XPloreRank: exploring XML data via you may also like queries
- Naseriparsa, Mehdi, Liu, Chengfei, Islam, Md Saiful, Zhou, Rui
- Authors: Naseriparsa, Mehdi , Liu, Chengfei , Islam, Md Saiful , Zhou, Rui
- Date: 2019
- Type: Text , Journal article
- Relation: World Wide Web Vol. 22, no. 4 (2019), p. 1727-1750
- Full Text:
- Reviewed:
- Description: In many cases, users are not familiar with their exact information needs while searching complicated data sources. This lack of understanding may cause the users to feel dissatisfaction when the system retrieves insufficient results after they issue queries. However, using their original query results, we may recommend additional queries which are highly relevant to the original query. This paper presents XPloreRank to recommend top-l highly relevant keyword queries called “You May Also Like” (YMAL) queries to the users in XML keyword search. To generate such queries, we firstly analyze the original keyword query results content and construct a weighted co-occurring keyword graph. Then, we generate the YMAL queries by traversing the co-occurring keyword graph and rank them based on the following correlation aspects: (a) external correlation, which measures the similarity of the YMAL query to the original query and (b) internal correlation, which measures the capability of the YMAL query keywords in producing meaningful results with respect to the data source. Due to the complexity of generating YMAL queries, we propose a novel A* search-based technique to generate top-l YMAL queries efficiently. We also present a greedy-based approximation for it to improve the performance further. Extensive experiments verify the effectiveness and efficiency of our approach. © 2018, Springer Science+Business Media, LLC, part of Springer Nature.
- Authors: Naseriparsa, Mehdi , Liu, Chengfei , Islam, Md Saiful , Zhou, Rui
- Date: 2019
- Type: Text , Journal article
- Relation: World Wide Web Vol. 22, no. 4 (2019), p. 1727-1750
- Full Text:
- Reviewed:
- Description: In many cases, users are not familiar with their exact information needs while searching complicated data sources. This lack of understanding may cause the users to feel dissatisfaction when the system retrieves insufficient results after they issue queries. However, using their original query results, we may recommend additional queries which are highly relevant to the original query. This paper presents XPloreRank to recommend top-l highly relevant keyword queries called “You May Also Like” (YMAL) queries to the users in XML keyword search. To generate such queries, we firstly analyze the original keyword query results content and construct a weighted co-occurring keyword graph. Then, we generate the YMAL queries by traversing the co-occurring keyword graph and rank them based on the following correlation aspects: (a) external correlation, which measures the similarity of the YMAL query to the original query and (b) internal correlation, which measures the capability of the YMAL query keywords in producing meaningful results with respect to the data source. Due to the complexity of generating YMAL queries, we propose a novel A* search-based technique to generate top-l YMAL queries efficiently. We also present a greedy-based approximation for it to improve the performance further. Extensive experiments verify the effectiveness and efficiency of our approach. © 2018, Springer Science+Business Media, LLC, part of Springer Nature.
No-but-semantic-match : computing semantically matched xml keyword search results
- Naseriparsa, Mehdi, Islam, Md Saiful, Liu, Chengfei, Moser, Irene
- Authors: Naseriparsa, Mehdi , Islam, Md Saiful , Liu, Chengfei , Moser, Irene
- Date: 2018
- Type: Text , Journal article
- Relation: World Wide Web Vol. 21, no. 5 (2018), p. 1223-1257
- Full Text:
- Reviewed:
- Description: Users are rarely familiar with the content of a data source they are querying, and therefore cannot avoid using keywords that do not exist in the data source. Traditional systems may respond with an empty result, causing dissatisfaction, while the data source in effect holds semantically related content. In this paper we study this no-but-semantic-match problem on XML keyword search and propose a solution which enables us to present the top-k semantically related results to the user. Our solution involves two steps: (a) extracting semantically related candidate queries from the original query and (b) processing candidate queries and retrieving the top-k semantically related results. Candidate queries are generated by replacement of non-mapped keywords with candidate keywords obtained from an ontological knowledge base. Candidate results are scored using their cohesiveness and their similarity to the original query. Since the number of queries to process can be large, with each result having to be analyzed, we propose pruning techniques to retrieve the top-k results efficiently. We develop two query processing algorithms based on our pruning techniques. Further, we exploit a property of the candidate queries to propose a technique for processing multiple queries in batch, which improves the performance substantially. Extensive experiments on two real datasets verify the effectiveness and efficiency of the proposed approaches. © 2017, Springer Science+Business Media, LLC.
- Authors: Naseriparsa, Mehdi , Islam, Md Saiful , Liu, Chengfei , Moser, Irene
- Date: 2018
- Type: Text , Journal article
- Relation: World Wide Web Vol. 21, no. 5 (2018), p. 1223-1257
- Full Text:
- Reviewed:
- Description: Users are rarely familiar with the content of a data source they are querying, and therefore cannot avoid using keywords that do not exist in the data source. Traditional systems may respond with an empty result, causing dissatisfaction, while the data source in effect holds semantically related content. In this paper we study this no-but-semantic-match problem on XML keyword search and propose a solution which enables us to present the top-k semantically related results to the user. Our solution involves two steps: (a) extracting semantically related candidate queries from the original query and (b) processing candidate queries and retrieving the top-k semantically related results. Candidate queries are generated by replacement of non-mapped keywords with candidate keywords obtained from an ontological knowledge base. Candidate results are scored using their cohesiveness and their similarity to the original query. Since the number of queries to process can be large, with each result having to be analyzed, we propose pruning techniques to retrieve the top-k results efficiently. We develop two query processing algorithms based on our pruning techniques. Further, we exploit a property of the candidate queries to propose a technique for processing multiple queries in batch, which improves the performance substantially. Extensive experiments on two real datasets verify the effectiveness and efficiency of the proposed approaches. © 2017, Springer Science+Business Media, LLC.
- «
- ‹
- 1
- ›
- »