An Integrated Search Framework for Leveraging the Knowledge-Based Web Ecosystem

Authors

DOI:

https://doi.org/10.3127/ajis.v24i0.2331

Keywords:

integrated search framework, digital ecosystem, information retrieval, information management, search engine, crawler, text classification

Abstract

The explosion of information constrains the judgement of search terms associated with Knowledge-Based Web Ecosystem (KBWE), making the retrieval of relevant information and its knowledge management challenging. The existing information retrieval (IR) tools and their fusion in a framework need attention, in which search results can effectively be managed. In this article, we demonstrate the effective use of information retrieval services by a variety of users and agents in various KBWE scenarios. An innovative Integrated Search Framework (ISF) is proposed, which utilises crawling strategies, web search technologies and traditional database search methods. Besides, ISF offers comprehensive, dynamic, personalized, and organization-oriented information retrieval services, ranging from the Internet, extranet, intranet, to personal desktop. In this empirical research, experiments are carried out demonstrating the improvements in the search process, as discerned in the conceptual ISF. The experimental results show improved precision compared with other popular search engines.

Author Biographies

Dengya Zhu, Curtin University

Dr Dengya Zhu is an adjunct research fellow at the School of Information Systems, Curtin Business School. He has a broad range of experience in government, industry, and academic research. Dr Zhu’s research interests include information retrieval, data mining, machine learning, natural language processing, big data, sentiment analysis, open source software and software development. His research projects are usually practically oriented to address real world issues.

 

 

Shastri Lakshman Nimmagadda, Curtin University

Dr Shastri is presently an adj. research fellow at the School of Management (Information Systems) at Curtin Business School, Curtin University, Australia. Shastri worked for Schlumberger Company in multiple geo-markets worldwide as an Expert in Geosciences. Prior to that he worked for several petroleum operating and service companies in India, Australia, Uganda, Kuwait, Abu Dhabi, Egypt, Malaysia, Colombia, Indonesia and Russia for more than 25 years. He did his PhD in Information Systems with Master of Information Technology in distinction from Curtin University, Australia.  He obtained M Tech and PhD in Exploration Geophysics from the Indian Institute of Technology, Kharagpur, India. His academic and industry research include, Big Data support in Industry Environments, supply chain business data modelling, data integration, warehouse modelling, processing, interpretation and knowledge mapping including research in domain applications. He has published and presented more than 100 research and technical papers in various international journals and conference proceedings in the areas of geophysics, oil & gas exploration and information systems.

References

Albro, E. N. (2006). Google Mini Is a Mighty Search Tool," PC World, https://www.pcworld.com/article/126139/article.html, June 21.

Alonso, O. & Mizzaro, S. (2009). Relevance criteria for e-commerce: a crowdsourcing-based experimental analysis, Proceedings of the 32nd international ACM SIGIR conference on research & development in information retrieval, 760-761, Boston, MA, USA — July 19 - 23, 2009, https://doi.org/10.1145/1571941.1572115.

Arnold, Stephen E. (2004). How Google Has Changed Enterprise Search. In: Searcher 12, S. 8-17.

Arasu, A. Cho, J. Garcia-Molina, H. Paepcke, A. & Raghavan, S. (2001). Searching the Web," ACM Transactions on Internet Technology, 1 (1), 2001, 2-43.

Baeza-Yates, R., Castillo, C., Marin, M. & Rodriguez, A. (2005). Crawling a Country Better Strategies than Breadth-First for Web Page Ordering. The 14th international conference on World Wide Web, May 10–14, 2005, Chiba, Japan.

Barrows, R. & Traverso, J. (2006). Search Considered Integral," ACM Queue, May 2006, 30-36.

Baskerville, R. L., Kaul, M., & Storey, V. C. (2015). Genres of Inquiry in Design-Science Research: Justification and Evaluation of Knowledge Production. MIS Quarterly, 39 (3), 541-564.

Behnert, C. & Lewandowski, D. (2017). "A framework for designing retrieval effectiveness studies of library information systems using human relevance assessments", Journal of Documentation, 73 (3), https://doi: 10.1108/JD-08-2016-0099

Brin, S. & Page, L. (1998). "The anatomy of a large-scale hypertextual Web search engine" Computer Networks and ISDN Systems. 30 (1–7): 107–117. CiteSeerX 10.1.1.115.5930. https://doi.org/10.1016/S0169-7552(98)00110-X

Bunz, M (2009). "Google extends personalised search to all users". The Guardian. Tue 8 Dec, 2009. https://www.theguardian.com/media/pda/2009/dec/07/google-personalised-search.

Chau, M. & Chen, H. (2008). “A Machine Learning Approach to Web Page Filtering Using Content and Structure Analysis,” Decision Support Systems, 44 (2), 482-494.

Croft, W.B., Metzler, D. & Strohman, T. (2015). Search Engines – Information Retrieval in Practice, Pearson Education, Boston, USA.

Dean, J. 2009. Challenges in Building Large-Scale Information Retrieval Systems, Google, ACM Conference Series, ACM International Conference on Web Search and Data mining, WSDM 2009, https://pdfs.semanticscholar.org/fc32/72302461b74217662085a8a05a5e500dbf05.pdf

Dolog, P. & Nejdl, W. (2003). Challenges and Benefits of the Semantic Web for User Modelling, In De Bra, P., Davis, H., Kay, J. and Schraefel, m. (eds.) Proc. of AH2003: Workshop on adaptive hypermedia and adaptive Web-based systems, Budapest, Hungary, Eindhoven University of Technology, pp. 99-111. Available online at: <http://wwwis.win.tue.nl/ah2003/proceedings/um-1/>.

Elmasri, R., & Navathe, S. (2016). Fundamentals of database systems, Hoboken, NJ : Pearson, USA, 2016.

Gartner, (2017). "Insights From the 2017 Gartner CIO Agenda Report: Seize the Digital Ecosystem Opportunity," 2017.

Gregory, K. M., Cousijn, H., Groth, P. Scharnhorst, A. & Wyatt. S. (2019). Understanding Data Search as a Socio-technical Practice, Journal of Information Science. https://doi.org/10.1177/0165551519837182

Haneef, I., Munir, E. U., Qaiser, G., Hafiz Gulfam, H. & Ahmad, U. (2018). Big Data Retrieval: Taxonomy, Techniques and Feature Analysis, IJCSNS International Journal of Computer Science and Network Security, 18 (11).

Hernandez, N. Mothe, J., Chrisment, C., & Egret, D. (2007). Modeling context through domain ontologies, Information Retrieval Journal (2007) 10:143–172, https://doi 10.1007/s10791-006-9018-0

Järvelin, K. (2007). An analysis of two approaches in information retrieval: From frameworks to study designs, Journal of the American Society for Information Science and Technology, 58 (7), https://doi.org/10.1002/asi.20589

Jung, J. J. (2007). Ontological framework based on contextual mediation for collaborative information retrieval, Information Retrieval Journal (2007) 10:85–109. https://doi 10.1007/s10791-006-9013

Karanam, S., Jorge-Botana, G., Olmos, R. & Oostendorp, H. V. (2017). The role of domain knowledge in cognitive modelling of information search, Information Retrieval Journal (2017), 20:456–479. https://doi 10.1007/s10791-017-9308-8

Koopman, B., Zuccon, G., Bruza, P. Sitbon, L. & Lawley, M. (2016). Information retrieval as semantic inference: a Graph Inference model applied to medical search, Information Retrieval Journal (2016) 19:6–37. https://doi 10.1007/s10791-015-9268-9

Kumar, S. S., Mahapatra, D. P. & Balabantaray, R. C. (2016). Challenges for Information Retrieval in Big data: Product Review Context, International Journal of Computer Applications (0975 – 8887), 136 (3), February 2016.

Liu, Y., Liu, T. Y., Gao, B., Ma, Z. & Li, H. (2010). A framework to compute page importance based on user behaviours, Information Retrieval Journal (2010) 13:22–45. https://doi 10.1007/s10791-009-9098-8

Manning, C. D. Raghavan, P. & Schütze, H. (2009). Introduction to Information Retrieval, Cambridge: Cambridge University Press, New York, NY, USA, 2009.

McCandless, M., Hatcher, E. & Gospodnetić, O. (2010). Lucene in Action, 2nd, Greenwich: Manning Publications.

McCreadie, R., Macdonald, C. & Ounis, L. (2012). MapReduce indexing strategies: Studying scalability and efficiency, Information Processing & Management, 48 (5), September 2012, 873-888. https://doi.org/10.1016/j.ipm.2010.12.003

Meng, W., Yu, C. & Liu, K. L. (2000). Building Efficient and Effective Metasearch Engines," ACM Computing Surveys, 34 (1), 48-89.

Mizzaro, S. (1997). "Relevance: The Whole History," Journal of the American Society for Information Science 48, 810-832.

Moore, R., Seedat, Y., & Chen, J. Y. J. (2018). South Africa: Winning with Digital Platforms, Accenture, 2018.

Pitkow, J. Schütze, H. Cass, T. Cooley, R., Turnbull, D. Edmonds, A. Adar, E. & Breuel, T. (2002). Personalized Search: A contextual computing approach may prove a breakthrough in personalized search efficiency," Communications of the ACM, 45 (9), 50-55.

Qin, T., Liu, T. Y. & Li, H. (2010). A general approximation framework for direct optimization of information retrieval measures, Information Retrieval Journal (2010) 13:375–397. https://doi 10.1007/s10791-009-9124-x

Seyler, D., Chandar, P. & Davis, M. (2018). An Information Retrieval Framework for Contextual Suggestion Based on Heterogeneous Information Network Embeddings, SIGIR ’18, July 8–12, 2018, Ann Arbor, MI, USA c 2018 Association for Computing Machinery. Retrieved from https://doi.org/10.1145/3209978.3210103

Simpson, M. S., Demner-Fushman, D., Antani, S. K. & Thoma, G. R. (2014). Multimodal biomedical image indexing and retrieval using descriptive text and global feature mapping, Information Retrieval Journal (2014) 17:229–264. https://doi 10.1007/s10791-013-9235-2

Soille, P., Burger, A., Marchi, D. D., Kempeneers, P., D.Rodriguez, D., Syrris, V. & Vasilev, V. (2018). A versatile data-intensive computing platform for information retrieval from big geospatial data, Future Generation Computer Systems, Elsevier, Volume 81, April 2018, Pages 30-40, https://doi.org/10.1016/j.future.2017.11.007

Soldaini, L., Yates, A., Yom-Tov, E., Frieder, O. & Goharian, N. (2016). Enhancing web search in the medical domain via query clarification, Information Retrieval Journal 19 (1-2), 149-173 (2016). https://doi 10.1007/s10791-015-9258-y

Tolosa, G., Feuerstein, E., Becchetti, L. & Marchetti-Spaccamela, A. (2017). Performance improvements for search systems using an integrated cache of lists + intersections, Information Retrieval Journal (2017) 20 (3):172–198. https://doi.org/10.1007/s10791-017-9299-5

Vaishnavi, V. K. & Kuechler, W. (2007). Design Science Research Methods and Patterns: Innovating Information and Communication Technology. Auerbach Publications, Boston, MA.

Weill, P. & Woerner, S. L. (2015). Thriving in an Increasingly Digital Ecosystem, MIT Sloan Management Review, 56 (4), 27-34.

Yang, H., Sloan, M. and Wang, J. 2015. Dynamic Information Retrieval Modeling, WSDM’15, February 2–6, 2015, Shanghai, China. ACM 978-1-4503-3317-7/15/02. http://dx.doi.org/10.1145/2684822.2697038.

Yue, Y. (2011). New learning frameworks for information retrieval, (PhD Thesis, Faculty of the Graduate School of Cornell University, NY, USA). Retrieved from http://www.yisongyue.com/yue_thesis.pdf

Zhu, D., Nimmagadda, S.L. & Reiners, T. (2018). An Integrated Information Retrieval Framework for Managing the Digital Web Ecosystem, Australasian Conference of Information Systems (ACIS, 2018), UTS, Sydney, Australia. http://www.acis2018.org/wp-content/uploads/2018/11/ACIS2018_paper_12.pdf

Zuccon, G., Leelanupab, T., Whiting, S., Yilmaz, E., Jose, J. M. & Azzopardi, L. (2013). Crowdsourcing interactions: using crowdsourcing for evaluating interactive information retrieval systems, Information Retrieval Journal (2013) 16:267–305. https://doi 10.1007/s10791-012-9206-z

Downloads

Published

2020-10-19

How to Cite

Zhu, D., Nimmagadda, S. L., Reiners, T., & Rudra, A. (2020). An Integrated Search Framework for Leveraging the Knowledge-Based Web Ecosystem. Australasian Journal of Information Systems, 24. https://doi.org/10.3127/ajis.v24i0.2331

Issue

Section

Research Articles