College of Science, Technology, and Health

Data Science Ensemble: Effective and Efficient Neural Re-Ranking in Information Retrieval with Allan Hanbury

May 6, 2021
4:00 PM to 5:00 PM
A digital image of points of light arranged into the shape of a human mind.

Dr. Allan Hanbury will give the fourth talk in The USM Data Science Ensemble, a seminar series focused on the intersection of data science and real-world applications. We invite you to join us for this in-depth look at a practical application of data science in the real world, by joining this moderated Zoom link

In Dr. Hanbury's talk, "Effective and Efficient Neural Re-Ranking in Information Retrieval," he will discuss how re-ranking in information retrieval involves using a more effective and usually more computationally expensive algorithm in a second stage, in order to re-rank the top results returned by a very efficient retrieval algorithm in the first stage. Complex neural architectures such as BERT have recently shown very high effectiveness on the re-ranking task but have the following disadvantages: (i) high computational cost requiring powerful hardware and leading to unacceptable waits before results are returned, (ii) lack of interpretability of the ranking produced, and (iii) a limit on the amount of document text that can be processed. These disadvantages limit the implementation of neural approaches in more constrained environments with specific user requirements and user expectations often found in domain-specific or enterprise search. He will present approaches developed by his group to overcome these disadvantages, including the Transformer-Kernel (TK) neural re-ranking model, its adaptation for long text (TKL), and Cross-Architecture Knowledge Distillation.

Allan Hanbury is Professor for Data Intelligence and head of the E-Commerce Research Unit in the Faculty of Informatics, TU Wien, Austria. He is also a faculty member of the Complexity Science Hub Vienna. He was the scientific coordinator of the EU-funded Khresmoi Project on medical and health information search and analysis and is co-founder of contextflow, the spin-off company commercializing the radiology image search technology developed in the Khresmoi project. He is the coordinator of DoSSIER, a Marie Curie Innovative Training Network, educating 15 doctoral students on domain-specific systems for information extraction and retrieval. He also coordinated the EU-funded VISCERAL project on evaluation of algorithms on big data, and the EU-funded KConnect project on technology for analyzing medical text. He is the author or co-author of over 160 publications in refereed journals and refereed international conferences.

Contact Information

Sharon Watterson