Available Thesis Topics
When applying for a thesis topic, follow the procedure described here and CC the advisor(s) in your email.
Click on a topic for more details.
Bachelor Topics
Master Topics
Vector databases are essential components of modern applications such as machine learning, pattern recognition, and recommendation systems. They are designed to efficiently store, index, and query high-dimensional vectors. Exact nearest neighbor search becomes inefficient for high-dimensional data, whereas Approximate Nearest Neighbor Search (ANNS) offers significant performance improvements for the retrieval process by finding near-optimal matches with reduced computational cost. This thesis aims to explore recent advances in ANNS systems, with a focus on disk-based solutions.
Goal & Steps:
- Comprehensive literature review
- Developing a thorough understanding of the basics of vector databases and ANNS
- Reviewing state-of-the-art ANNS algorithms and systems, with a focus on disk-based schemes
- Analyzing algorithmic and system-level optimizations
- Benchmarking the state-of-the-art schemes
- Identifying the common baseline and the non-GPU based schemes to evaluate
- Conducting empirical benchmarking study of existing schemes on realistic system setups and workloads
- Proposing novel schemes
- Designing and implementing novel solutions and evaluating them against the existing solutions
Target: M.Sc. Students
Prerequisites & Considerations:
- Proficiency in C/C++ & Python programming
- Strong interest in system research, with focus on system optimizations for emerging applications such as machine learning
- Required system setup & optimizations
- Problem solving & research capability
- For a master’s thesis, the expected outcome is a contribution of publishable quality, however, publishing the paper is not mandatory
To get more familiar with the topic, you can start with having a look at the following papers:
- S. J. Subramanya, F. Devvrit, H. V. Simhadri, R. Krishnawamy, and R. Kadekodi, “DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node,” in NeurIPS, 2019.
- H. Guo and Y. Lu, “Achieving Low-Latency Graph-Based Vector Search via Aligning Best-First Search Algorithm with SSD,” in USENIX OSDI, 2025.
- H. Guo and Y. Lu, “OdinANN: Direct Insert for Consistently Stable Performance in Billion-Scale Graph-Based Vector Search”, in USENIX FAST, 2026.
Advisor: Mostafa Hadizadeh
Arancini is a hybrid binary translator developed in our group, that translates x86 binaries to Arm and RISC-V architectures. Currently, it only supports translating Linux applications. To increase its usability, we want to add support for Windows applications to be translated and executed on Linux platforms.
Goals of the thesis: In this context, you will add support for Windows applications in Arancini by:
- Exploring existing Windows emulation layers such as Wine or Proton
- Implementing the necessary system call translations in Arancini
- Testing and evaluating the performance of Windows applications executed through Arancini
Target: Master
Prerequisites:
- Programming language: C++
- Previous knowledge in operating systems and system programming is appreciated
Advisor: Redha Gouicem