Studying The Effectiveness Of Mamba
10.02.2025The rise of ChatGPT has introduced a wide range of AI-powered systems. However, despite the continuous advancements, the underlying Transformer architecture still faces limitations. With this thesis, you will explore different architectures like Mamba and the effects of their core parts.
The rise of ChatGPT has introduced a wide range of AI-powered services. However, despite continuous advancements, the underlying Transformer architecture still faces limitations which led to the exploration of Mamba.
This thesis focuses on understanding how different components of Mamba contribute to its overall effectiveness. By isolating specific elements, we aim to identify which parts have the most significant impact on performance.
Requirements:
- Python
- PyTorch
Nice to have:
- HuggingFace Transformers
- HuggingFace PEFT
Supervisor: Anton Vlasjuk