RAG System for Educational Content
Developed a RAG system using Llama2 at KIT Institute for Anthropomatics and Robotics, focusing on enhancing lecture material accessibility.
Lecture Slides
Distributed Systems
Paxos Consensus
- Safety: Only a single value is chosen
- Liveness: Some value is eventually chosen
- • Proposers
- • Acceptors
- • Learners
Introduction
As a research student at IAR-KIT, implemented a retrieval-augmented generation system to make lecture content more interactive. The system processes lecture slides and course materials to provide accurate, context-aware responses to student queries.
Requirements
Process and vectorize KIT lecture slides and materials
Implement efficient retrieval system for educational content
Fine-tune Llama2 for academic context
Create evaluation metrics for answer quality
Design user-friendly Q&A interface
Support multiple question types and formats
Technologies
Llama2 for text generation
FAISS for vector storage
Sentence-transformers for embeddings
Python FastAPI backend
Streamlit for demo interface
Hugging Face transformers
PyTorch for model handling
Challenges
Academic Content Processing
- Developed specialized tokenization for technical content
- Handled mathematical formulas and diagrams
- Maintained context across lecture sections
Model Optimization
- Fine-tuned Llama2 for academic domain
- Balanced response accuracy and generation speed
- Implemented efficient context window management
Educational Accuracy
- Ensured responses align with course material
- Developed verification against source slides
- Created academic-focused evaluation metrics