RAG System for Educational Content

Developed a RAG system using Llama2 at KIT Institute for Anthropomatics and Robotics, focusing on enhancing lecture material accessibility.

kit.edu/lecture-assistant

KIT Lecture Assistant

Lecture Slides

Distributed Systems

Lecture 3: Consensus Algorithms

Paxos Consensus

Safety: Only a single value is chosen
Liveness: Some value is eventually chosen

Key Components:

• Proposers
• Acceptors
• Learners

Document Retrieval

Embedding Generation

Context Injection

Response Generation

Hi! I'm your KIT lecture assistant. I can help you understand the lecture content. Try asking me about distributed systems, cloud computing, or any other course topics!

Introduction

As a research student at IAR-KIT, implemented a retrieval-augmented generation system to make lecture content more interactive. The system processes lecture slides and course materials to provide accurate, context-aware responses to student queries.

Requirements

Process and vectorize KIT lecture slides and materials
Implement efficient retrieval system for educational content
Fine-tune Llama2 for academic context
Create evaluation metrics for answer quality
Design user-friendly Q&A interface
Support multiple question types and formats

Technologies

Llama2 for text generation
FAISS for vector storage
Sentence-transformers for embeddings
Python FastAPI backend
Streamlit for demo interface
Hugging Face transformers
PyTorch for model handling

Challenges

Academic Content Processing

Developed specialized tokenization for technical content
Handled mathematical formulas and diagrams
Maintained context across lecture sections

Model Optimization

Fine-tuned Llama2 for academic domain
Balanced response accuracy and generation speed
Implemented efficient context window management

Educational Accuracy

Ensured responses align with course material
Developed verification against source slides
Created academic-focused evaluation metrics