Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .env.example
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
# Copy this file to .env and add your actual API key
ANTHROPIC_API_KEY=your-anthropic-api-key-here
ANTHROPIC_API_KEY=your-anthropic-api-key-here
GEMINI_API_KEY=your-gemini-api-key-here
57 changes: 57 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Commands

```bash
# Run the application
./run.sh
# or manually:
cd backend && uv run uvicorn app:app --reload --port 8000

# Install dependencies
uv sync

# Run a specific backend file directly (useful for quick testing)
cd backend && uv run python <file>.py
```

The app is served at `http://localhost:8000`. FastAPI's auto-generated API docs are at `http://localhost:8000/docs`.

## Architecture

This is a RAG (Retrieval-Augmented Generation) chatbot that answers questions about course materials. FastAPI serves both the API and the static frontend from a single process.

**Request flow:**
1. Frontend (`frontend/script.js`) POSTs `{ query, session_id }` to `/api/query`
2. `app.py` routes to `RAGSystem.query()` — the main orchestrator in `rag_system.py`
3. `RAGSystem` fetches conversation history from `SessionManager`, then calls `AIGenerator`
4. `AIGenerator` calls Claude (claude-sonnet-4) with a `search_course_content` tool available
5. If Claude invokes the tool, `CourseSearchTool` runs a semantic search against ChromaDB and returns formatted chunks
6. Claude makes a second API call to synthesize a final answer from the retrieved chunks
7. Sources and response are returned up the chain to the frontend

**Key design decisions:**
- Claude drives retrieval via tool use — it decides whether to search and what to search for, rather than always retrieving before generating
- Two ChromaDB collections: `course_catalog` (one entry per course, used for fuzzy course-name resolution) and `course_content` (chunked text, used for semantic search)
- Conversation history is stored in-memory in `SessionManager` — it is lost on server restart
- The session ID is minted server-side on first request and returned to the frontend, which holds it in `currentSessionId` for the rest of the browser session

**Document format** (`docs/*.txt`):
```
Course Title: ...
Course Link: ...
Course Instructor: ...
Lesson 1: Title
Lesson Link: ...
<lesson content>
```
`DocumentProcessor` parses this format and chunks each lesson's content into ~800-character overlapping segments. Course documents are loaded at startup via `app.py`'s `startup_event`.

**Configuration** (`backend/config.py`):
- `ANTHROPIC_MODEL` — Claude model used for generation
- `EMBEDDING_MODEL` — SentenceTransformer model used for ChromaDB embeddings (`all-MiniLM-L6-v2`)
- `CHUNK_SIZE` / `CHUNK_OVERLAP` — control document chunking
- `MAX_HISTORY` — number of conversation exchanges retained per session
- `CHROMA_PATH` — local path for persisted ChromaDB data
16 changes: 13 additions & 3 deletions backend/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from fastapi.staticfiles import StaticFiles
from fastapi.middleware.trustedhost import TrustedHostMiddleware
from pydantic import BaseModel
from typing import List, Optional
from typing import Any, List, Optional
import os

from config import config
Expand Down Expand Up @@ -39,11 +39,12 @@ class QueryRequest(BaseModel):
"""Request model for course queries"""
query: str
session_id: Optional[str] = None
model: str = "claude"

class QueryResponse(BaseModel):
"""Response model for course queries"""
answer: str
sources: List[str]
sources: List[Any]
session_id: str

class CourseStats(BaseModel):
Expand All @@ -63,7 +64,7 @@ async def query_documents(request: QueryRequest):
session_id = rag_system.session_manager.create_session()

# Process query using RAG system
answer, sources = rag_system.query(request.query, session_id)
answer, sources = rag_system.query(request.query, session_id, request.model)

return QueryResponse(
answer=answer,
Expand All @@ -73,6 +74,15 @@ async def query_documents(request: QueryRequest):
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))

@app.delete("/api/session/{session_id}")
async def delete_session(session_id: str):
"""Clear a conversation session"""
try:
rag_system.session_manager.clear_session(session_id)
return {"status": "cleared"}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))

@app.get("/api/courses", response_model=CourseStats)
async def get_course_stats():
"""Get course analytics and statistics"""
Expand Down
6 changes: 5 additions & 1 deletion backend/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,18 @@ class Config:
# Anthropic API settings
ANTHROPIC_API_KEY: str = os.getenv("ANTHROPIC_API_KEY", "")
ANTHROPIC_MODEL: str = "claude-sonnet-4-20250514"

# Gemini API settings
GEMINI_API_KEY: str = os.getenv("GEMINI_API_KEY", "")
GEMINI_MODEL: str = "gemini-2.5-flash"

# Embedding model settings
EMBEDDING_MODEL: str = "all-MiniLM-L6-v2"

# Document processing settings
CHUNK_SIZE: int = 800 # Size of text chunks for vector storage
CHUNK_OVERLAP: int = 100 # Characters to overlap between chunks
MAX_RESULTS: int = 5 # Maximum search results to return
MAX_RESULTS: int = 10 # Maximum search results to return
MAX_HISTORY: int = 2 # Number of conversation messages to remember

# Database paths
Expand Down
132 changes: 132 additions & 0 deletions backend/gemini_generator.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
from google import genai
from google.genai import types
from typing import List, Optional, Dict, Any

TYPE_MAP = {
"object": types.Type.OBJECT,
"string": types.Type.STRING,
"integer": types.Type.INTEGER,
"number": types.Type.NUMBER,
"boolean": types.Type.BOOLEAN,
"array": types.Type.ARRAY,
}


class GeminiGenerator:
SYSTEM_PROMPT = """ You are an AI assistant specialized in course materials and educational content with access to a comprehensive search tool for course information.

Search Tool Usage:
- Use the search tool **only** for questions about specific course content or detailed educational materials
- **One search per query maximum**
- Synthesize search results into accurate, fact-based responses
- If search yields no results, state this clearly without offering alternatives

Response Protocol:
- **General knowledge questions**: Answer using existing knowledge without searching
- **Course-specific questions**: Search first, then answer
- **No meta-commentary**:
- Provide direct answers only — no reasoning process, search explanations, or question-type analysis
- Do not mention "based on the search results"


All responses must be:
1. **Brief, Concise and focused** - Get to the point quickly
2. **Educational** - Maintain instructional value
3. **Clear** - Use accessible language
4. **Example-supported** - Include relevant examples when they aid understanding
Provide only the direct answer to what was asked.
"""

def __init__(self, api_key: str, model: str):
self.client = genai.Client(api_key=api_key)
self.model = model

def generate_response(
self,
query: str,
conversation_history: Optional[str] = None,
tools: Optional[List] = None,
tool_manager=None,
) -> str:
system = (
f"{self.SYSTEM_PROMPT}\n\nPrevious conversation:\n{conversation_history}"
if conversation_history
else self.SYSTEM_PROMPT
)
contents = [{"role": "user", "parts": [{"text": query}]}]
config = types.GenerateContentConfig(
system_instruction=system,
temperature=0,
max_output_tokens=800,
tools=self._convert_tools(tools) if tools else None,
)
response = self.client.models.generate_content(
model=self.model, contents=contents, config=config
)
if self._has_function_call(response) and tool_manager:
return self._handle_tool_execution(response, contents, system, tool_manager)
return response.text

def _handle_tool_execution(self, initial_response, contents, system, tool_manager) -> str:
contents = contents + [
{"role": "model", "parts": initial_response.candidates[0].content.parts}
]
result_parts = []
for part in initial_response.candidates[0].content.parts:
if part.function_call:
result = tool_manager.execute_tool(
part.function_call.name, **dict(part.function_call.args)
)
result_parts.append(
types.Part.from_function_response(
name=part.function_call.name,
response={"result": result},
)
)
contents = contents + [{"role": "user", "parts": result_parts}]
final = self.client.models.generate_content(
model=self.model,
contents=contents,
config=types.GenerateContentConfig(
system_instruction=system, temperature=0, max_output_tokens=800
),
)
return final.text

def _has_function_call(self, response) -> bool:
try:
return any(
p.function_call for p in response.candidates[0].content.parts
)
except (AttributeError, IndexError):
return False

def _convert_tools(self, anthropic_tools: List[Dict]) -> List:
return [
types.Tool(
function_declarations=[
types.FunctionDeclaration(
name=t["name"],
description=t["description"],
parameters=self._convert_schema(t["input_schema"]),
)
for t in anthropic_tools
]
)
]

def _convert_schema(self, schema: Dict) -> types.Schema:
kwargs: Dict[str, Any] = {
"type": TYPE_MAP.get(schema.get("type", "").lower(), types.Type.STRING)
}
if "description" in schema:
kwargs["description"] = schema["description"]
if "properties" in schema:
kwargs["properties"] = {
k: self._convert_schema(v) for k, v in schema["properties"].items()
}
if "required" in schema:
kwargs["required"] = schema["required"]
if "items" in schema:
kwargs["items"] = self._convert_schema(schema["items"])
return types.Schema(**kwargs)
31 changes: 27 additions & 4 deletions backend/rag_system.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@
from document_processor import DocumentProcessor
from vector_store import VectorStore
from ai_generator import AIGenerator
from gemini_generator import GeminiGenerator
from session_manager import SessionManager
from search_tools import ToolManager, CourseSearchTool
from search_tools import ToolManager, CourseSearchTool, CoursePageTool
from models import Course, Lesson, CourseChunk

class RAGSystem:
Expand All @@ -17,12 +18,15 @@ def __init__(self, config):
self.document_processor = DocumentProcessor(config.CHUNK_SIZE, config.CHUNK_OVERLAP)
self.vector_store = VectorStore(config.CHROMA_PATH, config.EMBEDDING_MODEL, config.MAX_RESULTS)
self.ai_generator = AIGenerator(config.ANTHROPIC_API_KEY, config.ANTHROPIC_MODEL)
self.gemini_generator = GeminiGenerator(config.GEMINI_API_KEY, config.GEMINI_MODEL)
self.session_manager = SessionManager(config.MAX_HISTORY)

# Initialize search tools
self.tool_manager = ToolManager()
self.search_tool = CourseSearchTool(self.vector_store)
self.tool_manager.register_tool(self.search_tool)
self.course_page_tool = CoursePageTool(self.vector_store)
self.tool_manager.register_tool(self.course_page_tool)

def add_course_document(self, file_path: str) -> Tuple[Course, int]:
"""
Expand Down Expand Up @@ -99,7 +103,7 @@ def add_course_folder(self, folder_path: str, clear_existing: bool = False) -> T

return total_courses, total_chunks

def query(self, query: str, session_id: Optional[str] = None) -> Tuple[str, List[str]]:
def query(self, query: str, session_id: Optional[str] = None, model: str = "claude") -> Tuple[str, List[str]]:
"""
Process a user query using the RAG system with tool-based search.

Expand All @@ -111,15 +115,34 @@ def query(self, query: str, session_id: Optional[str] = None) -> Tuple[str, List
Tuple of (response, sources list - empty for tool-based approach)
"""
# Create prompt for the AI with clear instructions
prompt = f"""Answer this question about course materials: {query}"""
courses_meta = self.vector_store.get_all_courses_metadata()
if courses_meta:
courses_context = "\n".join(
f"- {m['title']} ({m.get('lesson_count', '?')} lessons, instructor: {m.get('instructor', 'unknown')})"
for m in courses_meta
)
else:
courses_context = "- (none loaded)"
prompt = f"""Answer this question about course materials: {query}

Available courses:
{courses_context}"""

# Get conversation history if session exists
history = None
if session_id:
history = self.session_manager.get_conversation_history(session_id)

# Select generator based on model choice
if model == "gemini":
if not self.config.GEMINI_API_KEY:
raise ValueError("Gemini API key is not configured.")
generator = self.gemini_generator
else:
generator = self.ai_generator

# Generate response using AI with tools
response = self.ai_generator.generate_response(
response = generator.generate_response(
query=prompt,
conversation_history=history,
tools=self.tool_manager.get_tool_definitions(),
Expand Down
Loading