π Quickstart Β Β·Β ποΈ Architecture Β Β·Β πΊοΈ Roadmap Β Β·Β π§° Tech Stack
Most AI projects stop at the demo: a single notebook, an in-memory vector store, a script that works once on someone's machine. Quarry is built the other way β as a real backend first, with the model layer added on top of infrastructure that's already production-shaped.
It's a long-running, versioned platform (v1 β v15) rather than a one-off project, covering everything from RAG and agents to evaluation, observability, and inference serving.
βοΈ How it compares
| π§ͺ Typical AI demo | πͺ¨ Quarry | |
|---|---|---|
| Architecture | Single script / notebook | Tiered, containerized backend |
| Data layer | In-memory, ephemeral | PostgreSQL + pgvector, persisted |
| State / caching | None | Redis |
| Testing | Manual spot-checks | Metric-driven evaluation (planned) |
| Observability | print() |
Structured logging & tracing (planned) |
| Lifecycle | Abandoned after a weekend | Versioned roadmap, v1 β v15 |
flowchart TB
Client["π₯οΈ Client / UI"]
subgraph API["β‘ FastAPI API Gateway"]
direction LR
Auth["π Auth Layer<br/><sub>JWT Β· RBAC</sub>"]
Docs["π Document Layer<br/><sub>Parse Β· Chunk</sub>"]
Retrieval["π Retrieval Layer<br/><sub>Embed Β· Search</sub>"]
end
subgraph Data["ποΈ Data & State Layer"]
direction LR
PG[("π PostgreSQL<br/>+ pgvector")]
Redis[("β‘ Redis<br/>Cache Β· Queues")]
end
Inference["π§ AI & Inference Layer<br/><sub>planned β v3</sub>"]
Client -->|HTTP / REST| API
Auth --> Data
Docs --> Data
Retrieval --> Data
Data -.-> Inference
style Inference stroke-dasharray: 5 5
π Request flow: the client talks to a single FastAPI gateway, which fans out to auth, document processing, and retrieval. All three sit on a shared PostgreSQL + pgvector store for persistence and a Redis layer for caching and queues. The whole stack runs as Docker Compose services today; the inference/LLM layer is the next piece going on top.
Quarry evolves as a single platform across 15 versions, grouped into four phases.
flowchart LR
subgraph P1["π Foundation"]
v1["v1<br/>Core Backend"]
v2["v2<br/>Production Backend"]
end
subgraph P2["π§ Intelligence"]
v3["v3<br/>LLM Layer"]
v4["v4<br/>Production RAG"]
v5["v5<br/>Agents"]
v6["v6<br/>Evaluation"]
end
subgraph P3["π Scale"]
v7["v7βv11<br/>Learning Β· Research Β·<br/>Repo Intel Β· Retrieval Β· Memory"]
v12["v12βv13<br/>Guardrails Β·<br/>Cloud Ops"]
end
subgraph P4["π Platform"]
v14["v14<br/>Observability"]
v15["v15<br/>Inference Platform"]
end
v1 --> v2 --> v3 --> v4 --> v5 --> v6 --> v7 --> v12 --> v14 --> v15
classDef done fill:#2ea44f,stroke:#22863a,color:#fff
classDef active fill:#fb8500,stroke:#d97706,color:#fff
classDef planned fill:#eee,stroke:#bbb,color:#666
class v1,v2 done
class v3 active
class v4,v5,v6,v7,v12,v14,v15 planned
π’ Done Β· π In progress Β· βͺ Planned
| Phase | Version | Focus | Status |
|---|---|---|---|
| π Foundation | v1 | Auth, PostgreSQL, PDF parsing, embeddings, retrieval | β Complete |
| π Foundation | v2 | Redis, pgvector, Docker, multi-container, health checks | β Complete |
| π§ Intelligence | v3 | LLM integration, streaming, provider abstraction | πΆ In progress |
| π§ Intelligence | v4 | Hybrid search, re-ranking, advanced RAG | β¬ Planned |
| π§ Intelligence | v5 | Multi-agent orchestration, tool use | β¬ Planned |
| π§ Intelligence | v6 | RAG / agent evaluation framework | β¬ Planned |
| π Scale | v7βv11 | Fine-tuning, research pipelines, code search, graph RAG, memory | β¬ Planned |
| π Scale | v12βv13 | Guardrails, PII scrubbing, Kubernetes, CI/CD, Terraform | β¬ Planned |
| π Platform | v14 | OpenTelemetry, structured logging, tracing | β¬ Planned |
| π Platform | v15 | Custom vLLM serving, inference optimization | β¬ Planned |
- π Authentication β JWT-based registration and login
- π Document processing β PDF upload, parsing, and chunking via PyMuPDF
- 𧬠Embeddings β automated vector generation via Sentence Transformers
- π Semantic retrieval β vector search over stored documents
- ποΈ Persistence β PostgreSQL via SQLAlchemy, with pgvector for embeddings
- π³ Infrastructure β fully containerized: API, Postgres, and Redis as separate services
π Health check
GET /health
{
"status": "healthy",
"database": "connected",
"redis": "connected"
}| Layer | Technology |
|---|---|
| π Language | Python |
| β‘ API framework | FastAPI |
| π Database | PostgreSQL |
| π Vector search | pgvector |
| β‘ Cache / queues | Redis |
| π§© ORM | SQLAlchemy |
| β Validation | Pydantic |
| π Auth | JWT |
| π Document parsing | PyMuPDF |
| 𧬠Embeddings | Sentence Transformers |
| π³ Containerization | Docker / Docker Compose |
quarry/
βββ app/
β βββ api/ # Route handlers and endpoints
β βββ core/ # Config, security, settings
β βββ db/ # Sessions and migrations
β βββ models/ # SQLAlchemy models
β βββ schemas/ # Pydantic schemas
β βββ services/ # Business logic (auth, docs, vectors)
βββ docs/ # Architecture notes
βββ scripts/ # Setup and DB utilities
βββ tests/ # Unit and integration tests
βββ .env.example
βββ requirements.txt
βββ README.md
docker compose up -d # start the full stack
docker ps # view running services
docker compose down # stop everythingπ API docs are served at http://localhost:8000/docs
git clone https://github.com/x2ankit/quarry.git
cd quarry
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env # then configure your settings
alembic upgrade head # requires Postgres running locally or via Docker
uvicorn app.main:app --reloadThis project is open source. See LICENSE for details.