financebench

Star

Here are 7 public repositories matching this topic...

VectifyAI / Mafin2.5-FinanceBench

Sponsor

Star

📈 FinanceBench evaluation of Mafin 2.5 (Powered by PageIndex)

financebench

Updated Oct 20, 2025
Python

avnlp / rag-pipelines

Star

Advanced RAG pipelines for medical (HealthBench, MedCaseReasoning, MetaMedQA, PubMedQA) and financial (FinanceBench, Earnings Calls) QA. LangGraph orchestration + BAML structructed generation, Milvus Hybrid search (dense + BM25 + RRF), three-layer Metadata Enrichment, Contextual AI instruction-following reranker, and DeepEval evaluation.

pubmed unstructured rag baml milvus earnings-calls contextual-ai llm langgraph rag-pipeline agentic-rag deepeval financebench healthbench

Updated May 26, 2026
Python

Rishabhmannu / financebench-rag-agent

Star

Multi-agent LangGraph RAG for financial Q&A — 72.7% on FinanceBench under κ=0.932 calibrated judge. RBAC at the vector layer, multi-party HITL on high-stakes answers, self-hosted LLM observability. pip install financebench-rag-agent

evaluation multi-agent-systems rag qdrant-vector-database retrieval-augmented-generation langfuse langgraph agentic-rag ragas-evaluation financebench

Updated May 28, 2026
Python

Mariam500000 / PageIndex

Star

🔍 Empower efficient retrieval with PageIndex, a reasoning-based system that eliminates the need for vector databases and chunking for human-like results.

jquery agent pagination ai retrieval query-builder pager reasoning pageindex rag pageindexer qdrant llm vector-db agentic-ai financebench context-engineering zxb

Updated May 28, 2026
Jupyter Notebook

MGanayim / financebench-rag

Star

An end-to-end RAG pipeline that answers questions about SEC 10-K filings with grounded citations instead of hallucinated numbers. Built on FinanceBench: indexing with FAISS + BGE embeddings, generation with Llama-3.3-70B, three-axis evaluation (correctness / faithfulness / page-hit@k), improvement cycles.

rag llm langchain retrieval-augmented-generation faiss-vector-database ragas financebench

Updated Apr 29, 2026
Jupyter Notebook

moshe19909090 / financebench-rag

Star

RAG pipeline for FinanceBench with retrieval, evaluation, improvement cycles, and chunk-size experiments.

nlp evaluation embeddings faiss rag llm langchain retrieval-augmented-generation financebench

Updated Apr 27, 2026
Jupyter Notebook

FMFigueroa / financebench-rag-eval

Star

Rigorous evaluation of contextual retrieval techniques on FinanceBench: comparing 5 embedders × 4 chunking strategies with bootstrapped confidence intervals on FinMTEB and FinanceBench.

python benchmarking natural-language-processing information-retrieval pytorch embeddings semantic-search rag vector-search huggingface sentence-transformers retrieval-augmented-generation llm-evaluation contextual-retrieval late-chunking finance-nlp financebench finmteb

Updated May 12, 2026
Jupyter Notebook

Improve this page

Add a description, image, and links to the financebench topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the financebench topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

financebench

Here are 7 public repositories matching this topic...

VectifyAI / Mafin2.5-FinanceBench

avnlp / rag-pipelines

Rishabhmannu / financebench-rag-agent

Mariam500000 / PageIndex

MGanayim / financebench-rag

moshe19909090 / financebench-rag

FMFigueroa / financebench-rag-eval

Improve this page

Add this topic to your repo