Skip to content

feat(embedding): configurable local model, cache dir, and HF mirror | 嵌入模型可配置#225

Open
mechanic-Q wants to merge 1 commit intorohitg00:mainfrom
mechanic-Q:feature/configurable-embedding
Open

feat(embedding): configurable local model, cache dir, and HF mirror | 嵌入模型可配置#225
mechanic-Q wants to merge 1 commit intorohitg00:mainfrom
mechanic-Q:feature/configurable-embedding

Conversation

@mechanic-Q
Copy link
Copy Markdown

@mechanic-Q mechanic-Q commented May 2, 2026

Summary | 概述

Make the local embedding provider configurable via environment variables, enabling multilingual embedding models (bge-m3, multilingual-e5, etc.) and supporting users behind firewalls or in regions with restricted HuggingFace access.

通过环境变量使本地嵌入模型可配置,支持切换多语言模型和配置 HuggingFace 镜像。

Motivation | 动机

The local embedding provider was hardcoded to all-MiniLM-L6-v2 (384-dim English-only). Users with non-English content (Chinese, Japanese, Korean, multilingual codebases) had no way to use a multilingual embedding model without forking the codebase. Additionally, users behind corporate firewalls or in regions with slow/unreachable HuggingFace access had no way to configure a mirror endpoint.

本地嵌入模型被硬编码为 all-MiniLM-L6-v2(仅支持英文)。非英文用户无法切换多语言模型。部署在防火墙后或在 HuggingFace 受限地区的用户无法配置镜像。

Changes | 改动

Three new environment variables (all optional, defaults unchanged):

  • AGENTMEMORY_LOCAL_EMBEDDING_MODEL — model name (default: Xenova/all-MiniLM-L6-v2). Set to Xenova/bge-m3 for multilingual 1024-dim embeddings supporting 100+ languages including Chinese.
  • AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR — custom cache directory for pre-downloaded models (useful in air-gapped/offline environments).
  • HF_ENDPOINT — HuggingFace mirror endpoint (e.g. https://hf-mirror.com).

Auto-detects dimensions from a known list; falls back to 384 for unknown models.

Usage | 用法

# Multilingual Chinese support via bge-m3
AGENTMEMORY_LOCAL_EMBEDDING_MODEL=Xenova/bge-m3

# Offline/pre-downloaded models
AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR=/opt/models/transformers

# HF mirror for China/firewall users
HF_ENDPOINT=https://hf-mirror.com

Backwards Compatibility | 向后兼容

All env vars are optional. Default behavior unchanged — all-MiniLM-L6-v2 with default HF endpoint.

Summary by CodeRabbit

  • New Features
    • Embedding provider now supports dynamic model and dimension configuration through environment variables instead of fixed defaults
    • Users can specify a custom local model cache directory for embedding models
    • Added support for Hugging Face endpoint configuration

…F mirror

Add three environment variables to the LocalEmbeddingProvider:

- AGENTMEMORY_LOCAL_EMBEDDING_MODEL: override the default model
  (default: Xenova/all-MiniLM-L6-v2). Set to Xenova/bge-m3 or
  Xenova/multilingual-e5-large for multilingual support.
- AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR: custom cache directory for
  pre-downloaded models (useful in offline/air-gapped environments).
- HF_ENDPOINT: HuggingFace mirror endpoint for users behind firewalls
  or in regions with restricted HF access.

Auto-detects model dimensions from a known list; falls back to 384.

添加三个环境变量使本地嵌入模型可配置,支持切换 bge-m3 等多语言模型,
指定模型缓存目录,以及配置 HuggingFace 镜像端点。
@vercel
Copy link
Copy Markdown

vercel Bot commented May 2, 2026

@mechanic-Q is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

📝 Walkthrough

Walkthrough

LocalEmbeddingProvider now reads embedding model name and dimensions from environment variables instead of using hard-coded defaults, supporting custom Hugging Face models, optional local cache directories, and remote host endpoints via configuration.

Changes

Dynamic Embedding Model Configuration

Layer / File(s) Summary
Configuration & Initialization
src/providers/embedding/local.ts (constructor)
Constructor added to read AGENTMEMORY_LOCAL_EMBEDDING_MODEL, AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR, and initializes dimensions via model-to-dimension lookup table (default 384).
Model Discovery
src/providers/embedding/local.ts (class properties)
Property dimensions changes from hard-coded 384 to instance variable initialized in constructor based on model name lookup.
Extractor Setup
src/providers/embedding/local.ts (getExtractor() method)
Configures @xenova/transformers environment with optional local cache path and remote host; uses dynamic this.modelName instead of hard-coded model string in pipeline call.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Poem

🐰 Flex thy models with a whispered word,
No more chained to one embedding path—
The rabbit hops through Hugging Face,
With custom caches and remote abode,
Config blooms where hard-code once stood.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: making the local embedding model configurable via environment variables, adding support for custom cache directories and HuggingFace mirrors.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/providers/embedding/local.ts (1)

25-33: ⚡ Quick win

Move KNOWN_DIMS to module scope

Defining KNOWN_DIMS inside the constructor allocates a fresh object on every instantiation. It should be a module-level constant, which matches the MODEL_DIMENSIONS pattern already established in OpenAIEmbeddingProvider (see src/providers/embedding/openai.ts lines 7–17).

♻️ Proposed refactor
+const KNOWN_DIMS: Record<string, number> = {
+  "Xenova/all-MiniLM-L6-v2": 384,
+  "Xenova/bge-m3": 1024,
+  "Xenova/multilingual-e5-large": 1024,
+  "Xenova/bge-small-en-v1.5": 384,
+  "Xenova/bge-base-en-v1.5": 768,
+  "Xenova/bge-large-en-v1.5": 1024,
+};
+
 export class LocalEmbeddingProvider implements EmbeddingProvider {
   ...
   constructor() {
     this.modelName = process.env["AGENTMEMORY_LOCAL_EMBEDDING_MODEL"] || "Xenova/all-MiniLM-L6-v2";
     this.modelDir = process.env["AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR"] || undefined;
-
-    // Known dimensions for common models. Falls back to 384 (MiniLM default).
-    const KNOWN_DIMS: Record<string, number> = {
-      "Xenova/all-MiniLM-L6-v2": 384,
-      "Xenova/bge-m3": 1024,
-      "Xenova/multilingual-e5-large": 1024,
-      "Xenova/bge-small-en-v1.5": 384,
-      "Xenova/bge-base-en-v1.5": 768,
-      "Xenova/bge-large-en-v1.5": 1024,
-    };
     this.dimensions = KNOWN_DIMS[this.modelName] ?? 384;
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/providers/embedding/local.ts` around lines 25 - 33, KNOWN_DIMS is created
inside the constructor which recreates the map on every instantiation; move the
KNOWN_DIMS object to module scope as a top-level constant (matching the
MODEL_DIMENSIONS pattern used in OpenAIEmbeddingProvider) and update the
constructor to compute this.dimensions = MODULE_LEVEL_KNOWN_DIMS[this.modelName]
|| 384 (or keep the same fallback) so the mapping is shared across instances and
not reallocated per new LocalEmbeddingProvider; keep the constant name
consistent with the project style (e.g., MODEL_DIMENSIONS or KNOWN_DIMS) and
reference it from the constructor where modelName and this.dimensions are set.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/providers/embedding/local.ts`:
- Line 66: Normalize the local model directory before assigning to
transformers.env.localModelPath so it always ends with exactly one trailing
slash (like the remoteHost normalization); instead of unconditionally using
this.modelDir + "/", strip any existing trailing slashes from this.modelDir and
then append a single "/" so transformers.env.localModelPath is guaranteed to be
"/path/to/models/"; update the assignment where transformers.env.localModelPath
is set and mirror the same trimming logic used for remoteHost normalization in
this file.

---

Nitpick comments:
In `@src/providers/embedding/local.ts`:
- Around line 25-33: KNOWN_DIMS is created inside the constructor which
recreates the map on every instantiation; move the KNOWN_DIMS object to module
scope as a top-level constant (matching the MODEL_DIMENSIONS pattern used in
OpenAIEmbeddingProvider) and update the constructor to compute this.dimensions =
MODULE_LEVEL_KNOWN_DIMS[this.modelName] || 384 (or keep the same fallback) so
the mapping is shared across instances and not reallocated per new
LocalEmbeddingProvider; keep the constant name consistent with the project style
(e.g., MODEL_DIMENSIONS or KNOWN_DIMS) and reference it from the constructor
where modelName and this.dimensions are set.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a178ef61-419f-418f-994a-901ab63269f6

📥 Commits

Reviewing files that changed from the base of the PR and between 94fc119 and a8f9d89.

📒 Files selected for processing (1)
  • src/providers/embedding/local.ts


// Allow custom model cache directory (e.g. for pre-downloaded models).
if (this.modelDir) {
transformers.env.localModelPath = this.modelDir + "/";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

localModelPath may produce a double trailing slash

this.modelDir + "/" unconditionally appends a slash. If a user sets AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR with a trailing slash (e.g. /models/), the result is /models//, which can break path resolution in @xenova/transformers. The same slash-normalization pattern already used for remoteHost (lines 73–75) should be applied here too.

The @xenova/transformers docs confirm env.localModelPath should be set to a path that ends with exactly one trailing slash (e.g., '/path/to/models/').

🐛 Proposed fix
-      transformers.env.localModelPath = this.modelDir + "/";
+      transformers.env.localModelPath = this.modelDir.endsWith("/")
+        ? this.modelDir
+        : this.modelDir + "/";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
transformers.env.localModelPath = this.modelDir + "/";
transformers.env.localModelPath = this.modelDir.endsWith("/")
? this.modelDir
: this.modelDir + "/";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/providers/embedding/local.ts` at line 66, Normalize the local model
directory before assigning to transformers.env.localModelPath so it always ends
with exactly one trailing slash (like the remoteHost normalization); instead of
unconditionally using this.modelDir + "/", strip any existing trailing slashes
from this.modelDir and then append a single "/" so
transformers.env.localModelPath is guaranteed to be "/path/to/models/"; update
the assignment where transformers.env.localModelPath is set and mirror the same
trimming logic used for remoteHost normalization in this file.

@rohitg00
Copy link
Copy Markdown
Owner

rohitg00 commented May 8, 2026

Reviewed — useful addition for users with pre-downloaded models, behind firewalls, or running multilingual / domain-specific embeddings.

What I like:

  • KNOWN_DIMS map for popular models (MiniLM 384, BGE-M3 1024, multilingual-e5-large 1024, BGE small/base/large EN) is the right shape — falls back to 384 for unknowns, which won't crash the index but should warn the user.
  • HF_ENDPOINT for HuggingFace mirror is the standard env var — matches transformers library convention.
  • AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR lets users pre-stage the model in a known cache dir.

One real concern + a couple of minor items:

1. Dimension mismatch on unknown models is a silent footgun.

If someone sets AGENTMEMORY_LOCAL_EMBEDDING_MODEL=Xenova/some-other-model that's actually 768-dim, but KNOWN_DIMS falls back to 384, the embed call will still produce a 768-dim vector at runtime — and you'll either crash on first index write or silently corrupt the vector index by storing mismatched-dimension vectors.

Two options:

  • Preferred: probe a single embed at first-extractor-init, set this.dimensions from the actual output length, throw if it doesn't match KNOWN_DIMS for known models. Costs one initial embed call.
  • Lighter: print a console.warn when the model is not in KNOWN_DIMS, instructing the user to set AGENTMEMORY_LOCAL_EMBEDDING_DIMENSIONS explicitly. Document the env var. Less safe but cheaper.

2. transformers.env typed as Record<string, unknown>.

Mutating a unknown value works at runtime but loses any IDE help. Worth a small ambient type for the bits you actually touch (localModelPath, cacheDir, remoteHost).

3. README env block.

Please add the three new env vars (AGENTMEMORY_LOCAL_EMBEDDING_MODEL, AGENTMEMORY_LOCAL_EMBEDDING_MODEL_DIR, HF_ENDPOINT) so they're discoverable.

Once dimension-mismatch handling lands (either probe or warn+document), this is good to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants