MatCreator is a skill-based, agentic platform for computational material science tasks, with a focus on Machine Learning Force Field (MLFF) generation and application. It would evolve with users by experience accumulation and creation of new skills.
# Create and activate an environment with uv (optional but recommended)
pip install uv
uv venv .venv --python 3.12
source .venv/bin/activate
uv pip install -e .After installation, tell the CLI where the project root lives:
# Run from the repo directory
matcreator init .
# Or specify an absolute path
matcreator init /path/to/PFD-AgentThis writes ~/.matcreator/config.yaml with the project_root path, so the
CLI can locate the agents/ directory even when installed into site-packages.
You can also set the MATCREATOR environment variable instead.
This project includes a frontend interface. The frontend depends on node, npm, and the local vite dev server.
Use NVM to manage Node.js.
# Install nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
source ~/.bashrc
# Verify installation:
nvm --versionnvm install 22
nvm use 22
nvm alias default 22Verify the installation
node -v
npm -vcd web/vite-frontend
npm install
npm run devWe highly recommend using WSL (Windows Subsystem for Linux) with uv to deploy local development virtual environments. WSL provides a native Linux environment seamlessly integrated with Windows, enabling access to Linux tools. As a fast, lightweight Python package manager, uv creates isolated environments to avoid dependency conflicts, ideal for Python applications like Streamlit.
In WSL, system Python is managed by apt and is a core system component. PEP 668 prohibits direct pip installation to avoid breaking system dependencies. Pipx is ideal for tools like uv: it creates isolated virtual environments for global access without polluting system Python.
As a command-line tool, uv can be installed via pipx, which creates an independent virtual environment for global use. Run these commands in the WSL terminal:
# 1. Install pipx
sudo apt update && sudo apt install pipx -y
# 2. Initialize pipx (add to system PATH)
pipx ensurepath
# 3. Restart terminal, then install uv
pipx install uvBefore the first run, create the agents/MatCreator/.env file and configure your model API credentials (additional environment variables may be required for some functionalities).
touch agents/MatCreator/.envAn example content of .env:
LLM_MODEL= "MODEL_TYPE"
LLM_API_KEY="API_KEYS"
LLM_BASE_URL="BASE_URL"
EMBEDDING_MODEL="EMBEDDING_MODEL_TYPE"
# SKILL_RELATED_ENV
CGCNN_ROOT=user/cgcnn # CGCNN project directory
MATTERGEN_ENV=user/../.mattergen # MATTERGEN virtual environment
TAVILY_API_KEY=""
BOHRIUM_MAT_IMAGE="" # MATTERGEN and MATTERSIM IMAGE
BOHRIUM_MAT_MACHINE="" # MATTERGEN and MATTERSIM IMAGE
eval_reference="user/../reference_MP2020correction.gz"
mattersim_model="user/../mattersim-v1.0.0-5M.pth"
mattergen_model="user/../mattergen/checkpoints"
BOHRIUM_VASP_IMAGE=""
BOHRIUM_VASP_MACHINE=""
...If you prefer different LLM models for sub-agents, you can override the default setting at the .env file within sub-agents directories.
A modern web UI with graph visualization, artifact upload/download, structure visualization, and scientific plotting. Start all three services (ADK API server, FastAPI middle layer, and Vite frontend) with a single script:
bash script/start_matcreator.shThis starts:
- ADK API server on
http://localhost:8000 - FastAPI middle layer on
http://localhost:8001 - Vite frontend on
http://localhost:5173
Logs are written to logs/{api-server,web-main,vite}.log. Press Ctrl+C to stop all services.
No frontend build step is needed — the Vite dev server runs directly with hot-reload.
Run the agent on a single prompt without starting any server:
# Inline prompt
matcreator run -p "Build a silicon FCC structure"
# Prompt from a file
matcreator run -f prompt.txt
# Save the answer to a file
matcreator run -p "Build a silicon FCC structure" -o result.txt
# Full structured JSON output (includes turn count, duration, etc.)
matcreator run -p "Build a silicon FCC structure" --output-format json -o result.json
# Override the workspace directory
matcreator run --workspace /data/my_workspace -p "Build a silicon FCC structure"
# or via environment variable
MATCLAW_WORKSPACE=/data/my_workspace matcreator run -p "Build a silicon FCC structure"Each run creates a session directory under <workspace>/sessions/<session-id>/ where any files produced by the agent are saved.
matcreator run webThis would set up the MatCreator agent network through the default adk web server. You can tune the LLM model and communication settings for the agents.
The default agent workspace is located at agents/MatCreator/.workspace, where skills, memory, etc., are stored.
MatCreator follows a modular design principle: skills are text files that define metadata, procedures and workflows. Some skills may require specialized tools (configured by $PROJECT/agents/MatCreator/tools.py), and some of them, e.g. tools for DFT calculations, may be hosted on MCP servers.
The default domain-based computational materials datasets is located at
database/domain_datasets.tar.gz, which should be extracted for database skill usage. (Seetools/database/README.md)
Check the
README.mdinskills/$SKILLif you really wanna use them.
Note — transitioning from MCP servers to skills: MatCreator is progressively moving tool logic out of dedicated MCP servers and into self-contained skills. A skill bundles its own workflow instructions, helper scripts, and configuration alongside the
.mdfile, so it can be run with only a general-purpose shell/Python tool rather than a running server process. If a capability you previously used via an MCP server is no longer listed undertools/, checkagents/MatCreator/knowledge/skills/— it may have been migrated to a skill. MCP servers are retained only for tools that genuinely require a persistent service (e.g. a remote job scheduler or a database backend).
For example, to set up a mcp server for ABACUS DFT software, uv run the script:
cd tools/abacus
uv sync
uv run server.py --port 50001You may need to set environment variables specific to the mcp server at tools/$TOOLNAME/.env, which can be referenced in tools/$TOOLNAME/README.md
Skills are Markdown files with a YAML frontmatter block (declaring name, description, tools, and dependent_skills) followed by a plain-text instruction body. The active loader discovers any workspace directory that contains a SKILL.md file, including nested directories such as skills/mattergen/mattergen_generation/SKILL.md. MatCreator loads skills from two locations in order:
- Built-in skills — shipped with the package under
agents/MatCreator/knowledge/skills/. Skills can be placed as flat<name>.mdfiles or in a subdirectory<name>/<name>.md; the subdirectory form takes precedence. - Workspace overlay — your personal skills under
$MATCLAW_WORKSPACE/skills/(defaults to.workspace/in the project root). Any skill here with the same name overrides the built-in version.
To customize a skill manually, copy its skill directory into your workspace skills/ directory and edit the contained SKILL.md. To add a new skill, create a new skills/<name>/SKILL.md file following the same frontmatter format.
The agent can also create and update skills on its own. During a session, the thinking agent can call built-in tools to scaffold a new skill file, write updated content to an existing one, or list what skills are currently available — letting the system accumulate knowledge automatically over time.
MatCreator organizes its knowledge as a graph of nodes and edges, stored in two separate SQLite databases:
skill_graph.db— developer-maintained, immutable nodes seeded from the skills directory.memory_graph.db— agent-learned nodes written during sessions; subject to synthesis and pruning.
In skill graph, each node belongs to one of three categories:
| Type | Description |
|---|---|
| Concept | Foundational domain knowledge and reference material (e.g. DFT theory, force-field conventions). Used as planning guidance rather than executable steps. |
| Skill | A self-contained, executable capability backed by a SKILL.md file (e.g. vasp_relaxation, mattergen_generation). The agent calls these during execution. |
| Workflow | A higher-level template that orchestrates multiple skills into a reusable sequence (e.g. a full MLFF training pipeline). |
Edges capture relationships between nodes (depends_on, belongs_to, relates_to). Semantic embeddings on every node enable vector search, so the agent can retrieve relevant skills and concepts by meaning rather than exact name.
When given a goal, the thinking agent produces an execution graph — a directed acyclic graph (DAG) where each node is a discrete action and each edge encodes a dependency.
step_download_data ──► step_relax ──► step_postprocess
└──► step_static ─►
Key properties:
- Nodes carry a
node_id, human-readablelabel, natural-languageactiondescription, and a list ofsuggested_skills. - Edges are
[predecessor_id, successor_id]pairs. A node cannot start until all its predecessors have succeeded. - Parallel execution: nodes with no unresolved dependencies are dispatched concurrently in a single turn.
- Failure propagation: if a node fails, all transitive dependents are marked
blockedautomatically.
The agent validates the graph for cycles before presenting it to the user, then waits for explicit confirmation before handing it off to the execution agent.
