🌊 flowyml

The Enterprise-Grade ML Pipeline Framework for Humans

FlowyML is a lightweight yet powerful ML pipeline orchestration framework. It bridges the gap between rapid experimentation and enterprise production by making assets first-class citizens. Write pipelines in pure Python, and scale them to production without changing a single line of code.

🚀 Why FlowyML?

Feature	FlowyML	Traditional Orchestrators
Developer Experience	🐍 Native Python - No DSLs, no YAML hell.	📜 Complex YAML or rigid DSLs.
Type-Based Routing	🧠 Auto-Routing - Define WHAT, we handle WHERE.	🔌 Manual wiring to cloud buckets.
Smart Caching	⚡ Multi-Level - Smart content-hashing skips re-runs.	🐢 Basic file-timestamp checking.
Asset Management	📦 First-Class Assets - Models & Datasets with lineage.	📁 Generic file paths only.
Multi-Stack	🌍 Abstract Infra - Switch local/prod with one env var.	🔒 Vendor lock-in or complex setup.
GenAI Ready	🤖 LLM Tracing - Built-in token & cost tracking.	🧩 Requires external tools.
Build-Time Validation	✅ Type Safety - Catches mismatches at build time.	💥 Runtime errors only.
Map Tasks	🗺️ Parallel Maps - `@map_task` with retries & concurrency.	🔁 Manual parallelism boilerplate.
Dynamic Workflows	🔀 Runtime DAGs - Generate pipelines based on data.	📐 Static definitions only.
GenAI Assets	🎯 Prompt & Checkpoint - First-class prompt versioning and training resumability.	📝 Unmanaged text files.
Stack Hydration	🏗️ YAML → Live Stack - `StackConfig.to_stack()` wires infra automatically.	⚙️ Manual component assembly.

⚡️ Quick Start

This is a complete, multi-step ML pipeline with auto-injected context:

from flowyml import Pipeline, step, context

@step(outputs=["dataset"])
def load_data(batch_size: int = 32):
    return [i for i in range(batch_size)]

@step(inputs=["dataset"], outputs=["model"])
def train_model(dataset, learning_rate: float = 0.01):
    print(f"Training on {len(dataset)} items with lr={learning_rate}")
    return "model_v1"

# Configure and Run
ctx = context(learning_rate=0.05, batch_size=64)
pipeline = Pipeline("quickstart", context=ctx)
pipeline.add_step(load_data).add_step(train_model)

pipeline.run()

🌟 Key Features

1. 🧠 Type-Based Artifact Routing (New in 1.8.0)

Define artifact types in code, and FlowyML automatically routes them to your cloud infrastructure.

@step
def train(...) -> Model:
    # Auto-saved to GCS/S3 and registered to Vertex AI / SageMaker
    return Model(obj, name="classifier")

2. 🌍 Multi-Stack Configuration

Manage local, staging, and production environments in a single flowyml.yaml.

export FLOWYML_STACK=production
python pipeline.py  # Now runs on Vertex AI with GCS storage

3. 🛡️ Intelligent Step Grouping

Group consecutive steps to run in the same container. Perfect for reducing overhead while maintaining clear step boundaries.

4. 📊 Built-in Observability

Beautiful dark-mode dashboard to monitor pipelines, visualize DAGs, and inspect artifacts in real-time.

5. 🎯 Evaluations Framework

Production-grade evaluation system with 29+ scorers — classification, regression, GenAI (LLM-as-a-judge), and adapters for DeepEval, RAGAS, and Phoenix:

from flowyml.evals import evaluate, EvalDataset, get_scorer

data = EvalDataset.create_genai("my_test", examples=[...])
result = evaluate(data=data, scorers=[get_scorer("relevance"), get_scorer("ragas.faithfulness")])
result.notify_if_regression(threshold=0.05)

6. 🗺️ Map Tasks & Dynamic Workflows

Distribute work over collections with @map_task and generate pipelines at runtime with @dynamic:

from flowyml import map_task, dynamic

@map_task(concurrency=8, retries=2, min_success_ratio=0.95)
def process_document(doc: dict) -> dict:
    return transform(doc)

@dynamic(outputs=["best_model"])
def hyperparameter_search(config: dict):
    sub = Pipeline("hp_search")
    for lr in config["learning_rates"]:
        sub.add_step(train_with_lr(lr))
    return sub

7. 📦 Artifact Catalog with Lineage

Centralized artifact discovery, tagging, and lineage tracking — works local and remote:

from flowyml import ArtifactCatalog

catalog = ArtifactCatalog()  # Auto-selects local SQLite or remote API
catalog.register(name="classifier", artifact_type="Model", parent_ids=[dataset_id])
lineage = catalog.get_lineage(model_id)  # Full parent→child graph

📦 Installation

# Install core
pip install flowyml

# Install with everything (recommended)
pip install "flowyml[all]"

📓 FlowyML Notebook — Design Pipelines Visually

FlowyML Notebook is the companion reactive notebook environment for FlowyML. It replaces Jupyter with a DAG-powered, production-ready notebook that ships directly to FlowyML pipelines.

Feature	Description
🔄 Reactive DAG	Cells form a dependency graph — change a variable, only dependent cells re-execute
📝 Pure .py Storage	Git-friendly, lintable, importable — no `.ipynb` JSON
🚀 Pipeline Promotion	Promote notebooks to production FlowyML pipelines with one click
🧾 43 Recipes	Reusable code templates across Core, Assets, ML, Evals, and more
🤖 AI Assistant	Context-aware code generation (OpenAI, Google AI, Ollama, Anthropic)
📊 Rich Data Explorer	Automatic DataFrame profiling with statistics, charts, and correlations
🌐 Publish as App	Turn any notebook into a web app with 5 layout options

pip install flowyml-notebook
fml-notebook dev  # 🔥 Launch with hot-reload

📖 FlowyML Notebook Documentation · GitHub · PyPI

📚 Documentation

Visit FlowyML Docs for the full guide:

Getting Started — Build your first pipeline in 5 minutes
Core Concepts — Pipelines, Steps, Context, and Assets
Features Explorer — 20+ features deep dive
Ecosystem — FlowyML Notebook, Keras tools, and integrations
API Reference — Full API documentation

Built with ❤️ by UnicoLab

Name		Name	Last commit message	Last commit date
Latest commit History 190 Commits
.gemini/antigravity/artifacts		.gemini/antigravity/artifacts
.github		.github
.playwright-mcp		.playwright-mcp
docker		docker
docs		docs
examples		examples
flowyml		flowyml
infra		infra
migrations		migrations
scripts		scripts
tests		tests
tmp_test		tmp_test
.dockerignore		.dockerignore
.env.example		.env.example
.flowyml.Dockerfile		.flowyml.Dockerfile
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
DOCKER.md		DOCKER.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
PREZ.md		PREZ.md
README.md		README.md
alembic.ini		alembic.ini
data.csv		data.csv
debug_imports.py		debug_imports.py
docker-compose.yml		docker-compose.yml
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
training_pipeline.py		training_pipeline.py
training_pipeline_dataset.py		training_pipeline_dataset.py
uniflow.yaml.example		uniflow.yaml.example
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌊 flowyml

🚀 Why FlowyML?

⚡️ Quick Start

🌟 Key Features

1. 🧠 Type-Based Artifact Routing (New in 1.8.0)

2. 🌍 Multi-Stack Configuration

3. 🛡️ Intelligent Step Grouping

4. 📊 Built-in Observability

5. 🎯 Evaluations Framework

6. 🗺️ Map Tasks & Dynamic Workflows

7. 📦 Artifact Catalog with Lineage

📦 Installation

📓 FlowyML Notebook — Design Pipelines Visually

📚 Documentation

About

Uh oh!

Releases 14

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌊 flowyml

🚀 Why FlowyML?

⚡️ Quick Start

🌟 Key Features

1. 🧠 Type-Based Artifact Routing (New in 1.8.0)

2. 🌍 Multi-Stack Configuration

3. 🛡️ Intelligent Step Grouping

4. 📊 Built-in Observability

5. 🎯 Evaluations Framework

6. 🗺️ Map Tasks & Dynamic Workflows

7. 📦 Artifact Catalog with Lineage

📦 Installation

📓 FlowyML Notebook — Design Pipelines Visually

📚 Documentation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 14

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages