The Enterprise-Grade ML Pipeline Framework for Humans
FlowyML is a lightweight yet powerful ML pipeline orchestration framework. It bridges the gap between rapid experimentation and enterprise production by making assets first-class citizens. Write pipelines in pure Python, and scale them to production without changing a single line of code.
| Feature | FlowyML | Traditional Orchestrators |
|---|---|---|
| Developer Experience | π Native Python - No DSLs, no YAML hell. | π Complex YAML or rigid DSLs. |
| Type-Based Routing | π§ Auto-Routing - Define WHAT, we handle WHERE. | π Manual wiring to cloud buckets. |
| Smart Caching | β‘ Multi-Level - Smart content-hashing skips re-runs. | π’ Basic file-timestamp checking. |
| Asset Management | π¦ First-Class Assets - Models & Datasets with lineage. | π Generic file paths only. |
| Multi-Stack | π Abstract Infra - Switch local/prod with one env var. | π Vendor lock-in or complex setup. |
| GenAI Ready | π€ LLM Tracing - Built-in token & cost tracking. | π§© Requires external tools. |
| Build-Time Validation | β Type Safety - Catches mismatches at build time. | π₯ Runtime errors only. |
| Map Tasks | πΊοΈ Parallel Maps - @map_task with retries & concurrency. |
π Manual parallelism boilerplate. |
| Dynamic Workflows | π Runtime DAGs - Generate pipelines based on data. | π Static definitions only. |
| GenAI Assets | π― Prompt & Checkpoint - First-class prompt versioning and training resumability. | π Unmanaged text files. |
| Stack Hydration | ποΈ YAML β Live Stack - StackConfig.to_stack() wires infra automatically. |
βοΈ Manual component assembly. |
This is a complete, multi-step ML pipeline with auto-injected context:
from flowyml import Pipeline, step, context
@step(outputs=["dataset"])
def load_data(batch_size: int = 32):
return [i for i in range(batch_size)]
@step(inputs=["dataset"], outputs=["model"])
def train_model(dataset, learning_rate: float = 0.01):
print(f"Training on {len(dataset)} items with lr={learning_rate}")
return "model_v1"
# Configure and Run
ctx = context(learning_rate=0.05, batch_size=64)
pipeline = Pipeline("quickstart", context=ctx)
pipeline.add_step(load_data).add_step(train_model)
pipeline.run()Define artifact types in code, and FlowyML automatically routes them to your cloud infrastructure.
@step
def train(...) -> Model:
# Auto-saved to GCS/S3 and registered to Vertex AI / SageMaker
return Model(obj, name="classifier")Manage local, staging, and production environments in a single flowyml.yaml.
export FLOWYML_STACK=production
python pipeline.py # Now runs on Vertex AI with GCS storageGroup consecutive steps to run in the same container. Perfect for reducing overhead while maintaining clear step boundaries.
Beautiful dark-mode dashboard to monitor pipelines, visualize DAGs, and inspect artifacts in real-time.
Production-grade evaluation system with 29+ scorers β classification, regression, GenAI (LLM-as-a-judge), and adapters for DeepEval, RAGAS, and Phoenix:
from flowyml.evals import evaluate, EvalDataset, get_scorer
data = EvalDataset.create_genai("my_test", examples=[...])
result = evaluate(data=data, scorers=[get_scorer("relevance"), get_scorer("ragas.faithfulness")])
result.notify_if_regression(threshold=0.05)Distribute work over collections with @map_task and generate pipelines at runtime with @dynamic:
from flowyml import map_task, dynamic
@map_task(concurrency=8, retries=2, min_success_ratio=0.95)
def process_document(doc: dict) -> dict:
return transform(doc)
@dynamic(outputs=["best_model"])
def hyperparameter_search(config: dict):
sub = Pipeline("hp_search")
for lr in config["learning_rates"]:
sub.add_step(train_with_lr(lr))
return subCentralized artifact discovery, tagging, and lineage tracking β works local and remote:
from flowyml import ArtifactCatalog
catalog = ArtifactCatalog() # Auto-selects local SQLite or remote API
catalog.register(name="classifier", artifact_type="Model", parent_ids=[dataset_id])
lineage = catalog.get_lineage(model_id) # Full parentβchild graph# Install core
pip install flowyml
# Install with everything (recommended)
pip install "flowyml[all]"FlowyML Notebook is the companion reactive notebook environment for FlowyML. It replaces Jupyter with a DAG-powered, production-ready notebook that ships directly to FlowyML pipelines.
| Feature | Description |
|---|---|
| π Reactive DAG | Cells form a dependency graph β change a variable, only dependent cells re-execute |
| π Pure .py Storage | Git-friendly, lintable, importable β no .ipynb JSON |
| π Pipeline Promotion | Promote notebooks to production FlowyML pipelines with one click |
| π§Ύ 43 Recipes | Reusable code templates across Core, Assets, ML, Evals, and more |
| π€ AI Assistant | Context-aware code generation (OpenAI, Google AI, Ollama, Anthropic) |
| π Rich Data Explorer | Automatic DataFrame profiling with statistics, charts, and correlations |
| π Publish as App | Turn any notebook into a web app with 5 layout options |
pip install flowyml-notebook
fml-notebook dev # π₯ Launch with hot-reloadπ FlowyML Notebook Documentation Β· GitHub Β· PyPI
Visit FlowyML Docs for the full guide:
- Getting Started β Build your first pipeline in 5 minutes
- Core Concepts β Pipelines, Steps, Context, and Assets
- Features Explorer β 20+ features deep dive
- Ecosystem β FlowyML Notebook, Keras tools, and integrations
- API Reference β Full API documentation
Built with β€οΈ by UnicoLab