PromptEnhancer

AI prompt engineering for image and video production. By Joost Helfers.

What it does

PromptEnhancer generates model-optimized prompts for commercial AI image and video production. It runs a three-step pipeline via OpenRouter:

Vision Analysis — An AI vision model reads your reference images and extracts color palette, lighting, texture, and emotional tone.
Creative Brief — A planning model develops a production brief: creative vision, visual metaphor, shot diversity, and color anchors.
Prompt Derivation — Prompts are derived from the brief, formatted for the target model's specific strengths and syntax.

Three modes

The UI is organized around two choices: what you want to do and what you're making (Image / Image edit / Video).

Create prompts — The full pipeline. Describe a concept and/or upload reference images, and generate a diverse set of model-specific prompts. Choose Image (text-to-image), Image edit (image-to-image, needs a reference image), or Video.
Enhance a prompt — Paste an existing prompt and optimize it for the selected model. The enhancer restructures, expands, and adapts it — it doesn't generate from scratch.
Develop a brief — Shape a creative brief and shot list first (creative vision, visual metaphor, shot diversity, color anchors), then generate image or video prompts from it.

Supported models

Model	Type	Description
Z-Image	Image	Default. Alibaba's 6B photorealism + text-rendering model. Natural-language, positive-only prompts (Turbo runs at CFG 0).
Flux 2 Klein 9B	Image	Best for cinematic stills. Keep prompts concise (50-100 words).
NanoBanana 2	Image	Fast and flexible. Up to 14 reference images with character consistency.
Gemini Omni Flash	Video	Google's multimodal video model. Conversational prompts, iterative editing, physics-aware.
Veo 3.1	Video	Google video. Structured scenes with camera, dialogue, and audio.
Kling v3	Video	Multi-shot video with character labels and temporal markers.
Kling o3	Video	Enhanced Kling with deeper scene understanding for complex sequences.
LTX-Video 2.3	Video	High-resolution video (up to 4K). Flowing present-tense with audio.

Quick start

Prerequisites

Node.js (v18 or later)
An OpenRouter API key

1. Clone and install

git clone https://github.com/joosthel/PromptEnhancer.git
cd PromptEnhancer
npm install

2. Add your API key

Create a .env.local file in the project root:

echo 'OPENROUTER_API_KEY=sk-or-v1-...' > .env.local

Replace sk-or-v1-... with your actual key from openrouter.ai/keys.

3. Run

macOS — double-click start.command, or run from terminal:

./start.command

Windows — double-click start.bat, or from a terminal:

start.bat

Both scripts install dependencies (if needed), check for your API key, start the dev server, and open your browser at http://localhost:3000.

You can also start manually:

npm run dev

Testing

Tests run on Vitest with React Testing Library (jsdom).

npm test            # run once
npm run test:watch  # watch mode

Pure-logic tests live next to the code, e.g. src/lib/__tests__/.
Component tests render real components in jsdom, e.g. src/app/__tests__/start-new.test.tsx.
Shared setup (DOM matchers, in-memory localStorage) is in vitest.setup.ts.

Usage

Choose a mode — Select Create prompts, Enhance a prompt, or Develop a brief from the left panel (and, for the first two, what you're making: Image / Image edit / Video).
Select a target model — Pick the AI model you're generating prompts for.
Add reference images (optional) — Drag and drop, click to browse, paste from clipboard, or enter a URL. Click thumbnails to label images (style reference, subject, face, background).
Describe your concept — Fill in the text area. All modes accept freeform descriptions.
Set prompt count (Generation/Art Direction) — Choose 1-6 prompts for diversity.
Generate — The pipeline runs server-side. Reference images are cached, so re-generating with the same images skips the vision step.
Refine — Use Fix buttons on individual prompt cards to iterate without re-running the full pipeline (Hands, Lighting, Too AI, Mood, or custom notes), or "Polish all" to run an art-direction pass over the whole set. If a run ever fails, hit Retry — it reuses the brief already produced and finishes fast.

Stack

Layer	Tech
Framework	Next.js 16 (App Router)
Language	TypeScript 5
Styling	Tailwind CSS 4
AI Gateway	OpenRouter (public) · Langdock (agency tenant)
Vision model	`google/gemini-3.5-flash` (fallback `google/gemini-2.5-flash`) — always via OpenRouter, even on the agency tenant
Text model (public)	`openai/gpt-4o-mini` (fallback `openai/gpt-4.1-nano`) — fast, reliable structured JSON, non-reasoning
Text model (agency)	`gemini-2.5-flash` via Langdock (env-configurable)

No database. No auth. Runtime deps are just the Next.js scaffold + zod (schema validation) + the MCP SDK; vitest is a dev-only test dependency.

Project structure

src/
  app/
    page.tsx                  # Main page — state, mode logic, generation handlers
    layout.tsx                # Root layout, metadata, fonts
    globals.css               # Global styles, accessibility rules
    api/
      generate-stream/route.ts # Primary pipeline (vision → brief → prompts), streamed via SSE with a heartbeat
      generate/route.ts        # Non-streaming pipeline (shared with the MCP server)
      enhance/route.ts         # Single-step prompt enhancement
      revise/route.ts          # Single-card fix/revision
      reformat/route.ts        # Cross-model prompt reformatting
      refine/route.ts          # Opt-in art-direction "Polish" pass
      [transport]/route.ts     # MCP server (/api/mcp, /api/sse)
  components/
    ModeSelector.tsx          # Three app modes + sub-mode chips
    ModelSelector.tsx         # Target model cards with descriptions
    ImageUploader.tsx         # Drag-drop, paste, URL input, image labeling
    InputForm.tsx             # Description textarea + prompt count
    PromptList.tsx            # Results: brief, visual analysis, prompt cards
    PromptCard.tsx            # Individual prompt with copy, fix, reformat
    FixToolbar.tsx            # Fix category chips + custom input
    FixHistory.tsx            # Prompt revision history
    ModelChips.tsx            # Cross-model reformat chips per card
    BatchActions.tsx          # Select all, batch fix operations
    LoadingAnimation.tsx      # Dot-ring loading with phase labels
    CreditPopup.tsx           # API credit acknowledgment popup
    HelpModal.tsx             # How-it-works documentation modal
  lib/
    services.ts               # Shared orchestration (enhance/generate/revise/reformat) — used by REST routes + MCP
    openrouter.ts             # Typed fetch wrapper for OpenRouter API
    system-prompt.ts          # Vision prompt, types, shared constants
    prompt-engine.ts          # System prompt + user message builders
    model-profiles.ts         # Model definitions, modes, fix categories
    image-utils.ts            # Canvas resize, clipboard, URL validation, fingerprinting
    use-focus-trap.ts         # Shared focus trap hook for modals

MCP server

The same deployment also exposes an MCP server, so the prompt engine can be called from MCP clients (Langdock, Claude, Cursor, …) — not just the web UI.

Endpoint (Streamable HTTP): https://<deployment>/api/mcp. Any custom domain on this project + /api/mcp works identically.
Legacy SSE fallback: /api/sse
Auth: set MCP_AUTH_TOKEN to require Authorization: Bearer <token> (or x-api-key) on tool calls. If unset, the server is open (bounded by the per-IP rate limit + provider spend cap).

Add the /api/mcp URL as a custom MCP integration in your client. Tools exposed:

Tool	Purpose
`usage_guide`	Returns a short how-to with examples. Call this first if unsure.
`enhance_prompt`	Optimize an existing prompt for a target model (optional reference images).
`generate_prompts`	Full pipeline (vision → brief → prompts). `briefOnly: true` returns just the creative brief (Art Direction).
`revise_prompt`	Refine a single prompt via a note and/or fix category.
`reformat_prompt`	Rewrite a prompt from one model's format to another.

The server also advertises instructions (a short summary) that compatible clients surface to the agent automatically.

Use it in Langdock (for colleagues)

PromptEnhancer plugs into Langdock as a remote MCP integration, so you can call it from Chat, Agents, and Workflows.

In Langdock, go to Integrations → Connect remote MCP.
Server URL: https://<agency-deployment>/api/mcp (Streamable HTTP). Use /api/sse if you need the SSE transport.
Authentication: choose API Key and paste the token (the value of MCP_AUTH_TOKEN). Langdock formats the header automatically.
Test the connection, then select the tools to import (usage_guide, generate_prompts, enhance_prompt, revise_prompt, reformat_prompt).
Attach the tools to an Agent, call them in Chat, or add them as Action nodes in a Workflow.

Once connected, ask an agent things like "generate 4 cinematic Flux prompts for a moody 1970s editorial portrait" — it calls generate_prompts and returns ready-to-use prompts. Call usage_guide any time for the full list of tools, modes, and supported models. On the agency deployment the whole pipeline runs on Langdock's own models, billed to the agency.

The MCP tools share the same logic as the REST routes via src/lib/services.ts. Reference images should be passed as public URLs where possible (base64 is supported but large over MCP).

Test locally with the MCP Inspector:

npx @modelcontextprotocol/inspector
# connect to http://localhost:3000/api/mcp (Streamable HTTP)

Environment variables

Public tenant (default)

Variable	Required	Description
`OPENROUTER_API_KEY`	Yes	Your OpenRouter API key — stored server-side only
`NEXT_PUBLIC_SITE_URL`	No	Production URL (defaults to `http://localhost:3000`)

Agency tenant (Langdock + OpenRouter hybrid)

When APP_TENANT is set to a non-public value, the app uses a hybrid: text routes through Langdock (preserves agency billing for the bulk of LLM spend), vision routes through OpenRouter (Langdock's API does not accept image inputs).

Variable	Required	Description
`APP_TENANT`	Yes	Agency identifier (e.g. `WIN`). Any non-empty value other than `public`/`openrouter` activates the agency path
`LANGDOCK_API_KEY`	Yes	Langdock workspace API key — used for text generation
`OPENROUTER_API_KEY`	Yes	Required even on the agency tenant — used for vision. Provider resolution fails fast if missing
`LANGDOCK_REGION`	No	Defaults to `eu`
`LANGDOCK_TEXT_MODEL`	No	Defaults to `gemini-2.5-flash`
`LANGDOCK_TEXT_FALLBACK`	No	Optional Langdock fallback model
`MCP_AUTH_TOKEN`	No	Bearer token guarding `/api/mcp` — recommended for agency deploys

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
docs		docs
public		public
src		src
.gitignore		.gitignore
README.md		README.md
env.example		env.example
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
start.bat		start.bat
start.command		start.command
tsconfig.json		tsconfig.json
vercel.json		vercel.json
vitest.config.ts		vitest.config.ts
vitest.setup.ts		vitest.setup.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PromptEnhancer

What it does

Three modes

Supported models

Quick start

Prerequisites

1. Clone and install

2. Add your API key

3. Run

Testing

Usage

Stack

Project structure

MCP server

Use it in Langdock (for colleagues)

Environment variables

Public tenant (default)

Agency tenant (Langdock + OpenRouter hybrid)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PromptEnhancer

What it does

Three modes

Supported models

Quick start

Prerequisites

1. Clone and install

2. Add your API key

3. Run

Testing

Usage

Stack

Project structure

MCP server

Use it in Langdock (for colleagues)

Environment variables

Public tenant (default)

Agency tenant (Langdock + OpenRouter hybrid)

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages