Inference Proxy Server

A lightweight, secure proxy server that gives your organization controlled access to LLMs via Amazon Bedrock. Built in Rust for performance and reliability.

Employees get API keys to access inference. Every request is authenticated, logged, and auditable. Admins manage keys and review usage — all through a simple REST API.

Amazon Bedrock is chosen as an inference provider due to their agreement to never share data with model providers and to not use inputs or output to train AI models. https://aws.amazon.com/bedrock/security-compliance/

Why

Security — Employees never see your Bedrock credentials. They use scoped, revocable API keys.
Audit trail — Every inference request is logged: who, what model, token usage, status, timestamp, source IP.
OpenAI-compatible — Exposes /v1/chat/completions so it works with any OpenAI SDK or tool out of the box.
Zero conversion overhead — Uses Bedrock's native OpenAI-compatible endpoint. Requests and responses pass through unmodified.
Simple auth — Bedrock bearer token auth. No SigV4 complexity.

Architecture

Employee (OpenAI SDK/curl)
    │
    ▼
┌──────────────────────┐
│  Inference Proxy      │
│  :3000                │
│                       │
│  ┌─ /v1/chat/completions ──► Amazon Bedrock
│  │   (API key auth)          (bearer token)
│  │
│  ├─ /admin/api-keys
│  │   (admin key auth)
│  │
│  └─ /admin/audit-logs
│      (admin key auth)
│                       │
└──────────┬────────────┘
           │
           ▼
       PostgreSQL
    (keys + audit logs)

Quick Start

Prerequisites

Rust 1.91+
A Bedrock API key (bearer token)
PostgreSQL (optional — only needed for API key management and audit logging)

Run without database (dev/testing)

cp .env.example .env
# Edit .env — set BEDROCK_BEARER_TOKEN at minimum
cargo run

With no DATABASE_URL, the proxy runs in passthrough mode: no auth required, no audit logging. Good for quick testing.

Run with full auth + audit

# Start Postgres (via Docker or local install)
docker compose up -d

# Run migrations
DATABASE_URL=postgres://proxy:proxy@localhost:5432/inference_proxy cargo run --bin migrate

# Start server
cargo run

Configuration

All configuration via environment variables (or .env file):

Variable	Required	Default	Description
`BEDROCK_BEARER_TOKEN`	Yes	—	Amazon Bedrock API key
`ADMIN_API_KEY`	When DB is set	—	Secret key for admin endpoints
`DATABASE_URL`	No	—	Postgres connection string
`AWS_REGION`	No	`us-east-1`	AWS region for Bedrock
`DEFAULT_MODEL_ID`	No	`openai.gpt-oss-20b-1:0`	Model when not specified in request
`BIND_ADDR`	No	`0.0.0.0:3000`	Server listen address
`RUST_LOG`	No	`info`	Log level filter

API

Inference

POST /v1/chat/completions — OpenAI-compatible chat completions

curl http://localhost:3000/v1/chat/completions \
  -H "Authorization: Bearer ipx-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai.gpt-oss-20b-1:0",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Works with any OpenAI SDK:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3000/v1",
    api_key="ipx-your-api-key"
)

resp = client.chat.completions.create(
    model="openai.gpt-oss-20b-1:0",
    messages=[{"role": "user", "content": "Hello!"}]
)

Admin (requires `ADMIN_API_KEY`)

Create an API key:

curl -X POST http://localhost:3000/admin/api-keys \
  -H "Authorization: Bearer $ADMIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name": "alice"}'

Returns the full key once — store it, it won't be shown again.

List keys:

curl http://localhost:3000/admin/api-keys \
  -H "Authorization: Bearer $ADMIN_API_KEY"

Revoke a key:

curl -X DELETE http://localhost:3000/admin/api-keys/{id} \
  -H "Authorization: Bearer $ADMIN_API_KEY"

Query audit logs:

curl "http://localhost:3000/admin/audit-logs?limit=50" \
  -H "Authorization: Bearer $ADMIN_API_KEY"

# Filter by key
curl "http://localhost:3000/admin/audit-logs?api_key_id=uuid-here" \
  -H "Authorization: Bearer $ADMIN_API_KEY"

Security Model

API keys are SHA-256 hashed before storage. Plaintext is shown exactly once at creation.
Key prefix (ipx-xxxx) stored for identification without exposing the full key.
Admin key is checked via constant-time-safe string comparison and never stored in the database.
Bedrock credentials never leave the server. Employees only interact with proxy-issued keys.
Revocation is instant — revoked keys are rejected on the next request.

Project Structure

src/
  main.rs       — Config, startup, router
  auth.rs       — Key generation, hashing, auth middleware
  bedrock.rs    — HTTP call to Bedrock OpenAI-compatible endpoint
  routes.rs     — /v1/chat/completions + admin CRUD
  db.rs         — Postgres queries (api_keys, audit_logs)
  bin/
    migrate.rs  — Database migration runner
migrations/
  001_init.sql  — Schema (api_keys, audit_logs tables)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
migrations		migrations
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
dev.sh		dev.sh
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inference Proxy Server

Why

Architecture

Quick Start

Prerequisites

Run without database (dev/testing)

Run with full auth + audit

Configuration

API

Inference

Admin (requires `ADMIN_API_KEY`)

Security Model

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Inference Proxy Server

Why

Architecture

Quick Start

Prerequisites

Run without database (dev/testing)

Run with full auth + audit

Configuration

API

Inference

Admin (requires ADMIN_API_KEY)

Security Model

Project Structure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Admin (requires `ADMIN_API_KEY`)

Packages