Skip to content

rust4ai/inference-proxy-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Inference Proxy Server

A lightweight, secure proxy server that gives your organization controlled access to LLMs via Amazon Bedrock. Built in Rust for performance and reliability.

Employees get API keys to access inference. Every request is authenticated, logged, and auditable. Admins manage keys and review usage — all through a simple REST API.

Amazon Bedrock is chosen as an inference provider due to their agreement to never share data with model providers and to not use inputs or output to train AI models. https://aws.amazon.com/bedrock/security-compliance/

Why

  • Security — Employees never see your Bedrock credentials. They use scoped, revocable API keys.
  • Audit trail — Every inference request is logged: who, what model, token usage, status, timestamp, source IP.
  • OpenAI-compatible — Exposes /v1/chat/completions so it works with any OpenAI SDK or tool out of the box.
  • Zero conversion overhead — Uses Bedrock's native OpenAI-compatible endpoint. Requests and responses pass through unmodified.
  • Simple auth — Bedrock bearer token auth. No SigV4 complexity.

Architecture

Employee (OpenAI SDK/curl)
    │
    ▼
┌──────────────────────┐
│  Inference Proxy      │
│  :3000                │
│                       │
│  ┌─ /v1/chat/completions ──► Amazon Bedrock
│  │   (API key auth)          (bearer token)
│  │
│  ├─ /admin/api-keys
│  │   (admin key auth)
│  │
│  └─ /admin/audit-logs
│      (admin key auth)
│                       │
└──────────┬────────────┘
           │
           ▼
       PostgreSQL
    (keys + audit logs)

Quick Start

Prerequisites

  • Rust 1.91+
  • A Bedrock API key (bearer token)
  • PostgreSQL (optional — only needed for API key management and audit logging)

Run without database (dev/testing)

cp .env.example .env
# Edit .env — set BEDROCK_BEARER_TOKEN at minimum
cargo run

With no DATABASE_URL, the proxy runs in passthrough mode: no auth required, no audit logging. Good for quick testing.

Run with full auth + audit

# Start Postgres (via Docker or local install)
docker compose up -d

# Run migrations
DATABASE_URL=postgres://proxy:proxy@localhost:5432/inference_proxy cargo run --bin migrate

# Start server
cargo run

Configuration

All configuration via environment variables (or .env file):

Variable Required Default Description
BEDROCK_BEARER_TOKEN Yes Amazon Bedrock API key
ADMIN_API_KEY When DB is set Secret key for admin endpoints
DATABASE_URL No Postgres connection string
AWS_REGION No us-east-1 AWS region for Bedrock
DEFAULT_MODEL_ID No openai.gpt-oss-20b-1:0 Model when not specified in request
BIND_ADDR No 0.0.0.0:3000 Server listen address
RUST_LOG No info Log level filter

API

Inference

POST /v1/chat/completions — OpenAI-compatible chat completions

curl http://localhost:3000/v1/chat/completions \
  -H "Authorization: Bearer ipx-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai.gpt-oss-20b-1:0",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Works with any OpenAI SDK:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3000/v1",
    api_key="ipx-your-api-key"
)

resp = client.chat.completions.create(
    model="openai.gpt-oss-20b-1:0",
    messages=[{"role": "user", "content": "Hello!"}]
)

Admin (requires ADMIN_API_KEY)

Create an API key:

curl -X POST http://localhost:3000/admin/api-keys \
  -H "Authorization: Bearer $ADMIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name": "alice"}'

Returns the full key once — store it, it won't be shown again.

List keys:

curl http://localhost:3000/admin/api-keys \
  -H "Authorization: Bearer $ADMIN_API_KEY"

Revoke a key:

curl -X DELETE http://localhost:3000/admin/api-keys/{id} \
  -H "Authorization: Bearer $ADMIN_API_KEY"

Query audit logs:

curl "http://localhost:3000/admin/audit-logs?limit=50" \
  -H "Authorization: Bearer $ADMIN_API_KEY"

# Filter by key
curl "http://localhost:3000/admin/audit-logs?api_key_id=uuid-here" \
  -H "Authorization: Bearer $ADMIN_API_KEY"

Security Model

  • API keys are SHA-256 hashed before storage. Plaintext is shown exactly once at creation.
  • Key prefix (ipx-xxxx) stored for identification without exposing the full key.
  • Admin key is checked via constant-time-safe string comparison and never stored in the database.
  • Bedrock credentials never leave the server. Employees only interact with proxy-issued keys.
  • Revocation is instant — revoked keys are rejected on the next request.

Project Structure

src/
  main.rs       — Config, startup, router
  auth.rs       — Key generation, hashing, auth middleware
  bedrock.rs    — HTTP call to Bedrock OpenAI-compatible endpoint
  routes.rs     — /v1/chat/completions + admin CRUD
  db.rs         — Postgres queries (api_keys, audit_logs)
  bin/
    migrate.rs  — Database migration runner
migrations/
  001_init.sql  — Schema (api_keys, audit_logs tables)

License

MIT

About

A middleman proxy server for inference for an organization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors