Skip to content

Tharanika-R-Git/AI-Token-Monitor

Repository files navigation

AI-Token-Monitor 🚀

Track token usage and cost for OpenRouter Groq and OpenAI — with seamless FastAPI integration
Monitor every token spent per request chat session plus cost estimation using real model pricing


Table of Contents


Introduction 📖

AI-Token-Monitor solves the challenge of tracking token consumption and cost across multiple AI providers such as OpenRouter, Groq, and OpenAI. It provides fine-grained monitoring per chat session and request, enabling developers to optimize usage and control expenses effectively.

This project benefits AI developers, API integrators, and product teams who rely on OpenAI-compatible APIs and want transparent and accurate token usage billing metrics integrated directly with FastAPI applications.

Feature AI-Token-Monitor Alternative A Alternative B
Multi-provider token tracking ✅ OpenRouter Groq OpenAI ❌ Limited to OpenAI only ❌ Limited to single provider
FastAPI middleware support ✅ Built-in middleware ❌ Requires custom setup ❌ No middleware
Per chat session tracking ✅ Yes ❌ No ❌ No
Real-time cost estimation ✅ Based on live pricing ❌ Static pricing only ❌ No cost estimate
Simple in-memory storage ✅ Default with option to extend ❌ No storage or DB only ❌ Only cloud DB
Open-source & extensible ✅ Fully open and modular ❌ Closed or proprietary ❌ Limited extensibility

Features ✨

Core Features

  • 🔍 Track tokens used per request and chat session with detailed logs
  • 💰 Real-time cost estimation using up-to-date model pricing from OpenRouter
  • 🔗 Support for multiple AI providers: OpenRouter, Groq, OpenAI
  • 🛠️ FastAPI middleware integration for automatic request tracking

Developer Experience

  • 🧩 Modular design with components like TokenMonitor, ChatManager, and Storage
  • ⚡ Lightweight in-memory storage by default, easily replaceable with any DB adapter
  • 🐍 Pythonic API with simple methods for adding messages and tracking tokens
  • 📦 Packaged as a PyPI-installable library for easy integration

Deployment

  • 🚀 Ready to deploy with FastAPI applications out-of-the-box
  • 🔒 Secure environment variable management via dotenv support
  • 📝 Clear logging with configurable verbosity for production and development

Architecture 🏗️

flowchart LR
    Client[Client Request] --> API[FastAPI Application]
    API --> Middleware[TokenMonitorMiddleware]
    Middleware --> ChatManager[ChatManager]
    Middleware --> TokenMonitor[TokenMonitor]
    TokenMonitor --> Storage[InMemoryStorage]
    ChatManager --> Storage
    API --> ExternalAI[OpenRouter Groq OpenAI APIs]
    ExternalAI --> API
Loading
Component Role Technology
FastAPI Application Serves API endpoints and handles client requests Python FastAPI
TokenMonitorMiddleware Intercepts and tracks tokens for each API request Starlette Middleware
ChatManager Manages chat sessions and messages Python Class
TokenMonitor Tracks tokens and estimates cost per response Python Class
InMemoryStorage Stores token usage logs for retrieval Python List-based
ExternalAI APIs Provides AI model responses OpenRouter Groq OpenAI

Workflow 🔄

sequenceDiagram
    actor User
    participant ClientApp
    participant FastAPI
    participant Middleware
    participant ChatManager
    participant TokenMonitor
    participant Storage
    participant ExternalAI

    User->>ClientApp: Sends chat message
    ClientApp->>FastAPI: POST chat message
    FastAPI->>Middleware: Process request
    Middleware->>ChatManager: Create or retrieve chat session
    FastAPI->>ExternalAI: Forward chat message
    ExternalAI-->>FastAPI: Returns AI response
    Middleware->>TokenMonitor: Track tokens and cost
    TokenMonitor->>Storage: Save token logs
    Middleware-->>FastAPI: Pass updated response
    FastAPI-->>ClientApp: Send AI response with usage info
    ClientApp-->>User: Display chat and cost
Loading
  1. User sends a chat message via client app.
  2. FastAPI receives the message and passes it through TokenMonitorMiddleware.
  3. Middleware invokes ChatManager to create or fetch the chat session.
  4. FastAPI forwards the message to the external AI provider (OpenRouter/Groq/OpenAI).
  5. AI provider returns a response with token usage metadata.
  6. Middleware calls TokenMonitor to extract tokens and calculate cost.
  7. TokenMonitor stores usage logs in InMemoryStorage.
  8. Response with usage details is returned to client app and displayed to the user.

Tech Stack 🛠️

Layer Technology Purpose
API Framework FastAPI Web API and middleware
Middleware Starlette Middleware Request interception and tracking
AI SDK OpenAI Python SDK Interact with AI providers
Storage In-memory Python List Temporary token usage storage
Environment Python dotenv Secure environment variable management
Logging Python logging Structured application logs

Installation 📥

Prerequisites

  • Python 3.8 or newer
  • pip package manager
  • OpenRouter API key (or keys for Groq/OpenAI)

Quick Start

git clone https://github.com/Tharanika-R-Git/AI-Token-Monitor.git
cd AI-Token-Monitor
pip install -r requirements.txt

Environment Setup

cp .env.example .env
# Edit .env to add your OPENROUTER_API_KEY and other credentials

Project Structure 🗂️

AI-Token-Monitor/
├── ai_token_monitor/
│   ├── __init__.py         # Package initialization
│   ├── chat.py             # Chat session management
│   ├── logger.py           # Logging setup
│   ├── middleware.py       # FastAPI middleware for token tracking
│   ├── monitor.py          # Token tracking and cost calculation
│   ├── pricing.py          # Model pricing data and utilities
│   ├── storage.py          # In-memory storage for logs
│   └── utils.py            # Utility functions for response normalization
├── fastapi_app.py          # Demo FastAPI application using the package
├── requirements.txt        # Python dependencies
├── .env.example            # Example environment variables file
└── README.md               # This documentation

Usage 🚀

Basic Example

from ai_token_monitor import TokenMonitor, ChatManager

monitor = TokenMonitor()
chat_manager = ChatManager()

chat_id = chat_manager.create_chat(user_id="user123")
# Simulate adding a message and AI response tracking
chat_manager.add_message(chat_id, role="user", content="Hello AI!")
response = {
    "usage": {"total_tokens": 50},
    "model": "openai-gpt-4",
    "choices": [{"message": {"content": "Hello user!"}}]
}
monitor.track(response, model="openai-gpt-4", chat_manager=chat_manager, chat_id=chat_id)

print(f"Total tokens used: {monitor.total_tokens}")
print(f"Total cost estimate USD: {monitor.total_cost:.4f}")

Advanced Example with FastAPI Integration

import os
from fastapi import FastAPI, Request
from openai import OpenAI
from dotenv import load_dotenv
from ai_token_monitor import TokenMonitor, ChatManager

load_dotenv()

app = FastAPI(title="AI Token Monitor Demo")

monitor = TokenMonitor()
chat_manager = ChatManager()
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ["OPENROUTER_API_KEY"]
)

@app.post("/chat/{user_id}")
async def chat(user_id: str, request: Request):
    body = await request.json()
    message = body.get("message", "")
    chat_id = chat_manager.create_chat(user_id)
    chat_manager.add_message(chat_id, role="user", content=message)

    # Call AI provider
    response = client.chat.completions.create(
        model="openai-gpt-4o-mini",
        messages=[{"role": "user", "content": message}]
    )

    # Track tokens and cost
    monitor.track(response, model="openai-gpt-4o-mini", chat_manager=chat_manager, chat_id=chat_id)
    chat_manager.add_message(chat_id, role="assistant", content=response.choices[0].message.content)

    return {
        "response": response.choices[0].message.content,
        "total_tokens": monitor.total_tokens,
        "total_cost": monitor.total_cost
    }

Thank you for using AI-Token-Monitor! For issues, feature requests, or contributions, please open an issue or pull request on GitHub.

License

This project is licensed under the MIT License.


🔗 GitHub Repo: https://github.com/Tharanika-R-Git/AI-Token-Monitor

About

Track token usage and cost for OpenRouter, Groq, and OpenAI — with FastAPI integration.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages