FrogShield: LLM Prompt Injection Defense Framework (Educational Demo)

FrogShield is an educational framework that demonstrates defensive strategies against prompt injection attacks in Large Language Models (LLMs).

⚠️ Disclaimer: This framework is an educational prototype. It lacks the robustness, sophistication, and comprehensive testing required for production environments. Use with caution and for learning purposes only.

Purpose

The goal of FrogShield is to illustrate three key layers of defense against prompt injection:

Input Validation: Analyzing incoming prompts against known malicious patterns (loaded from config_data/patterns.txt) and using placeholder functions (frogshield/utils/text_analysis.py) to check for suspicious syntax or context manipulation.
Model Hardening (Conceptual): The frogshield/model_hardener.py module provides methods for generating adversarial examples and testing model boundaries conceptually. It does not perform actual model training.
Real-time Monitoring: Analyzing the LLM's output for suspicious keywords (loaded from config_data/) or refusal messages, and monitoring basic behavioral patterns (like response length) for anomalies using placeholder logic (frogshield/realtime_monitor.py).

Components

The project includes the following key files and directories:

README.md: This file.
LICENSE: The MIT license file.
pyproject.toml: Project build configuration and core library dependencies (e.g., PyYAML).
requirements.txt: Lists additional dependencies needed for specific demos (e.g., ollama).
config.yaml: Main configuration file for tuning FrogShield components and specifying list file paths.
config_data/: Directory containing external data files:
- patterns.txt: Default file containing known injection patterns.
- refusal_keywords.txt: Keywords indicating appropriate LLM refusal.
- compliance_keywords.txt: Keywords indicating potential sensitive data leaks or compliance.
- sample_prompts.txt: Sample prompts used by demo_mock.py.
- boundary_refusal_keywords.txt: Refusal keywords used by demo_ollama.py boundary tests.
frogshield/: Directory containing the core defense library modules.
- __init__.py: Makes frogshield importable and defines the public API (InputValidator, RealtimeMonitor, ModelHardener).
- input_validator.py: Contains the InputValidator class for checking user input.
- model_hardener.py: Contains the ModelHardener class for conceptual hardening tasks.
- realtime_monitor.py: Contains the RealtimeMonitor class for checking LLM output.
- utils/: Sub-directory for shared utilities.
  - __init__.py: Package initializer.
  - config_loader.py: Utility for loading config.yaml and list files.
  - text_analysis.py: Placeholder functions for syntax/context analysis (used by InputValidator).
- tests/: Contains unit tests using Python's unittest framework.
  - Includes tests for validation, monitoring, and hardening modules.
demo_mock.py: Script demonstrating FrogShield with a simple, built-in mock LLM.
demo_ollama.py: Script demonstrating FrogShield with a local LLM run via Ollama.
run_demo.sh: Shell script to run the demo_ollama.py steps sequentially and interactively.
.gitignore: Standard Python gitignore file.

Configuration

FrogShield's behavior and external data sources are configured primarily through config.yaml.

Main Configuration (`config.yaml`)

Most configurable parameters for InputValidator, RealtimeMonitor, and the underlying TextAnalysis utilities are defined in config.yaml located in the project root. The components load these settings automatically if not overridden during instantiation.

Parameter Details:

`InputValidator`

context_window (int): Number of past conversation turns to consider for context analysis.

`RealtimeMonitor`

sensitivity_threshold (float, 0.0-1.0): Base sensitivity for detecting behavioral anomalies (e.g., response length deviations).
initial_avg_length (int): Starting guess for average response length (used initially).
behavior_monitoring_factor (float): Multiplier applied to sensitivity_threshold to adjust the acceptable deviation range for length checks.

`TextAnalysis`

syntax_non_alnum_threshold (float): Max allowed ratio of non-alphanumeric/non-space characters in a prompt.
syntax_max_word_length (int): Max allowed length for a single "word".

`ListFiles`

Purpose: Defines the paths (relative to the project root) for external data files used by components and demos.
Keys:
- patterns: Path to injection patterns file (used by InputValidator).
- refusal_keywords: Path to refusal keywords file (used by RealtimeMonitor).
- compliance_keywords: Path to compliance/sensitive keywords file (used by RealtimeMonitor).
- sample_prompts: Path to sample prompts file (used by demo_mock.py).
- boundary_refusal_keywords: Path to boundary test refusal keywords (used by demo_ollama.py).

List Files (`config_data/`)

External lists like injection patterns and keywords are stored as plain text files (one item per line, # comments ignored) within the config_data/ directory. The specific file used for each list type is determined by the paths set in the ListFiles section of config.yaml.

To customize these lists:

Edit the files within config_data/ directly.
Modify config.yaml to point to different files (ensure they are placed correctly relative to the project root).
Pass the content directly: When initializing components like InputValidator or RealtimeMonitor programmatically, you can pass the list/set content directly via arguments (e.g., patterns=[...], refusal_keywords={...}), bypassing the file loading mechanism.

Logging

The frogshield library modules use Python's standard logging module.
The demo scripts configure basic console logging to show INFO level messages by default.
To see more detailed DEBUG messages (e.g., specific pattern matches, analysis steps), use the --debug flag when running the demo scripts (e.g., python demo_ollama.py --prompt "Hello" --debug).

How to Run

Clone the Repository:

git clone https://github.com/blakeben/FrogShield.git # Or your fork
cd FrogShield

Set up Python Environment: (Recommended)

python3 -m venv venv
source venv/bin/activate
# On Windows use venv\Scripts\activate

Install Dependencies: Install the core library (editable mode) and demo dependencies.

# Installs frogshield + core deps (PyYAML)
pip install -e .

# Installs demo deps (ollama)
pip install -r requirements.txt

Verify Configuration: Ensure config.yaml and the necessary files within config_data/ exist.
Prepare Ollama (if using demo_ollama.py):
- Ensure Ollama is installed and the server is running (ollama serve in a separate terminal).
- Pull the desired model (the demo defaults to llama3): ollama pull llama3
Run Demos:
- Mock Demo: Runs through predefined prompts loaded from config_data/sample_prompts.txt.
```
python demo_mock.py
```
- Ollama Demo (Interactive): Uses run_demo.sh for a guided walkthrough.
```
chmod +x run_demo.sh
./run_demo.sh
```
- Ollama Demo (Single Prompt): Run a specific prompt through the framework.
```
python demo_ollama.py --prompt "Your prompt here"
```
- Ollama Demo (Boundary Test): Run the predefined boundary test suite using keywords from config_data/boundary_refusal_keywords.txt.
```
python demo_ollama.py --test-boundaries
```
- Specify Model/Debug: Use --model <model_name> or --debug with demo_ollama.py.
Run Unit Tests:
```
python -m unittest discover -v
```

Current Status & Limitations

Basic Functionality: Core components (InputValidator, RealtimeMonitor, ModelHardener) are implemented with foundational logic.
Centralized Config: List data (patterns, keywords) is externalized to config_data/ and managed via config.yaml.
Placeholder Analysis: Syntax/context analysis (text_analysis.py) and behavioral monitoring use simplified, placeholder heuristics.
Conceptual Hardening: ModelHardener demonstrates boundary testing and example generation concepts but lacks integration with actual model training.
Packaging: Basic packaging (pyproject.toml) allows local editable installation.
Testing: Unit tests cover basic functionality and currently pass.
Not Production Ready: This framework requires significant development for real-world use.

Future Development Ideas

Implement robust pattern matching (e.g., regex, semantic similarity).
Develop advanced syntax and context analysis using NLP techniques.
Integrate with additional LLM APIs and providers.
Refine baseline modeling and anomaly detection (e.g., statistical methods, sequence analysis).
Explore actual model fine-tuning/hardening techniques.
Implement more sophisticated adaptive response strategies.
Expand unit and integration test coverage.

Contributors

Author: Ben Blake <ben.blake@tcu.edu>
Contributor: Tanner Hendrix <t.hendrix@tcu.edu>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FrogShield: LLM Prompt Injection Defense Framework (Educational Demo)

Table of Contents

Purpose

Components

Configuration

Main Configuration (`config.yaml`)

Parameter Details:

`InputValidator`

`RealtimeMonitor`

`TextAnalysis`

`ListFiles`

List Files (`config_data/`)

Logging

How to Run

Current Status & Limitations

Future Development Ideas

Contributors

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
config_data		config_data
frogshield		frogshield
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
demo_mock.py		demo_mock.py
demo_ollama.py		demo_ollama.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_demo.sh		run_demo.sh

Folders and files

Latest commit

History

Repository files navigation

FrogShield: LLM Prompt Injection Defense Framework (Educational Demo)

Table of Contents

Purpose

Components

Configuration

Main Configuration (config.yaml)

Parameter Details:

InputValidator

RealtimeMonitor

TextAnalysis

ListFiles

List Files (config_data/)

Logging

How to Run

Current Status & Limitations

Future Development Ideas

Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages

Main Configuration (`config.yaml`)

`InputValidator`

`RealtimeMonitor`

`TextAnalysis`

`ListFiles`

List Files (`config_data/`)