A MATLAB-based project that implements a secure visual communication system using speech-to-text, AES encryption, BCH error correction coding, and visual data transmission through screen capture.
This project demonstrates a complete cryptographic communication pipeline where:
- Speech input is converted to text
- Text is encrypted using AES-256-CBC
- Encrypted data is encoded with BCH error correction
- Data is visualized as a binary image with green border
- Image is transmitted via screen display
- Receiver captures screen, decodes, decrypts, and converts back to speech
The flowchart illustrates the complete transmitter (TX) and receiver (RX) execution pipeline:
- Speech Input - Voice capture via microphone
- Speech to Text - Convert speech to text string
- Data Preparation - AES setup and system initialization
- Text to Binary - Convert text to binary representation
- AES Encryption - Encrypt with AES-256-CBC
- BCH Encoding - Add error correction codes
- Image Generation - Create binary grid visualization
- Padding Data - Fill to required image dimensions
- Full Screen Display - Transmit via screen display
- Image Capture - Screen capture of transmitted image
- Border Detection & Data Extraction - Locate green border and extract data
- Binarize - Convert to binary representation
- Decode Binary Data - Extract bit stream
- BCH Decoding - Error correction and data recovery
- AES Decryption - Decrypt to recover original text
- Binary to Text - Convert binary to text string
- Text to Speech - Convert text to spoken output
Note: The flowchart shows RS encoding/decoding and interleaving steps, but the current implementation skips RS coding to save space and focuses on BCH error correction only.
- Speech-to-Text: Uses Windows Speech Recognition for voice input
- AES-256-CBC Encryption: Strong cryptographic protection with PKCS5Padding
- BCH Error Correction: (63,36) BCH coding for robust transmission
- Visual Data Transmission: Binary grid visualization with distinctive green border
- Screen Capture: Automatic detection and capture of transmitted image
- Text-to-Speech: Python pyttsx3 for cross-platform speech synthesis
- Error Resilience: Handles transmission errors through BCH coding
transmitter_code.m- Transmitter module (encryption + visualization)receiver_code.m- Receiver module (capture + decryption)
- MATLAB R2022a or later - Main development environment
- Windows OS - Required for speech recognition APIs
- Python 3.7+ - Required for pyttsx3 text-to-speech engine
- Java Runtime Environment - For Java cryptographic operations (AES)
- Communications Toolbox - For BCH encoder/decoder functions
- Image Processing Toolbox - For image manipulation and analysis
% Check if required toolboxes are installed
ver('communications') % Communications Toolbox
ver('image') % Image Processing Toolboxjavax.crypto.Cipher- AES encryption/decryptionjavax.crypto.spec.SecretKeySpec- AES key specificationjavax.crypto.spec.IvParameterSpec- AES initialization vectorjava.awt.Robot- Screen capture functionalityjava.awt.Rectangle- Screen region definitionjava.awt.GraphicsEnvironment- Graphics environment access
pyttsx3- Python text-to-speech engine for cross-platform speech synthesis
System.Speech.Recognition.SpeechRecognitionEngine- Speech-to-textSystem.Speech.Recognition.DictationGrammar- Speech grammar
- Microphone - For speech input capture
- Speakers/Headphones - For text-to-speech output
- Display - For visual data transmission (minimum 1024x768 recommended)
- RAM - Minimum 4GB (8GB recommended for smooth operation)
- Windows Speech Recognition - Must be enabled in Windows settings
- Python 3.7+ - Required for pyttsx3 text-to-speech engine
- pyttsx3 - Python text-to-speech library (
pip install pyttsx3) - Java Cryptography Extension (JCE) - Unlimited strength policy files for AES-256
% Install additional MATLAB packages (if needed)
% Communications Toolbox
matlab.addons.toolbox.installToolbox('communications')
% Image Processing Toolbox
matlab.addons.toolbox.installToolbox('image')# Install Python TTS dependency
pip install pyttsx3% Test Java integration
try
import javax.crypto.Cipher
disp('✓ Java cryptography available')
catch
disp('✗ Java cryptography not available')
end
% Test .NET integration
try
NET.addAssembly('System.Speech')
disp('✓ .NET Speech available')
catch
disp('✗ .NET Speech not available')
end
% Test Python TTS integration
try
[status, result] = system('python -c "import pyttsx3; print(pyttsx3.__version__)"');
if status == 0
disp('✓ Python pyttsx3 available')
else
disp('✗ Python pyttsx3 not available')
end
catch
disp('✗ Python pyttsx3 not available')
end- Speech Input: Captures voice input using Windows Speech Recognition
- AES Encryption:
- Key: 256-bit hexadecimal key
- IV: 128-bit initialization vector
- Mode: CBC with PKCS5Padding
- BCH Encoding: (63,36) BCH encoder adds error correction
- Image Generation: Creates 320x240 binary image with 10x10 blocks
- Green Border: Adds distinctive 20px green border for detection
- Display: Shows fullscreen for transmission
- Screen Capture: Captures fullscreen after 20-second delay
- Green Border Detection: Identifies transmitted image via green border
- Bit Extraction: Samples binary values from grid blocks
- BCH Decoding: Corrects transmission errors
- AES Decryption: Recovers original plaintext
- Text-to-Speech: Converts decrypted text to speech using Python pyttsx3
run('transmitter_code.m')- Speak clearly when prompted after 2 seconds
- The encrypted image will be displayed fullscreen
- Keep the window open for transmission
run('receiver_code.m')- Start the receiver after transmitter is displaying
- Click on the transmitter window when prompted
- Wait 20 seconds for automatic capture
- Decrypted text will be spoken
- AES Key:
211e91dce682d2d514022d9e72a2c013a1c813113326dd290e94dc09da229c72 - AES IV:
654569dc8be692bbde4f3289fa510610 - Block Size: 10x10 pixels
- Image Resolution: 320x240 pixels (24x32 blocks)
- BCH Code: (63,36) - 27 parity bits per codeword
- Green Border Threshold: G > 120, R < 100, B < 100
- Binary Threshold: Otsu's method via
graythresh()
- The AES key and IV are hardcoded for demonstration purposes
- In production, use proper key management
- The green border provides visual identification but no cryptographic security
- BCH coding provides error correction, not additional security
- Speech recognition failures prompt retry
- Green border detection errors halt processing
- Cryptographic errors are caught and reported
- Invalid padding or decryption failures are handled gracefully
- Transmission Rate: Visual (limited by display refresh)
- Latency: ~20 seconds (capture delay) + processing time
- Capacity: 768 bits per frame (24x32 blocks)
- Error Correction: Up to 27 bit errors per 63-bit codeword
-
Speech Recognition Issues
- Ensure microphone is working
- Speak clearly after the 2-second prompt
- Check Windows Speech Recognition settings
-
Green Border Not Detected
- Ensure transmitter window is visible
- Check display color settings
- Verify green border is not obscured
-
Decryption Failures
- Ensure transmitter and receiver use same key/IV
- Check for transmission errors
- Verify BCH decoding success
- Ensure Python pyttsx3 is properly installed and accessible
-
Python TTS Issues
- Verify Python 3.7+ is installed
- Install pyttsx3:
pip install pyttsx3 - Check Python is in system PATH
- Test pyttsx3 independently:
python -c "import pyttsx3; engine = pyttsx3.init(); engine.say('test'); engine.runAndWait()"
- Dynamic key exchange
- Multiple frame support for larger messages
- Adaptive error correction
- Network-based transmission
- GUI interface
- Real-time streaming
This project is for educational purposes. Use responsibly and in accordance with applicable laws and regulations.
