- Overview
- Dataset
- Project Structure
- Pipeline
- Model Architecture
- Installation
- Usage
- Results
- Tech Stack
- Known Issues & Fixes
- Future Improvements
- License
This project implements a Convolutional Neural Network (CNN) from scratch using Keras to classify grayscale images of clothing and fashion accessories from the Fashion MNIST dataset.
The project covers the complete machine learning workflow:
- Raw data loading from CSV files
- Data preprocessing and normalization
- Exploratory data visualization
- CNN model design and training
- Evaluation using confusion matrix and classification report
- Visual prediction display on test images
Fashion MNIST is widely used as a more challenging benchmark than the classic handwritten digits MNIST dataset, making it ideal for practicing image classification with deep learning.
The Fashion MNIST dataset was created by Zalando Research and contains 70,000 grayscale images across 10 fashion categories.
| Split | Samples | Image Size | Format |
|---|---|---|---|
| Training | 60,000 | 28 × 28 (grayscale) | CSV |
| Test | 10,000 | 28 × 28 (grayscale) | CSV |
| Label | Class Name | Description |
|---|---|---|
| 0 | T-shirt/Top | Short-sleeved or sleeveless top |
| 1 | Trouser | Long pants/trousers |
| 2 | Pullover | Knitted pullover sweater |
| 3 | Dress | One-piece dress garment |
| 4 | Coat | Heavy outerwear coat |
| 5 | Sandal | Open-toed summer footwear |
| 6 | Shirt | Collared button-up shirt |
| 7 | Sneaker | Casual athletic shoe |
| 8 | Bag | Handbag or tote |
| 9 | Ankle Boot | Short boot covering the ankle |
📥 Download the dataset from Kaggle. A free Kaggle account is required.
fashion-Class-Classification/
│
├── fashion-MNIST-dataset/ # Dataset folder (not tracked by git)
│ ├── fashion-mnist_train.csv # 60,000 training samples
│ └── fashion-mnist_test.csv # 10,000 test samples
│
├── visualizing.py # Step 1: Data loading & visualization
├── main.py # Step 2: Model training & evaluation
│
├── .gitignore # Ignores dataset and venv folders
├── requirements.txt # Python dependencies
└── README.md # Project documentation
| File | Responsibility |
|---|---|
visualizing.py |
Loads CSVs, converts to NumPy arrays, plots sample grids |
main.py |
Preprocesses data, builds CNN, trains model, evaluates and visualizes results |
fashion_train_df = pd.read_csv('fashion-MNIST-dataset/fashion-mnist_train.csv')
fashion_test_df = pd.read_csv('fashion-MNIST-dataset/fashion-mnist_test.csv')
training = np.array(fashion_train_df, dtype=np.float32)
test = np.array(fashion_test_df, dtype=np.float32)Each row in the CSV contains:
- Column
0→ class label (integer 0–9) - Columns
1–784→ pixel values (0–255) flattened from a 28×28 image
- Displays a single random image with its label
- Renders a 15 × 15 grid of random training samples to inspect class distribution and image quality
plt.imshow(training[i, 1:].reshape(28, 28), cmap='gray')# Extract pixels and labels
X_train = training[:, 1:] / 255.0 # Normalize pixels to [0, 1]
y_train = training[:, 0] # Integer class labels (0–9), no normalization
# 80/20 train-validation split
X_train, X_valid, y_train, y_valid = train_test_split(
X_train, y_train, test_size=0.2, random_state=12345
)
# Reshape for CNN input: (samples, height, width, channels)
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1).astype('float32')
X_valid = X_valid.reshape(X_valid.shape[0], 28, 28, 1).astype('float32')
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1).astype('float32')| Step | Detail |
|---|---|
| Normalization | Divide pixels by 255 → range [0, 1] |
| Validation split | 80% training, 20% validation |
| Reshape | (n, 784) → (n, 28, 28, 1) for CNN |
model.compile(
optimizer=Adam(learning_rate=0.001),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
model.fit(
X_train, y_train,
validation_data=(X_valid, y_valid),
epochs=10,
batch_size=32,
verbose=1
)# Get predicted class indices from softmax probabilities
predictions = np.argmax(model.predict(X_test), axis=1)
# Confusion matrix heatmap
cm = confusion_matrix(y_test, predictions)
sns.heatmap(cm, annot=True, fmt="d")
# Per-class precision, recall, F1-score
print(classification_report(y_test, predictions, target_names=target_names))Input: (28, 28, 1)
↓
Conv2D → 32 filters, 3×3 kernel, stride=2, ReLU
↓
MaxPooling2D → 2×2 pool size
↓
Flatten → 1D vector
↓
Dense(32) → ReLU activation
↓
Dense(10) → Softmax (output: probability per class)
| Layer | Output Shape | Parameters | Notes |
|---|---|---|---|
Conv2D |
(13, 13, 32) | 320 | 3×3 kernel, stride 2, ReLU |
MaxPooling2D |
(6, 6, 32) | 0 | 2×2 pool |
Flatten |
(1152,) | 0 | — |
Dense |
(32,) | 36,896 | ReLU activation |
Dense |
(10,) | 330 | Softmax output |
| Hyperparameter | Value |
|---|---|
| Optimizer | Adam |
| Learning rate | 0.001 |
| Loss function | Sparse Categorical Crossentropy |
| Epochs | 10 |
| Batch size | 32 |
| Validation split | 20% |
- Python 3.9 or higher
- pip
git clone https://github.com/your-username/fashion-Class-Classification.git
cd fashion-Class-Classificationpython -m venv .venv
# Activate on Windows
.venv\Scripts\activate
# Activate on macOS/Linux
source .venv/bin/activatepip install -r requirements.txtOr manually:
pip install pandas numpy matplotlib seaborn keras tensorflow scikit-learn- Visit https://www.kaggle.com/datasets/zalando-research/fashionmnist
- Click Download (free Kaggle account required)
- Extract and place the CSV files as shown:
fashion-MNIST-dataset/
├── fashion-mnist_train.csv
└── fashion-mnist_test.csv
python visualizing.pyOutputs:
- One random training image with its class label
- A 15×15 grid of random training images with labels
python main.pyOutputs:
- Live training progress (accuracy & loss per epoch)
- 5×5 prediction grid: predicted vs true labels
- Confusion matrix heatmap
- Full classification report in terminal
After 10 training epochs, the model produces the following evaluations:
A 10×10 annotated heatmap where rows are true labels and columns are predicted labels. A strong diagonal means high accuracy per class. Off-diagonal values reveal which classes are most commonly confused.
Per-class breakdown of key metrics:
| Metric | Description |
|---|---|
| Precision | Of all predictions for a class, how many were correct |
| Recall | Of all true instances of a class, how many were found |
| F1-Score | Harmonic mean of precision and recall |
| Support | Number of true instances per class in the test set |
A 5×5 visual grid displaying test images with their predicted and true labels side by side for quick qualitative review.
| Library | Purpose |
|---|---|
pandas |
Loading CSV dataset files |
numpy |
Array operations and image reshaping |
matplotlib |
Plotting images and sample grids |
seaborn |
Confusion matrix heatmap visualization |
keras |
Building, compiling, and training the CNN |
tensorflow |
Keras backend for computation |
scikit-learn |
Train/test split, confusion matrix, report |
| Issue | Root Cause | Fix Applied |
|---|---|---|
FileNotFoundError on CSV load |
Wrong path or missing dataset files | Verified fashion-MNIST-dataset/ folder structure |
| White/blank image grid | Missing cmap='gray' in imshow |
Added cmap='gray' to all imshow calls |
ValueError: cannot reshape array of size 1 |
training[i, 1] grabs 1 pixel instead of 784 |
Changed to training[i, 1:] (slice, not index) |
TypeError: tuple cannot be interpreted as int |
X.shape[0:] returns full tuple, not int |
Changed to X.shape[0] (no colon) |
confusion_matrix receives None |
prediction_model() had no return statement |
Added return predict_classes at end of function |
| Labels divided by 255 incorrectly | y_train = training[:, 0] / 255 |
Labels are integers 0–9 and must NOT be normalized |
Input layer placed after Conv2D |
Misplaced model.add(Input(...)) mid-model |
Removed stray Input layer; Keras infers shape from input_shape |
| Wrong loss function | binary_crossentropy used for 10-class task |
Changed to sparse_categorical_crossentropy |
| Wrong output layer | Dense(16, activation='sigmoid') |
Changed to Dense(10, activation='softmax') |
- Add additional
Conv2D+MaxPooling2Dblocks for deeper feature extraction - Add
BatchNormalizationlayers to stabilize and speed up training - Add
Dropoutlayers to reduce overfitting - Implement learning rate scheduling (e.g.
ReduceLROnPlateau) - Plot training history — accuracy and loss curves per epoch
- Save and reload trained model using
.kerasor.h5format - Build an interactive demo with Gradio or Streamlit for live predictions
- Apply data augmentation (random flips, rotations, zoom) to improve generalization
- Benchmark against other architectures (ResNet, MobileNet via transfer learning)
This project is licensed under the MIT License.
Feel free to use, modify, and distribute with attribution.