Leaf Object Detection

Single-class YOLO dataset and training pipeline for live leaf detection.

Open the training notebook in Colab:

The Colab notebook downloads the public Kaggle dataset automatically with KaggleHub when no cached dataset exists, so you should not need to upload files manually. It saves everything under:

MyDrive/leaf-object-detection/

It caches archive.zip, uploads the prepared YOLO dataset as datasets/leaf_yolo_dataset.tar.gz, syncs run folders after each model, and writes final ZIP/summary artifacts to artifacts/.

Fastest repeat setup: the notebook creates and reuses one compressed prepared dataset file:

MyDrive/leaf-object-detection/datasets/leaf_yolo_dataset.tar.gz

If you ever want to create it locally instead, use:

cd D:\Ken\leaf_detector
tar -czf leaf_yolo_dataset.tar.gz datasets\leaf_yolo

The notebook can detect the current loose Drive upload layout:

MyDrive/leaf-object-detection/datasets/archive/PlantVillage_for_object_detection/Dataset

but it no longer uses that loose folder automatically because mounted Google Drive is too slow for 50k+ files. Use leaf_yolo_dataset.tar.gz or automatic KaggleHub download for fast Colab prep.

Current Local Status

Prepared dataset, after running prep: datasets/leaf_yolo
Class set: nc: 1, names: ['leaf']
Total images: 57,164
Total leaf boxes: 63,225
Empty-label hard negatives: 300
Sources:
- 54,293 images from the provided PlantVillage YOLO archive
- 2,571 public PlantDoc mixed-scene images converted from Pascal VOC to YOLO
- 300 synthetic no-leaf negatives
Split:
- Train: 46,008 images, 51,905 boxes
- Val: 5,460 images, 5,432 boxes
- Test: 5,696 images, 5,888 boxes

The prepared dataset uses hardlinks for images, so it does not duplicate the full image storage on disk.

Important Files

scripts/prepare_leaf_dataset.py: extracts, validates, merges, splits, and rewrites labels to class 0
scripts/validate_leaf_dataset.py: checks image-label pairing, YOLO box validity, and class IDs
scripts/import_plantdoc_git.py: imports PlantDoc directly from Git blobs, including Windows-hostile filenames
scripts/generate_synthetic_negatives.py: creates temporary no-leaf negative images
scripts/train_leaf_yolo.py: Colab A100 training, test evaluation, ONNX export, and TF.js export
datasets/leaf_yolo/data.yaml: final YOLO dataset config, generated after prep
datasets/leaf_yolo/validation_report.json: latest validation report, generated after validation
COLAB_A100_STEPS.md: exact Colab commands
web/: browser ONNX demo scaffold

Rebuild Dataset Locally

python scripts\prepare_leaf_dataset.py `
  --archive archive.zip `
  --work-dir . `
  --extra-yolo-dir public\plantdoc_yolo `
  --hard-negatives-dir hard_negatives\synthetic `
  --force

python scripts\validate_leaf_dataset.py `
  --dataset datasets\leaf_yolo `
  --write-report

Train on Colab A100

Follow COLAB_A100_STEPS.md.

The smoke test trains yolo26s.pt for 20 epochs. The main run trains:

yolo26s.pt
yolo26m.pt
yolo26x.pt

Training uses pretrained weights, early stopping, cosine LR, fixed 640px main training, mosaic closeout, HSV/geometry augmentation, mixup, and cutmix. Multi-scale training is available with --multi-scale, but fixed sizing is the default because it is more stable on ROCm/AMD GPUs. The main Colab run also performs a lower-augmentation fine-tune stage from each candidate model's best.pt. The final model should be chosen by test recall, mAP50-95, false positives on no-leaf images, model size, and browser FPS.

Deployment

Exported ONNX models can be tested in web/index.html after placing the chosen model at:

leaf_detector/web/models/best.onnx

For a browser-first deployment, start with yolo26s or yolo26m. Keep yolo26x only if the browser FPS is acceptable.

Notes

The original PlantVillage data is mostly centered 256x256 single-leaf imagery. The PlantDoc merge and hard negatives reduce that bias, but real webcam images from the final environment are still the best way to prove live performance.

The original PlantVillage object-detection dataset is hosted on Kaggle under CC BY-NC-SA 4.0. Check that license before any commercial deployment.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
scripts		scripts
web		web
.gitignore		.gitignore
COLAB_A100_STEPS.md		COLAB_A100_STEPS.md
README.md		README.md
colab_leaf_yolo_training.ipynb		colab_leaf_yolo_training.ipynb
requirements-colab.txt		requirements-colab.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Leaf Object Detection

Current Local Status

Important Files

Rebuild Dataset Locally

Train on Colab A100

Deployment

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Leaf Object Detection

Current Local Status

Important Files

Rebuild Dataset Locally

Train on Colab A100

Deployment

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages