1. Problem
Industrial quality control typically relies on slow, manual inspection processes. This project solves the need for real-time anomaly detection on continuous manufacturing lines, targeted toward operations running on constrained edge devices requiring high-throughput, sub-20ms visual analysis.
2. Approach
- Data Engine: Designed an active learning ingestion loop targeting severely imbalanced defect classes. Scaled a robust 10,000+ image dataset via automated synthetic generation and Albumentations.
- Training Strategy: Fine-tuned ultra-lightweight YOLOv8 architectures. Orchestrated distributed training runs on cloud GPU clusters.
- Inference Optimization: Minimized memory footprints and FLOPs by optimizing PyTorch nets with FP16/INT8 post-training quantization.
- Hardware Target: Compiled the frozen graphs strictly to TensorRT to maximize parallel execution on Jetson and Coral edge devices.
3. Tech Stack
PyTorch • TensorRT • CUDA • C++ Inference API • Docker • Distributed AWS
4. Results
- mAP: 0.87 mAP@0.5:0.95 during rigorous test splits against adversarial edge cases.
- Precision/Recall: Maintained 0.95+ Recall on critical defect classes, practically eliminating false negatives.
- Latency Target Met: Smashed the SLA requirement by achieving stable sub-15ms inferences directly on constrained edge hardware.
- Throughput: Maintained a rock-solid 60+ FPS processing limit across sustained production runs.
5. Architectural Next Steps
- Transition entirely from Python inference scripts to a custom C++ / GStreamer pipeline for zero-overhead buffer management.
- Explore Knowledge Distillation to force larger teacher-model performance into Nano-sized student parameters.
- Deploy robust Model Observability tooling to instantly capture data drift across varying factory lighting conditions.
