Active Deterrence
Project start: December 2025 — Present
Role: AI / Computer Vision Engineer
Status: 🟢 Active Development
Collaboration: SmartDrones, Karabela
Project description
Active Deterrence is an AI Vision system for autonomous security drones. The project addresses a critical gap in current drone security solutions: while existing systems can detect motion, they lack the intelligence to understand what they see and react appropriately.
Current limitations of autonomous security drones:
- detection without understanding — can detect movement but cannot classify what triggered the alert
- high false positive rates — alerts triggered by animals, shadows, leaves
- no intelligent response — operators are overwhelmed with notifications
- passive monitoring only — no capability for active deterrence
I am building an end-to-end AI pipeline that transforms raw drone footage into actionable security intelligence — detecting and classifying objects (persons, vehicles), filtering false alarms, and enabling automated voice deterrence when intruders are detected.
Architecture Overview
Active Deterrence is an image → detection → decision → voice pipeline:
- Video Ingestion — MP4/RTSP/JPG from drones (day + IR/night modes)
- Computer Vision Layer — YOLOv8 fine-tuned on drone footage
- Alert Logic — confidence threshold (40%+), alert cooldown, frame sampling
- Voice AI (Active Deterrence) — ElevenLabs TTS for real-time voice alerts (~300-500ms latency)
- Training Pipeline — GPU Droplet (H100/A100) on Digital Ocean with TensorBoard monitoring
Model Training Journey
A key aspect of this project was the iterative model training process. I conducted 4 training iterations with different dataset combinations:
| Model | Dataset | mAP@50 | Precision | Recall | Key Insight |
|---|---|---|---|---|---|
| v2 | Karabela only (218 img) | 82.4% | 86.7% | 85.4% | Baseline, overfitting observed |
| v3 | Karabela + HIT-UAV (~3k img) | 93.5% | 93.2% | 90.3% | Best performance |
| v4 | + VisDrone (~10k img) | 62.0% | 79.1% | 57.4% | Domain mismatch degradation |
Technical Learnings
More data ≠ better model
Adding 10,000 VisDrone images actually degraded performance. Analysis revealed class imbalance (vehicle-heavy), domain mismatch (different altitudes/angles), and small object challenges.
Confusion Matrix Analysis is Critical
The v4 model showed 9,000 false negatives for the vehicle class — objects the model completely missed. This guided the optimization strategy.
Small Object Detection is Hard
Vehicles from high-altitude drone footage often appear as <20px objects. Solutions include SAHI (Slicing Aided Hyper Inference), tiling preprocessing, and larger model architectures.
What I did
- Designed AI Vision system architecture for security drones
- Built data pipeline for ingestion and auditing from Google Drive
- Performed labeling in Roboflow (person, vehicle classes)
- Trained 4 model iterations with different datasets
- Integrated external datasets (HIT-UAV, VisDrone)
- Analyzed confusion matrix and identified small object problems
- Integrated ElevenLabs TTS for voice alerts
- Built Streamlit demo with full flow: upload → detection → alert
- Prepared GPU training infrastructure (Digital Ocean + TensorBoard)
Skills
| Category | Technologies |
|---|---|
| Computer Vision | YOLOv8, PyTorch, OpenCV, Roboflow |
| Deep Learning | Transfer Learning, Hyperparameter Tuning |
| MLOps | TensorBoard, Digital Ocean GPU, Docker |
| Voice AI | ElevenLabs TTS |
| Backend | FastAPI, Python |
| Frontend | Streamlit |
| Data | FFmpeg, Annotation Tools |
Results
- Model v3 with mAP 93.5% — best result
- Working demo: upload → detection → voice alert
- Identified small object detection problem (vehicle class)
- Training pipeline ready for scaling
- Voice AI system with ~300-500ms latency
- GPU training infrastructure on Digital Ocean
Roadmap
MVP (January 2026) ✅
- Computer Vision — person/vehicle detection
- Model v2/v3/v4 (iterative training)
- Streamlit demo with Voice AI
- Confusion matrix analysis
Next Phase: Hyperparameter Optimization (Q1 2026)
- GPU Droplet training with hyperparameter tuning
- SAHI workflow for small object detection
- Class balancing strategies
Future: MLOps & Production (Q1-Q2 2026)
- MLflow Tracking & Model Registry
- FastAPI backend deployment
- Real-time RTSP inference
- Integration with SmartDrones platform
Sample photos
