Active Deterrence

Project start: December 2025 — Present

Role: AI / Computer Vision Engineer

Status: 🟢 Active Development

Collaboration: SmartDrones, Karabela

Project description

Active Deterrence is an AI Vision system for autonomous security drones. The project addresses a critical gap in current drone security solutions: while existing systems can detect motion, they lack the intelligence to understand what they see and react appropriately.

Current limitations of autonomous security drones:

detection without understanding — can detect movement but cannot classify what triggered the alert
high false positive rates — alerts triggered by animals, shadows, leaves
no intelligent response — operators are overwhelmed with notifications
passive monitoring only — no capability for active deterrence

I am building an end-to-end AI pipeline that transforms raw drone footage into actionable security intelligence — detecting and classifying objects (persons, vehicles), filtering false alarms, and enabling automated voice deterrence when intruders are detected.

Architecture Overview

Active Deterrence is an image → detection → decision → voice pipeline:

Video Ingestion — MP4/RTSP/JPG from drones (day + IR/night modes)
Computer Vision Layer — YOLOv8 fine-tuned on drone footage
Alert Logic — confidence threshold (40%+), alert cooldown, frame sampling
Voice AI (Active Deterrence) — ElevenLabs TTS for real-time voice alerts (~300-500ms latency)
Training Pipeline — GPU Droplet (H100/A100) on Digital Ocean with TensorBoard monitoring

Model Training Journey

A key aspect of this project was the iterative model training process. I conducted 4 training iterations with different dataset combinations:

Model	Dataset	mAP@50	Precision	Recall	Key Insight
v2	Karabela only (218 img)	82.4%	86.7%	85.4%	Baseline, overfitting observed
v3	Karabela + HIT-UAV (~3k img)	93.5%	93.2%	90.3%	Best performance
v4	+ VisDrone (~10k img)	62.0%	79.1%	57.4%	Domain mismatch degradation

Technical Learnings

More data ≠ better model
Adding 10,000 VisDrone images actually degraded performance. Analysis revealed class imbalance (vehicle-heavy), domain mismatch (different altitudes/angles), and small object challenges.

Confusion Matrix Analysis is Critical
The v4 model showed 9,000 false negatives for the vehicle class — objects the model completely missed. This guided the optimization strategy.

Small Object Detection is Hard
Vehicles from high-altitude drone footage often appear as <20px objects. Solutions include SAHI (Slicing Aided Hyper Inference), tiling preprocessing, and larger model architectures.

What I did

Designed AI Vision system architecture for security drones
Built data pipeline for ingestion and auditing from Google Drive
Performed labeling in Roboflow (person, vehicle classes)
Trained 4 model iterations with different datasets
Integrated external datasets (HIT-UAV, VisDrone)
Analyzed confusion matrix and identified small object problems
Integrated ElevenLabs TTS for voice alerts
Built Streamlit demo with full flow: upload → detection → alert
Prepared GPU training infrastructure (Digital Ocean + TensorBoard)

Skills

Category	Technologies
Computer Vision	YOLOv8, PyTorch, OpenCV, Roboflow
Deep Learning	Transfer Learning, Hyperparameter Tuning
MLOps	TensorBoard, Digital Ocean GPU, Docker
Voice AI	ElevenLabs TTS
Backend	FastAPI, Python
Frontend	Streamlit
Data	FFmpeg, Annotation Tools

Results

Model v3 with mAP 93.5% — best result
Working demo: upload → detection → voice alert
Identified small object detection problem (vehicle class)
Training pipeline ready for scaling
Voice AI system with ~300-500ms latency
GPU training infrastructure on Digital Ocean

Roadmap

MVP (January 2026) ✅

Computer Vision — person/vehicle detection
Model v2/v3/v4 (iterative training)
Streamlit demo with Voice AI
Confusion matrix analysis

Next Phase: Hyperparameter Optimization (Q1 2026)

GPU Droplet training with hyperparameter tuning
SAHI workflow for small object detection
Class balancing strategies

Future: MLOps & Production (Q1-Q2 2026)

MLflow Tracking & Model Registry
FastAPI backend deployment
Real-time RTSP inference
Integration with SmartDrones platform

Sample photos

Person Detection Vehicle Detection Multi-object Detection