There are plenty of proposed datasets for training CNN based vehicle detector, also many pre-trained weights that can be directly applied for the vehicle detection tasks. Because the majority of the vehicle detections tasks were conducted on or near the ground, these image datasets were collected by ground vehicles or CCTV cameras. However, vehicle detectors trained on these datasets collected by the ground cameras perform badly for the vehicle detections in UAV video applications, image datasets created from UAV videos are needed for creating a reliable traffic monitoring pipeline on UAVs. We conducted fine-tuning on YOLOv3 using the aerial-cars-dataset. The trained CNN vehicle detector recognized most vehicles near the UAV camera, but not vehicles far away from the camera. This is due to the training dataset contains bird's-eye view images only.
Images of the M0606 of UAV-benchmark-M were added to train the CNN, the improvement was very limited. This is because the dataset was created by labeling cars near to the cameras only, cars far away from the cameras were ignored. Another reason is images resolution of both aerial-cars-dataset and UAV-benchmark-M are relatively low (1024x540) compared to our aerial videos (2720x1530).
A new dataset was created by labeling our UAV video images. The final dataset used for fine-tuning YOLOv3 vehicle detector is composed of 154 images from aerial-cars-dataset, 1374 images from the UAV-benchmark-M, and our custom labeled 157 images. The complete dataset is provided in dataset1, dataset2, dataset3, and dataset4.
Here are the key steps of fine tuning our UAV vehicle detector:Install YOLOv3: AlexeyAB/darknet
a. For cuda complie issues: execute this line export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
, before make
Install OpenCV: NOT opencv-python! following How to install OpenCV 3.4.0 on Ubuntu 16.04
Custom dataset labeling: the dataset is created from our aerial videos by Yolo_mark
Fine-tune YOLOv3: How to Train
a. if error Out of memory
shows, in .cfg-file
, increase subdivisions = 16, 32 or 64
following this
Important command lines:
Training: ./darknet detector train data/dji.data cfg/yolov3_dji.cfg darknet53.conv.74
Testing: ./darknet detector demo data/dji.data cfg/yolov3_dji.cfg backup/yolov3_dji_final.weights DJI_0003.MOV -out_filename DJI_0003_dji.avi
Please kindly cite this paper in your publications if this helps your research:
@article{wang2019orientation,
title={Orientation-and Scale-Invariant Multi-Vehicle Detection and Tracking from Unmanned Aerial Videos},
author={Wang, Jie and Simeonova, Sandra and Shahbazi, Mozhdeh},
journal={Remote Sensing},
volume={11},
number={18},
pages={2155},
year={2019},
publisher={Multidisciplinary Digital Publishing Institute}
}
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4