This repository implements MA-SSD (Multi Attention SSD for Fast Detection of Small Objects). The implementation is based on the projects lufficc/SSD.
MA-SSD | SSD |
---|---|
- Fast Small Object Detection MA-SSD outperforms SSD on detection especially in small object detection task. MA-SSD runs over 100 FPS on single RTX 2080 Ti GPU. In Quadro P2000 GPU, it also runs over 23 FPS(SSD is 28 FPS).
- Neck Structure In
ssd/modeling/neck/
, you can add or modifiy neck module. Neck module always be employed between backbone and head. - Inference Speed Calculation While you running
demo.py
, it is not only detect objects in specify image folder but also calculate FPS for each image and avgerage FPS for all image. - Multi Attention Module
- Feature Fusion Module
- Python3
- PyTorch 1.0 or higher
- yacs
- Vizer
- GCC >= 4.9
- OpenCV
git clone https://github.com/kevinchan04/MA-SSD.git
cd MA-SSD
# Required packages: torch torchvision yacs tqdm opencv-python vizer
pip install -r requirements.txt
# It's recommended to install the latest release of torch and torchvision.
For Pascal VOC dataset, make the folder structure like this:
VOC_ROOT
|__ VOC2007
|_ JPEGImages
|_ Annotations
|_ ImageSets
|_ SegmentationClass
|__ VOC2012
|_ JPEGImages
|_ Annotations
|_ ImageSets
|_ SegmentationClass
|__ ...
Where VOC_ROOT
default is datasets
folder in current project, you can create symlinks to datasets
or export VOC_ROOT="/path/to/voc_root"
.
For COCO dataset, make the folder structure like this:
COCO_ROOT
|__ annotations
|_ instances_valminusminival2014.json
|_ instances_minival2014.json
|_ instances_train2014.json
|_ instances_val2014.json
|_ ...
|__ train2014
|_ <im-1-name>.jpg
|_ ...
|_ <im-N-name>.jpg
|__ val2014
|_ <im-1-name>.jpg
|_ ...
|_ <im-N-name>.jpg
|__ ...
Where COCO_ROOT
default is datasets
folder in current project, you can create symlinks to datasets
or export COCO_ROOT="/path/to/coco_root"
.
# for example, evaluate SSD300:
python test.py --config-file configs/vgg_att_ssd300_neckthreemed_voc0712.yaml --ckpt https://github.com/kevinchan04/MA-SSD/releases/download/1.0/vgg_att_ssd300_voc0712_neckthreemed.pth
# for example, evaluate SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS test.py --config-file configs/vgg_ssd300_voc0712.yaml --ckpt https://github.com/kevinchan04/MA-SSD/releases/download/1.0/vgg_att_ssd300_voc0712_neckthreemed.pth
Predicting image in a folder is simple, it will calculate avgerage speed of inference(FPS):
python demo.py --config-file configs/vgg_ssd300_voc0712.yaml --images_dir demo --ckpt https://github.com/kevinchan04/MA-SSD/releases/download/1.0/vgg_att_ssd300_voc0712_neckthreemed.pth
Then it will download and cache vgg_att_ssd300_voc0712_neckthreemed.pth
automatically and predicted images with boxes, scores and label names will saved to demo/result
folder by default.
# for example, train SSD300:
python train.py --config-file configs/vgg_att_ssd300_neckthreemed_voc0712.yaml
# for example, train SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --config-file configs/vgg_att_ssd300_neckthreemed_voc0712.yaml
The configuration files that I provide assume that we are running on single GPU. When changing number of GPUs, hyper-parameter (lr, max_iter, ...) will also changed. The learning rate is the sum of all GPUs, which means if you are training on 4 GPUs, lr should be set as 1e-3
. According to our experiments, larger lr always requires more warm-up iterations. The max_iter
also is the sum on all GPUs.
VOC2007 test | coco test-dev2015 | |
---|---|---|
SSD300* | 77.2 | 25.1 |
SSD512* | 79.8 | 28.8 |
Backbone | Neck | Input Size | box AP | Model Size | Download |
---|---|---|---|---|---|
VGG16 | neckthreemed | 300 | 26.5 | 372MB | model |
Backbone | Neck | Input Size | mAP | Model Size | Download |
---|---|---|---|---|---|
VGG16 | neckthreemed | 300 | 79.9 | 307MB | model |
neckthreemed
is multi attention with feature fusion neck. Please refer to paper with more details about comparsion with other methods.