-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PyTorch_YOLOv4 - detect.py performs poorly compared with detect.py from YOLOv5 & 7; How to set optimal values for conf-thres & iou-thres #51
Comments
@valentinitnelav I really hoped we could compare the predictions with the same parameters (eg confthres, iouthres), but in the end we should compare the best possible results, which would mean we have to predict using all possible combinations.... that would be alot of work. |
@stark-t , what if we define somehow universally acceptable/as objective as possible the "best" For each case we do a "search grid", say I think I can write a bash script to run detect.py hundreds of times on a GPU. But I need your help to decide on the evaluation metrics. Would this work? Or doesn't YOLO have some suggestions on these already? I saw that one could look at the F1_curve.png (for YOLOv5 & 7) and get some optimal confidence. However, I didn't see a graph like this for yolov4. Could be that I should use the |
FYI: We do not find an answer in these links below, but the general idea is that the "best" approach is to find the optimal values for these parameters. We could see them as hyperparameters at inference time, especially since these will also impact the rate of false positives on our field images. |
So, I just run detect.py for YOLOv4 with: --img-size 640 \
--conf-thres 0.25 \
--iou-thres 0.45 My hopes were ruined, because it produced again only 238 txt prediction files on the 1680 test image files. Similar to using 'best_overall.pt' # Number of txt files generated
cd ~/PAI/detectors/PyTorch_YOLOv4/runs/detect/
cd 3265637_detection_using_3217130_yolov4_pacsp_s_b8_e300_img640_hyp_custom
ls *txt | wc -l
238
# Number of jpg files in the test dataset
cd ~/datasets/P1_Data_sampled/test/images
ls *jpg | wc -l # this will not catch png or jpeg ones, ut 1680 is the right number
1680 I will try all the other best options and see what I get - see see #50 |
Overview of weights trials using: --img-size 640 \
--conf-thres 0.25 \
--iou-thres 0.45 Unfortunately, none of the weights options generated a number of detection txt file close to the total number of images in the test dataset. The best results were 238 out of 1680. best.ptJob id 3265637 # Number of txt files generated
cd ~/PAI/detectors/PyTorch_YOLOv4/runs/detect/
cd 3265637_detection_using_3217130_yolov4_pacsp_s_b8_e300_img640_hyp_custom
ls *txt | wc -l
# 238 best_overall.ptJob id 3265668 # Number of txt files generated
cd ..
cd 3265668_detection_using_3217130_yolov4_pacsp_s_b8_e300_img640_hyp_custom
ls *txt | wc -l
# 238 best_ap50.ptJob id 3265661 # Number of txt files generated
cd ..
cd 3265661_detection_using_3217130_yolov4_pacsp_s_b8_e300_img640_hyp_custom
ls *txt | wc -l
# 238 best_ap.ptJob id 3265663 # Number of txt files generated
cd ..
cd 3265663_detection_using_3217130_yolov4_pacsp_s_b8_e300_img640_hyp_custom
ls *txt | wc -l
# 238 best_f.ptJob id 3265662 # Number of txt files generated
cd ..
cd 3265662_detection_using_3217130_yolov4_pacsp_s_b8_e300_img640_hyp_custom
ls *txt | wc -l
# 92 best_p.ptJob id 3265662 # Number of txt files generated
cd ..
cd 3265665_detection_using_3217130_yolov4_pacsp_s_b8_e300_img640_hyp_custom
ls *txt | wc -l
# 4 best_r.ptJob id 3265666 # Number of txt files generated
cd ..
cd 3265666_detection_using_3217130_yolov4_pacsp_s_b8_e300_img640_hyp_custom
ls *txt | wc -l
# 238 |
Hi @stark-t , Given the results above, I think we need to do a grid search for the optimal values of Such a script will produce a dozen of detection folders with txt label files that you can run through an evaluation script and compute performance metrics (e.g.: precision, recall, average precision, F1, IoUs). Then we can plot these values on two axis of Is there a simpler approach to this issue? |
@valentinitnelav maybe we can narrow down the steps from 10%, 20%, .... 90%, since we already have some insights that lower thersholds work better right? |
Hi @stark-t , should I go ahead and close all the issues related to YOLOv4 since we drop it from the results comparison pipeline? I don't think I will get more time to investigate the issues at the moment. |
We don't implement YOLOv4 any longer. See also the other related issues linked above. |
Hi @stark-t ,
Running inference with detect.py for YOLOv7 was very similar to YOLOv5. However, for PyTorch_YOLOv4, things got a bit less smooth.
This is the detection script which I just tried for PyTorch_YOLOv4.
These arguments are the same as for YOLOv7 (for which I sent you already the txt prediction files).
PyTorch_YOLOv4 doesn't have the
--nosave
option, so it saves the images as well at inference time and I didn't find an argument to stop this action.Then it has two new arguments
--cfg
&--names
, which are not used in yolov7 or 5's detect.pyThe
pai.names
files must contain the label names:Most disturbing is that the run of detect.py produces only 238 txt prediction files on the 1680 test image files.
Also, the .err file usually produced when running a cluster job are empty (as opposed to YOLOv5, which for each image gives info about the time needed and extra infos).
I am not sure at this point what argument to change in detect.py of PyTorch_YOLOv4 to increase the number of detections.
I can reduce the values for
--conf-thres
&--iou-thres
, but that doesn't make it comparable anymore with how I run for YOLOv7 & 5. Actually, the values above are the default values for v5 & 7. For v4 the defaults are--conf-thres 04
&--iou-thres 0.5
- see https://github.com/WongKinYiu/PyTorch_YOLOv4/blob/master/detect.pyOut of curiosity, I reduced these values to:
and I got 1443 txt prediction files for the 1680 test image files. Still a lower number than what I got for YOLOv7 with the default values (1668 txt files).
EDIT: However, I just saw that this creates too many prediction boxes per image. I will send you the results.
All in all, how do we find a comparable situation when running detect.py on the test dataset between YOLO versions?
The text was updated successfully, but these errors were encountered: