-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MixNet (Mix_Conv) - 0.360 (0.5) BFlops - 77.0% (71.5%) Top1 #4203
Comments
MixNet-L and -M have the same network architecture: we simple apply depth_multiplier 1.3 on MixNet-M to get MixNet-L, as shown in this code: https://github.com/tensorflow/tpu/blob/56e1058cba2b7b5ca233a4c9bfd7331a69082188/models/official/mnasnet/mixnet/mixnet_builder.py#L217 Is trained:
Explanation:
For comparison with EfficientNet |
@AlexeyAB |
@CuongNguyen218 thanks for sharing this. And yea it seems like AlexeyAB's cfg will apply the filters to the entire input tensor (like inceptionnet). |
Maybe the slice implementation be called but not split @AlexeyAB |
|
I added groups= and groupd_id= params to the [route] layer, so you can try to implement MixNet by using such blocks: #4203 (comment) But I didn't test it. Commit: 0fa9c8f |
@AlexeyAB , how can i know that it's true |
@AlexeyAB |
What is the |
@CuongNguyen218 @dexception @beHappy666 @gnefihs @WongKinYiu @LukeAI I implemented MixNet-M classification network, so you can try to train it on ImageNet. GPU nVidia RTX 2070
|
@AlexeyAB Hello,
i d like too know what r difference between these two comments, thanks. |
1st is got from paper Or what do you mean? MixNet is just more efficient (Top1/Flops) modification of EfficientNet |
just to make sure i understand correctly. implemented MixNet-M is 0.256 BFLOPs, but GPU version is 1.0 BFLOPs. i ll take a look cfg files after finish my breakfast, thank you. |
Yes, I just made some changes in MixNet-M (mixnet_m_gpu.cfg.txt) so it can be trained ~2x faster - 2.7 sec instead of 4.6 sec per training iteration with the same inference speed on GPU. May be we should look at |
Now training mixnet_m.cfg.txt - 0.256 BFlops - 4.6 sec per iteration training - 45ms inference. update: gets cuDNN Error: CUDNN_STATUS_INTERNAL_ERROR |
@WongKinYiu Yes, I fixed, BFLOPS 0.759 it is 0.379 FMA (EfficientNet and MixNet authors use FMA). I successfully trained mixnet_m_gpu.cfg.txt for 10 000 iterations on Windows 7 x64. |
@AlexeyAB thanks, i do not know why on my every windows computer, training models with group convolution will crash. |
|
nvcc --version windows do not have nvidia-smi |
This is a very strange error, why it is trying to create another instance of cuDNN-handle when it is already created.
It should be in the Do you use the latest version of Darknet? |
|
yes, i notice that nvidia-smi shows cuda vesion 10.1. |
Or just try to use new cuDNN version |
@WongKinYiu, |
here u r: #4360 |
@AlexeyAB , |
ImageNet and COCO models of EfficientNet-B0: #3874 (comment) |
@AlexeyAB , what result did you get? |
mixnet-m-gpu, top-1 = 71.5%, top-5 = 90.5%. |
@WongKinYiu Nice! Can you share weights-file? |
Why is your results very different from paper? |
becuz mixnet-m-gpu is designed by @AlexeyAB, not appears in the paper. |
What do you think about pytorch or tensorflow transform model to darknet?
Tải Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
Từ: Kin-Yiu, Wong <notifications@github.com>
Đã gửi: Wednesday, December 11, 2019 4:23:01 PM
Đến: AlexeyAB/darknet <darknet@noreply.github.com>
Cc: Nguyen Ngoc Cuong 20150510 <cuong.nn150510@sis.hust.edu.vn>; Mention <mention@noreply.github.com>
Chủ đề: Re: [AlexeyAB/darknet] Mix_Conv (#4203)
becuz mixnet-m-gpu is designed by @AlexeyAB<https://github.com/AlexeyAB>, not appears in the paper.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#4203?email_source=notifications&email_token=AKKXRBTGQ7UBX5XUYK4DLV3QYCWPLA5CNFSM4JHVYHBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGSN37A#issuecomment-564452860>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AKKXRBWWDVSURRCZMKISFMLQYCWPLANCNFSM4JHVYHBA>.
|
Because in the paper MixNet and Efficientnet are trained with very large mini_batch_size on DGX-2 / Cluster ~400k$ - 1M$. If we train with the same mini_batch_size, then EfficientNet-B0 (official) has even lower Top1/5 accuracy than my EfficientNet-B0: https://github.com/WongKinYiu/CrossStagePartialNetworks#small-models Also, I slightly optimized MixNet on GPU so that it can be trained in 1 month instead of 2 months. |
@CuongNguyen218 If you want you can train original MixNet-M on ImageNet: #4203 (comment)
https://github.com/AlexeyAB/darknet/files/3838329/mixnet_m.cfg.txt |
@AlexeyAB I just started looking into MixConvs. They seem very interesting! Do you know of anywhere that they are applied to object detection or are they only used in classification? EfficientDet was published in November 2019, while MixConv was published in July 2019, so the EfficientDet authors clearly must have been aware of this type of convolution but neglected to use it for some reason I'm thinking. |
There are the same authors in all three articles: MixNet, EfficientNet, EfficientDet
Both EfficientNet / MixNet are not optimal for the current CPU/GPU/Neuro-chips (MyriadX, Coral-TPU-Edge). So they do such network as a reference-network to help to create a new neurochips (new version of TPU-edge). So may be the reason why they don't use MixNet for Detector: Creating a neurochip for EfficientNet (Grouped-conv) is much easier than for MixNet (Grouped-Conv with different kernel_size). Also may be MixNet has lower BFlops, but also slower. |
@AlexeyAB Ah I see, that's an interesting approach. Yes it seems like hardware speeds for all of these new grouped convolution techniques are quite slow, despite the lower parameter count. |
Hi @AlexeyAB , I am trying to do inference on mixnet model using your config and pretrained weights mentioned in the starting of the tread, but I am getting error: " Error: in the file data/coco.names number of names 80 that isn't equal to classes=0 in the file cfg/mixnet_m_gpu.cfg I tried running it on my ubuntu 18.04 by using command: "./darknet detector test cfg/coco.data cfg/mixnet_m_gpu.cfg mixnet_m_gpu_final.weights -ext_output data/dog.jpg" |
Hi @AlexeyAB ,
Mix_conv: Mixed Depthwise Convolutional Kernels.
Arxiv
Github
Top1 Acc: 78.9% on ImageNet with 0.56 BFlops. I think this idea is good.
The text was updated successfully, but these errors were encountered: