Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support ViTPose #1876

Merged
merged 14 commits into from
Mar 14, 2023
Merged

[Feature] Support ViTPose #1876

merged 14 commits into from
Mar 14, 2023

Conversation

LareinaM
Copy link
Collaborator

@LareinaM LareinaM commented Dec 12, 2022

Motivation

Add implementation of ViTPose on MMPose 1.0

Modification

  • Add six config files
  • Add layer decay optimizer
  • In HeatmapHead, add parameter upsample, default to zero (no effect on previous codes), resize in BaseHead

BC-breaking (Optional)

Use cases (Optional)

Checklist

**Before

  • I have read and followed the workflow indicated in the CONTRIBUTING.md to create this PR.
  • Pre-commit or linting tools indicated in CONTRIBUTING.md are used to fix the potential lint issues.
  • Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
  • New functionalities are covered by complete unit tests. If not, please add more unit tests to ensure correctness.
  • The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

  • CLA has been signed and all committers have signed the CLA in this PR.

@LareinaM LareinaM marked this pull request as ready for review December 12, 2022 03:27
@ly015 ly015 changed the title Dev 1.x [Feature] Support ViTPose Dec 12, 2022
@jin-s13
Copy link
Collaborator

jin-s13 commented Dec 12, 2022

Have you tried training a base ViT model to check the accuracy?

@LareinaM LareinaM closed this Dec 12, 2022
@LareinaM LareinaM reopened this Dec 12, 2022
@jin-s13 jin-s13 mentioned this pull request Jan 4, 2023
@LareinaM
Copy link
Collaborator Author

LareinaM commented Feb 7, 2023

Result of the current implementation

With classic decoder

Arch Input Size AP AR
ViTPose-S 256x192 0.739 0.792
ViTPose-B 256x192 0.757 0.810
ViTPose-L 256x192 0.782 0.834
ViTPose-H 256x192 0.788 0.839

With simple decoder

Arch Input Size AP AR
ViTPose-S 256x192 0.736 0.790
ViTPose-B 256x192 0.756 0.809
ViTPose-L 256x192 0.781 0.833
ViTPose-H 256x192 0.789 0.839

Result of original ViTPose implementation

With classic decoder

Model Input Size AP AR
ViTPose-S 256x192 0.738 0.792
ViTPose-B 256x192 0.758 0.811
ViTPose-L 256x192 0.783 0.835
ViTPose-H 256x192 0.791 0.841

With simple decoder

Model Input Size AP AR
ViTPose-S 256x192 0.735 0.789
ViTPose-B 256x192 0.755 0.809
ViTPose-L 256x192 0.782 0.834
ViTPose-H 256x192 0.789 0.840

@codecov
Copy link

codecov bot commented Feb 15, 2023

Codecov Report

Patch coverage: 6.34% and project coverage change: -0.46 ⚠️

Comparison is base (d341f11) 82.22% compared to head (e22d6c8) 81.77%.

❗ Current head e22d6c8 differs from pull request most recent head 23b7838. Consider uploading reports for the commit 23b7838 to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           dev-1.x    #1876      +/-   ##
===========================================
- Coverage    82.22%   81.77%   -0.46%     
===========================================
  Files          225      227       +2     
  Lines        13375    13438      +63     
  Branches      2269     2285      +16     
===========================================
- Hits         10998    10989       -9     
- Misses        1864     1933      +69     
- Partials       513      516       +3     
Flag Coverage Δ
unittests 81.77% <6.34%> (-0.46%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmpose/engine/optim_wrappers/__init__.py 0.00% <0.00%> (ø)
...engine/optim_wrappers/layer_decay_optim_wrapper.py 0.00% <0.00%> (ø)
mmpose/models/heads/heatmap_heads/heatmap_head.py 82.19% <23.07%> (-5.78%) ⬇️
mmpose/models/heads/base_head.py 81.81% <33.33%> (-2.31%) ⬇️

... and 2 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Comment on lines +67 to +68
extra (dict, optional): Extra configurations.
Defaults to ``None``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument extra is convenient for extending the head class but may confuse users. We should keep a well-defined interface where every argument has a clear meaning and a detailed usage description in the docstring.

Here I think maybe we can use existing arguments conv_out_channels, conv_kernel_sizes, and has_final_layer to configure the final conv layers.

Copy link
Collaborator Author

@LareinaM LareinaM Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep the code clear and simple, would it be better to split the dictionary into two parameters, e.g. input_upsample (defaults to 0) and final_kernel_size (defaults to 1)?

@@ -101,6 +104,21 @@ def __init__(self,
self.decoder = KEYPOINT_CODECS.build(decoder)
else:
self.decoder = None
self.upsample = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding a new argument, e.g. input_upsample or input_rescale.

shuheilocale pushed a commit to shuheilocale/mmpose that referenced this pull request May 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants