-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elaborate on skinning weights sum #1749
Conversation
I brought up this specific case back when normalized weights were first made required: #1352 (comment). My impression was that some tolerance for this problem was supposed to be allowed? Isn't this a breaking change? There's probably plenty of code that quantitizes weights by multiplying by 255 and rounding. |
@scurest |
Isn't this a breaking change? |
There were plenty of reports of models (even in our own Sample-Models repo) behaving incorrectly because of non-unit sum (like
This PR just clarifies what "unit linear sum" means for specific component types. |
I'm surprised with the results seen in the screenshots here: KhronosGroup/glTF-Validator#124 (comment) While it might be true that this is related to weights not having exactly the same value as the originals, I suspect this is happening also because the original model's hands have some undesired influences from unrelated bones, and this has been made more apparent by quantized weights. So maybe in most cases it might not be so apparent. In fact maybe in other cases where the renormalization is uneven, it might have undersirable effects too! I also fear what @scurest has said; even if the schema clarifies this case scenario, everybody has been asumming some level of tolerance on all component types; After all, "if some tolerance is allowed in the Float component, why not allow it in others?" , so to me, even if it is a clarification, it's also a breaking change. So, maybe a compromise solution would be to have TWO tolerances: if the weights sum has an exact sum of 1, all is good, but if it's very close to 1, issue a warning instead of an error. This will allow current implementations to keep working, and give due warning for those that want to walk the extra mile. In any case, the specification and gltf-validator need to be aligned, so developers following the specification will not hit gltf-validator errors. |
I don't think the result is very surprising. If you have any surface that's skinned to multiple bones, the influences of the bones will gradually decrease/increase. As a result, the sum error after quantization will fluctuate between -1/255, 0/255 and 1/255 (occasionally getting to +-2/255 for 4-weight sources). This should just naturally occur for any regions that have >2 weights. If you have exactly 2 weights, then I think this will be rare since the sum should mostly result in precisely 255. But even if you have 2 weights, and they are both exactly equal to 0.5, naive quantization will introduce an error of 1/255. When you have any error, you will see a problem. Consider the character in that screenshot. The hands look like they are at about +-0.6m from the center of the character along the X axis. The error of 1/255 results in 0.2cm in the steady position, which is 10% of the finger width. I think this is why the fingers are visibly distorted even in the first screenshot. Now, if you use world-space skinning (which three.js does, and it is suboptimal for a few reasons but a straightforward implementation of glTF spec would also skin in world space), then moving the character just 5 meters to the right would place the hands at 5.6m from the origin, and so the error becomes 2cm. 50 meters, and it's 20 cm. This is exactly what you see in the screenshots. The reason why you see the vertices "oscillating" is because you have the error of the sum oscillating. For 16-bit weights, the problem is much less severe because the error from naive quantization is 256 times smaller - so you have to move the character much further from the origin to see it. But for 8-bit weights it's severe. Having said that, it might make sense to change this to be a warning? I am not sure I fully understand the distinction. |
Currently gltf-validator issues an error when it finds weights not perfecly normalized. Given that the "naive normalization" approach is what most developers will naturally take, maybe being so strict is not good for adoption. Bug given that also not many developers are writing gltfs with packed weights, I don't think being strict will affect many developers either... |
Or normalize them in the shader or whatever. CesiumMan has non-normalized weights so any engine that can load it is already doing some kind of normalization. Having the exporter write round(255*weight) and then having the engine normalize after de-quantization is the best case for accuracy. Any adjustments you do to make them sum to 255 increases the absolute error. Btw the algorithm linked above is not necessarily the best way to do it, even with just four weights. It concentrates all the error in the maximal element. Here are some weights quantized with just round(255*weight). These are best possible errors. The quantized column sums to 257.
Adjusting the quantized weights to sum to 255 by modifying only the maximal element increases the absolute error.
OTOH, if you defray it over two elements, the absolute error is still the best possible
|
Yes, spreading the error across multiple elements can reduce the per-component error, but this is less important compared to minimizing the error in the sum. I agree that normalizing in the shader can work around these issues, but do all engines do that? three.js clearly doesn't (it normalizes during load - even that should be unnecessary, it's only doing this because some models were broken) |
If it's normalizing on load, shouldn't it be adjusting these quantized weights at load? Or does it only normalize floats or something? |
@zeux How cesium man looks in threejs? I think we're in a sort of crossroads, where we need to choose either requiring renormalization after quantization (which, as @scurest said, it does not guarantee to solve the problem) - or requiring engines to renormalize on load.
I believe it's the floats, so the deviation is evenly distributed across all weights, instead of overshooting one weight. |
Can you expand on this?
It only normalizes floating point weights right now. In general I don't think the specification should assume that all loaders need to normalize weights on load, the only reason why three.js did it is that bugs with broken models were reported and needed to be fixed. |
Technically, the allowed level of tolerance (
At first, the spec didn't mention the weights sum at all. Then, after multiple reports of unexpected behavior, we agreed on "linear unit sum" in general. Now, since this check has been implemented and we have some specific examples of its impact, it's time to make the spec more concrete. This change certainly invalidates assets with obviously wrong weights - that's fine. There are also assets that were valid all along - that's great. Everything in the middle has some issues that may be visible only in a particular pose or with exaggerated scale (when engines do not fix that somehow, like CesiumMan in three.js). Keep in mind, that one of the primary goals of the validator is to improve glTF compatibility with different engines/platforms/environments by subjecting assets to ever-tightening rules. A less picky workflow may knowingly override the corresponding validation severity code or mute it completely.
In my opinion, a warning should not lead to a broken render (in general).
As a rule of thumb, glTF prefers putting the burden on exporters. |
Please review the updated spec language. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wording threads the needle of (1) not normatively specifying something too arbitrary, (2) explaining the motivation clearly, and (3) setting expectations for the validator. Looks good to me, thanks @lexaknyazev. 👍
Final nit. The validation note says:
It may be unclear from what number of weights threshold accumulation starts, i.e., whether |
Additionally, given the exquisitely precise this requirement is, I would also suggest to pick old CesiumMan and other models that have invalid weights and move them to the gltf-generation collection as negative load cases. |
The validator's behavior, and considering that case to be one non-zero weight for a total threshold of 2e-7, seems like clearest option to me. |
I'm not a skinning expert, so @donmccurdy or @bghgary can one of you merge this when ready? |
@lexaknyazev your comment (#1749 (comment)) asks about the wording, and to me the wording seems clear. Is this OK to merge, or was there more clarification you feel is needed, like saying "regardless of weight order"? |
@donmccurdy Some implementations read only three weights and use |
Ok, I have no preference on that. |
Let's skip adding further requirements for now, a single clear requirement for the skinning sum is a good starting point. |
It's more like a disambiguation rather than an extra requirement.
|
Sorry, I can't understand why |
Having written my own skinning implementation earlier this year, I finally understand certain issues such as the importance of the weights summing to 1.0. That said, it does make sense to allow some tolerance, particularly for systems that use different floating-point representations and must convert. With such tolerance allowed, the |
After going through the original rationale one more time, I agree with allowing @donmccurdy @emackey |
specification/2.0/README.md
Outdated
|
||
> **Implementation Note:** Since the allowed threshold is much lower than minimum possible step for quantized component types, exporters should just renormalize weight sum after quantization. | ||
|
||
Without these requirements, vertices would be deformed significantly because the weight error would get multiplied by the joint position. For example, an error of `1/255` in the weight sum would result in an unacceptably large difference in the joint position. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this line be part of the implementation note immediately above it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would merge the non-implementation-note lines into a single paragraph, followed by two implementation notes. No need to interleave them sentence by sentence.
Other than the minor layout issue @emackey mentioned, this looks good to me. 👍 |
@lexaknyazev What happened if some vertices are not skinned? e.g they have zero sum weights? |
Following the discussion from KhronosGroup/glTF-Validator#132.
Rationale is borrowed from KhronosGroup/glTF-Validator#124 (comment) by @zeux.
/cc @vpenades