Consider removing Base 64 / Data URIs in future glTF version #1915

donmccurdy · 2020-12-01T22:27:48Z

Currently glTF has three common packing arrangements:

(1) .gltf + .bin + textures (separate)
(2) .gltf (embedded)
(3) .glb (binary embedded)

Option (2), a .gltf with embedded binary data, adds 20-30% to the overall file size, and a non-trivial amount of extra processing cost to parse the base64 data (at least in JS). This has repeatedly been a stumbling block for three.js users, who understandably don't know the difference between the different options. I've had a difficult time explaining the difference to users — I think users sort of understand the .glb vs .gltf difference, but not the two types of .gltf file.

I'm not aware of any confirmed need for the embedded .gltf Data URIs except for convenient debugging within a single file. Seeing users shipping slower/larger embedded .gltf files without understanding the performance cost, or even comparing glTF unfavorably to other formats based on size differences related to these Data URIs, have me wondering if we are paying too high of a cost for this debugging feature.

Perhaps if it had a different file extension (.gltf-debug?) it would be easier to communicate to users, but I'm tempted to suggest that we drop the option for Data URIs in the next version of glTF (whenever that might be). Thoughts?

The text was updated successfully, but these errors were encountered:

donmccurdy · 2020-12-01T22:30:43Z

In any case, we should also try to ensure that tools don't use (2) as their default output. A few do this today, such as glTF-Pipeline.

lexaknyazev · 2020-12-01T22:53:16Z

I wouldn't call data URIs a "debugging feature" since text editors usually do not like huge multi-megabyte strings. Data URIs have almost no extra debugging value compared to keeping binary resources in external files (the latter could be more efficiently edited with specialized tools).

Instead, I think that data URIs in glTF should be viewed just like data URIs in HTML/CSS - used only for very small binary payloads. The pros/cons are basically the same across all web technologies.

+1 for ensuring that tools do not produce them by default. Maybe add a new validation issue when the embedded binary size exceeds a certain threshold?

donmccurdy · 2020-12-02T00:29:32Z

Maybe add a new validation issue when the embedded binary size exceeds a certain threshold?

That's a great idea. 👍

zeux · 2020-12-12T02:33:27Z

It's a bit tangential perhaps, but somewhat in line and may complement this well, but if we simultaneously add a way to store multiple buffers as part of GLB instead of just a single one, this will clean up the buffer storage story, making migration between gltf + external files and glb simpler as existing buffer view / buffer structure can be maintained, and in some cases allow GLB loaders to omit specific buffers that aren't necessary (due to use of fallbacks for unsupported extensions / LOD-type extensions / etc.).

prideout · 2021-03-17T21:23:35Z

As an aside, I've seen data URI's used in the images array but the spec only mentions them in the context of the buffers array. Maybe the spec should be more explicit about where they are allowed.

@zeux and I have contributed to the cgltf library, which does not support data URI's in images, and I don't know if it should.

zeux · 2021-03-17T21:27:24Z

@prideout I've definitely seen data URIs used in images; cgltf will preserve this data but doesn't decode it in any way, so an application that uses cgltf would need to process data URIs - gltfpack does this (it decodes the Base64 encoded data and stores it in a binary buffer, see https://github.com/zeux/meshoptimizer/blob/master/gltf/write.cpp#L723 + calls to parseDataUri.

lexaknyazev · 2021-03-17T21:28:30Z

@prideout
It's defined in the URIs section, so it applies to all URI usages. Using data: with images on the web is trivial although I understand that it doesn't apply as easy to native apps.

ideiasfrescas · 2021-06-04T20:51:01Z

I would like to mention that users may want to do some server side processing of the image and then create a whole new model based on a template one. Using php to load images and post changes to the file would be a case against dropping data uris.

wallabyway · 2021-08-05T18:45:06Z

@donmccurdy
While on the topic of making breaking changes to .glb format...

Regarding - (3) .glb (binary embedded)

I would like to suggest adding msgpack type, to replace the json utf-8 string type, for the structure json chunk inside the .glb file spec.

(happy to try other alternatives to msgpack serialization, but let's discuss that later)...

THE TOOLING PROBLEM:
There's a problem working with large glTF files between tools (and interfaces). They don't serialize/deserialize well.

For example:
I have a tooling pipeline, that generates a glTF+bin's file-set. This is then run through gltfpack for optimizations (unfortunately, gltf-transform got overwhelmed).

The gltf content, is 'massive' - gigabytes in size, hours to generate ( even with node-v8 large-memory settings). For example,. tree's with leaves, just waiting to be de-duplicated and turned into gltf-mesh-instance structures. See #1699

I'm using node.js, and here is the crux of the problem:

let buf = Buffer.from(JSON.stringify(gltf))

This line fills memory, and crashes because there are so many nodes, accessors, bufferviews, etc.

Ironically, this line is needed (serializing of the glTF structure into a JSON string/UINT8Array) for both saving to a glTF file or generating the JSON chunk for a .glb stream.

@zeux
I tried to integrate gltfpack node.js interface directly, but the interface still requires a .glb stream, which means I still need to serialize with this... JSON.stringify(gltf).

THE WORKAROUND(S)
I can generate the large glTF file, by separately serializing sub-parts of the glTF (nodes, accessors, bufferviews, etc) or I can use a different serializer, like msgpack.

This worked! (ie. serializing the JSON structure into a msgpack file)
msgpack files were 4x smaller, serialization was minutes (instead of hours), and gz compressed 2x better. It was easy to find integration tools.

image courtesy of @petrbroz

It got me thinking...

For the .glb format only, - if we replaced the structured JSON content from type JSON-string to type msgpack
would this be better for the gltf-tooling-ecosystem ?

https://github.com/KhronosGroup/glTF/tree/master/specification/2.0#chunks

(on second thought, this is insanity. I should probably just use msgpack for the whole thing... including the .bin files, since msgpack encodes binary arrays in a compatible way to gltf .bin format... NVM)

donmccurdy · 2021-08-06T15:01:34Z

I think @lexaknyazev has suggested that something like this might be appropriate for GLB v3 as well (#1560 (comment)). There is a real cost to it — currently it takes about +15 LOC in my implementations to add GLB support to an existing GLTF parser, switching to an entirely different serialization would almost certainly increase that. But GLTF has grown far beyond the web now, and JSON is not as much "at home" on other platforms, so perhaps it is something we should just do. Whether it would be msgpack or FlatBuffer or something else, I don't know, that is tricky because file formats can outlive individual libraries. Perhaps this is worth starting a separate issue to discuss.

lexaknyazev · 2021-09-27T06:45:42Z

The updated spec is more explicit about the size increase introduced by Data URIs.

donmccurdy · 2021-10-02T21:08:58Z

Unfortunately these files are already common; I don't think language in the spec is going to change that. We have the maintainers of the Blender addon saying "never ever use glTF Embedded", and I'm trying to communicate that same thing to three.js users as well, but it's difficult because it looks like a perfectly normal glTF file.

Ideally I'd like to see authoring tools stop using "glTF Embedded" entirely – it's a trap for users. I'm planning to start showing warnings in three.js any time glTF Embedded files exceed 100kb. I'd also like to propose we remove the option from Blender.

Related: #1117

Clarifications for glTF Embedded, for more details see KhronosGroup/glTF#1915

donmccurdy added the breaking change Changes under consideration for a future glTF spec version, which would require breaking changes. label Dec 1, 2020

donmccurdy added this to the glTF Next milestone Dec 1, 2020

donmccurdy mentioned this issue Dec 7, 2020

Consider making -b or -s the default? CesiumGS/gltf-pipeline#565

Open

lexaknyazev mentioned this issue Sep 27, 2021

Add a warning for large Data URIs KhronosGroup/glTF-Validator#169

Open

lexaknyazev closed this as completed Sep 27, 2021

donmccurdy mentioned this issue Oct 2, 2021

GLTFLoader: Warn about large Data URIs mrdoob/three.js#22630

Closed

zeux mentioned this issue Aug 13, 2022

Keep -te option for .gltf. zeux/meshoptimizer#459

Closed

echadwick-artist added a commit to KhronosGroup/glTF-Sample-Assets that referenced this issue May 23, 2023

Update README.md

6bb39ec

Clarifications for glTF Embedded, for more details see KhronosGroup/glTF#1915

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider removing Base 64 / Data URIs in future glTF version #1915

Consider removing Base 64 / Data URIs in future glTF version #1915

donmccurdy commented Dec 1, 2020 •

edited

Loading

donmccurdy commented Dec 1, 2020

lexaknyazev commented Dec 1, 2020

donmccurdy commented Dec 2, 2020

zeux commented Dec 12, 2020

prideout commented Mar 17, 2021

zeux commented Mar 17, 2021

lexaknyazev commented Mar 17, 2021

ideiasfrescas commented Jun 4, 2021

wallabyway commented Aug 5, 2021 •

edited

Loading

donmccurdy commented Aug 6, 2021

lexaknyazev commented Sep 27, 2021

donmccurdy commented Oct 2, 2021

Consider removing Base 64 / Data URIs in future glTF version #1915

Consider removing Base 64 / Data URIs in future glTF version #1915

Comments

donmccurdy commented Dec 1, 2020 • edited Loading

donmccurdy commented Dec 1, 2020

lexaknyazev commented Dec 1, 2020

donmccurdy commented Dec 2, 2020

zeux commented Dec 12, 2020

prideout commented Mar 17, 2021

zeux commented Mar 17, 2021

lexaknyazev commented Mar 17, 2021

ideiasfrescas commented Jun 4, 2021

wallabyway commented Aug 5, 2021 • edited Loading

donmccurdy commented Aug 6, 2021

lexaknyazev commented Sep 27, 2021

donmccurdy commented Oct 2, 2021

donmccurdy commented Dec 1, 2020 •

edited

Loading

wallabyway commented Aug 5, 2021 •

edited

Loading