Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK-49 Add custom section for sourceMappingURL #124

Closed
wants to merge 3 commits into from

Conversation

paulyoung
Copy link
Contributor

This is based on WebAssembly/design#1051.

I doubt using the env the way I have is the best/correct way to do this, so would appreciate some feedback on that.

We may wish to add support for providing a base URL like emscripten does.

I'm running into what seems to be an unrelated issue (#123) since a version of this was working at an older commit.

@paulyoung paulyoung changed the title Add custom section for sourceMappingURL SDK-49 Add custom section for sourceMappingURL Jan 15, 2019
@rossberg
Copy link
Contributor

Hm, since this value should only be needed at the outermost part of module generation it doesn't seem necessary to store it in the environment.

It's a bit annoying that this param pollutes all the compile & pipeline API now. Maybe it's time to aggregate some of the compile parameters (mode, name, url) into a config record?

@nomeata
Copy link
Collaborator

nomeata commented Jan 15, 2019

since this value should only be needed at the outermost part of module generation it doesn't seem necessary to store it in the environment.

We create modules for local actors, so that part might be needed. But is it even correct to use the same source map for the surrounding and the local modules? Presumably not, or at least not implemented like now. (The sourcemap spec seems to include a file field to give the name this map applies to … but there does not seem to be a provision to support multiple files.)

So maybe asc’s --map flag should write a directory of source maps, the url parameter should just specify the basename, and asc comes up with unique names for the WebAssembly modules (main and embedded), and uses these names in the source maps and the embedded URL.

Or we could ditch embedding URLs altogether, and asc writes the source maps to hash-of-module.map, i.e. in a content-addressable way, and leave it to the SDK to make the connection (by asking the hypervisor for the hash of the module, and then finding the file.)

Of course, if we ditch local actors soon anyways, then this is all moot.

Another, orthogonal thought: Is compile time really the right time to specify the URL of the source map? Isn’t it more likely deploy time, i.e. when you upload a working program “somewhere” that you know where to publish the source map? If so, then maybe the URL should not be embedded by the compiler, but rather a separate tool that acts on the .wasm binary. I could imagine a good user experience is the following: The command (dvm? dfn-cli?) that deploys a .wasm module looks for a .map file in the same directory. If it is there (or specified with a command line flag), it uploads it to a well known location (maybe simply on the node), and embeds the URL to that in the module.

@rossberg
Copy link
Contributor

Agreed that compile time may be the wrong point.

@paulyoung
Copy link
Contributor Author

I think the concerns above are all valid. However, I still think the best developer experience would be to specify that a source map should be produced, and then by default make the resulting files ready to be consumed by a browser.

I think this means:

  • By default, setting the sourceMappingURL to the name of the source map file produced without not requiring extra steps or additional tools (people could still do this if the default is not sufficient)
  • Having a standalone solution that works for ActorScript (nothing DFINITY-related required)

Does anyone feel strongly against doing the above? If not, are there ways I can improve on what's here?

@paulyoung
Copy link
Contributor Author

Hm, since this value should only be needed at the outermost part of module generation it doesn't seem necessary to store it in the environment.

It's a bit annoying that this param pollutes all the compile & pipeline API now. Maybe it's time to aggregate some of the compile parameters (mode, name, url) into a config record?

I'll look at addressing both of these.

@paulyoung
Copy link
Contributor Author

is it even correct to use the same source map for the surrounding and the local modules? Presumably not, or at least not implemented like now. (The sourcemap spec seems to include a file field to give the name this map applies to … but there does not seem to be a provision to support multiple files.)

As far as I know there's a .map file for every output file, and since we're producing one file, e.g. foo.wasm there's one source map file, i.e. foo.wasm.map

In my experience it's common for a single file to containing many modules, classes, etc to be loaded by a client/browser with it's corresponding source map containing data on multiple source files.

Is that what you're getting at, or was it something else?

@paulyoung
Copy link
Contributor Author

Hm, since this value should only be needed at the outermost part of module generation it doesn't seem necessary to store it in the environment.

I remember why I did this; it seemed cleaner than polluting ActorE with source_mapping_url (which appears to be necessary in order to be passed to actor_lit and then to conclude_module)

@nomeata
Copy link
Collaborator

nomeata commented Jan 23, 2019

As far as I know there's a .map file for every output file, and since we're producing one file, e.g. foo.wasm there's one source map file, i.e. foo.wasm.map

But foo.wasm contains (at the moment) the binaries for the local modules, created from ActorE, and embedded as strings. These need their own source maps, and these source maps need to be stored somewhere, and clearly have their own URL.

Here is the line where we throw away the source map from the embedded module at the moment: https://github.com/dfinity-lab/actorscript/blob/822d85a3bec1652119a4bb807022abe06616a823/src/compile.ml#L3860

There are voices for an alternative plans where embedded modules are written to separate files, and somehow bundled in by way of a manifest, and uploaded to the chain along side of each other. If that happens, then it is more obvious where to write the embedded source maps to.

By default, setting the sourceMappingURL to the name of the source map …

Ah, so it wouldn’t be a full URL, just the basename?

@paulyoung
Copy link
Contributor Author

By default, setting the sourceMappingURL to the name of the source map …

Ah, so it wouldn’t be a full URL, just the basename?

It would be just the filename, e.g. foo.wasm.map (as I did in this PR)

WebAssembly/design#1051 says under Linking generated code to source maps:

When the generated code is JavaScript, it includes a specially-formatted line at the end, which is the URL of the associated source map. For WebAssembly, a custom section named "sourceMappingURL" contains the URL. The URL is represented and encoded as defined in the WHATWG URL spec. If the URL is not absolute, then:

  • If the wasm module is associated with an HTTP response (e.g. the compileStreaming or instantiateStreaming APIs are used), then it is resolved relative to the URL of the response.
  • Otherwise it is relative to the page's origin.

For wasm modules with an associated HTTP response (e.g. those using the compileStreaming or instantiateStreaming APIs) the source map URL may also be specified using the SourceMap: HTTP header as with JavaScript source maps.

So I think for now, just the filename is a reasonable default

@paulyoung
Copy link
Contributor Author

paulyoung commented Jan 23, 2019

But foo.wasm contains (at the moment) the binaries for the local modules, created from ActorE, and embedded as strings. These need their own source maps, and these source maps need to be stored somewhere, and clearly have their own URL.

Could you help me understand what these local/embedded modules are and how they are special (regarding source maps)?

I'm not sure I agree with "clearly have their own URL"; I think if these modules exist in foo.wasm then the source maps could exist in foo.wasm.map. The purpose of the URL is only so the browser knows how to map Wasm to its source code and not any other semantic meaning that I know of.


There are voices for an alternative plans where embedded modules are written to separate files, and somehow bundled in by way of a manifest, and uploaded to the chain along side of each other. If that happens, then it is more obvious where to write the embedded source maps to.

I'm trying to work with what browsers have implemented today, which appears to be what is described in WebAssembly/design#1051.

@nomeata
Copy link
Collaborator

nomeata commented Jan 23, 2019

Let me use a Javascript analogy. If you have foo.js with this code:

document.write('<script src="data:text/javascript;base64,R0lGODdh…"/>')

If you run this code, the browser loads the embedded JavaScript file as a separate file, but it is no longer connected to foo.js in any meaningful way. Why should the browser look at foo.js.map for the embedded JavaScript?

@paulyoung
Copy link
Contributor Author

I see. Thanks for the explanation. I think if the base64-encoded JS includes a sourceMappingURL that tells the browser that the source map is in foo.js.map, then it would work as expected though, right?

@paulyoung
Copy link
Contributor Author

(Provided that the source map for the base64-encoded JS in in foo.js.map)

@nomeata
Copy link
Collaborator

nomeata commented Jan 23, 2019

Also, I fail to see how it is useful to just write actor.as.map into the sourceMappingURL section. There is no location here. And the original filename that the developer used has no useful meaning any more once the actor is deployed to the chain. And there might be multiple actors that have the same filename (different versions) running at the same time. This isn’t going to fly.

Here is an alternative proposals:

  • asc --map creates <hash>.as.map files, where <hash> is a hash of the WebAssembly binary, excluding the sourceMappingURL custom section. It creates these files for all produced modules – the main actor, as well as for local actors.
  • After hashing, it appends a sourceMappingURL section, pointing to <hash>.as.map

This way, the SDK just needs to collect all these .map files in one directory, and tell the development console to find maps in that directory (I hope this is possible). Even when there are multiple versions of one actor running on the chain, the browser will pick the right source map.

In fact, we should maybe even include the source there (Source maps can optionally include the original source.) This way, there is never confusion about picking the wrong source file or source map for a given on-chain actor.

@nomeata
Copy link
Collaborator

nomeata commented Jan 23, 2019

I see. Thanks for the explanation. I think if the base64-encoded JS includes a sourceMappingURL that tells the browser that the source map is in foo.js.map, then it would work as expected though, right?

But a source map is always only for one (produced) file. What should the entry “line 1 position 1” in that map refer to – the outer JS or the inner JS?
(This analogy uses JS files as generated files, I hope that is not causing confusion.)

@paulyoung
Copy link
Contributor Author

Also, I fail to see how it is useful to just write actor.as.map into the sourceMappingURL section. There is no location here. And the original filename that the developer used has no useful meaning any more once the actor is deployed to the chain. And there might be multiple actors that have the same name (different versions) running at the same time. This isn’t going to fly.

I've definitely been thinking about this too much in terms of browsers and not our network. Thanks for pointing this out.

@paulyoung
Copy link
Contributor Author

But a source map is always only for one (produced) file. What should the entry “line 1 position 1” in that map refer to – the outer JS or the inner JS?

I thought a single source map file could represent multiple source files but perhaps that's not the case.

@paulyoung
Copy link
Contributor Author

This way, the SDK just needs to collect all these .map files in one directory, and tell the development console to find maps in that directory (I hope this is possible).

As far as I know, the only way to tell a browser where to find a source map is through the mechanisms I shared above, described in WebAssembly/design#1051 under Linking generated code to source maps:

When the generated code is JavaScript, it includes a specially-formatted line at the end, which is the URL of the associated source map. For WebAssembly, a custom section named "sourceMappingURL" contains the URL. The URL is represented and encoded as defined in the WHATWG URL spec. If the URL is not absolute, then:

  • If the wasm module is associated with an HTTP response (e.g. the compileStreaming or instantiateStreaming APIs are used), then it is resolved relative to the URL of the response.
  • Otherwise it is relative to the page's origin.

For wasm modules with an associated HTTP response (e.g. those using the compileStreaming or instantiateStreaming APIs) the source map URL may also be specified using the SourceMap: HTTP header as with JavaScript source maps.

Which brings up the question of how these files are going to be served when the debugger is attached to V8, and what control we have over that (I'm assuming little)

I'm still making progress on doing that and have been documenting my findings along the way here: https://dfinity.atlassian.net/browse/SDK-22

@paulyoung
Copy link
Contributor Author

I think I should switch focus back to working on that and see what it reveals. Is there any progress to be made here in the meantime based on the alternative proposals?

@nomeata
Copy link
Collaborator

nomeata commented Jan 23, 2019

I thought a single source map file could represent multiple source files but perhaps that's not the case.

multiple sources yes (in our case: ActorScript), but not multiple outputs (in our case: Wasm modules).

I think I should switch focus back to working on that and see what it reveals. Is there any progress to be made here in the meantime based on the alternative proposals?

I think we should pause this one until we actually have the debugger attached to V8 in the node.

I suspect that by the time that runs we have a better understanding where the debugger would look for source maps in that case. Is there HTTP involved? Maybe there is a place to inject SourceMap HTTP headers in the request where the browser downloads the Wasm module, and maybe that then works.

Or maybe we can create a small chrome plugin that handles a custom URL scheme (sourceMap://<hash>.map), then we could put these absolute URLs into the extra section.

@paulyoung
Copy link
Contributor Author

I think we should pause this one until we actually have the debugger attached to V8 in the node.

That's what I mean by "I think I should switch focus back to working on that and see what it reveals."

Sorry it was unclear. Glad we are on the same page though :)

@ggreif
Copy link
Contributor

ggreif commented Feb 12, 2019

We now have a test app attachable by Chrome on the paulyoung/sdk-22-v8-inspector branch. It is named inspector-shell for now. We are stroking it in a somewhat slightly wrong way, so sources don"t show up yet.

@nomeata
Copy link
Collaborator

nomeata commented Oct 15, 2019

Closing this, it’s been months since anybody talked about source maps…

@nomeata nomeata closed this Oct 15, 2019
@nomeata nomeata deleted the paulyoung/source-mapping-url branch October 15, 2019 14:01
dfinity-bot added a commit that referenced this pull request Dec 12, 2022
## Changelog for ic-hs:
Branch: master
Commits: [dfinity/ic-hs@3cc51be5...9a7bcb2f](dfinity/ic-hs@3cc51be...9a7bcb2)

* [`9a7bcb2f`](dfinity/ic-hs@9a7bcb2) remember last trap also if on_cleanup succeeds ([dfinity/ic-hs⁠#124](https://github.com/dfinity/ic-hs/issues/124))
dfinity-bot added a commit that referenced this pull request Dec 13, 2022
## Changelog for ic-hs:
Branch: master
Commits: [dfinity/ic-hs@3cc51be5...8384593e](dfinity/ic-hs@3cc51be...8384593)

* [`9a7bcb2f`](dfinity/ic-hs@9a7bcb2) remember last trap also if on_cleanup succeeds ([dfinity/ic-hs⁠#124](https://github.com/dfinity/ic-hs/issues/124))
* [`8384593e`](dfinity/ic-hs@8384593) scale cycle cost of http_request calls based on subnet size ([dfinity/ic-hs⁠#123](https://github.com/dfinity/ic-hs/issues/123))
mergify bot pushed a commit that referenced this pull request Dec 14, 2022
## Changelog for ic-hs:
Branch: master
Commits: [dfinity/ic-hs@3cc51be5...8384593e](dfinity/ic-hs@3cc51be...8384593)

* [`9a7bcb2f`](dfinity/ic-hs@9a7bcb2) remember last trap also if on_cleanup succeeds ([dfinity/ic-hs⁠#124](https://github.com/dfinity/ic-hs/issues/124))
* [`8384593e`](dfinity/ic-hs@8384593) scale cycle cost of http_request calls based on subnet size ([dfinity/ic-hs⁠#123](https://github.com/dfinity/ic-hs/issues/123))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants