Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm-ep] End-to-end workflow for profiled AOT using EventPIpe #72481

Closed
Tracked by #69268
lambdageek opened this issue Jul 19, 2022 · 3 comments
Closed
Tracked by #69268

[wasm-ep] End-to-end workflow for profiled AOT using EventPIpe #72481

lambdageek opened this issue Jul 19, 2022 · 3 comments
Assignees
Labels
arch-wasm WebAssembly architecture area-Tracing-mono enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@lambdageek
Copy link
Member

lambdageek commented Jul 19, 2022

Add a way to create profiled AOT builds of WebAssembly apps using EventPipe.

Do use a runtime built with EventPipe enabled, and to configure an app to collect a trace using dotnet-dsrouter and dotnet-trace that can be used to create a .mibc using dotnet-pgo.

@lambdageek lambdageek mentioned this issue Jul 19, 2022
22 tasks
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 19, 2022
@lambdageek lambdageek added area-Tracing-mono arch-wasm WebAssembly architecture labels Jul 19, 2022
@ghost
Copy link

ghost commented Jul 19, 2022

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

Add a way to create profiled AOT builds of WebAssembly apps using EventPipe.

Do use a runtime built with EventPipe enabled, and to configure an app to collect a trace using dotnet-dsrouter and dotnet-trace that can be used to create a .mibc using dotnet-pgo.

Author: lambdageek
Assignees: -
Labels:

arch-wasm, area-Tracing-coreclr, untriaged, area-Tracing-mono

Milestone: -

@lambdageek lambdageek removed area-Tracing-coreclr untriaged New issue has not been triaged by the area owner labels Jul 19, 2022
@lambdageek lambdageek added this to the 8.0.0 milestone Jul 19, 2022
lambdageek added a commit that referenced this issue Jul 25, 2022
…mbly (#72482)

Add a diagnostic server for WebAssembly.  Enable by building the runtime with `/p:WasmEnablePerfTracing=true` or `/p:WasmEnableThreads=true`.

To configure a project to start the diagnostic server, add this to the .csproj:
```xml
      <WasmExtraConfig Include="diagnostic_options" Value='
{
  "server": { "suspend": false, "connect_url": "ws://localhost:8088/diagnostics" }
}' />
```

The `connect_url` should be a WebSocket url serviced by `dotnet-dsrouter server-websocket` **from this branch** https://github.com/lambdageek/diagnostics/tree/wasm-server

Note that setting `"suspend": true` will hang the browser tab until a diagnostic tool such as `dotnet-trace collect` connects to the dsrouter.

---

Implement creating VFS file based sessions at runtime startup.  Add the following to a .csproj:

```xml
    <WasmExtraConfig Include="diagnostic_options" Value='
{
  "sessions": [ { "collectRundownEvents": "true", "providers": "WasmHello::5:EventCounterIntervalSec=1" } ]
}' />
```

That will create and start one or more EventPipe sessions that will store their results into the VFS.

The startup session can be retrieved via `MONO.diagnostics.getStartupSessions()`.  Each session `s` should be stopped via `s.stop()` and the data can then be extraced in a `Blob` using `s.getTraceBlob()`.

This is orthogonal to the diagnostic server support.  You don't need `dotnet-dsrouter` running on the host.  But you do need access to JavaScript on the main thread.

---

Notes/Issues:

* Tree shaking: I verified that if threads are not available, all the TypeScript diagnostics code is removed.
* Right now the server is not very robust to `dotnet-dsrouter` stopping, or starting after the runtime starts.  The ideal order is to start `dotnet-dsrouter` first, and then open the browser
* Unrelated housekeeping fixes:
   * Tell `wasm.proj` about all the subdirectories with .ts files - makes incremental builds notice changes in subdirectories.
   * Add a rollup `dependencies` property to quiet a warning about `node/buffer`
   * There's a mock implementation of a "websocket" that was used for protocol testing. I verified that tree-shaking removes this in thread-enabled Release builds.
   * Bump `PTHREAD_POOL_SIZE` to `4` and set `PTHREAD_POOL_SIZE_STRICT=2` (returns `EAGAIN` from `pthread_create` if the pool needs to grow).  The previous setting `PTHREAD_POOL_SIZE_STRING=1` (warn and try to grow the pool) seemed to lead to hangs.  Usually that means the main thread is creating a thread and immediately calling `pthread_join` without returning to JS. We should investigate separately.
   * The only implemented diagnostic server commands are `CollectTracing2`, `StopCollecting` and `ResumeRuntime`. None of the `Dump`, `Process` and `Profiler` commands are implemented and the server will crash if it receives them.  It should be relatively straightforward to return a "command unsupported" reply (which would allow clients to gracefully disconnect), but it's not done yet.
* In some error states the runtime kills the browser tab with the following in a terminal window (if Chrome is started from a terminal: `FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory`). This probably means we're hitting a loop somewhere that rapidly exhausts JIT memory, but it's difficult to investigate when the JS console dies, too (happens with chrome stable v103 and chrome beta v104).

Fixes #69674, contributes to #72481


---

* [wasm] Enable the tracing component if threading is supported

* add a way to specify EP sessions in the MonoConfig

   Currently not wired up to the runtime

* Add a mechanism to copy startup configs into the runtime and session IDs out

* checkpoint. Do more from JS

The issue is that once we're setting up streaming sessions, we will need to send back a DS IPC reply with the session id before we start streaming.  So it's better to just call back to JS when we start up and setup all the EP sessions from JS so that when we return to C everything is all ready.

* checkpoint: starting a session at startup works

* checkpoint add a controller and a webworker for DS

* checkpoint: diagnostic server

* fix eslint

* [diagnostic_server] wasm-specific fn_table

   We won't be using the native C version

* [wasm-ep] disable DS connect ports in C, too

   we will implement DS in JS

* Start diagnostic server pthread

Clean up some of the old WIP code - we will probably not send configuration strings from the diagnostic server back to the main thread.

* checkpoint: try to start the server

   It doesn't work right now because the MessagePort is not created until the server thread attaches to Mono, which doesn't happen because it's started before Mono.

Also it doesn't yet send a resume event, so the main thread just blocks forever

* Add a mock WebSocket connection to simulate the remote end

   Start the diagnostic server and have it perform the open/advertise steps with the mock.

* wasm-mt: use a PThreadSelf struct instead of a raw MessagePort

* Move all the EP and diagnostic server modules to one directory

* Refactor; remove dead code; rationalize controller

the flow is now:

```
main -{creates pthread}->              server
  .                                      server creates event listener
  .  <-{sends diagnostic MessagePort}-   .
main creates event listener              .
  .  -{posts "start" message}->          .
  .                                      begins server loop
```

after the server loop is running, the main thread will get a "resume_startup" message once the diagnostic server receives the right command from the websocket.

next TODO: the runtime needs to send a "attach to runtime" message which will signal the server that it can attach to the runtime (in native) and start calling EP session creation functions.

* checkpoint: start adding queue from streaming thread to DS thread

We can't set up a shared MessagePort very easily (we need to bounce through the main thread but it probably won't be able to process our message until it's too late).

Also Atomics.waitAsync isn't available on many browsers (Chrome only).

So we use emscripten's dispatch mechanism to trigger an event in the diagnostic thread to wake up and service the streaming thread's queue. Right now the queue is dumb so we trigger on every write.  and also the write is synchronous.

But it's simple to think about and it's implementable.

* [wasm] Incremental build and rollup warnings cleanups

- Add 'node/buffer' as an extrenal dependency. This doesn't do anything except quiet a rollup warning about the import.
- Add all the .ts files, and the tsconfig files (except node_modules) to the rollup inputs, to make sure we re-run rollup when anything changes.

* WIP: work on wiring up DS protocol commands (mock); resume hack

- start adding commands so that we can strt some sessions from DS
- we can't avoid a busy loop in ds_server_wasm_pause_for_diagnostics_monitor.
  we can't make the main thread pause until we get a resume command
  until after we're able to start an EP session (DS client won't send
  a resume command until we send an EP session ID back).  If the DS
  pauses until it can attach to the runtime, and the runtime pauses
  until DS tells it to resume, the main thread pause has to be after
  we get EP and DS initialized.  But that means it can't be async.  So
  we'll just have to busy wait on a condition variable in native.

* WIP: set up a WasmIpcStream, create EP sessions from DS

Seems to create the session, but not seeing write events
yet. possibly due to not flushing?

* WIP: starting to stream works; needs PTHREAD_POOL_SIZE bump

Looks like we can send the initial nettrace header some events.

We're starting more threads, so we need a bigger thread pool.

Also PTHREAD_POOL_SIZE_STRICT=1 (the default - warn if worker pool needs to grow,
but still try to grow it) seems to deadlock the browser-eventpipe
sample.

Set PTHREAD_POOL_SIZE_STRICT=2 (don't try to allocate a worker, make
pthread_create fail with EAGAIN) instead so we get some kind of
exception instead in other circumstances.

Set the pool size to 4.

* cleanup browser-eventpipe sample

* call mono_wasm_event_pipe_early_startup_callback from event_pipe init

  instead of from the rundown_execution_checkpoint_2 function

* if diagnostics server isn't enabled, don't try to initialize it

* checkpoint: start parsing binary commands

* checkpoint: Can parse a CollectTracing2 command and attempt to create a
session!

* [wasm-ep] use the new PromiseController<T>

* get back to the server loop quicker by queueing the parsing in the microtask

* update mock for binary ADVR_V1 message

* sample: don't suspend, and use a mock url

* wasm_ipc_stream: wire up close command

   Use a sentinal "buf" value (-1) to signal that the writer closed the stream

* Send proper OK messages in replies to binary protocol commands

* (testing) turn off the file session for now

* remove em_asm(console.log); simplify wasm EP init

   Just call the EP JS callback directly from native

* remove debug output

* cleanup wasm ipc stream impl

* put diagnostics mocks behind a const flag

* don't build wasm-specific DS if threads are disabled

* refactor and cleanup

- Move the IPC parsing and serialization into separate files
- Try to have one responsibility per class
- update comments and docs

* help treeshaking

verified that all the DS and EP JS code is dropped if monoWasmThreads is false.

* update DS design notes

* use PromiseController in more places

* fix Windows build

* add MONO_WASM prefix to console logging outputs

* improve debug output for DS server

   keep track of open/advertise counts and print them when receiving replies

* bugfix: don't confuse buf_addr for the value stored in it

   the buf_addr is always the same for a given queue. the value in it is what we need to check to see if it's the sentinel value

* fix bug in queue_push_sync main thread detection

* merge fixup

* fix rollup warning when making the crypto worker

* add MONO_WASM: prefix to logging

* make diagnostic server mocking friendlier

   Allow each test project to specify its own mock script.

   Also provide TypeScript declarations for the mocking interfaces

   Also always use binary protocol commands - don't send json for mocks.

* disable mocking in the sample project by default

* fixup after merge

* review feedback

   - improve diagnostics mock README
   - note that mocking just uses ES6 modules, testing with CJS is not supported right now.
   - fix iteration over listeners when dispatching a one-shot event in the EventTargt polyfill
   - use U32 getter in EP session creation
@lambdageek lambdageek self-assigned this Jul 26, 2022
@pavelsavara
Copy link
Member

@lambdageek Could you please recommend documentation or code I could read to learn more ?
Context mono AOT + wasm

@lewing lewing modified the milestones: 8.0.0, 9.0.0 Jul 25, 2023
@tommcdon tommcdon added the enhancement Product code improvement that does NOT require public API changes/additions label Mar 12, 2024
@lewing
Copy link
Member

lewing commented Jun 11, 2024

This isn't currently on the roadmap so I'm going to close it. We will use a new issue for any future plans

@lewing lewing closed this as completed Jun 11, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jul 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture area-Tracing-mono enhancement Product code improvement that does NOT require public API changes/additions
Projects
Status: Done
Development

No branches or pull requests

4 participants