Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meshcat in C++ #13038

Closed
mntan3 opened this issue Apr 10, 2020 · 12 comments · Fixed by #15670
Closed

Meshcat in C++ #13038

mntan3 opened this issue Apr 10, 2020 · 12 comments · Fixed by #15670

Comments

@mntan3
Copy link
Contributor

mntan3 commented Apr 10, 2020

Issue Description

When visualizing a model with many links, I was seeing that meshcat was slowing down the simulator significantly and the drake visualizer doesn't. It would be nice if this were at least documented if in fact meshcat is just slower than Drake Visualizer.

Example test script and model here to replicate:

https://gist.github.com/mntan3/be02a1c410a0830f2ddb656aaf6403e2
After running the script for 10 seconds, it should print out the simulator rate. I was observing something like a rate of 0.8 for drake visualizer and 0.6 for meshcat on my computer

Initial discussions from slack:

https://drakedevelopers.slack.com/archives/C43KX47A9/p1586538504017800
sean.curtis 2 hours ago
I think you may just be victim of python loops vs C++ loops.

mntan 2 hours ago
Just to clarify, so you're saying that meshcat is written in python and drake_vis is written in c++, so that's why meshcat is going to be slower?

sean.curtis 2 hours ago
Essentially -- there may be other reasons, but that will be one wall you won't be able to get around. And the key, particularly, is the work that has to be done in a Drake System to translate Drake state to be consumed by the visualizer.
You might try collecting timing on the publish method of the mesh cat visualizer -- easy enough to do in python. I bet most of the time lost is spent right there.

eric.cousineau 1 hour ago
I think this may be good to track in an issue. @mntan Do you feel comfortable porting this to a Drake issue?

eric.cousineau 1 hour ago
My guess is it might be slow due to mesh conversion for sending it over the wire?

@sherm1
Copy link
Member

sherm1 commented Apr 11, 2020

Assigning to @mntan3 for further investigation.

@RussTedrake
Copy link
Contributor

@manuelli said he found that the numpy -> msgpack conversion is extremely slow out of the box (in both directions), but that this package implements a much faster alternative.
https://pypi.org/project/msgpack-numpy/

fwiw, I think the right solution is probably to write our mechcat visualizer in c++.

cc @xuchenhan-tri

@RussTedrake
Copy link
Contributor

@RussTedrake
Copy link
Contributor

Planned resolution is to move to c++.

@RussTedrake RussTedrake changed the title Meshcat slower than Drake Visualizer Meshcat in C++ Jul 6, 2021
@RussTedrake
Copy link
Contributor

Related to RussTedrake/manipulation#145

@RussTedrake
Copy link
Contributor

fwiw -- i'm planning to spike-test a c++ implementation over the next few days.

@RussTedrake
Copy link
Contributor

I'll leave some notes here to document some of the relevant decisions.

Websockets not ZMQ. meshcat-python uses a separate ZMQ server to relay between python and the browser:
python Visualizer <=zmq=> zmqserver <=websockets=> browser
I intend to go directly from c++ meshcat to the browser:
c++ Visualizer <= websockets => browser
Having discussed with @rdeits, the zmq server design was put in place partially to support multiple geometry suppliers (the visualizers) and consumers (the browsers), but also just to parcel out the asyncio complexities away from the supplier in python. His Julia meshcat Visualizer just goes straight to the browser via websockets, and he's been recommending that to me when I upgrade. This is especially relevant because I am trying to add new support for gui elements in the meshcat browser sending information back to c++, and the zmqserver in the middle complicates that workflow significantly.

C++ websocket libraries. I've now explored a handful of websocket libraries that we could potentially use in drake c++. This list was helpful. Taking a number of factors, such as licensing and light dependencies, I ended up looking most closely at:

@RussTedrake
Copy link
Contributor

RussTedrake commented Aug 14, 2021

Basic C++ design is currently:

  • Meshcat is a class that plays the role of meshcat.Visualizer in python. It will launch the websocket listener thread and accept set_object, set_transform, etc, calls in the main thread. This will in many way parallel DrakeLcm. I've put this in drake::geometry, because it will support geometry shapes and depend on geometry methods to load meshes, etc.
  • MeshcatVisualizer will be a LeafSystem that is analogous to DrakeVisualizer (and replace the current drake MeshcatVisualizer implemented in python). It will accept a Meshcat object in the constructor, or offer to own one itself. I will put this in drake::geometry next to DrakeVisualizer; that seems like the right place (it does depend on geometry).
  • I will also have to port the meshcat.geometry objects.

My current PR strategy is:

  1. Meshcat proof of life. Starts the server, demonstrates that clients can connect, and just sends one type of message to show that data can flow. Reviewers can focus on the build system and websocket server details.
  2. Bring in testing framework. Requires new build dependencies, which I want to separate from the original PR.
  3. Meshcat full api (set_transform, delete, etc). Still with only a modicum of geometry supported.
  4. basic MeshcatVisualizer (c++ only, no bindings) implementation. Importantly, I this version will have optional output ports for ui feedback.
  5. Add python bindings

Then we can add more geometry / bells and whistles incrementally.

RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 17, 2021
This is the first of a series of PRs that will provide Meshcat as a visualizer in C++.  The design and PR strategy is documented on in RobotLocomotion#13038.

This is the first PR:
Meshcat proof of life. Starts the server, demonstrates that clients can connect, and just sends one type of message to show that data can flow. Reviewers can focus on the build system and websocket server details.

The dependencies added here are all quite lightweight, and properly licensed.  And at the end of the PR train, we will likely want to deprecate meshcat-python, and eventually remove the pip dependencies that come along with it.
@RussTedrake
Copy link
Contributor

RussTedrake commented Aug 18, 2021

For a more mature testing strategy, I'm currently trying:

  • Loading meshcat.Viewer() directly in node.js, so that I can provide a test utility that connects to my C++ websocket server and checks for certain conditions to be met (e.g. that set_property('/Background', 'visible', false) has the desired result. This seems the ideal in terms of verifying correctness. It has some immediate challenges in terms of getting meshcat to run headless, which I'm slowing bashing through, and then adding nodejs + npm libraries into the drake test installation framework.

Alternatives could include:

  • connecting to the websocket (probably from python) and simply verifying that the message is getting through as expected.
  • testing via headless chrome / chromium (e.g. using puppeteer, which I've used before).

@RussTedrake
Copy link
Contributor

RussTedrake commented Aug 18, 2021

Test strategy update: I've got the following working well with a minimal nodejs setup:

// Test utility for Meshcat that
// 1) Connects a (headless) meshcat Viewer object via websockets to `ws_url`,
// 2) Waits until the Viewer receives `num_messages_to_wait_for` messages 
//    (default: 0),
// 3) Evaluates the string `eval_string`.
// 4) Exits with return code 0 if the `eval_string` evaluates to `true`,
//    otherwise with return code 1.
//
// Run with `node meshcat_test.js ws_url eval_string [num_messages_to_wait_for]`
// e.g. 
//  node meshcat_test.js 'ws://localhost:7001' \
//    "viewer.scene_tree.find(['Background']).object.visible == true" 3
//
// Requires meshcat, and `npm install jsdom webgl-mock-threejs canvas`.

The full script is here: meshcat_test.js

@jwnimmer-tri
Copy link
Collaborator

... adding nodejs + npm libraries into the drake test installation framework.

FYI My rough impression from very quick glances in the past was that node and npm were extremely difficult to make sufficiently hermetic for use Drake. You might want to de-risk that before walking too far down this path. Maybe https://github.com/bazelbuild/rules_nodejs has already resolved this by now, but I don't think we know for sure yet.

connecting to the websocket (probably from python) and simply verifying that the message is getting through as expected.

Why is this option not the best answer? We don't acceptance test drake-visualizer round trip, we assume that it has its own testing in place, and so within Drake we just check that the messages we are sending it are as desired. That same story seems like it should be plenty sufficient for meshcat as well? If we find that too many bugs are slipping through, we can always upgrade to a headless regression test in the future.

RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 19, 2021
This is the first of a series of PRs that will provide Meshcat as a visualizer in C++.  The design and PR strategy is documented in RobotLocomotion#13038.

This is the first PR:
Meshcat proof of life. Starts the server, demonstrates that clients can connect, and just sends one type of message to show that data can flow. Reviewers can focus on the build system and websocket server details.

The dependencies added here are all quite lightweight, and properly licensed.  And at the end of the PR train, we will likely want to deprecate meshcat-python, and eventually remove the pip dependencies that come along with it.
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 19, 2021
This is the first of a series of PRs that will provide Meshcat as a visualizer in C++.  The design and PR strategy is documented in RobotLocomotion#13038.

This is the first PR:
Meshcat proof of life. Starts the server, demonstrates that clients can connect, and just sends one type of message to show that data can flow. Reviewers can focus on the build system and websocket server details.

The dependencies added here are all quite lightweight, and properly licensed.  And at the end of the PR train, we will likely want to deprecate meshcat-python, and eventually remove the pip dependencies that come along with it.
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 19, 2021
Adds test coverage, and therefore moves Meshcat out of dev.

Per discussion with jwnimmer-tri, the strategy here is to provide a reasonable coverage of the c++ code (CI runs most all of the code and verifies it doesn't segfault; avoiding coverage of the throw and join methods is fine).

I explored more elaborate testing mechanisms (documented in RobotLocomotion#13038) using nodejs.  This could add value downstream, but adding nodejs support to bazel might be a big upfront (and even maintenance) cost for relatively small gain.  As jwnimmer-tri points out, we don't provide that level of coverage for DrakeVisualizer.
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 19, 2021
Adds test coverage, and therefore moves Meshcat out of dev.

Per discussion with jwnimmer-tri, the strategy here is to provide a reasonable coverage of the c++ code (CI runs most all of the code and verifies it doesn't segfault; avoiding coverage of the throw and join methods is fine).

I explored more elaborate testing mechanisms (documented in RobotLocomotion#13038) using nodejs.  This could add value downstream, but adding nodejs support to bazel might be a big upfront (and even maintenance) cost for relatively small gain.  As jwnimmer-tri points out, we don't provide that level of coverage for DrakeVisualizer.
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 19, 2021
Adds test coverage, and therefore moves Meshcat out of dev.

Per discussion with jwnimmer-tri, the strategy here is to provide a reasonable coverage of the c++ code (CI runs most all of the code and verifies it doesn't segfault).

I explored more elaborate testing mechanisms (documented in RobotLocomotion#13038) using nodejs.  This could add value downstream, but adding nodejs support to bazel might be a big upfront (and even maintenance) cost for relatively small gain.  As jwnimmer-tri points out, we don't provide that level of coverage for DrakeVisualizer.
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 19, 2021
Adds test coverage, and therefore moves Meshcat out of dev.

Per discussion with jwnimmer-tri, the strategy here is to provide a reasonable coverage of the c++ code (CI runs most all of the code and verifies it doesn't segfault).

I explored more elaborate testing mechanisms (documented in RobotLocomotion#13038) using nodejs.  This could add value downstream, but adding nodejs support to bazel might be a big upfront (and even maintenance) cost for relatively small gain.  As jwnimmer-tri points out, we don't provide that level of coverage for DrakeVisualizer.
jwnimmer-tri pushed a commit that referenced this issue Aug 19, 2021
This is the first of a series of PRs that will provide Meshcat as a visualizer in C++.  The design and PR strategy is documented in #13038.

This is the first PR:
Meshcat proof of life. Starts the server, demonstrates that clients can connect, and just sends one type of message to show that data can flow. Reviewers can focus on the build system and websocket server details.

The dependencies added here are all quite lightweight, and properly licensed.  And at the end of the PR train, we will likely want to deprecate meshcat-python, and eventually remove the pip dependencies that come along with it.
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 19, 2021
Adds test coverage, and therefore moves Meshcat out of dev.

Per discussion with jwnimmer-tri, the strategy here is to provide a reasonable coverage of the c++ code (CI runs most all of the code and verifies it doesn't segfault).

I explored more elaborate testing mechanisms (documented in RobotLocomotion#13038) using nodejs.  This could add value downstream, but adding nodejs support to bazel might be a big upfront (and even maintenance) cost for relatively small gain.  As jwnimmer-tri points out, we don't provide that level of coverage for DrakeVisualizer.
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 19, 2021
Adds test coverage, and therefore moves Meshcat out of dev.

Per discussion with jwnimmer-tri, the strategy here is to provide a reasonable coverage of the c++ code (CI runs most all of the code and verifies it doesn't segfault).

I explored more elaborate testing mechanisms (documented in RobotLocomotion#13038) using nodejs.  This could add value downstream, but adding nodejs support to bazel might be a big upfront (and even maintenance) cost for relatively small gain.  As jwnimmer-tri points out, we don't provide that level of coverage for DrakeVisualizer.
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 26, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 26, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 26, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 26, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 27, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 27, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 27, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 27, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 27, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 28, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 28, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 29, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 29, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 29, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 29, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 29, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 29, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 29, 2021
RussTedrake added a commit that referenced this issue Aug 30, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 30, 2021
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 30, 2021
Follow-up to RobotLocomotion#13038.
Also makes the meshcat_visualizer_test.py more robust (test_warnings_and_errors could fail if the default zmq_url was already consumed)
RussTedrake added a commit to RussTedrake/drake that referenced this issue Aug 30, 2021
Follow-up to RobotLocomotion#13038.
Also makes the meshcat_visualizer_test.py more robust (test_warnings_and_errors could fail if the default zmq_url was already consumed)
RussTedrake added a commit that referenced this issue Aug 31, 2021
Follow-up to #13038.
Also makes the meshcat_visualizer_test.py more robust (test_warnings_and_errors could fail if the default zmq_url was already consumed)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants