-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
D435 intermittantly fails to appear on lsusb after boot and reboot #1615
Comments
@josephduchesne thanks for all the details. I have seen issue 1 occur consistently with a 5i7 NUC and not at all on a fairly new 7i7 NUC. After updating the BIOS I have not seen issue 1 occur since. Can you confirm that you have the latest BIOS? |
@tedtman I made note of the current BIOS and also downloaded the newest BIOS. I can confirm that this still occurs. I plugged 2x D435 using firmware 5.9.9.2 into the rear USB A ports. On the second boot, lsusb showed:
0/2 devices show up on Bus 002. On the next boot cycle, both appeared. I received the following errors that seem related in syslog:
Firmware in previous tests: BNKBL357.86A.0050.2017.0816.2002 |
I left the test running for a while. The computer reboots on a cron (5 min), and after boot if there are no D435s showing up.
This seems suspiciously regular. |
Leaving this running all weekend, it got less regular:
|
Sorry @josephduchesne but can you confirm my understanding of your test?
|
In this particular case it's the following loop:
I also tested an alternate configuration where every 5 minutes there was a 15 second power off cycle but the results looked similar. The two variables D435_COUNT and NO_ADDRESS are:
|
Are the cameras streaming when reboot is initiated? Is "power off cycle" a physical power off or shutdown? |
They cameras may be streaming when the reboot is initiated. Reboot is 'sudo reboot' in this case. The cameras are running under a systemd service, but I'm not sure how the ros nodelet manager handles the signal to stop when the shutdown happens. It is possible it's not being shut down 'properly', but since our use case involves a machine with a physical key that could turn power off without warning, this is a case that we will have to handle as gracefully as possible. |
Thanks, your case is understood and I just want to help provide the specifics for debugging. If lsusb is successful, does your test know if the cameras are actually streaming? (The DSO you noted fixed 5.8.15 that succeeded with lsusb but failed to be seen by the driver on cold boot; validation for this was also performed on slightly different h/w, type-B USB boards) |
If lsusb is successful, I have a wrapper around the roslaunch which watches the ROS output topic, and resets the entire pipeline (roslaunch file and all nodes) if there has been no output for 10 seconds. This also logs failures. By calling reset on the cameras on initialisation (using this modification to the ROS nodelet), the camera will come up with a small number of attempts (0-3, typically 0). This process is also logged in my logfile. An example section of this logfile:
The sequence of events in plain english:
|
The clarify, the launch monitor process exists to work around DSO-8665 in conjunction with the ROS node modification that resets the camera on node start. Also, yes the fact that the launch monitor did not reset a camera after 10 seconds indicates that the camera is streaming depth data. I have independently validated this using rostopic / rviz several times. |
[Realsense Customer Engineering Team Comment] |
The earlier tests were with USB legacy enabled, but after updating the firmware I tested first with it enabled, then with it disabled (latest tests). I didn't see a difference. I'll check if xHCI is configured tomorrow and test whichever mode is opposite the current setting. |
I can't find xHCI configuration options on the NUC7i3BNH |
Sorry, I forgot that you have a 7th gen NUC; xHCI has no optional settings. I have been working with a 4th gen i3 that has an auto / enabled mode option for xHCI. When lsusb fails, what do you see on dmesg before reboot? |
For completeness I've attached the entire dmesg file, but the relevant section appears to be:
The Bus enumeration on the NUC7i3 is:
It appears that bluetooth is provided on this machine via the USB 2.0 host controller on Bus 1 as an internal USB device. (The keyboard+mouse were not present during the tests but were plugged in for BIOS configuration)
|
An additional piece of debugging information: I tried a Rosewill RC-509 PCI-E (PCI Express) to USB 3.1 (Type A +Type C) Expansion Card in the AsRock IMB-195 i7-6700TE platform listed above in the table I provided. It does not experience issue 1. The devices always initialise correctly. |
[Realsense Customer Engineering Team Comment] sorry for late, i am another engineer now take care this ticket, i got 2 D435 and gen7 Nuc for this issue now, but would like to confirm if latest FW5.9.13 still have same issue? |
[Realsense Customer Engineering Team Comment] |
[Realsense Customer Engineering Team Comment] |
[Realsense Customer Engineering Team Comment] do you still have this issue? |
I just re-ran the test on firmware 5.9.14. I set it running on Saturday with the same parameters and initially things seemed to have improved (one camera failed to show up on After reinstalling the hub, the issue stopped occuring again. |
[Realsense Customer Engineering Team Comment] do you mean after reinstall the hub and it works good always? |
The hub we're using is an unpowered USB hub, using USB-C to connect to the NUC and sporting USB-A ports. With the hub (5.9.14): Two D435s started first time 128 times, required 1+ software resets 182 times Without the hub (5.9.14): Two D435s started first time 138 times, required 1+ software resets 240 times, and one of them failed to show up as a device at boot (not visible on lsusb) 8 times in the first 13 hours. After that one device failed to show up on 3207 consecutive reboot cycles (over the weekend), but after upon unplugging and replugging the device begain to function again. |
The bios version that is experiencing the issue: BNKBL357.86A.0050 |
[Realsense Customer Engineering Team Comment] thanks, please let me know how it works. |
@RealSense-Customer-Engineering 467 reboot cycles in, no instances of the issue. It seems that the 0067 NUC bios + 5.9.14 firmware works well. |
I left the test running longer and had another failure. This was on the NUC with the latest firmware, and a recent version of librealsense: Relevant Kernel+Syslogs of failure:
one attempt to kill+relaunch ROS driver
After restarting the computer (and the device fails to show up on
This seems reminiscent of #2311, however the cables used are the supplied intel ones and although our eventual target is a mobile robot this test bench is completely static, and the test had been ongoing successfully for over 24h at this point (restarting every 5 minutes and then gathering camera data from two cameras started 3 seconds apart, logging any issues along the way). |
[Realsense Customer Engineering Team Comment] do you have another USB certified cable for testing? the frame did arrival in time mostly cause by cable quality. |
I do, I have restarted the test with a different cable on both cameras. That said, after rebooting the computer the camera still won't reconnect in this state. The system has to be either physically unpowered, or the camera unplugged. This is more than just a transient issue and appears to indicate that there is a firmware, driver or hardware bug allowing the camera to get into a non-functioning state that can't be reset from software. |
[Realsense Customer Engineering Team Comment] Did you try the firmware 5.10.3 to see how it works? |
Running with very recent BIOS on our NUC (BNKBL357.86A.0067.2018.0814.1500), and firmware 5.10.3, we experience two problems. With the USB-C hub that we were previously using without issue, the cameras fail to come up on reboot (no cameras on Upon downgrading the BIOS 0063 (the one we were previously using), the USB-C hub no long has the "error -71" issue on reboot (however I would expect it to revert to having problems with the USB A ports that we saw on 0063). Boot logs containing "usb" with BIOS 0067:
With the USB A ports, they work the vast majority of the times, but still occasionally experience the following issue where they have device descriptor errors:
|
[Realsense Customer Engineering Team Comment] We ran the following tests on NUC7i3BNH / Ubuntu 16.04.4 / LibRealSense 2.16.1 / FW 5.10.3 / 2 D435 RealSense Cameras w/ bash script that reboots system and checks 'lsusb' for 2 D435 cameras:
Questions:
Thank you! |
[Realsense Customer Engineering Team Comment] We would like to help narrow down this issue, however; if no response we will close this issue as solved. Thank you! |
That's encouraging. We are trying without the USB hub with two cameras right now. With BIOS 69 the unpowered USB C hub we are using was not working, however without the hub it seems better. I'm trying to see if the no-hub configuration is as stable as BIOS 63 with the hub (which is what we were using previously). |
We are using the intel provided cables at the moment in our tests. The ones we use in our robots are from another supplier since we want ones that screw in to the camera. |
Ok, I'm currently testing the configuration. On 3/4 systems that I have tested so far NUC7i3 BIOS 0069 + Intel Realsense firmware 5.10.3 (the latest version when I started these tests two weeks ago) + a recent version of ros-kinetic-intel-realsense2 the configuration works fine and appears stable. The cable configuration is two USB type C to Type A cables: one connected to the front blue USB port, and one connected to the back bottom USB port. On one system though, it appears to work when cold booted (power applied to NUC), but on reboot both cameras return a USB kernel error and fail to show up under
|
[Realsense Customer Engineering Team Comment] I am glad to hear 3 of the 4 systems are working for you now! Just for a quick test - can you:
Thank you for the update! Please let us know how it goes! |
Hello, sir I have the same issue. Required Info My SBC is recognizing D435 on USB 3.0 port, but when I reboot the SBC from command, The SBC lost the D435 after rebooting.
I need to manipulate the SBC remotely. I hope the next firmware will solve this issue. Thank you. |
[Realsense Customer Engineering Team Comment] Sorry to hear you are seeing a similar issue with D435 camera. From our testing using Intel NUCs + provided D435 USB 3.0 type C cable, this problem was proven not an issue with D435 camera itself but a cabling and / or USB port connection issue. Since you are using a custom Intel Atom board, a few questions:
I am thinking the issue you are seeing is related to power consumption, could you please:
Thank you |
[Realsense Customer Engineering Team Comment] |
I met this problem as well, see issue discription ,any updates so far? |
Happened to wander in here... my own experience with these cameras and some other USB devices is that doing Instead, a workaround I commonly do is to use the |
@erelson - just wanted to say THANK YOU for this! The camera not being detected by the OS has been killing us in prod and although your exact answer didn't work for us, it led me to a solution that did. For others that stumble on this thread, here's what worked for us:
We're still on the lookout for a solution that doesn't require a reboot, but this works for now. If anyone finds a way to properly cycle a Realsense camera (we're using a D415) without rebooting when the camera doesn't show up on |
Issue Description
When trying to use 2x D435 with a 7th gen i3 NUC, I found that sometimes the devices failed to show up after a reboot cycle.
I automated both a reboot test:
The service used the ROS pipeline with only depth enabled. One pipeline per camera. The only modificiation to the current release branch is this reset the camera whenever the driver launches: ethz-asl/realsense@148eaf9
I have experienced three issues:
lsusb
) This is the primary issue that I am concerned about.After finding a workable solution for problem 2, I set out to profile problems 1 and 3.
I profiled this a lot to try to understand the problem and work around it:
In that last case, there is a fairly reliable pattern.
The text was updated successfully, but these errors were encountered: