Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux driver stalls if a signal is received #1203

Closed
richardroberts opened this issue Feb 21, 2018 · 6 comments
Closed

Linux driver stalls if a signal is received #1203

richardroberts opened this issue Feb 21, 2018 · 6 comments
Assignees

Comments

@richardroberts
Copy link


Required Info
Camera Model D400
Firmware Version Unknown
Operating System & Version Ubuntu 16
Kernel Version (Linux Only) Unknown
Platform Custom
SDK Version Latest 2.x from GitHub

Issue Description

When a process using the RealSense API receives a signal, the driver stalls. This is because EINTR from a select call in backend-v4l2.cpp is not handled by attempting a retry. The attached patch fixes this issue in the testing I did. I'm hoping someone can take a look if this indeed looks like a bug, and whether there are any problems with the proposed fix.

select_check_eintr.patch.txt

@RealSense-Customer-Engineering
Copy link
Collaborator

[Realsense Customer Engineering Team Comment]
@richardroberts, please file the PR to request engineer review. Thank you.

@dorodnic
Copy link
Contributor

dorodnic commented Apr 3, 2018

I created PR based on the patch, @richardroberts, thank you for bringing this up.
Could you please provide some extra context regarding how to reproduce the issue?
We are exercising the V4L2 backend in our QA lab regularly for long periods of time, but never came across such problem.

@richardroberts
Copy link
Author

Thank you for the followup. The issue arose on an embedded Intel Pentium platform, using Ubuntu Linux with realtime patches. A guess about why it occurred, is that we have some threads with very high priority running, and several sensors and I/O connected, so this may mean frequent interrupts cause the read to sometimes return early so that the kernel can deal with some I/O (very much a guess here).

The issue never occurred for me when testing on the desktop, but only occurred on the embedded platform, where it did occur repeatably. On the embedded platform we could usually stream from the camera for anywhere between a few seconds to a minute before this hang occurred.

I think a commercially-available computer most likely to reproduce this issue would be an "Up Board". If necessary, I might be able to work with you so that you're able to reproduce the issue, for this, contact me over email.

@dorodnic
Copy link
Contributor

Great. I tested the patch a bit on my machine. We have couple of Up-Boards in the lab, but don't run regular and prolong validation on them. We'll try to run a longer cycle on it to make sure the problem does not come back.

@RealSense-Customer-Engineering
Copy link
Collaborator

[Realsense Customer Engineering Team Comment]
@richardroberts
The code has been merged into development branch, if your problem is fixed, can you help to close the issue? Thank you.

@richardroberts
Copy link
Author

Ok, if I understand correctly, you would like me to close the issue, so I'm doing that. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants