-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cleardtc on aw50 fails successfully #84
Comments
hey @aaeegg , long time no see. Do you remember what the reported failure was ? |
Here's the log:
|
you should know that code better than me since you wrote it P) Suggest adding some |
In response to the clearDiagnosticInformation request (AF 01), the TCM sends a series of routineNotCompleteOrServiceInProgress responses (7E xx 23) at ~52ms intervals, then after a total of nearly a full second, a success response (EF 01). Raw communication as seen when connecting to the interface (OBDLink SX) with a terminal program:
Scope shot of overall timing (readDiagnosticTroubleCodes request at G1, response at G2, immediately followed by clearDiagnosticInformation request at x, final response at o): Scope shot of routineNotCompleteOrServiceInProgress repetition rate: The behavior of the TCM is in contrast to the fuel management ECU, which immediately sends a success response. scantool handles the latter correctly. But when faced with the TCM's series of responses, scantool only looks at the first routineNotCompleteOrServiceInProgress response and doesn't wait for the interface to send the rest of the responses and the > prompt. It interprets the routineNotCompleteOrServiceInProgress response as a failure and ignores the rest of the responses. scantool output with debugging turned on:
Contrary to my original comment, scantool is still communicating with the TCM at this point. However, a subsequent disconnect command fails. scantool sends a stopDiagnosticSession request and gets a success response, then it tries to send an ATPC command to disconnect the interface. Normally, by the time scantool sends the ATPC command, the interface has already returned to command mode. But in this case, the previous series of routineNotCompleteOrServiceInProgress responses caused the interface's adaptive timing algorithm to increase the response timeout. The interface is still listening for more responses and hasn't yet sent the > prompt, but scantool considered the stopDiagnosticSession request/response to be done and didn't wait for the > prompt before sending the ATPC command. This caused the interface to abort the still-in-progress receive and send a STOPPED error. scantool output with debugging turned on:
According to Richard H. Jones's document on error codes, in addition to the AW50-42 TCM sending routineNotCompleteOrServiceInProgress while clearing DTCs, the airbag module does the same thing. Additionally, when a readDataByLocalIdentifier request is sent with a final byte of 02 rather than the usual 01 (scantool never does this), the ECU will send a continuous data stream alternating between routineNotCompleteOrServiceInProgress and readDataByLocalIdentifier responses. So there are two problems here:
A possible side issue is that the diag_l1 timeouts are totally uncoordinated with the interface's timeouts. So the upper layer code might know how long a response should take and use a specific timeout, and that's the timeout used by scantool to communicate with the interface, but the timeout used by the interface to communicate with the ECU is totally different. Also, if we know what the protocol's timing constraints are, it would probably be better to set the interface to fixed timeout (ATAT 0, ATST hh) instead of letting it use adaptive timing. |
wow, excellent analysis. Do I understand that for this issue, the only clean way to deal with it is to disable adaptive timing ? I can think of a few approaches :
|
I think this could break on a fast enough computer even if adaptive timing is disabled. We should be setting a flag when an operation is in progress, and clear it when we see a > character. On entry to elm_send and elm_sendcmd, if the flag is set and there's not already a > in the input buffer, we should send a space character to abort the operation, then wait for the >. Note that we should only wait for the > and not explicitly check for the STOPPED response, because there's a race condition if the operation completes just as we're trying to abort it. Also note that we send a space rather than a CR, because if a CR arrives after the operation completes, the interface will interpret this as a request to repeat the last command. I tested leading space characters on the OBDLink SX and they're safe. They should also be OK on original ELM327 according to the datasheet. I don't know what other clones will do, but even if they don't handle it correctly, we won't be hosed any worse than if we had done nothing. |
Why would it break ? When you request that sendcmd from D2 code, I understand there's a good notion of what kind of timeout is appropriate, so it should be a simple matter of passing the request with that timeout ? I don't know if I want to change the ELM layer into some kind of partially async code. |
The problem is, you send your request, elm_send returns immediately. The car responds, you call elm_recv with whatever timeout, elm_recv gets the response and returns. But the interface is still waiting for more potential responses from the car. If you actually expect multiple responses, you call elm_recv again and get the next response (or a timeout), etc. If you've finished with this transaction and proceed to send another request or command - that's when it breaks. |
I split off the ELM interface problem into #87 because it's a separate issue from the misinterpretation of the TCM's response to clearDiagnosticInformation. |
…ard it and keep listening until we get the final response. This is necessary for requests that return a few routineNotCompleteOrServiceInProgresses before they complete, such as clearDiagnosticInformation on AW50-42. Fixes fenugrec#84
…ard it and keep listening until we get the final response. This is necessary for requests that return a few routineNotCompleteOrServiceInProgresses before they complete, such as clearDiagnosticInformation on AW50-42. Fixes fenugrec#84
If we get a routineNotCompleteOrServiceInProgress response, discard it and keep listening until we get the final response. This is necessary for requests that return a few routineNotCompleteOrServiceInProgresses before they complete, such as clearDiagnosticInformation on AW50-42. Fixes fenugrec#84
When clearing DTCs on AW50-42 transmission using D2 protocol, DTCs are cleared, but scantool reports "Failed" and loses connection to the TCM. Maybe a timeout needs to be increased.
The text was updated successfully, but these errors were encountered: