1060: Catching up a few bug fixes #852

baemyung · 2024-03-13T14:16:49Z

A few number of bugs (which could be critical) have been fixed, and it is catching up into downstream.
They are

openbmc/bmcweb@a6695a8
Clear json object
openbmc/bmcweb@f3cb5df
http_client: fix for broken connection
openbmc/bmcweb@a889420
bmcweb: Add nullptr check for weakptr

nlohmann::json::clear() has different behavior dependent on what the underlying object is, rather than doing the expected behavior of completely clearing the json object. This didn't matter because of a similar bug in http_connection that relied on nlohmann:json::empty() which is ALSO type dependent, so these worked. Unfortunately, in 02e01b5 we wanted to allow empty objects, and this bug was exposed. There are two places where clear() is used, once in Response, which is clearly not the intent, which is to reset the object to the original constructed state. The other place we call clear is in Manager, where we use it to clear incremental results. That was a previous best practice that has been eliminated everywhere else (now we return as many results with the error as we are able). It has been removed. Tested: Logging into the webui in firefox no longer core dumps. Signed-off-by: Ed Tanous <edtanous@google.com> Change-Id: Ic89a037b30fb40c0e6eaeb939cae6e006dd0ffac

http_client is not handling connection termination by server due to keep alive timeout. At present client is not aware of connection termination from server. So whenever next redfish is event ready to be sent, the client will try send/receives data over broken connection. After failed operation the client will try to restart the connection by closing the current connection. Problems: 1) Restart is not attempted on all failure paths. Eg: stream_truncated error was ignored, which usually happens when try to read from broken connection, due to which retry is never performed. 2) Ssl shutdown over broken connection often fails to call the shutdown callback 3) ssl session was reused for new connection attempt. Which is wrong Solution: This patch will try to reattempt the connection in all failure cases. It uses new socket object and new ssl session for the retries Tested by: Test normal event flow between redfish-event clients and the BMC Test failure event flow between redfish-event clients and the BMC Tested the bad path by keeping the setup idle for 3 hours on the above two setups. Verified the events flow after this idle time Change-Id: I3d725b9d77bea22e2e8860e01ee0dfc971789008 Signed-off-by: Abhilash Raju <abhilash.kollam@gmail.com> Signed-off-by: Ed Tanous <ed@tanous.net>

When we call a null weakptr, it will cause a crash. Add nullptr check for weakptr can avoid this situation. Tested: bmcweb.service did not experience core-dump. Change-Id: I4490d68c70ea5d43681f4fb18b3859afb01ed70a Signed-off-by: Zhao Gang <zhaogang.0108@bytedance.com>

gtmills · 2024-03-14T03:31:15Z

@baemyung Do we think this fixes the core dump we are seeing ?

raviteja-b · 2024-03-14T04:37:09Z

@baemyung have you verified basic event flow with this PR with HMC connected?

abhilashraju · 2024-03-14T04:49:11Z

http/http_client.hpp

@@ -380,6 +380,16 @@ class ConnectionInfo : public std::enable_shared_from_this<ConnectionInfo>
        unsigned int respCode = parser->get().result_int();
        BMCWEB_LOG_DEBUG << "recvMessage() Header Response Code: " << respCode;

+        // Handle the case of stream_truncated.  Some servers close the ssl


can we take the latest code from upstream? It does not uses socket pointer

For 1060, we can't rebase with upstream but we try to catch up some potentially critical bug fixes if there are.
However, this particular lines are already synced with upstream - related to socket handling.

abhilashraju · 2024-03-14T04:50:54Z

http/http_response.hpp

@@ -153,7 +153,7 @@ struct Response
    {
        BMCWEB_LOG_DEBUG << this << " Clearing response containers";
        stringResponse.emplace(response_type{});
-        jsonValue.clear();
+        jsonValue = nullptr;


jsonValue is an object not a pointer right? should't it be like jsonValue=nlohmann::json();

I think both ways can be used - nlohmann/json#2156
Here, we're trying to sync up with what upstream does - https://github.com/openbmc/bmcweb/blob/9970e93f794d0110de436828775e333e81330ad5/http/http_response.hpp#L194

baemyung · 2024-03-14T13:20:23Z

@baemyung have you verified basic event flow with this PR with HMC connected?

I have asked the tester of 595145 to try it out,and waiting for the result.

baemyung · 2024-03-14T13:21:33Z

@baemyung Do we think this fixes the core dump we are seeing ?

I don't know at this time, but these seem critical ones to resolve regardless.

gtmills · 2024-03-15T03:22:41Z

Let's get this in. We can do something more later

edtanous and others added 3 commits March 12, 2024 10:10

gtmills requested review from abhilashraju and sunharis March 14, 2024 03:16

gtmills approved these changes Mar 14, 2024

View reviewed changes

raviteja-b self-requested a review March 14, 2024 04:41

abhilashraju reviewed Mar 14, 2024

View reviewed changes

rfrandse merged commit b5d73b0 into ibm-openbmc:1060 Mar 15, 2024
1 check passed

baemyung deleted the 1060-catchup-bug-fixes branch March 21, 2024 01:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1060: Catching up a few bug fixes #852

1060: Catching up a few bug fixes #852

baemyung commented Mar 13, 2024

gtmills commented Mar 14, 2024

raviteja-b commented Mar 14, 2024

abhilashraju Mar 14, 2024

baemyung Mar 14, 2024

abhilashraju Mar 14, 2024

baemyung Mar 14, 2024

baemyung commented Mar 14, 2024

baemyung commented Mar 14, 2024

gtmills commented Mar 15, 2024

1060: Catching up a few bug fixes #852

1060: Catching up a few bug fixes #852

Conversation

baemyung commented Mar 13, 2024

gtmills commented Mar 14, 2024

raviteja-b commented Mar 14, 2024

abhilashraju Mar 14, 2024

Choose a reason for hiding this comment

baemyung Mar 14, 2024

Choose a reason for hiding this comment

abhilashraju Mar 14, 2024

Choose a reason for hiding this comment

baemyung Mar 14, 2024

Choose a reason for hiding this comment

baemyung commented Mar 14, 2024

baemyung commented Mar 14, 2024

gtmills commented Mar 15, 2024