Skip to content

Commit

Permalink
[#23925] DocDB: Address recent regression of test PgWaitQueuesTest.Mu…
Browse files Browse the repository at this point in the history
…ltiTabletFairness

Summary:
`PgWaitQueuesTest.MultiTabletFairness` has some timing based assertions that started to fail recently. In particular, each thread issues a `select for update...` and waits for all other threads to reach the same state.

It seems like enabling shared memory in release mode has caused the flakiness in this test (only on mac release builds), which may be hints at `select for update...` taking longer after the change. But a potential regression affecting latencies due to that change is being looked at separately and it shouldn't have anything to do with this test.

here's a snippet of the error we see on mac release builds.
```
../../src/yb/yql/pgwrapper/pg_wait_on_conflict-test.cc:1317
Value of: queued_waiters.WaitFor(10s * kTimeMultiplier)
  Actual: false
Expected: true
```

This diff addresses the test only issue by executing all reads with explicit lock requests before the for loop with timing based assertions.
Jira: DB-12827

Test Plan: ./yb_build.sh release --cxx-test pgwrapper_pg_wait_on_conflict-test --gtest_filter PgWaitQueuesTest.MultiTabletFairness -n 50 --tp 1

Reviewers: rthallam, pjain, patnaik.balivada

Reviewed By: patnaik.balivada

Subscribers: ybase, yql

Differential Revision: https://phorge.dev.yugabyte.com/D38050
  • Loading branch information
basavaraj29 committed Sep 16, 2024
1 parent e2b1d28 commit e717f43
Showing 1 changed file with 7 additions and 7 deletions.
14 changes: 7 additions & 7 deletions src/yb/yql/pgwrapper/pg_wait_on_conflict-test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1291,6 +1291,12 @@ void PgWaitQueuesTest::TestMultiTabletFairness() const {
update_conns.reserve(kNumUpdateConns);
for (int i = 0; i < kNumUpdateConns; ++i) {
update_conns.push_back(ASSERT_RESULT(Connect()));
auto& conn = update_conns.back();
ASSERT_OK(conn.StartTransaction(IsolationLevel::SNAPSHOT_ISOLATION));
// Establish a distributed transaction by obtaining a lock on some key outside of the contended
// range of keys.
ASSERT_OK(conn.FetchFormat("SELECT * FROM foo WHERE k=$0 FOR UPDATE", kNumKeys * 2 + i));
LOG(INFO) << "Conn " << i << " started";
}

TestThreadHolder thread_holder;
Expand All @@ -1301,20 +1307,14 @@ void PgWaitQueuesTest::TestMultiTabletFairness() const {
// in *serial* in both RR and RC isolation.
for (int i = 0; i < kNumUpdateConns; ++i) {
auto& conn = update_conns.at(i);
ASSERT_OK(conn.StartTransaction(IsolationLevel::SNAPSHOT_ISOLATION));
// Establish a distributed transaction by obtaining a lock on some key outside of the contended
// range of keys.
ASSERT_OK(conn.FetchFormat("SELECT * FROM foo WHERE k=$0 FOR UPDATE", kNumKeys * 2 + i));
LOG(INFO) << "Conn " << i << " started";
update_did_return[i] = false;

thread_holder.AddThreadFunctor(
[i, &conn, &update_did_return = update_did_return[i], &queued_waiters, &update_query] {
// Wait for all connections to queue their thread of execution
auto txn_id = ASSERT_RESULT(conn.FetchRow<Uuid>("SELECT yb_get_current_transaction()"));
LOG(INFO) << "Conn " << i << " queued with txn id " << yb::ToString(txn_id);
queued_waiters.CountDown();
ASSERT_TRUE(queued_waiters.WaitFor(10s * kTimeMultiplier));
ASSERT_TRUE(queued_waiters.WaitFor(20s * kTimeMultiplier));
LOG(INFO) << "Conn " << i << " finished waiting";

// Set timeout to 10s so the test does not hang for default 600s timeout in case of failure.
Expand Down

0 comments on commit e717f43

Please sign in to comment.