Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Allow read committed isolation to pick read time using safe time on first rpc to tserver #15856

Closed
pkj415 opened this issue Jan 27, 2023 · 0 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@pkj415
Copy link
Contributor

pkj415 commented Jan 27, 2023

Jira Link: DB-5248

Description

In Read Committed isolation, a new read time is picked for each statement (i.e., a new logical snapshot of the database is used for each statement's reads). This is done (in PgClientService) by setting the read time to the current time at the start of each new statement before issuing requests to any tserver. However, this might results in high latencies in the first read op that is executed as part of that statement because the tablet serving the read (likely on another node) might have to wait for the "safe" time to reach the picked read time. A long wait for safe time is usually seen when there are concurrent writes to the tablet and the read enters while the raft replication that moves the safe time ahead is still in progress (see #11805).

This issue is avoided in Repeatable Read isolation because there, the first tablet serving the read in a transaction is allowed to pick the read time as the latest available "safe" time without having to wait for any catchup. This read time is sent back to PgClientService as used_read_time so that future reads can use the same read time. Note that even in Repeatable Read isolation, in case, there are multiple parallel RPCs to various tservers, the read time is still picked on the PgClientService because otherwise, the rpcs would have to wait for one of them to execute and came back with a used_read_time.

@pkj415 pkj415 added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Jan 27, 2023
@pkj415 pkj415 self-assigned this Jan 27, 2023
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jan 27, 2023
@pkj415 pkj415 added priority/high High Priority and removed kind/bug This issue is a bug priority/medium Medium priority issue status/awaiting-triage Issue awaiting triage labels Jan 27, 2023
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug status/awaiting-triage Issue awaiting triage labels Feb 14, 2023
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage kind/bug This issue is a bug priority/high High Priority labels Feb 22, 2023
@pkj415 pkj415 changed the title [YSQL] Operations in Read Committed isolation unnecessarily incur high latencies [YSQL] Operations in Read Committed isolation might unnecessarily incur high latencies Mar 31, 2023
pkj415 added a commit to pkj415/yugabyte-db that referenced this issue Apr 7, 2023
…ble for Read Committed isolation

Summary:
In Read Committed isolation, a new read time is picked for each statement
(i.e., a new logical snapshot of the database is used for each statement's
reads). This is done (in PgClientService) by setting the read time to the
current time at the start of each new statement before issuing requests to any
tserver. However, this might results in high latencies in the first read op that
is executed as part of that statement because the tablet serving the read
(likely on another node) might have to wait for the "safe" time to reach the
picked read time. A long wait for safe time is usually seen when there are
concurrent writes to the tablet and the read enters while the raft replication
that moves the safe time ahead is still in progress (see yugabyte#11805).

This issue is avoided in Repeatable Read isolation because there, the first
tablet serving the read in a transaction is allowed to pick the read time as the
latest available "safe" time without having to wait for any catchup. This read
time is sent back to PgClientService as used_read_time so that future reads can
use the same read time. Note that even in Repeatable Read isolation, in case,
there are multiple parallel RPCs to various tservers, the read time is still
picked on the PgClientService because otherwise, the rpcs would have to wait for
one of them to execute and came back with a used_read_time.

This diff extends the same logic to Read Committed isolation.

Test Plan:
Jenkins: skip

./yb_build.sh --java-test org.yb.pgsql.TestPgTransactions#testReadPointInReadCommittedIsolation
./yb_build.sh --java-test org.yb.pgsql.TestPgIsolationRegress

Reviewers: dmitry

Subscribers: yql, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D24075
@pkj415 pkj415 changed the title [YSQL] Operations in Read Committed isolation might unnecessarily incur high latencies [YSQL] Allow read committed isolation to pick read time using safe time on first rpc to tserver Jun 23, 2023
pkj415 added a commit to pkj415/yugabyte-db that referenced this issue Jun 30, 2023
…n Read Committed isolation

Summary:
In Read Committed isolation, a new read time is picked for each statement
(i.e., a new logical snapshot of the database is used for each statement's
reads). This is done (in PgClientService) by setting the read time to the
current time at the start of each new statement before issuing requests to any
tserver. However, this might results in high latencies in the first read op that
is executed as part of that statement because the tablet serving the read
(likely on another node) might have to wait for the "safe" time to reach the
picked read time. A long wait for safe time is usually seen when there are
concurrent writes to the tablet and the read enters while the raft replication
that moves the safe time ahead is still in progress (see yugabyte#11805).

This issue is avoided in Repeatable Read isolation because there, the first
tablet serving the read in a transaction is allowed to pick the read time as the
latest available "safe" time without having to wait for any catchup. This read
time is sent back to PgClientService as used_read_time so that future reads can
use the same read time. Note that even in Repeatable Read isolation, in case,
there are multiple parallel RPCs to various tservers, the read time is still
picked on the PgClientService because otherwise, the rpcs would have to wait for
one of them to execute and came back with a used_read_time.

This diff extends the same logic to Read Committed isolation.

Test Plan:
./yb_build.sh --java-test org.yb.pgsql.TestPgTransactions#testReadPointInReadCommittedIsolation
./yb_build.sh --java-test org.yb.pgsql.TestPgIsolationRegress

Reviewers: dmitry

Subscribers: yql, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D24075
pkj415 added a commit to pkj415/yugabyte-db that referenced this issue Jul 7, 2023
…n Read Committed isolation

Summary:
In Read Committed isolation, a new read time is picked for each statement
(i.e., a new logical snapshot of the database is used for each statement's
reads). This is done (in PgClientService) by setting the read time to the
current time at the start of each new statement before issuing requests to any
tserver. However, this might results in high latencies in the first read op that
is executed as part of that statement because the tablet serving the read
(likely on another node) might have to wait for the "safe" time to reach the
picked read time. A long wait for safe time is usually seen when there are
concurrent writes to the tablet and the read enters while the raft replication
that moves the safe time ahead is still in progress (see yugabyte#11805).

This issue is avoided in Repeatable Read isolation because there, the first
tablet serving the read in a transaction is allowed to pick the read time as the
latest available "safe" time without having to wait for any catchup. This read
time is sent back to PgClientService as used_read_time so that future reads can
use the same read time. Note that even in Repeatable Read isolation, in case,
there are multiple parallel RPCs to various tservers, the read time is still
picked on the PgClientService because otherwise, the rpcs would have to wait for
one of them to execute and came back with a used_read_time.

This diff extends the same logic to Read Committed isolation.

Test Plan:
./yb_build.sh --java-test org.yb.pgsql.TestPgTransactions#testReadPointInReadCommittedIsolation
./yb_build.sh --java-test org.yb.pgsql.TestPgIsolationRegress

Reviewers: dmitry

Subscribers: yql, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D24075
pkj415 added a commit that referenced this issue Jul 7, 2023
…ommitted isolation

Summary:
In Read Committed isolation, a new read time is picked for each statement
(i.e., a new logical snapshot of the database is used for each statement's
reads). This is done (in PgClientService) by setting the read time to the
current time at the start of each new statement before issuing requests to any
tserver. However, this might results in high latencies in the first read op that
is executed as part of that statement because the tablet serving the read
(likely on another node) might have to wait for the "safe" time to reach the
picked read time. A long wait for safe time is usually seen when there are
concurrent writes to the tablet and the read enters while the raft replication
that moves the safe time ahead is still in progress (see #11805).

This issue is avoided in Repeatable Read isolation because there, the first
tablet serving the read in a transaction is allowed to pick the read time as the
latest available "safe" time without having to wait for any catchup. This read
time is sent back to PgClientService as used_read_time so that future reads can
use the same read time. Note that even in Repeatable Read isolation, in case,
there are multiple parallel RPCs to various tservers, the read time is still
picked on the PgClientService because otherwise, the rpcs would have to wait for
one of them to execute and came back with a used_read_time.

This diff extends the same logic to Read Committed isolation.
Jira: DB-5248

Test Plan:
./yb_build.sh --java-test org.yb.pgsql.TestPgTransactions#testReadPointInReadCommittedIsolation
./yb_build.sh --java-test org.yb.pgsql.TestPgIsolationRegress

Reviewers: dmitry

Reviewed By: dmitry

Subscribers: dsrinivasan, gkukreja, yql, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D24075
@pkj415 pkj415 closed this as completed Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
Status: Done
Development

No branches or pull requests

2 participants