Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] PR for Follower Reads documentation should call out default staleness when reading from tablet leader #15182

Merged
merged 1 commit into from
Dec 5, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -28,61 +28,62 @@ type: docs

## Leader leases

In a distributed environment, when one node in a cluster is elected as the leader holding the latest data, it is possible that another node may assume that it is the leader, and that it holds the latest data. This could result in serving stale reads to a client. To avoid this confusion, YugabyteDB provides a _leader lease_ mechanism where an elected node member is guaranteed to be the leader until its lease expires.
In a distributed environment, when one node in a cluster is elected as the leader holding the latest data, it is possible that another node may assume that it is the leader, and that it holds the latest data. This could result in serving stale reads to a client. To avoid this confusion, YugabyteDB provides a _leader lease_ mechanism where an elected node member is guaranteed to be the leader until its lease expires.

The leader lease mechanism guarantees to serve strongly consistent reads where a client can fetch reads directly from the leader, because the leader under lease will have the latest data.
The interactive animations in this [blog post](https://blog.yugabyte.com/low-latency-reads-in-geo-distributed-sql-with-raft-leader-leases/) explains the performance improvements using leader leases.
The interactive animations in this [blog post](https://blog.yugabyte.com/low-latency-reads-in-geo-distributed-sql-with-raft-leader-leases/) explain the performance improvements using leader leases.

## Follower reads

YugabyteDB requires reading from the **leader** to read the latest data. However, for applications that don't require the latest data, or are working with unchanging data, the cost of contacting a potentially remote leader to fetch the data may be wasteful. Your application may benefit from better latency by reading from a **replica** that is closer to the client.
YugabyteDB requires reading from the leader to read the latest data. However, for applications that do not require the latest data or are working with unchanging data, the cost of contacting a potentially remote leader to fetch the data may be wasteful. Your application may benefit from better latency by reading from a replica that is closer to the client.

### Time-critical use cases

Let's say a user starts a donation page to raise money for a personal cause and the target amount must be met by the end of the day. In a time-critical scenario, the user benefits from accessing the most recent donation made on the page and can keep track of the progress. Such real-time applications require the latest information as soon as it's available. But in the case of an application where fetching slightly stale data is okay, follower reads can be helpful.
Suppose the end-user starts a donation page to raise money for a personal cause and the target amount must be met by the end of the day. In a time-critical scenario, the end-user benefits from accessing the most recent donation made on the page and can keep track of the progress. Such real-time applications require the latest information as soon as it is available. But in the case of an application where fetching slightly stale data is acceptable, follower reads can be helpful.

### Latency-tolerant (staleness) use cases

Let's say a social media post gets about a million likes and more continuously. For a post with massive likes such as this one, slightly stale reads are acceptable, and immediate updates aren't necessary because the absolute number may not really matter to the user reading the post. Such applications don't need to always make requests directly to the leader. Instead, a slightly older value from the closest replica can achieve improved performance with lower latency.
Suppose a social media post gets a million likes and more continuously. For a post with massive likes such as this one, slightly stale reads are acceptable, and immediate updates are not necessary because the absolute number may not really matter to the end-user reading the post. Such applications do not need to always make requests directly to the leader. Instead, a slightly older value from the closest replica can achieve improved performance with lower latency.

Follower reads are applicable for applications that can tolerate staleness. Replicas may not be completely up to date with all updates, so this design may respond with stale data. You can specify how much staleness the application can tolerate. When enabled, read-only operations may be handled by the closest replica, instead of having to go to the leader. The GUC session variables that PostgreSQL supports can be used to enable follower reads.
Follower reads are applicable for applications that can tolerate staleness. Replicas may not be completely up-to-date with all updates, so this design may respond with stale data. You can specify how much staleness the application can tolerate. When enabled, read-only operations may be handled by the closest replica, instead of having to go to the leader. The Grand Unified Configuration (GUC) session variables that PostgreSQL supports can be used to enable follower reads.

## Surface area and usage

### Surface area
## Surface area

Two session variables control the behavior of follower reads:

- `yb_read_from_followers` controls whether reading from followers is enabled. Default is false.
- `yb_follower_read_staleness_ms` sets the maximum allowable staleness. Default is 30000 (30 seconds).
- `yb_read_from_followers` controls whether or not reading from followers is enabled. The default value is false.

- `yb_follower_read_staleness_ms` sets the maximum allowable staleness. The default value is 30000 (30 seconds).

Note that even if the tablet leader is on the closest node, you would still read from `Now()-yb_follower_read_staleness_ms`. Therefore, when follower reads are used, the read is always stale, even if you are reading from a tablet leader.

### Expected behavior

The table describes what the expected behavior is when a read happens from a follower.
The following table provides information on the expected behavior when a read happens from a follower.

| Conditions | Expected behavior |
| :--------- | :---------------- |
| yb_read_from_followers is true AND transaction is marked read-only | Read happens from the closest follower |
| yb_read_from_followers is false OR transaction/statement is not read-only | Read happens from the leader |
| yb_read_from_followers is true AND transaction is marked read-only | Read happens from the closest tablet. |
| yb_read_from_followers is false OR transaction or statement is not read-only | Read happens from the leader. |

### Read from follower conditions

- If the follower's safe-time is at least `<current_time> - <staleness>`, the follower may serve the read without any delay.

- If the follower is not yet caught up to `<current_time> - <staleness>`, the read will be redirected to a different replica transparently from the end-user. The end user may see a slight increase in latency depending on the location of the replica which satisfies the read.
- If the follower is not yet caught up to `<current_time> - <staleness>`, the read is redirected to a different replica transparently from the end-user. The end-user may see a slight increase in latency depending on the location of the replica which satisfies the read.

### Read-only transaction conditions

To mark a transaction as read only, a user can do one of the following:
You can mark a transaction as read-only by applying the following guidelines:

- `SET TRANSACTION READ ONLY` applies only to the current transaction block.
- `SET SESSION CHARACTERISTICS AS TRANSACTION READ ONLY` applies the read-only setting to all statements and transaction blocks that follow.
- `SET default_transaction_read_only = TRUE` applies the read-only setting to all statements and transaction blocks that follow.
- Use the **pg_hint_plan** mechanism to embed the hint along with the `SELECT` statement. For example, `/*+ Set(transaction_read_only true) */ SELECT ...` applies only to the current `SELECT` statement.
- Tthe `pg_hint_plan` mechanism embeds the hint along with the `SELECT` statement. For example, `/*+ Set(transaction_read_only true) */ SELECT ...` applies only to the current `SELECT` statement.

## Examples

This example uses follower reads because the **transaction** is marked read-only.
This example uses follower reads because the transaction is marked read-only:

```sql
set yb_read_from_followers = true;
Expand All @@ -97,7 +98,7 @@ commit;
k1 | v1
```

This example uses follower reads because the **session** is marked read only.
This example uses follower reads because the session is marked read-only:

```sql
set session characteristics as transaction read only;
Expand All @@ -112,11 +113,7 @@ SELECT * from t WHERE k='k1';
(1 row)
```

The following examples use follower reads because the **pg_hint_plan** mechanism is used during SELECT, PREPARE, and CREATE FUNCTION to perform follower reads.

{{< note title="Note" >}}
The pg_hint_plan hint needs to be applied at the prepare/function-definition stage and not at the `execute` stage.
{{< /note >}}
The following examples use follower reads because the `pg_hint_plan` mechanism is used during `SELECT`, `PREPARE`, and `CREATE FUNCTION` to perform follower reads:

```sql
set yb_read_from_followers = true;
Expand Down Expand Up @@ -161,7 +158,9 @@ SELECT func();
(1 row)
```

A **join** example that uses follower reads.
Note that `pg_hint_plan` hint needs to be applied at the function-definition stage and not at the `execute` stage.

The following is a `JOIN` example that uses follower reads:

```sql
create table table1(k int primary key, v int);
Expand All @@ -180,7 +179,7 @@ select * from table1, table2 where table1.k = 3 and table2.v = table3.v;
(1 row)
```

The following examples demonstrate **staleness** after enabling the `yb_follower_read_staleness_ms` property.
The following examples demonstrate staleness after enabling the `yb_follower_read_staleness_ms` property:

```sql
set session characteristics as transaction read write;
Expand Down