Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add document about stale read transaction #6347

Merged
merged 59 commits into from
Jun 15, 2021
Merged
Show file tree
Hide file tree
Changes from 57 commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
00fb603
add doc
Yisaer May 26, 2021
1b6e53f
add sql view
Yisaer May 27, 2021
dc84157
add sql view
Yisaer May 27, 2021
cc0e35c
fix lint
Yisaer May 27, 2021
14e2cf3
fix lint
Yisaer May 27, 2021
0145732
fix lint
Yisaer May 27, 2021
808a889
fix lint
Yisaer May 27, 2021
c8f276d
fix lint
Yisaer May 27, 2021
d84d1a3
Add Stale Read introduction
JmPotato Jun 1, 2021
f3fba9e
Resolve the conflict
JmPotato Jun 1, 2021
03f3ccd
Fix the link
JmPotato Jun 1, 2021
72ec40a
Fix a typo
JmPotato Jun 1, 2021
bdb6ca9
Refine the introduction of usage scenario
JmPotato Jun 8, 2021
b53e3ad
Update TOC.md
JmPotato Jun 10, 2021
790d75a
Apply suggestions from code review
Yisaer Jun 10, 2021
e897d19
address the comment
Yisaer Jun 10, 2021
911de89
Update TOC.md
nolouch Jun 11, 2021
85bc899
Update stale-read.md
nolouch Jun 11, 2021
d6c070c
Update as-of-timestamp.md
nolouch Jun 11, 2021
293c4f4
Update as-of-timestamp.md
nolouch Jun 11, 2021
27f0caf
Update as-of-timestamp.md
nolouch Jun 11, 2021
0a79a09
Update as-of-timestamp.md
nolouch Jun 11, 2021
6cee97b
Update stale-read.md
nolouch Jun 11, 2021
19c44aa
Update stale-read.md
nolouch Jun 11, 2021
36a7487
Update stale-read.md
nolouch Jun 11, 2021
83a7454
Update stale-read.md
nolouch Jun 11, 2021
218a107
Update as-of-timestamp.md
nolouch Jun 11, 2021
23584df
Update as-of-timestamp.md
nolouch Jun 11, 2021
ebc77c1
Update as-of-timestamp.md
nolouch Jun 11, 2021
18cb715
Update as-of-timestamp.md
nolouch Jun 11, 2021
c260ad8
Update as-of-timestamp.md
nolouch Jun 11, 2021
7f0b302
Update as-of-timestamp.md
nolouch Jun 11, 2021
1f5b6ac
Update as-of-timestamp.md
nolouch Jun 11, 2021
bf2d730
Update as-of-timestamp.md
nolouch Jun 11, 2021
baddd7c
Update as-of-timestamp.md
nolouch Jun 11, 2021
efc245a
Update as-of-timestamp.md
nolouch Jun 11, 2021
f352558
Update as-of-timestamp.md
nolouch Jun 11, 2021
9158919
Update read-historical-data.md
TomShawn Jun 11, 2021
21c7f85
Apply suggestions from code review
TomShawn Jun 11, 2021
559162e
address
nolouch Jun 11, 2021
4fd6e55
remove line
nolouch Jun 11, 2021
b842a92
refine
nolouch Jun 11, 2021
c504dd9
update select
nolouch Jun 11, 2021
75605f7
address
nolouch Jun 14, 2021
e769eca
fix the tableRefsClause
nolouch Jun 14, 2021
de996cb
address linter
nolouch Jun 14, 2021
d76fc54
Update as-of-timestamp.md
nolouch Jun 14, 2021
b6f57fb
Update as-of-timestamp.md
nolouch Jun 14, 2021
8e3f8ef
Update as-of-timestamp.md
nolouch Jun 14, 2021
8bfcc03
Update as-of-timestamp.md
nolouch Jun 14, 2021
7655d8e
Update stale-read.md
nolouch Jun 14, 2021
7d57426
address
nolouch Jun 14, 2021
b7e6c94
Update stale-read.md
nolouch Jun 14, 2021
2d9a830
Update stale-read.md
nolouch Jun 14, 2021
556f007
Merge remote-tracking branch 'yisaer/support_stale_read_data' into pr…
nolouch Jun 14, 2021
757242b
address
nolouch Jun 14, 2021
e8405e7
fix
nolouch Jun 14, 2021
8a103c0
add local read
Yisaer Jun 15, 2021
84ddbea
Apply suggestions from code review
Yisaer Jun 15, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,11 @@
+ [BR 备份恢复场景示例](/br/backup-and-restore-use-cases.md)
+ [外部存储](/br/backup-and-restore-storages.md)
+ [BR 常见问题](/br/backup-and-restore-faq.md)
+ [读取历史数据](/read-historical-data.md)
+ 读取历史数据
+ 使用 Stale Read 功能读取历史数据(推荐)
+ [Stale Read 使用场景介绍](/stale-read.md)
+ [使用 `AS OF TIMESTAMP` 语法读取历史数据](/as-of-timestamp.md)
+ [使用系统变量 `tidb_snapshot` 读取历史数据](/read-historical-data.md)
+ [修改时区](/configure-time-zone.md)
+ [日常巡检](/daily-check.md)
+ [TiFlash 常用运维操作](/tiflash/maintain-tiflash.md)
Expand Down
259 changes: 259 additions & 0 deletions as-of-timestamp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,259 @@
---
title: 使用 AS OF TIMESTAMP 语法读取历史数据
summary: 了解如何使用 AS OF TIMESTAMP 语法读取历史数据。
---

# 使用 AS OF TIMESTAMP 语法读取历史数据

本文档介绍如何通过 `AS OF TIMESTAMP` 语句使用 [Stale Read](/stale-read.md) 功能来读取 TiDB 历史版本数据,包括具体的操作示例以及历史数据的保存策略。

TomShawn marked this conversation as resolved.
Show resolved Hide resolved
TiDB 支持通过标准 SQL 接口,即通过 `AS OF TIMESTAMP` SQL 语法的形式读取历史数据,无需特殊的服务器或者驱动器。当数据被更新或删除后,你可以通过 SQL 接口将更新或删除前的数据读取出来。

> **注意:**
>
> 读取历史数据时,即使当前数据的表结构相较于历史数据的表结构已经发生改变,历史数据也会以当时的历史表结构来返回。

## 语法方式

你可以通过以下三种方式使用 `AS OF TIMESTAMP` 语法:

- [`SELECT ... FROM ... AS OF TIMESTAMP`](/sql-statements/sql-statement-select.md)
- [`START TRANSACTION READ ONLY AS OF TIMESTAMP`](/sql-statements/sql-statement-start-transaction.md)
- [`SET TRANSACTION READ ONLY AS OF TIMESTAMP`](/sql-statements/sql-statement-set-transaction.md)
Comment on lines +21 to +22
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这两个链接可以链接到本文中的对应章节吗?如果需要给出对应的 transaction 语法,是否可以在后面对应的章节里再提供呢?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

两边都已经进行了链接。


如果你想要指定一个精确的时间点,可在 `AS OF TIMESTAMP` 中使用日期时间和时间函数,日期时间的格式为:"2016-10-08 16:45:26.999",最小时间精度范围为毫秒,通常可只写到秒,例如 "2016-10-08 16:45:26"。你也可以通过 `NOW(3)` 函数获得精确到毫秒的当前时间。

如果你想要指定一个时间范围,需要使用 `TIDB_BOUNDED_STALENESS()` 函数。使用该函数,TiDB 会在指定的时间范围内选择一个合适的时间戳,该时间戳能保证所访问的副本上不存在开始于这个时间戳之前且还没有提交的相关事务,即能保证所访问的可用副本上执行读取操作而且不会被阻塞。用法为 `TIDB_BOUNDED_STALENESS(t1, t2)`,其中 `t1` 和 `t2` 为时间范围的两端,支持使用日期时间和时间函数,示例如下:

- `AS OF TIMESTAMP '2016-10-08 16:45:26'` 表示读取在 2016 年 10 月 8 日 16 点 45 分 26 秒时最新的数据。
- `AS OF TIMESTAMP NOW() - INTERVAL 10 SECOND` 表示读取 10 秒前最新的数据。
- `AS OF TIMESTAMP TIDB_BOUNDED_STALENESS('2016-10-08 16:45:26', '2016-10-08 16:45:29')` 表示读取在 2016 年 10 月 8 日 16 点 45 分 26 秒到 29 秒的时间范围内尽可能新的数据。
- `AS OF TIMESTAMP TIDB_BOUNDED_STALENESS(NOW() - INTERVAL 20 SECOND, NOW())` 表示读取 20 秒前到现在的时间范围内尽可能新的数据。

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加一个说明我们推荐的是 5 秒左右的历史读吧(最佳实践)。 cc @NingLin-P 5 秒 ok 吧?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

最好多测测几个场景再确定

注意: 除了指定时间戳,`AS OF TIMESTAMP` 语法最常用使用的方式是读几秒前的数据。如果采用这种方式,取值范围推荐为读 5 秒以上的历史数据。

## 示例

nolouch marked this conversation as resolved.
Show resolved Hide resolved
本节通过多个示例介绍 `AS OF TIMESTAMP` 语法的不同使用方法。在本节中,先介绍如何准备用于恢复的数据,再分别展示如何通过 `SELECT`、`START TRANSACTION READ ONLY AS OF TIMESTAMP`、`SET TRANSACTION READ ONLY AS OF TIMESTAMP` 以及 `SELECT` 子句使用 `AS OF TIMESTAMP`。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SELECT 子句


### 准备数据

在准备数据阶段,创建一张表,并插入若干行数据:

```sql
create table t (c int);
```

```
Query OK, 0 rows affected (0.01 sec)
```

```sql
insert into t values (1), (2), (3);
```

```
Query OK, 3 rows affected (0.00 sec)
```

查看表中的数据:

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

查看当前时间:

```sql
select now();
```

```
+---------------------+
| now() |
+---------------------+
| 2021-05-26 16:45:26 |
+---------------------+
1 row in set (0.00 sec)
```

更新某一行数据:

```sql
update t set c=22 where c=2;
```

```
Query OK, 1 row affected (0.00 sec)
```

确认数据已经被更新:

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 22 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

### 通过 `SELECT` 读取历史数据

通过 [`SELECT ... FROM ... AS OF TIMESTAMP`](/sql-statements/sql-statement-select.md) 语句读取一个基于历史时间的数据。

```sql
select * from t as of timestamp '2021-05-26 16:45:26';
```

```
+------+
| c |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

> **注意:**
>
> 通过 `SELECT` 语句读取多个表时要保证 TIMESTAMP EXPRESSION 是一致的。 比如: `select * from t as of timestamp NOW() - INTERVAL 2 SECOND, c as of timestamp NOW() - INTERVAL 2 SECOND;`。此外,在 `SELECT` 语句中,你必须要指定相关数据表的 as of 信息,若不指定,`SELECT` 语句会默认读最新的数据。

### 通过 `START TRANSACTION READ ONLY AS OF TIMESTAMP` 读取历史数据

通过 [`START TRANSACTION READ ONLY AS OF TIMESTAMP`](/sql-statements/sql-statement-start-transaction.md) 语句,你可以开启一个基于历史时间的只读事务,该事务基于所提供的历史时间来读取历史数据。

```sql
start transaction read only as of timestamp '2021-05-26 16:45:26';
```

```
Query OK, 0 rows affected (0.00 sec)
```

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

```sql
commit;
```

```
Query OK, 0 rows affected (0.00 sec)
```

当事务结束后,即可读取最新数据。

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 22 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

> **注意:**
>
> 通过 `START TRANSACTION READ ONLY AS OF TIMESTAMP` 开启的事务为只读事务。假如在该事务中执行写入操作,操作将会被该事务拒绝。

### 通过 `SET TRANSACTION READ ONLY AS OF TIMESTAMP` 读取历史数据

通过 [`SET TRANSACTION READ ONLY AS OF TIMESTAMP`](/sql-statements/sql-statement-set-transaction.md) 语句,你可以将下一个事务设置为基于指定历史时间的只读事务。该事务将会基于所提供的历史时间来读取历史数据。

```sql
set transaction read only as of timestamp '2021-05-26 16:45:26';
```

```
Query OK, 0 rows affected (0.00 sec)
```

```sql
begin;
```

```
Query OK, 0 rows affected (0.00 sec)
```

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

```sql
commit;
```

```
Query OK, 0 rows affected (0.00 sec)
```

当事务结束后,即可读取最新数据。

```sql
select * from t;
```

```
+------+
| c |
+------+
| 1 |
| 22 |
| 3 |
+------+
3 rows in set (0.00 sec)
```

> **注意:**
>
> 通过 `SET TRANSACTION READ ONLY AS OF TIMESTAMP` 开启的事务为只读事务。假如在该事务中执行写入操作,操作将会被该事务拒绝。
8 changes: 5 additions & 3 deletions read-historical-data.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,19 @@
---
title: 读取历史数据
title: 通过系统变量 tidb_snapshot 读取历史数据
TomShawn marked this conversation as resolved.
Show resolved Hide resolved
aliases: ['/docs-cn/dev/read-historical-data/','/docs-cn/dev/how-to/get-started/read-historical-data/']
---

# 读取历史数据
# 通过系统变量 tidb_snapshot 读取历史数据

本文档介绍 TiDB 如何读取历史版本数据,包括具体的操作流程以及历史数据的保存策略。
Yisaer marked this conversation as resolved.
Show resolved Hide resolved

## 功能说明

TiDB 实现了通过标准 SQL 接口读取历史数据功能,无需特殊的 client 或者 driver。当数据被更新、删除后,依然可以通过 SQL 接口将更新/删除前的数据读取出来。

另外即使在更新数据之后,表结构发生了变化,TiDB 依旧能用旧的表结构将数据读取出来。
> **注意:**
>
> 读取历史数据时,即使当前数据的表结构相较于历史数据的表结构已经发生改变,历史数据也会使用当时的表结构来返回数据。

## 操作流程

Expand Down
8 changes: 7 additions & 1 deletion sql-statements/sql-statement-select.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,13 @@ aliases: ['/docs-cn/dev/sql-statements/sql-statement-select/','/docs-cn/dev/refe

**TableRefsClause:**

![TableRefsClause](/media/sqlgram/TableRefsClause.png)
```ebnf+diagram
TableRefsClause ::=
(TableRef (AsOfClause)? ) (( ',' TableRef (AsOfClause)? ))*

AsOfClause ::=
( 'AS' 'OF' 'TIMESTAMP' Expression)
```

**WhereClauseOptional:**

Expand Down
20 changes: 13 additions & 7 deletions sql-statements/sql-statement-set-transaction.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,23 @@ aliases: ['/docs-cn/dev/sql-statements/sql-statement-set-transaction/','/docs-cn

## 语法图

**SetStmt:**
```ebnf+diagram

![SetStmt](/media/sqlgram/SetStmt.png)
SetStmt ::=
'SET' ( VariableAssignmentList |
'PASSWORD' ('FOR' Username)? '=' PasswordOpt |
( 'GLOBAL'| 'SESSION' )? 'TRANSACTION' TransactionChars |
'CONFIG' ( Identifier | stringLit) ConfigItemName EqOrAssignmentEq SetExpr )

**TransactionChar:**
TransactionChars ::=
( 'ISOLATION' 'LEVEL' IsolationLevel | 'READ' 'WRITE' | 'READ' 'ONLY' AsOfClause? )

![TransactionChar](/media/sqlgram/TransactionChar.png)
IsolationLevel ::=
( 'REPEATABLE' 'READ' | 'READ' ( 'COMMITTED' | 'UNCOMMITTED' ) | 'SERIALIZABLE' )

**IsolationLevel:**

![IsolationLevel](/media/sqlgram/IsolationLevel.png)
AsOfClause ::=
( 'AS' 'OF' 'TIMESTAMP' Expression)
```

## 示例

Expand Down
5 changes: 4 additions & 1 deletion sql-statements/sql-statement-start-transaction.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,10 @@ aliases: ['/docs-cn/dev/sql-statements/sql-statement-start-transaction/','/docs-
```ebnf+diagram
BeginTransactionStmt ::=
'BEGIN' ( 'PESSIMISTIC' | 'OPTIMISTIC' )?
| 'START' 'TRANSACTION' ( 'READ' ( 'WRITE' | 'ONLY' ( 'WITH' 'TIMESTAMP' 'BOUND' TimestampBound )? ) | 'WITH' 'CONSISTENT' 'SNAPSHOT' | 'WITH' 'CAUSAL' 'CONSISTENCY' 'ONLY' )?
| 'START' 'TRANSACTION' ( 'READ' ( 'WRITE' | 'ONLY' ( ( 'WITH' 'TIMESTAMP' 'BOUND' TimestampBound )? | AsOfClause ) ) | 'WITH' 'CONSISTENT' 'SNAPSHOT' | 'WITH' 'CAUSAL' 'CONSISTENCY' 'ONLY' )?

AsOfClause ::=
( 'AS' 'OF' 'TIMESTAMP' Expression)
```

## 示例
Expand Down
25 changes: 25 additions & 0 deletions stale-read.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
title: Stale Read 功能的使用场景
summary: 介绍 Stale Read 功能和使用场景。
---

# Stale Read 功能的使用场景

本文档介绍 Stale Read 的使用场景。由于 TiDB 数据是多版本存储的,Stale Read 能为用户提供了一种读取历史数据的一种机制。使用 Stale Read 功能,你能从指定时间点或时间范围内读取对应的历史数据,从而避免数据同步带来延迟。
Yisaer marked this conversation as resolved.
Show resolved Hide resolved

在内部实现上,TiDB 通过 Stale Read 可以从任意一个副本上读取到该指定时间点或时间范围内尽可能新的数据,并在这个过程中始终保证数据的一致性约束。

## 场景描述

+ 场景一:如果一个事务仅涉及只读操作,并且一定程度上可容忍牺牲实时性,你可以使用 Stale Read 功能来读取历史数据。由于牺牲了一定的实时性,使用 Stale Read 后,TiDB 可将请求发送到对应数据的任意一个副本,使得查询的执行获得更大的吞吐量。

+ 场景二:在一些小表的查询场景中,如果使用了强一致性读,数据可能集中在某一个存储节点上,导致查询压力集中在该节点,成为整个查询的瓶颈。使用 Stale Read 功能后,TiDB 可以将请求分发到对应数据的每一个副本上,提升了查询整体的吞吐能力,从而显著提升查询性能。
TomShawn marked this conversation as resolved.
Show resolved Hide resolved

+ 场景三:在部分跨数据中心部署的场景中,如果使用了强一致性的 Follower 读,为了读到的数据与 Leader 上的数据一致,会产生跨数据中心获取 Readindex 来校验的请求,导致整体查询的访问延迟增加。通过使用 Stale Read 功能,可以牺牲一定的实时性,可就近访问对应数据所在当前中心的副本,避免跨数据中心的网络延迟,降低整体查询的访问延迟。
nolouch marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

针对场景三,就近读取到一个尽可能新的数据即可,一定要改 sql,加上「as of timestamp」么,还要明确指定一个时间?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

只要使用 Stale read 功能,就要改 SQL,和哪个场景没关系。

Yisaer marked this conversation as resolved.
Show resolved Hide resolved

## 使用方法

TiDB 提供两种 Stale Read 使用方式,分别是指定一个精确的时间点和一个时间范围:

- 指定精确时间点:如需 TiDB 读取一个时间点上保证全局事务记录一致性的数据并且不破坏隔离级别,你可以指定这个时间点对应的时间戳。要使用该方式,请参阅 [`AS OF TIMESTAMP` 语法](/as-of-timestamp.md#语法方式)文档。
- 指定时间范围:如需 TiDB 读取在一个时间范围内尽可能新的数据并不破坏隔离级别,你可以指定一个时间范围。在指定时间范围内,TiDB 会选择一个合适的时间戳,该时间戳能保证所访问的副本上不存在开始于这个时间戳之前且还没有提交的相关事务,即能保证在所访问的可用副本上可执行读取操作而且不会被阻塞。 要使用该方式,请参阅 [`AS OF TIMESTAMP` 语法](/as-of-timestamp.md#语法方式)文档和该文档中 [`TIDB_BOUNDED_STALENESS` 函数](/as-of-timestamp.md#语法方式)部分的介绍。