Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tidb-server does not shutdown gracefully if pd-server and/or tikv-server have gone away #10260

Closed
kolbe opened this issue Apr 24, 2019 · 4 comments
Labels
component/server severity/moderate sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.

Comments

@kolbe
Copy link
Contributor

kolbe commented Apr 24, 2019

Bug Report

Please answer these questions before submitting your issue. Thanks!

  1. What did you do?
    When shutting down an entire cluster, tidb-server does not handle its own shutdown well if pd-server and/or tikv-server have already exited.

  2. What did you expect to see?

tidb-server's signal handler should be able to override the loops to re-establish communication with pd and tikv.

  1. What did you see instead?

Even after receiving sigterm, tidb-server runs for more than 60 seconds and outputs more than 2000 lines to the log.

[test@localhost tidb-latest-linux-amd64]$ grep -n signal tidb.log
256:[2019/04/24 15:59:35.564 -04:00] [INFO] [signal_posix.go:54] ["got signal to exit"] [signal=terminated]
350:[2019/04/24 16:33:57.701 -04:00] [INFO] [signal_posix.go:54] ["got signal to exit"] [signal=terminated]
[test@localhost tidb-latest-linux-amd64]$ wc -l tidb.log
3021 tidb.log
  1. What version of TiDB are you using (tidb-server -V or run select tidb_version(); on TiDB)?
Release Version: v3.0.0-beta.1-154-gd5afff70c
Git Commit Hash: d5afff70cdd825d5fab125c8e52e686cc5fb9a6e
Git Branch: master
UTC Build Time: 2019-04-24 03:10:00
GoVersion: go version go1.12 linux/amd64
Race Enabled: false
TiKV Min Version: 2.1.0-alpha.1-ff3dd160846b7d1aed9079c389fc188f7f5ea13e
Check Table Before Drop: false
@kolbe
Copy link
Contributor Author

kolbe commented Apr 24, 2019

I'm mostly interested here in how tidb-server responds to signals (SIGTERM and SIGINT) that cause it to exit.

There are other discussions that might happen about other graceful ways to tell tidb-server to exit (a "SHUTDOWN" SQL statement, for example, or something in tidb-ctl), and I guess those things should also bypass the retry loop(s) I refer to in this issue.

@morgo morgo added the type/bug The issue is confirmed as a bug. label Apr 28, 2019
@ghost
Copy link

ghost commented Jul 14, 2020

I can reproduce this bug report against master. I tested this using two scenarios:

  • kill -9 pd-server and tikv-server, and then SHUTDOWN command a tidb-server. This brought the server down as expected when I had an existing session to tidb-server. When I didn't have a connection, a new connection took way too long, since there appears to be a timeout trying to reach pd/tikv. And the command failed:
$ time mysql -e 'shutdown';
ERROR 9001 (HY000) at line 1: PD server timeout

real	1m20.690s
user	0m0.000s
sys	0m0.006s

$ ps aux | grep tidb-server
nullnot+ 1478541  0.6  0.0 2477328 78528 pts/0   Sl   16:37   0:02 ./bin/tidb-server --config=/mnt/evo970/etc/tidb.toml
nullnot+ 1479639  0.0  0.0   9032  2724 pts/7    S+   16:43   0:00 grep --color=auto tidb-server
  • kill -9 pd-server and tikv-server, and then kill tidb-server:
$ killall -9 pd-server tikv-server
$ killall tidb-server
$ killall tidb-server
$ ps aux | grep tidb-server
nullnot+ 1479747  0.0  0.0   9624  3812 pts/0    S+   16:44   0:00 /bin/bash /home/nullnotnil/bin/old-make-full-tidb-server
nullnot+ 1479802  1.5  0.0 2404108 73648 pts/0   Sl+  16:44   0:00 ./bin/tidb-server --config=/mnt/evo970/etc/tidb.toml
nullnot+ 1480077  0.0  0.0   9032   736 pts/7    S+   16:45   0:00 grep --color=auto tidb-server

My TiDB version:

mysql> select tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v4.0.0-beta.2-762-g77aecd4b2
Edition: Community
Git Commit Hash: 77aecd4b27e79a97215eb4fdd68f68f2ddf67d21
Git Branch: master
UTC Build Time: 2020-07-13 01:43:31
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false
1 row in set (0.00 sec)

@xiongjiwei
Copy link
Contributor

dup of #18336

@github-actions
Copy link

github-actions bot commented Dec 7, 2021

Please check whether the issue should be labeled with 'affects-x.y' or 'fixes-x.y.z', and then remove 'needs-more-info' label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/server severity/moderate sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

5 participants