[docdb] TS crash due to 2 tablets in cache with the same partition_key_start #3669

OlegLoginov · 2020-02-17T13:00:45Z

Jira Link: DB-1920
TS crashed in MetaCache::ProcessTabletLocations() (meta_cache.cc:722) with the fatal log:

F1228 18:29:00.744967 25713 meta_cache.cc:722] Check failed: tablets_by_key.emplace(partition.partition_key_start(), remote).second

The issue can affect CQL & SQL, because the failed MetaCache is used in the yb::client code to get Tablet Location info in any tablet read/write operation (in AsyncRpc via TabletInvoker via YBClient::LookupTabletById()).

The text was updated successfully, but these errors were encountered:

OlegLoginov · 2020-02-17T13:01:40Z

Log:

W1228 18:27:11.434542  7247 remote_bootstrap_service.cc:117] Error handling RemoteBootstrapService RPC request from 10.20.67.46:60189: Not found (yb/tserver/remote_bootstrap_session.cc:488): Unable to get piece of log segment: Already sent active log segment, don't send 2
W1228 18:27:11.449787 25344 consensus_peers.cc:424] T 30ef75e22b864fdd8bcad80fbf63e768 P b2ac21dfaa474840b5b3e57bc64cea16 -> Peer c9d47b5f21bd46e186d3cb236b014da3 ([host: "10.20.67.46" port: 9101], []): Couldn't send request.  Status: Illegal state (yb/tserver/service_util.h:125): Tablet not RUNNING: BOOTSTRAPPING. Retrying in the next heartbeat period. Already tried 1 times. State: 2
W1228 18:27:13.464529 25347 remote_bootstrap_service.cc:117] Error handling RemoteBootstrapService RPC request from 10.20.67.50:42331: Not found (yb/tserver/remote_bootstrap_session.cc:488): Unable to get piece of log segment: Already sent active log segment, don't send 2
W1228 18:27:15.523027  7404 log_util.cc:211] Could not read footer for segment: /usr/scratch/yugabyte/2.0.8.0/data/yb-data/tserver/wals/table-0000400100003000800000000000401d/tablet-5688f5b6aa13400b915482ef28512cd1/wal-000000001: Not found (yb/consensus/log_util.cc:446): Footer not found. Footer magic doesn't match
W1228 18:27:15.523083  7404 log_reader.cc:182] T 5688f5b6aa13400b915482ef28512cd1 P b2ac21dfaa474840b5b3e57bc64cea16: Log segment /usr/scratch/yugabyte/2.0.8.0/data/yb-data/tserver/wals/table-0000400100003000800000000000401d/tablet-5688f5b6aa13400b915482ef28512cd1/wal-000000001 was likely left in-progress after a previous crash. Will try to rebuild footer by scanning data.
W1228 18:27:18.961344  7149 reactor.cc:380] TabletServer_R001: Client call yb.consensus.ConsensusService.RequestConsensusVote has no timeout set for connection id: { remote: 10.20.67.48:9101 idx: 2 protocol: 0x00007fa783506d98 -> tcp }
W1228 18:27:18.961792  7150 reactor.cc:380] TabletServer_R002: Client call yb.consensus.ConsensusService.RequestConsensusVote has no timeout set for connection id: { remote: 10.20.67.50:9101 idx: 0 protocol: 0x00007fa783506d98 -> tcp }
W1228 18:27:18.961882  7150 reactor.cc:380] TabletServer_R002: Client call yb.consensus.ConsensusService.RequestConsensusVote has no timeout set for connection id: { remote: 10.20.67.49:9101 idx: 6 protocol: 0x00007fa783506d98 -> tcp }
W1228 18:28:33.910887  7250 cql_processor.cc:263] Unsupported driver option DRIVER_VERSION = 3.19.0
F1228 18:29:00.744967 25713 meta_cache.cc:722] Check failed: tablets_by_key.emplace(partition.partition_key_start(), remote).second

OlegLoginov · 2020-02-17T13:04:39Z

Failed yb::client code (which works in TServer - in a CQL/SQL Service, to prepare/process RPC requests from TS to Master).

// Manager of RemoteTablets and RemoteTabletServers. The client consults
// this class to look up a given tablet or server.
//
// This class will also be responsible for cache eviction policies, etc.
class MetaCache : public RefCountedThreadSafe<MetaCache> {
...
    std::unordered_map<PartitionKey, RemoteTabletPtr> tablets_by_partition;

RemoteTabletPtr MetaCache::ProcessTabletLocations(
    const google::protobuf::RepeatedPtrField<master::TabletLocationsPB>& locations,
    const std::string* partition_group_start) {
  VLOG(2) << "Processing master response " << ToString(locations);
...
    std::lock_guard<decltype(mutex_)> l(mutex_);
    for (const TabletLocationsPB& loc : locations) {
      for (const std::string& table_id : loc.table_ids()) {
        auto& table_data = tables_[table_id];
        auto& tablets_by_key = table_data.tablets_by_partition;    <<<<<<<<<<<<
...
        // Next, update the tablet caches.
        const std::string& tablet_id = loc.tablet_id();
        RemoteTabletPtr remote = FindPtrOrNull(tablets_by_id_, tablet_id);
        if (remote) {            <<<<<<<<<<<<<<  == NULL - 'tablet_id' was NOT found
          // Partition should not have changed.
          DCHECK_EQ(loc.partition().partition_key_start(),
                    remote->partition().partition_key_start());
          DCHECK_EQ(loc.partition().partition_key_end(),
                    remote->partition().partition_key_end());
          VLOG(3) << "Refreshing tablet " << tablet_id << ": " << loc.ShortDebugString();
        } else {
          VLOG(3) << "Caching tablet " << tablet_id << ": " << loc.ShortDebugString();
          Partition partition;
          Partition::FromPB(loc.partition(), &partition);
          remote = new RemoteTablet(tablet_id, partition);
          CHECK(tablets_by_id_.emplace(tablet_id, remote).second);
<<<<<<< OK. Successfully added by ID
          CHECK(tablets_by_key.emplace(partition.partition_key_start(), remote).second);
<<<<<<<<< FALSE - means the pair was not inserted into the map. There is already such key (=partition.partition_key_start()) in the map.
        }

TS gets info (Partitions for all tablets) from Master - per table_id + per tablet_id partitions info.
The failed tablet is new on the TS (due to RemoteBootStrap above in the log) - it was not found in the map by id.
The tablet was successfully added into the map by id.
Then the tablet must be added for this table into the second map - map by 'partition_key_start', but the line was failed. So, there is another Tablet in this table with the same 'partition_key_start'. That's unexpected case. That means 2 tablets (for the same table) with the same 'partition_key_start'.
I cannot say right now how it can happen (2 tablets with the same start partition key). May be the first tablet was deleted and a new one was added instead, but the meta cache on TS was not updated to remove the first tablet in time. Anyway it needs more investigation and a reproducer would help very much.
I can guess that this manipulations with tablets are related to/originated by the RemoteBootstrap service. Possibly the case is related to the scenario when the the RemoteBootstrap failed (as we see in the log).

OlegLoginov · 2020-02-17T13:07:31Z

A few other log warnings (may be related to the issue, may be not):

Bootstrapping tablet in wrong (not running) state:

W1228 18:27:11.449787 25344 consensus_peers.cc:424] T 30ef75e22b864fdd8bcad80fbf63e768 P b2ac21dfaa474840b5b3e57bc64cea16 -> Peer c9d47b5f21bd46e186d3cb236b014da3 ([host: "10.20.67.46" port: 9101], []): Couldn't send request.  Status: Illegal state (yb/tserver/service_util.h:125): Tablet not RUNNING: BOOTSTRAPPING. Retrying in the next heartbeat period. Already tried 1 times. State: 2
W1228 18:27:13.464529 25347 remote_bootstrap_service.cc:117] Error handling RemoteBootstrapService RPC request from 10.20.67.50:42331: Not found (yb/tserver/remote_bootstrap_session.cc:488): Unable to get piece of log segment: Already sent active log segment, don't send 2

Broken WAL log - due to previous TS crash I guess - we can find other crash logs on the disk:

W1228 18:27:15.523027  7404 log_util.cc:211] Could not read footer for segment: /usr/scratch/yugabyte/2.0.8.0/data/yb-data/tserver/wals/table-0000400100003000800000000000401d/tablet-5688f5b6aa13400b915482ef28512cd1/wal-000000001: Not found (yb/consensus/log_util.cc:446): Footer not found. Footer magic doesn't match
W1228 18:27:15.523083  7404 log_reader.cc:182] T 5688f5b6aa13400b915482ef28512cd1 P b2ac21dfaa474840b5b3e57bc64cea16: Log segment /usr/scratch/yugabyte/2.0.8.0/data/yb-data/tserver/wals/table-0000400100003000800000000000401d/tablet-5688f5b6aa13400b915482ef28512cd1/wal-000000001 was likely left in-progress after a previous crash. Will try to rebuild footer by scanning data.

The driver version is not expected:

W1228 18:28:33.910887  7250 cql_processor.cc:263] Unsupported driver option DRIVER_VERSION = 3.19.0

OlegLoginov · 2020-02-17T13:20:23Z

Note: used default Cassandra (not YB) driver.

Related discussion in Slack: https://yugabyte-db.slack.com/archives/GRY7P7LTG/p1581602620108600

OlegLoginov · 2020-02-17T13:36:45Z

Additional logs:

I0212 22:37:54.056057 32508 memtable_list.cc:374] T 5802b7fa1aa94df79be06d6cd88e9ab9 P 0f8027232bba4785903285de35749b04 [R]: [default] Level-0 commit table #27 started
I0212 22:37:54.056080 32508 memtable_list.cc:390] T 5802b7fa1aa94df79be06d6cd88e9ab9 P 0f8027232bba4785903285de35749b04 [R]: [default] Level-0 commit table #27: memtable #1 done
I0212 22:37:54.056084 32508 event_logger.cc:77] T 5802b7fa1aa94df79be06d6cd88e9ab9 P 0f8027232bba4785903285de35749b04 [R]: EVENT_LOG_v1 {"time_micros": 1581547074055590, "job": 18, "event": "flush_finished", "lsm_state": [3]}
I0212 22:37:54.196646 31959 log.cc:861] T 5802b7fa1aa94df79be06d6cd88e9ab9 P 0f8027232bba4785903285de35749b04: Running Log GC on /usr/scratch/yugabyte/persistent_data/tserver/yb-data/tserver/wals/table-a0a03ddea7ea4120a589bacd62221af8/tablet-5802b7fa1aa94df79be06d6cd88e9ab9: retaining ops >= 157786, log segment size = 67108864
I0212 22:37:54.196723 31959 log_reader.cc:521] T 5802b7fa1aa94df79be06d6cd88e9ab9 P 0f8027232bba4785903285de35749b04: Removed log segment sequence numbers from log reader: [28]
I0212 22:37:54.196746 31959 log.cc:886] T 5802b7fa1aa94df79be06d6cd88e9ab9 P 0f8027232bba4785903285de35749b04: Deleting log segment in path: /usr/scratch/yugabyte/persistent_data/tserver/yb-data/tserver/wals/table-a0a03ddea7ea4120a589bacd62221af8/tablet-5802b7fa1aa94df79be06d6cd88e9ab9/wal-000000028 (GCed ops < 157786)
I0212 22:37:54.208775 31959 maintenance_manager.cc:373] Time spent running LogGCOp(5802b7fa1aa94df79be06d6cd88e9ab9): real 0.012s	user 0.000s	sys 0.011s
I0212 22:49:59.318825 13275 log.cc:561] T 18de408a722745f891ee72a816531bd6 P 0f8027232bba4785903285de35749b04: Max segment size 67108864 reached. Starting new segment allocation. 
I0212 22:49:59.324448 13275 log.cc:469] T 18de408a722745f891ee72a816531bd6 P 0f8027232bba4785903285de35749b04: Rolled over to a new segment: /usr/scratch/yugabyte/persistent_data/tserver/yb-data/tserver/wals/table-a0a03ddea7ea4120a589bacd62221af8/tablet-18de408a722745f891ee72a816531bd6/wal-000000083
W0212 22:50:10.751219 11541 cql_processor.cc:263] Unsupported driver option DRIVER_VERSION = 3.19.0
F0212 22:50:12.256759 11926 meta_cache.cc:722] Check failed: tablets_by_key.emplace(partition.partition_key_start(), remote).second

I0212 22:35:53.869737 20780 call_home.cc:392] Done with collector RpcsCollector
I0212 22:35:53.874092 20780 call_home.cc:392] Done with collector GFlagsCollector
I0212 22:36:24.353956 20780 call_home.cc:436] Unable to send diagnostics data to http://diagnostics.yugabyte.com
W0212 22:50:10.778905 22453 cql_processor.cc:263] Unsupported driver option DRIVER_VERSION = 3.19.0
F0212 22:50:13.353415 20892 meta_cache.cc:722] Check failed: tablets_by_key.emplace(partition.partition_key_start(), remote).second

I0212 22:51:21.102425 16657 server_base.cc:484] Using private ip address 10.20.67.45
I0212 22:51:21.102555 16657 ts_tablet_manager.cc:1070] Loading metadata for tablet 74501e39774d4024acc2dc40a210fca1
I0212 22:51:21.105792 16657 ts_tablet_manager.cc:1611] T 74501e39774d4024acc2dc40a210fca1 P 0f8027232bba4785903285de35749b04: Tablet Manager startup: Rolling forward tablet deletion of type TABLET_DATA_DELETED
I0212 22:51:21.105820 16657 ts_tablet_manager.cc:1854] T 74501e39774d4024acc2dc40a210fca1 P 0f8027232bba4785903285de35749b04: Deleting tablet data with delete state TABLET_DATA_DELETED
I0212 22:51:21.105932 16657 docdb_rocksdb_util.cc:414] Auto setting FLAGS_rocksdb_max_background_flushes to 3
I0212 22:51:21.105960 16657 docdb_rocksdb_util.cc:434] Auto setting FLAGS_rocksdb_max_background_compactions to 3
I0212 22:51:21.105965 16657 docdb_rocksdb_util.cc:443] Auto setting FLAGS_rocksdb_base_background_compactions to 3
I0212 22:51:21.105971 16657 docdb_rocksdb_util.cc:454] Auto setting FLAGS_priority_thread_pool_size to 4
I0212 22:51:21.106109 16657 tablet_metadata.cc:391] Destroying regular db at: /usr/scratch/yugabyte/persistent_data/tserver/yb-data/tserver/data/rocksdb/table-00004001000030008000000000004007/tablet-74501e39774d4024acc2dc40a210fca1
E0212 22:51:21.107319 16657 tablet_metadata.cc:395] Failed to destroy regular DB at: /usr/scratch/yugabyte/persistent_data/tserver/yb-data/tserver/data/rocksdb/table-00004001000030008000000000004007/tablet-74501e39774d4024acc2dc40a210fca1: IO error (yb/rocksdb/util/env_posix.cc:326): /usr/scratch/yugabyte/persistent_data/tserver/yb-data/tserver/data/rocksdb/table-00004001000030008000000000004007/tablet-74501e39774d4024acc2dc40a210fca1/LOCK: No such file or directory
I0212 22:51:21.122887 16657 ts_tablet_manager.cc:1864] T 74501e39774d4024acc2dc40a210fca1 P 0f8027232bba4785903285de35749b04: Tablet deleted. Last logged OpId: { term: 1 index: 4306 }
I0212 22:51:21.123950 16657 ts_tablet_manager.cc:1620] T 74501e39774d4024acc2dc40a210fca1 P 0f8027232bba4785903285de35749b04: Deleting tablet superblock

OlegLoginov · 2020-02-24T17:41:00Z

The failure can happen when a new table was created.

WORKAROUND: Set Master flag tablet_creation_timeout_ms=120000
(This is a timeout value for a tablet creation. Be default = 30000 ms (30 sec). Try to increase the value to a few minutes.)

OlegLoginov · 2020-02-24T17:42:49Z

ROOT CAUSE: It's possible to get race on Tablet-state during the CreateTable. Because the state change CREATING->RUNNING happens in one thread, while the state checking (if it's in CREATING - then it must be replaced after timeout) happens in CatalogManager-Background-tasks thread. So, if it happens in the order:

Master-THREAD1: Started tablet creating (CREATING state)
TS: Tablet creating takes a bit more than timeout = 30 sec by default
Master-THREAD2: The tablet was selected (in CREATING state - multiple READ-LOCK) after the timeout - for replacing - started replacing
Master-THREAD1: Got response from TS - tablet state changed into RUNNING (single WRITE-LOCK).
Master (RPC) - Selecting all tablets for the table (there is a check that it must be RUNNING) - (IN CLIENT) MetaCache updated by the tablets

    Status CatalogManager::BuildLocationsForTablet(const scoped_refptr<TabletInfo>& tablet,
...
    if (PREDICT_FALSE(!l_tablet->data().is_running())) {
      return STATUS(ServiceUnavailable, "Tablet not running");
    }

Master replacing the tablet.
Master (new RPC) - Selecting all tablets for the table - (IN CLIENT) MetaCache updated by the new tablets - CRASH

So, the race is on steps [3] & [4] can cause such crash in MetaCahe.
But probability of such scenario is quite small - because the tablet creating time on TS must be very close to the timeout (30 sec by default).

OlegLoginov self-assigned this Feb 17, 2020

bmatican added area/docdb YugabyteDB core features kind/bug This issue is a bug priority/high High Priority labels Feb 24, 2020

bmatican changed the title ~~TS crash due to 2 tablets in cache with the same partition_key_start.~~ [docdb] TS crash due to 2 tablets in cache with the same partition_key_start Oct 10, 2020

bmatican removed the priority/high High Priority label Oct 10, 2020

This was referenced Oct 10, 2020

[docdb] Get rid of the master side tablet replacement code #6005

Closed

[docdb] Get rid of master side tablet replacement code #6006

Open

Sambit-rgb mentioned this issue May 14, 2022

[Snyk] Fix for 1 vulnerabilities #12510

Closed

yugabyte-ci added the priority/medium Medium priority issue label Jun 9, 2022

yugabyte-ci assigned rthallamko3 and unassigned OlegLoginov Jul 27, 2022

This was referenced Jun 22, 2023

[Snyk] Fix for 1 vulnerabilities ryan-ally/yugabyte-db#150

Open

[Snyk] Fix for 12 vulnerabilities ryan-ally/yugabyte-db#154

Open

nyndyny mentioned this issue Nov 28, 2023

[Snyk] Fix for 5 vulnerabilities nyndyny/yugabyte-db#171

Open

ryan-ally mentioned this issue Dec 20, 2023

[Snyk] Fix for 3 vulnerabilities ryan-ally/yugabyte-db#229

Open

nyndyny mentioned this issue Dec 20, 2023

[Snyk] Fix for 1 vulnerabilities nyndyny/yugabyte-db#200

Open

nyndyny mentioned this issue Mar 15, 2024

[Snyk] Fix for 2 vulnerabilities nyndyny/yugabyte-db#232

Open

nyndyny mentioned this issue Mar 22, 2024

[Snyk] Security upgrade webpack-dev-server from 3.11.2 to 4.0.0 nyndyny/yugabyte-db#234

Open

ryan-ally mentioned this issue Mar 23, 2024

[Snyk] Security upgrade webpack-dev-server from 3.11.2 to 4.0.0 ryan-ally/yugabyte-db#264

Open

ryan-ally mentioned this issue May 13, 2024

[Snyk] Fix for 1 vulnerabilities ryan-ally/yugabyte-db#276

Open

nyndyny mentioned this issue May 13, 2024

[Snyk] Fix for 2 vulnerabilities nyndyny/yugabyte-db#250

Open

ryan-ally mentioned this issue May 14, 2024

[Snyk] Fix for 1 vulnerabilities ryan-ally/yugabyte-db#278

Open

nyndyny mentioned this issue May 22, 2024

[Snyk] Security upgrade webpack-dev-server from 3.11.2 to 4.0.0 nyndyny/yugabyte-db#256

Open

ryan-ally mentioned this issue May 23, 2024

[Snyk] Security upgrade webpack-dev-server from 3.11.2 to 4.0.0 ryan-ally/yugabyte-db#288

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docdb] TS crash due to 2 tablets in cache with the same partition_key_start #3669

[docdb] TS crash due to 2 tablets in cache with the same partition_key_start #3669

OlegLoginov commented Feb 17, 2020 •

edited by yugabyte-ci

Loading

OlegLoginov commented Feb 17, 2020

OlegLoginov commented Feb 17, 2020 •

edited

Loading

OlegLoginov commented Feb 17, 2020

OlegLoginov commented Feb 17, 2020 •

edited

Loading

OlegLoginov commented Feb 17, 2020

OlegLoginov commented Feb 24, 2020 •

edited

Loading

OlegLoginov commented Feb 24, 2020 •

edited

Loading

[docdb] TS crash due to 2 tablets in cache with the same partition_key_start #3669

[docdb] TS crash due to 2 tablets in cache with the same partition_key_start #3669

Comments

OlegLoginov commented Feb 17, 2020 • edited by yugabyte-ci Loading

OlegLoginov commented Feb 17, 2020

OlegLoginov commented Feb 17, 2020 • edited Loading

OlegLoginov commented Feb 17, 2020

OlegLoginov commented Feb 17, 2020 • edited Loading

OlegLoginov commented Feb 17, 2020

OlegLoginov commented Feb 24, 2020 • edited Loading

OlegLoginov commented Feb 24, 2020 • edited Loading

OlegLoginov commented Feb 17, 2020 •

edited by yugabyte-ci

Loading

OlegLoginov commented Feb 17, 2020 •

edited

Loading

OlegLoginov commented Feb 17, 2020 •

edited

Loading

OlegLoginov commented Feb 24, 2020 •

edited

Loading

OlegLoginov commented Feb 24, 2020 •

edited

Loading