Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add proposal for instance scoped variables #30558

Merged
merged 19 commits into from
Jan 19, 2022

Conversation

morgo
Copy link
Contributor

@morgo morgo commented Dec 8, 2021

What problem does this PR solve?

Issue Number: close #30557

Problem Summary:

This is a proposal for instance scoped variables and later unified configuration names if you want to call it that.

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@morgo morgo requested a review from a team as a code owner December 8, 2021 23:53
@ti-chi-bot
Copy link
Member

ti-chi-bot commented Dec 8, 2021

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • breeswish
  • dveeden

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 8, 2021

### Stage 2: Mapping All Configuration File Settings to System Variables

The second stage is to map all configuration file settings to system variable names. In MySQL any configuration file setting is also available as a system variable (even if it is read only), which makes it possible to get the configuration of the cluster with `SHOW GLOBAL VARIABLES`. Once this step is complete, it should also be possible to offer an `[instance]` section of the configuration file which accepts the system variable names (and finally consistent naming between the two systems). We can then modify the example configuration files to be based on the flat `[instance]` configuration:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that if I change the value of an system variable, the value in the configuration file will be changed as well ? or those var from config file is readonly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default will be that we map variables as read-only, so we don't need to think about the specific cases (i.e. you can't easily change the socket or port). This helps us achieve completeness first, and then we can then evaluate which ones can be made dynamic.

When you modify an instance scoped system variable, it modifies in memory only (similar to SET GLOBAL in MySQL). This does not propose modifying or rewriting the config file (that would make it lose formatting, and it gets complex quickly).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this clarification to the proposal, so I'm going to resolve this conversation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Why do we offer system variables in the configuration file? To allow users to set the initial values through configuration files? Or to be compatible with MySQL?
  • Initial values of some system variables are already configured in the configuration file, so they need to be deduplicated from showing global variables or the configuration file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we offer system variables in the configuration file? To allow users to set the initial values through configuration files? Or to be compatible with MySQL?

The goal is to (1) have system variables show the superset of all configuration and (2) have unified names between configuration. These do technically improve MySQL compatibility, but I would say these are also good practices.

In the case of instance variables, they do not persist on restart so a configuration file is required to handle this case. In cockroachdb they don't have this though: node (their term for instance) is only supported as startup parameters, everything else is a cluster wide variable. But I think this is also an incompatible change.. by making every config setting available through systerm variables and settable in the config file using the system variable name (under [instance] heading) gives us a better transition state to unification.

Initial values of some system variables are already configured in the configuration file, so they need to be deduplicated from showing global variables or the configuration file.

Because instance values do not persist, this behavior is straight forward. The configuration file values will be taken, and if not specified, compiled server defaults. This is also technically a MySQL compatible way for GLOBAL variables.

@easonn7
Copy link

easonn7 commented Dec 10, 2021

already discussed with Morgan,LGTM

From a user-oriented point of view, setting an instance scoped sysvar is the same as setting a `GLOBAL` variable:

```sql
SET GLOBAL max_connections=1234;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this confuse the customer that SET GLOBAL is used by instance and global sysvars? How could they know if a var is instance or global level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you will need to read the docs to see if a variable is INSTANCE or GLOBAL scoped. This confusion is existing with the session scope, but actually worse since it's not handled by the framework the docs for many INSTANCE variables say they are SESSION scoped (we will finally be able to auto generate docs and say something is "INSTANCE" scoped versus a manual workaround: https://github.com/pingcap/docs/blob/8600643f2fcd20d70d8ad8f637009995948c48c6/scripts/generate-system-variables.go#L142-L152 ).

The semantics for INSTANCE are also much closer to global than session, with only two exceptions:

  • Changes don't propagate to other tidb servers.
  • Changes don't persist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a section on "Documentation Changes", PTAL

This introduces a compatibility issue, since users who have previously configured instance scope with `SET [SESSION] tidb_general_log = x` will receive an error because the scope is wrong. This can be fixed by adding a legacy mode to the sysvar framework which allows `INSTANCE` scoped variables to be set with either `SET GLOBAL` or `SET SESSION`:

```sql
SET GLOBAL tidb_enable_legacy_instance_scope = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking together with instance semantic: https://github.com/pingcap/tidb/pull/30558/files#diff-af084f8d970428eecaaeb42d9b43dcd9e0155602726dec822bda47e0071a0896R51

I guess when tidb_enable_legacy_instance_scope is enabled, the user may face instance-level var with both 'SESSION' and 'GLOBAL' keywords:

SET SESSION tidb_general_log = x

SET GLOBAL max_connections = x

Both of the two SETs are actually working in 'instance' scope. Am I right? if so I think this can be confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. We should add a warning for the SET SESSION version. I agree it is confusing, which is why I suggest we consider later changing the default.

Unfortunately it is required to keep compatibility. But it is important to transition INSTANCE to using the GLOBAL keyword, because SESSION semantics are different.

docs/design/2021-12-08-instance-scope.md Show resolved Hide resolved
docs/design/2021-12-08-instance-scope.md Show resolved Hide resolved

### Stage 2: Mapping All Configuration File Settings to System Variables

The second stage is to map all configuration file settings to system variable names. In MySQL any configuration file setting is also available as a system variable (even if it is read only), which makes it possible to get the configuration of the cluster with `SHOW GLOBAL VARIABLES`. Once this step is complete, it should also be possible to offer an `[instance]` section of the configuration file which accepts the system variable names (and finally consistent naming between the two systems). We can then modify the example configuration files to be based on the flat `[instance]` configuration:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Why do we offer system variables in the configuration file? To allow users to set the initial values through configuration files? Or to be compatible with MySQL?
  • Initial values of some system variables are already configured in the configuration file, so they need to be deduplicated from showing global variables or the configuration file.

docs/design/2021-12-08-instance-scope.md Outdated Show resolved Hide resolved

## Unresolved Questions

There are **very few** known scenarios where a system variable is *currently both* instance and global scoped, but these individually would need to be handled:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't get it: why do we replace global with instance instead of replacing session with instance? To users, they are both hard to understand and you need to explain it in the user docs anyway.
Are there any cases where a system variable is both session and instance scoped? I don't think so.

Copy link
Contributor Author

@morgo morgo Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, there are no cases where a variable is both session and instance.

Session by semantics is supposed to be thread-local-storage. When the initial connection is created to MySQL it will copy all the values of global variables that also have session scope, and any further changes to global will not affect those session changes. It should only be changes that you make in your connection that change the value.

What I am calling INSTANCE is actually the identical semantics to MySQL GLOBAL. They are much closer in behavior: the only difference is you will need to check the docs to see if it is an INSTANCE-GLOBAL or a GLOBAL-GLOBAL if you want to know if your change (1) propagates to the cluster and (2) persists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is listed as motivation #3 in "Motivation and Background".

Copy link
Contributor

@djshow832 djshow832 Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the semantics between session and instance is different, but the semantics between instance and global is also different. From users' view:

  • If you set a global system variable and then reboot TiDB, the setting should still take effect, but it actually does not.
  • If you set a global system variable and then connect to TiDB from another session, the setting should take effect, but it may not.

So I don't think instance is closer to session.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set a global system variable and then connect to TiDB from another session, the setting should take effect, but it may not.

This is not the case since ~TiDB 5.2. On a single-instance basis if you set a GLOBAL system variable and then connect from another session it will take effect unless the scope is SESSION + GLOBAL, and the other session pre-originated the setting. On a multi-instance basis there is an etcd notification sent immediately, but there is a race where you could beat it.

So I don't think instance is closer to session.

I was arguing it is closer to MySQL GLOBAL, or rather it is identical. TiDB GLOBAL does not exist in MySQL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a section on "Documentation Changes", PTAL

- The configuration file (in `.toml` format) ([Docs](https://docs.pingcap.com/tidb/stable/tidb-configuration-file))
- Using MySQL compatible system variables (`SET GLOBAL sysvar=x`, `SET SESSION sysvar=x`) ([Docs](https://docs.pingcap.com/tidb/stable/system-variables))

Both the **semantics** and **naming conventions** differ between these two methods:
Copy link
Member

@breezewish breezewish Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you supply a real user scenario that would require this non-persisted instance scoped variables? From my perspective, setting a variable in instance level but not persisting it will lead to confusion in many cases. For example, if I really would like to enable a feature for only one instance, I would be very surprised that the change is reverted at some point, cause TiDB is designed to be stateless and frequent restart is allowed to happen.

More over, considering that in many cases user is using TiDB behind a LB, and there may be even no way to access individual instances (for example, in TiDB Cloud), would it be better to allow remotely setting the instance variables?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost more a question of what should be a candidate for instance scope. Eason's opinion is that we should always default to global scope for management simplicity, and actually few things should be instance. I think CRDB gets this right in that the node specific options almost always refer to local resources (max memory, tmpdir path, log file path, port, socket). We probably get a few things wrong currently, but are mostly on the right path.

If we look from these examples: you might need to shuffle around the tmpdir on a particular TiDB server instance to keep it up and running (and might not want to restart it). So you would have to log into the TiDB instance to make this change, but it would not affect users running through a LB.

As to the semantic of "reverted", if we were to persist the change, it is not really any less confusing because there is still a precedence problem of config file vs. saved in TiDB (i.e. "I set this option in the config file but it's not working!"). The semantic is also how MySQL works (although there is a newer feature called SET PERSIST), TiDB is special in that it persists all global config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a section on "Documentation Changes", PTAL. I think this is easier to explain

Copy link
Member

@breezewish breezewish Dec 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the supply! Does this mean that, we will have a clear rule that for these settings, an INSTANCE variable can be used to temporarily updating the config (and taking effect immediately), while user should use config file for persistently updating the config (but taking effect after a reboot)?

I do like CRDB's config mechanism a lot. There aren't a lot of ways like TiDB to configure. They only have two, which is very easy for user to learn. User will not be confused about persistence or things like that:

a. CLI Args - Instance level, restart to take effect, persisted
b. Variable - Cluster level, take effect immediately, persisted

Notice that under CRDB's model it is not possible to adjust some instance level things and take effect without a restart.

Copy link
Contributor Author

@morgo morgo Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that under CRDB's model it is not possible to adjust some instance level things and take effect without a restart.

Yes exactly. The other limitation is explaining why 2 nodes are behaving differently: in MySQL SHOW GLOBAL VARIABLES captures all configuration, but since memory in CRDB is a node-level option, it wouldn't be captured, but it is in this proposal since everything has a sysvar mapping (step 2).

Edit: even if TiDB only has 2 sources currently, the split between them is not intuitive to users. We will benefit from a source of truth.

Does this mean that, we will have a clear rule that for these settings, an INSTANCE variable can be used to temporarily updating the config (and taking effect immediately), while user should use config file for persistently updating the config (but taking effect after a reboot)?

Yes, I propose in documentation we describe the semantic as "Persists to Cluster: Bool" (instead of INSTANCE vs. GLOBAL). Persists to Cluster: No is identical to how MySQL global variables, so it is easy to explain. We do not need to use the term Instance scope vs. global scope for users - since they are both set with SET GLOBAL.

Copy link
Contributor

@dveeden dveeden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to be able to modify the instance scope setting for other nodes? Like set config "127.0.0.1:20180" `split.qps-threshold`=1000

docs/design/2021-12-08-instance-scope.md Outdated Show resolved Hide resolved
docs/design/2021-12-08-instance-scope.md Show resolved Hide resolved
@ti-chi-bot ti-chi-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 10, 2021
@morgo
Copy link
Contributor Author

morgo commented Dec 10, 2021

Would it make sense to be able to modify the instance scope setting for other nodes? Like set config "127.0.0.1:20180" `split.qps-threshold`=1000

This proposes that the sysvar name becomes the source of truth, so it would be something like SET GLOBAL [wayofspecifyingremoteinstance] tidb_split_qps_threshold = 1000. There is nothing stopping us from adding it in future, but I am going to keep it out of the scope of this proposal.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Dec 13, 2021

## Documentation Changes

For the purposes of user-communication we don't need to use the terminology `INSTANCE`. We will remove more of the confusion by instead referring to the difference between these as whether it "persists to [the] cluster". This also aligns closer to the syntax that users will be using. Consider the following example changes which are easier to read (the current text also doesn't actually explain that to set an `INSTANCE` variable you must use `SET SESSION`, so it actually communicates a lot more in fewer words):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the purposes of user-communication we don't need to use the terminology INSTANCE.

If so, it's also kind of confusing that [instance] section will be introduced to the configuration file... But for now I don't have a better idea for the naming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also call it [tidb-server], this would be consistent with the naming of mysql configuration files. The only part I am worried about is the dash (which MySQL treats in a special way). If it's a problem we can change it to [server] or something like that.

Copy link
Member

@breezewish breezewish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good! The binding with "persist + cluster" is smart and greatly simplifies the problems we are facing with.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jan 18, 2022
Copy link
Member

@bb7133 bb7133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morgo
Copy link
Contributor Author

morgo commented Jan 19, 2022

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: a05e8f2

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jan 19, 2022
@ti-chi-bot
Copy link
Member

@morgo: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@sre-bot
Copy link
Contributor

sre-bot commented Jan 19, 2022

@morgo
Copy link
Contributor Author

morgo commented Jan 19, 2022

/run-check_dev_2

@ti-chi-bot ti-chi-bot merged commit a4e8ed5 into pingcap:master Jan 19, 2022
@morgo morgo deleted the instance-scope-proposal branch January 19, 2022 05:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Simplified proposal for Instance Scoped Sysvars
9 participants