Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create A Command for Interacting With Configurations #11051

Closed
4 tasks
Tracked by #14566
dwedul-figure opened this issue Jan 27, 2022 · 28 comments
Closed
4 tasks
Tracked by #14566

Create A Command for Interacting With Configurations #11051

dwedul-figure opened this issue Jan 27, 2022 · 28 comments
Assignees

Comments

@dwedul-figure
Copy link
Collaborator

dwedul-figure commented Jan 27, 2022

Summary

Create a config command for managing configuration values. Interactions would include get and set as well as a way to easily identify values that are different from their defaults. Another desired function is a way to pack and unpack the configs (to and from a json file containing only configuration entries that are not their default values).

Problem Definition

As an engineer and/or node operator, I want a programmatic way to interact with configuration so that it's easier to write scripts and investigate settings.

The TOML files are great for manually opening and editing settings, but are difficult to update programmatically. Additionally, since environment variables can be involved, it'd be nice to have the chain application tell me what it thinks the config values are.

Furthermore, it would be nice to have a single small config file that only contains non-default values and can be copied and shared.

Proposal

Add a config command with sub-commands:

  • get - Gets the value being used for a configuration entry.
  • set - Sets config values in their appropriate file.
  • changed - Outputs all config values that are different from the defaults.
  • pack - Converts the toml config files into a single small config file with only entries there are different from the default.
  • unpack - Splits the single small config file back into the original toml file formats.
  • home - Outputs the string being used as the home path.

This command will allow management of all three of the configurations: Cosmos (app.toml), Tendermint (config.toml), and Client (client.toml).

This will probably also require some refactoring of the code that loads the configs.


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@dwedul-figure
Copy link
Collaborator Author

dwedul-figure commented Jan 27, 2022

The Provenance chain has a config command like this already, and it has been extremely useful for sharing setups and troubleshooting.

@peterbourgon
Copy link

Config files aren't tiny databases that can be mutated over time, they're meant to be sources of truth that are read-only for the consumers that rely on them. This would create a ton of ambiguous circumstances that don't have good answers...

  1. What should happen if the user running config set doesn't have write permissions on the config file?
  2. What should happen if a config file packed at version 1 is unpacked at version 2, where defaults have changed?
  3. What should happen to the running process if a config value is changed via set?
  4. What should happen if the config file on disk changes vs. your source of truth, e.g. Ansible, Chef, etc?

etc. etc.

@iramiller
Copy link
Contributor

iramiller commented Feb 1, 2022

  1. What should happen if the user running config set doesn't have write permissions on the config file?

The same thing that should happen if a user doesn't have permission to edit the config file now, permission denied.

  1. What should happen if a config file packed at version 1 is unpacked at version 2, where defaults have changed?

A related issue here is that there is currently no validation of the configuration reference files in the source with the configuration settings setup through code. The above system actually identified a bug in 0.45 where the IAVL caching config property was not properly setup in the default.

In addition this command will output the entire state of the node based on the files in place and the running defaults. Meaning if you are using v1, output your configuration then compare it to the configuration output by v2 which contains new config values that are not present in the config files on disk, the changed configuration will appear in a simple diff.

  1. What should happen to the running process if a config value is changed via set?

The same thing that happens today when a config file is edited while the process is running.

  1. What should happen if the config file on disk changes vs. your source of truth, e.g. Ansible, Chef, etc?

This command doesn't attempt to change the configuration management process that an entity is using. The same resolution a user leverages for files on disk vs Ansible, Chef, etc still applies the same as if the admin edited the file directly.

If you were to combine the applied config output between version one and version two of a binary it can easily highlight if the configuration has silently changed between versions. One of the common problems when relying on a config file based approach is that the executable itself has its own state of config that evolves over time and can easily diverge from the static configuration file across updates. Without this command there is no way to see this differences between versions unless you review a source code diff or hopefully spot something outlining the update in the changelog.

@peterbourgon
Copy link

I think the disconnect here is about the relationship between a config file and the program which reads it. Config files are typically input to or dependencies of programs, basically equivalent to commandline flags or env vars. The program can read them but doesn't have the right to modify them. If the program which consumes a config file can also modify it, then that relationship changes, and ownership becomes ambiguous. In concrete terms, I can no longer trust that changes I make to that config file, as an administrator or provisioning system or whatever else, won't be overwritten without my knowledge. And comments, whitespace, etc. would get mangled or lost, absent a sufficiently intelligent AST model, parser, etc.

This is all hugely problematic! It would break a well-established and well-understood expectation. That's a huge cost. And for sure this capability has value, which you point out. But basically every conceivable capability has value in some circumstance. Having value is a necessary, but insufficient, rationale for adding features. You have to weigh both costs and benefits.

@iramiller
Copy link
Contributor

One example of a program that uses and maintains a config file is kubectl. https://kubernetes.io/docs/reference/kubectl/cheatsheet/

The patterns it supports to manage and also leverage its configuration both from a file and the environment are an excellent example of a powerful tool that is easy to use and understand.

@peterbourgon
Copy link

Yeah! It makes a lot of sense for kubectl to own and manipulate its config file. That's because kubectl is a CLI tool, typically run interactively by users on workstations, "alive" in short bursts. That's categorically different than Cosmos SDK applications, which are typically long-running services managed by supervisors on servers.

@iramiller
Copy link
Contributor

It's also worth pointing out again that this command was fully built and is in active use, it's implementation is not hypothetical and it's source code, behavior and interaction, and ultimately real world user feedback can be evaluated or queried directly if desired.

The reason we offered this is because the cosmos sdk is severely lacking in the configuration department with multiple overlapping configuration files and non obvious default values that come from code (which changes over time) and the environment it runs in. It presents challenges to experienced engineers and new users alike. Even the core dev team gets this wrong sometimes and makes mistakes this tool trivially identifies.

It is obvious that there is some disagreement on the design principal irregardless of the benefits for users. I propose that if one of the foremost systems administration frameworks in the world (kubernetes) has found value in this cli ux design pattern; we could do a lot worse than trying to emulating it.

@iramiller
Copy link
Contributor

That's because kubectl is a CLI tool, typically run interactively by users on workstations, "alive" in short bursts. That's categorically different than Cosmos SDK applications, which are typically long-running services managed by supervisors on servers.

simd q bank ...

simd tx bank ...

Etc... most of the cli on simd has nothing to do with running the node itself. The entire client config file is dedicated to this aspect.

@peterbourgon
Copy link

The entire client config file . . .

What config file(s) are we talking about? The OP makes me believe it's the node's app.toml and config.toml.

@dwedul-figure
Copy link
Collaborator Author

The entire client config file . . .

What config file(s) are we talking about? The OP makes me believe it's the node's app.toml and config.toml.

All three of app.toml, config.toml, and client.toml. I've updated the main issue text to clarify this.

@drohit-cb
Copy link

In the last 1 year that I have played around with cosmos-sdk based protocols as a enterprise grade node operator, I have been wanting this feature for the longest time. We run some internal tests while consuming protocol binaries that require us to generate latest app/config tomls and tweak them for our tests. This is currently done by hand and having a tool to set necessary keys would be extremely useful.

@iramiller
Copy link
Contributor

We have been running this in provenanced for the better part of a year now. Indispensable tool for us. From interaction with new devs (hey can you double check this setting--here is a single line command to see what it is set to while accounting for your current env) to verifying new config options are properly configured (see iavl fast node debacle where disable was chosen instead of enable leading to weird defaults/missing config entry issues).

That doesn't even include the numerous broken config setting problems we encounter while testing sdk updates. After all there are config files that get new features adddd but the existing environment won't have those. With this tool we can rewrite the config with all the new properties while maintaining the existing config overrides.

Pretty much every time someone adds/updates a new config setting in the sdk there is some weird artifact which would be caught if a tool like this was integrated.

@tac0turtle
Copy link
Member

tac0turtle commented Nov 5, 2022

This could be tied into the config fix issue we have an have assigned. Is there a link to your tool @iramiller.

Cc @julienrbrt in this issue cause he is picking up the config fix issue

@julienrbrt
Copy link
Member

julienrbrt commented Nov 5, 2022

This could be tied into the config fix issue we have an have assigned. Is there a link to your tool @iramiller.

Cc @julienrbrt in this issue cause he is picking up the config fix issue

I understand the added value of having a diff view between what is on disk and the default config. However I share @peterbourgon concerns about letting confix mutate config field (which is different than adding the missing fields from this issue: #13661)

What the added value compared to a tool like dasel.

I can start with the simple config migration (#13661) and add subcommand that shows a simple diff between the default config expected and the one on disk.

@iramiller
Copy link
Contributor

Our work is here: https://github.com/provenance-io/provenance/blob/a019c35966c0c6d2e21c00592927d83ea6f8e3d2/cmd/provenanced/cmd/config.go

I encourage the core dev team to explore the use cases and opinions of the various large operation teams in the ecosystem (versus more developer centric use cases).

As an example ... here is my current dev environment consolidated configuration (app, config, and client config files)... I can very quickly verify what is set and share this data for troubleshooting ... or set/manage any of these settings using our provenanced config command.

%%> provenanced config pack
Packed config:
{
  "api.enable": "true",
  "api.swagger": "true",
  "chain-id": "provenance-chain-fsWtlG",
  "consensus.timeout_commit": "4s",
  "minimum-gas-prices": "1905nhash",
  "moniker": "simapp-test",
  "node": "tcp://127.0.0.1:26657",
  "rpc.pprof_laddr": "localhost:6060",
  "tx_index.indexer": "null"
}

and

%%> provenanced config set tx_index.indexer kv
Tendermint Config Updated: (packed)
-------------------------------
tx_index.indexer Was: "null", Is Now: "kv"

Config is packed: /[..]/Provenance/config/packed-conf.json

@SpicyLemon
Copy link
Collaborator

Another thing to point out is that the config get command returns the value that's actually being used.

E.g.

%%> provenanced config get output
Client Config: /[..]/config/client.toml (or env)
-------------------
output="text"

%%> PIO_OUTPUT=json provenanced config get output
Client Config: /[..]/config/client.toml (or env)
-------------------
output="json"

%%> PIO_OUTPUT=json; provenanced config get output
Client Config: /[..]/config/client.toml (or env)
-------------------
output="text"

%%> export PIO_OUTPUT
bash> provenanced config get output
Client Config: /[..]/config/client.toml (or env)
-------------------
output="json"

Note, the [..] above were edited in by me. In the actual command output, it's the full path to the file(s) (i.e. probably doesn't contain "[..]").

@SpicyLemon
Copy link
Collaborator

One feature that I'd like to add too, is a config update command that doesn't take any arguments. All it does is regenerate the config files using the values already defined in them, applied to the current template, filling in new fields with their default values.

@peterbourgon
Copy link

One feature that I'd like to add too, is a config update command that . . . regenerate[s] the config files . . . [and fills] in new fields with their default values.

It's important that default values, as defined in code, aren't persisted to config files. Defaults don't say "this is the ideal value", they say "this is the value we think you should use if you don't give an explicit alternative", and that preference can change over software versions. A config update command as described here fixes unspecified params to their current defaults, which is unexpected behavior for both users and maintainers.

@SpicyLemon
Copy link
Collaborator

One feature that I'd like to add too, is a config update command that . . . regenerate[s] the config files . . . [and fills] in new fields with their default values.

It's important that default values, as defined in code, aren't persisted to config files. Defaults don't say "this is the ideal value", they say "this is the value we think you should use if you don't give an explicit alternative", and that preference can change over software versions. A config update command as described here fixes unspecified params to their current defaults, which is unexpected behavior for both users and maintainers.

The whole purpose of having config files is to let users decide. This command helps by providing users with the information they need to make configuration decisions.

Already, defaults are defined in code and written to a config file, e.g. with the init command. When new fields become available, currently, the only way an existing user might know about them is by digging through the app's/sdk's code. This is not a good solution.

The primary problem this config update command is trying to solve is this: As a node operator that has been through multiple upgrades, I want have an updated config file that contains all possible configuration fields and their explanations so that I can more easily know all the things that I can configure.

It's important that default values, as defined in code, aren't persisted to config files.

I whole-heartedly disagree with this. While it's usually good that this doesn't happen unknowingly, there's nothing wrong with having a command that does exactly that. How do you expect people to properly configure things if they can't know what's configurable?

Defaults don't say "this is the ideal value", they say "this is the value we think you should use if you don't give an explicit alternative"

I fully agree. But it's impossible to provide an explicit alternative if you don't know it's possible to do so. The main purpose of putting things in a config file is because it's impossible for a maintainer to know what the ideal value is for any given user. In order for users to make those decisions, though, they need to know what decisions are possible. There is currently no solution for telling users what's newly configurable (or no longer configurable).

... and that preference can change over software versions.

If the field can exist in a config file, maintainers should NEVER assume that changing its default will have an affect on existing users. The purpose of config files is to let users decide, not maintainers. Maintainers can provide guidance, though, which is most often done by putting the field in the config file with a default and explanation.

A config update command as described here fixes unspecified params to their current defaults, which is unexpected behavior for both users and maintainers.

The entire purpose of the config update command is to get an updated version of the config file(s), filling unspecified params to their current defaults; it's the behavior users expect from it.

Its folly for maintainers to expect that the hard-coded default is what's used for a configurable param, or that changing the default will affect existing users. In Cosmos-SDK specifically, there's a config file template that uses the current defaults to write a config file to disk for users to manually update. As such, maintainers should expect that this behavior happens.


A config update command will almost be a necessity in a near-future sdk version that uses a new tendermint. Tendermint changed the config fields to use dashes instead of underscores. That means that existing config.toml files won't work, and will HAVE to be updated. In this case, though, it's slightly different since the values will be migrating from one field name to another field name, but the idea is still the same: "Give me a new config file that has all my configured field values, but also contains new additions and changes, so that a) it still works as expected, and b) I can know what configuration decisions are possible." By having a built-in command for doing that, it can easily be provided as a pre-upgrade command to be done as part of an upgrade.


Lastly, the thing that made me really want the config update command is the iavl-disable-fastnode config field. Most of our nodes' config files don't have that field since it was added after our chain started. In some version, it was added with a default of true. We didn't notice because nothing actually changed. In v0.46.x though, the default changed to false. So when doing an upgrade, instead of it taking about 1 minute, it took 40+ minutes (60+ in some situations with slow disks) because the iavl tree was unexpectedly (and undesirably) being updated. Basically, the maintainer's decision to change this default had undesirable effects for us.

The fix was to have node operators add it to their config file. If the config update command were available, we could have told all validators to just run that command. Since our chain has a config set command, though, we were able to tell them to run provenanced config set iavl-disable-fastnode true; that effectively did the same thing, but it would have been easier as just provenanced config update. Without the config command at all, we would have had to tell operators to manually edit their config files to add the field to the right spot in the file, which is what's needed for most cosmos-sdk-based chains right now.

It'd just be really nice to be able to have my config file updated to include new fields, remove old fields, and have updated comments.

@peterbourgon
Copy link

(discovery of new flags)

program -h describes the configuration surface area of a program without requiring the user to look at source code.

The fix was to have node operators add it to their config file. If the config update command were available, we could have told all validators to just run that command. Since our chain has a config set command, though, we were able to tell them to run provenanced config set iavl-disable-fastnode true; that effectively did the same thing, but it would have been easier as just provenanced config update. Without the config command at all, we would have had to tell operators to manually edit their config files to add the field to the right spot in the file, which is what's needed for most cosmos-sdk-based chains right now.

Maintaining software running in production by modifying stuff directly on the production hosts is of course not something that anyone is, or should be, doing.

Config files, binaries, systemd unit files, everything that impacts production infrastructure, are provisioned to those production hosts by a separate system — Ansible, Chef, Puppet, Kubernetes, Nomad, whatever — which takes the input from a system of record, often a GitHub repository. Assets are declarative by definition in this context — all changes are e.g. PRs.

@iramiller
Copy link
Contributor

Maintaining software running in production by modifying stuff directly on the production hosts is of course not something that anyone is, or should be, doing.

@peterbourgon I appreciate your passion here but your apparent lack of ability even consider that there are a variety of different approaches in use by operations teams is extremely frustrating and quite in appropriate for this context.

Within this detailed thread there are in-depth explanations of how the software is being used that outline how the current configuration system in Cosmos SDK is not unsuitable and what might mitigate this (and in some cases has) . There is even feedback from engineers that work with what are quite possibly the largest professional hosting environments in the industry which detail the value being derived from these features.

Yes, it is great that Ansible/Chef/K8s/etc and configuration from official record are useful tool for managing a production environment. It is very curious that you do not mention where that configuration comes from, how those teams build up, maintain, test, and review those configs. People working on that side of the operation care about scripting/repeatable and explicit configuration, and testing. These are the people that will benefit highly from establishing configs of record and these tools we describe here support their needs.

@peterbourgon
Copy link

peterbourgon commented Nov 20, 2022

@iramiller

it is great that Ansible/Chef/K8s/etc and configuration from official record are useful tool for managing a production environment. It is very curious that you do not mention where that configuration comes from . . .

Config files, binaries, systemd unit files, everything that impacts production infrastructure, are provisioned to those production hosts by a separate system — Ansible, Chef, Puppet, Kubernetes, Nomad, whatever — which takes the input from a system of record, often a GitHub repository

. . . a variety of different approaches in use by operations teams . . .

I have no doubt that many operations teams routinely edit config/data/etc. files directly on their production servers as you describe. Similarly, I have no doubt that many Cosmos application developers routinely ignore errors returned by function calls. And just as Go code shouldn't try to accommodate broken/invalid consumers that ignore errors, Cosmos tooling shouldn't try to accommodate broken/invalid operators that expect to be able to edit config directly on prod machines. It's just not a valid way of working. This shouldn't be controversial — we figured this out as an industry like 20 years ago, in the LiveJournal era.

@yihuang
Copy link
Collaborator

yihuang commented Nov 20, 2022

One feature that I'd like to add too, is a config update command that doesn't take any arguments. All it does is regenerate the config files using the values already defined in them, applied to the current template, filling in new fields with their default values.

+1
Even better, it can output the new items it added, it can help people to learn about what's new.

@ValarDragon
Copy link
Contributor

I'm very supportive of OP proposal

@tac0turtle
Copy link
Member

added in #14342

@iramiller
Copy link
Contributor

added in #14342

@tac0turtle lots of people were tracking the progress of this issue ... and it looks like it was fully developed on that other issue without tagging anyone who was following along over here. In the future when issues are implemented separately could discussions be merged or at least an updated link to where the work is being done be posted? It is a disappointment to see that the entire solution was built, shipped, and merged before anyone following along here was updated.

@julienrbrt
Copy link
Member

julienrbrt commented Jan 10, 2023

added in #14342

@tac0turtle lots of people were tracking the progress of this issue ... and it looks like it was fully developed on that other issue without tagging anyone who was following along over here. In the future when issues are implemented separately could discussions be merged or at least an updated link to where the work is being done be posted? It is a disappointment to see that the entire solution was built, shipped, and merged before anyone following along here was updated.

If you have any feedback on the current implementation direction feel free to chime in in the follow-up PR (#14568). A few ideas from here have been implemented basically set and get) and diff will be added in the follow up.

I don't understand what is "disappointing". The PR references both issues from the start.

@tac0turtle
Copy link
Member

@tac0turtle lots of people were tracking the progress of this issue ... and it looks like it was fully developed on that other issue without tagging anyone who was following along over here. In the future when issues are implemented separately could discussions be merged or at least an updated link to where the work is being done be posted? It is a disappointment to see that the entire solution was built, shipped, and merged before anyone following along here was updated.

I did send the pr to @SpicyLemon in discord and asked if he would want to take a look. just because something was developed does not mean its final, if you have feedback please let us know and we can work on incorporating it. In the future I will make sure to send you a message instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

9 participants