Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Searching for hiera keys #4

Closed
tuxmea opened this issue Jan 31, 2022 · 5 comments · Fixed by #83
Closed

Feature Request: Searching for hiera keys #4

tuxmea opened this issue Jan 31, 2022 · 5 comments · Fixed by #83
Assignees

Comments

@tuxmea
Copy link
Member

tuxmea commented Jan 31, 2022

At the moment we can search for environments -> nodes -> hiera keys.

Users want to see in which environment, hierarchies and files a hiera key is made available.

This feature should allow users to select an environment and then search for a key.

The webinterface should show the hiera.yaml hierarchies and mark them if the key is available in an hierarchy (bold, like with the node data).

When klicking on a hierachy, a list with all files should be listed which contain the key.
Files without the key should not be shown.
The value should be shown for each file when klicking on the file name.

View only mode!

@oneiros
Copy link
Collaborator

oneiros commented Jun 28, 2022

I am sorry I did not catch this earlier, but there is a conceptual problem I cannot wrap my head around:

This feature should allow users to select an environment and then search for a key.
The webinterface should show the hiera.yaml hierarchies

The existing code only allows working with hierarchies after a node has been selected. The reason is, that the node's facts are (in many cases) necessary to interpolate the filenames associated with a hierarchy.

Of course a hierarchy may specify hardcoded filenames without interpolation, but I suspect that it is rare to have only hardcoded files in practice.

So we have to deal with fact interpolation in filenames. And given that, I see no way to reliably compute which files actually belong to a hierarchy without knowing actual node facts.

The only solution I can think of is to either limit this to a single node (so the user has to once again select an environment and a node) or do a very expensive calculation of all possible filenames by gathering all facts for all nodes.

@tuxmea
Copy link
Member Author

tuxmea commented Jun 28, 2022

The idea was to not look for nodes containing keys, but to search in all hiera data for a specific key and then show a grep-like result.
Files can be sorted alphabetically base on their path and filename.

nodes/datacenter/ [search]
  k-gg123fjk.domain.tld -> 'value'
  k-gg123fjl.domain.tld
  k-gg123fjm.domain.tld
applikation/billing/ [search]
  dev.yaml
  prod.yaml
application/proxy/dev.yaml
os/ [search]
  RedHat.yaml
  Suse.yaml
common.yaml
defaults.yaml
usermanagement.yaml

Please omit search in a first implementation.
Decryption should be possible.
Read-Only view.
Due to the reason that the listing of dirs and files doe snot represent the hiera.yaml hierarchies, we only need to show files which have the key inside. Files not containing the key should not be listed.

@oneiros
Copy link
Collaborator

oneiros commented Aug 18, 2022

Sorry, this is taking a lot longer than I anticipated.

But I think I am on a way that might lead to good results.

I thought some more on how to find files to search for the keys. Simply "grepping" all files in the base directory of the environment is not ideal imho. Firstly, the files may not be suitable (i.e. not actually belong to any hierarchy, not be yaml files etc.), and secondly, only checking if the file contains the string does not mean it is actually a key.

So I came up with the following (too?) simple idea: I look at the actual configured paths (or globs) in the hierarchies and replace all fact variables with an asterisk (*). I then use the resulting glob to find "candidate files" in the file system. Of course, we do not know if the resulting candidate files are really used by hiera, but from our last discussion and the comment above I gather that is OK.

This works very well so far and I can also reuse code to check if the given key exists in those files.

Given that this approach means I have to do things per hierarchy, my original idea to display the results grouped by hierarchy would be possible after all. The only downside is that in some edge cases files might be displayed twice. Imagine two hierarchies that stuff everything in the same directory:

hierarchy:
  - name: "Host specific"
    path: "stuff/%{::facts.fqdn}.yaml"
  - name: "Role specific"
    path: "stuff/%{::facts.role}.yaml"

A file called stuff/test.yaml would match both paths. But I guess this is so bad a practice, that we can live with that.

If no one tells me otherwise, I will run with this approach.

The other thing I am currently struggling with, is code reuse. As I wrote above I could already reuse small things, but it becomes more and more apparent that this feature should share a lot more with the existing code. This however is currently not easily possible, because the existing code almost always requires a node.

I will now spend some time trying to figure out if I can restructure things to work without a node.

@oneiros
Copy link
Collaborator

oneiros commented Aug 19, 2022

I will now spend some time trying to figure out if I can restructure things to work without a node.

Turns out one major reason for this tight entanglement with nodes is also a problem, when it comes to this feature:

In the (optional) git backend we allow directories of hiera files to be replaced with their counterparts from a git repository. To be consistent, I guess we would need to show the files from git here as well in those cases.

The problem is, we allowed fact interpolation in the configured paths, so you can say something like:

production:
  git_data:
     - datadir: /etc/puppetlabs/code/environments/%{facts.environment}/data
       git_url: git@githost.example.com:puppet/hiera_data.git
       path_in_repo: environments/%{facts.environment}/data

Without this, you would have to specify each environment's data directory seperately. So the feature makes a lot of sense.

But this also means we always need a node's facts to check if we need to take a file from a git repo. In the current code this means we always need to have a specific node when working with keys and values (and hierarchies, though that I believe this could be changed without breaking the git backend).

And of course this will not work for this feature, which is totally independent of any actual node.

I do not see a way around this, without changing the git backend.

One idea could be to replace the variable interpolation with regular expressions. Something like this:

production:
  git_data:
     - datadir: /etc/puppetlabs/code/environments/(\w+)/data
       git_url: git@githost.example.com:puppet/hiera_data.git
       path_in_repo: environments/\1/data

(The \1 means this part should be the same as the part that matched the first parantheses in the datadir regexp.)

I must admit, I am not a fan of this solution. It makes an already complex feature even harder to understand and use.

I already made the case for a simpler git backend here: https://github.com/tuxmea/hdm/pull/25#issuecomment-920749909

This would of course also be a good solution here.

@tuxmea
Copy link
Member Author

tuxmea commented Aug 23, 2022

We want to show actual data only. Any data which are in git, but not yet not deployed, can be ignored.
It is sufficient to only parse the hiera.yaml data dir and hierarchy paths.
From hiera.yaml data paths we only need to take care on the environment fact.
all other facts are ignored.

oneiros added a commit that referenced this issue Aug 29, 2022
With the old funtionality it was OK to always need a node,
because a user had to choose one after choosing an environment.

With #4 we are introducing a different funtionality that
should work indepently of a node.

This is in preparation of #4 and tries to disentangle
environment, nodes and everything else.

To accomplish this, "proper" abstractions for all relevant
parts of the hiera data model are introduced (hierarchies,
data_files and values).

Due to the git backend, (data_)file and value handling still
need to be able to work with a node's facts, but this has
been made optional.
oneiros added a commit that referenced this issue Aug 30, 2022
oneiros added a commit that referenced this issue Aug 30, 2022
oneiros added a commit that referenced this issue Oct 17, 2022
Lists of files may get very long. This way, a user can
open just the hierarchy she wants.
oneiros added a commit that referenced this issue Oct 18, 2022
@oneiros oneiros linked a pull request Oct 18, 2022 that will close this issue
@oneiros oneiros mentioned this issue Oct 18, 2022
tuxmea pushed a commit that referenced this issue Oct 18, 2022
* Large refactoring #4

With the old funtionality it was OK to always need a node,
because a user had to choose one after choosing an environment.

With #4 we are introducing a different funtionality that
should work indepently of a node.

This is in preparation of #4 and tries to disentangle
environment, nodes and everything else.

To accomplish this, "proper" abstractions for all relevant
parts of the hiera data model are introduced (hierarchies,
data_files and values).

Due to the git backend, (data_)file and value handling still
need to be able to work with a node's facts, but this has
been made optional.

* First draft of key search #4

* Disable misguided cop #4

* Collapse search results by default #4

Lists of files may get very long. This way, a user can
open just the hierarchy she wants.

* Cleanup and some tests #4
oneiros added a commit that referenced this issue Dec 8, 2022
oneiros added a commit that referenced this issue Dec 8, 2022
tuxmea pushed a commit that referenced this issue Dec 8, 2022
* First draft of key search #4

* Cleanup and some tests #4

* Basic CRUD for groups.

* Allow editing group memberships.

* Only regular users can be members of a group.

* Small bugfix.

Grouped user list might contain nil values.

* Enforce group access rules.

* Adjust RBAC to new code and key search #88
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants