Feature Request: Searching for hiera keys #4

tuxmea · 2022-01-31T13:18:57Z

At the moment we can search for environments -> nodes -> hiera keys.

Users want to see in which environment, hierarchies and files a hiera key is made available.

This feature should allow users to select an environment and then search for a key.

The webinterface should show the hiera.yaml hierarchies and mark them if the key is available in an hierarchy (bold, like with the node data).

When klicking on a hierachy, a list with all files should be listed which contain the key.
Files without the key should not be shown.
The value should be shown for each file when klicking on the file name.

View only mode!

oneiros · 2022-06-28T10:08:53Z

I am sorry I did not catch this earlier, but there is a conceptual problem I cannot wrap my head around:

This feature should allow users to select an environment and then search for a key.
The webinterface should show the hiera.yaml hierarchies

The existing code only allows working with hierarchies after a node has been selected. The reason is, that the node's facts are (in many cases) necessary to interpolate the filenames associated with a hierarchy.

Of course a hierarchy may specify hardcoded filenames without interpolation, but I suspect that it is rare to have only hardcoded files in practice.

So we have to deal with fact interpolation in filenames. And given that, I see no way to reliably compute which files actually belong to a hierarchy without knowing actual node facts.

The only solution I can think of is to either limit this to a single node (so the user has to once again select an environment and a node) or do a very expensive calculation of all possible filenames by gathering all facts for all nodes.

tuxmea · 2022-06-28T12:24:43Z

The idea was to not look for nodes containing keys, but to search in all hiera data for a specific key and then show a grep-like result.
Files can be sorted alphabetically base on their path and filename.

nodes/datacenter/ [search]
  k-gg123fjk.domain.tld -> 'value'
  k-gg123fjl.domain.tld
  k-gg123fjm.domain.tld
applikation/billing/ [search]
  dev.yaml
  prod.yaml
application/proxy/dev.yaml
os/ [search]
  RedHat.yaml
  Suse.yaml
common.yaml
defaults.yaml
usermanagement.yaml

Please omit search in a first implementation.
Decryption should be possible.
Read-Only view.
Due to the reason that the listing of dirs and files doe snot represent the hiera.yaml hierarchies, we only need to show files which have the key inside. Files not containing the key should not be listed.

oneiros · 2022-08-18T09:57:23Z

Sorry, this is taking a lot longer than I anticipated.

But I think I am on a way that might lead to good results.

I thought some more on how to find files to search for the keys. Simply "grepping" all files in the base directory of the environment is not ideal imho. Firstly, the files may not be suitable (i.e. not actually belong to any hierarchy, not be yaml files etc.), and secondly, only checking if the file contains the string does not mean it is actually a key.

So I came up with the following (too?) simple idea: I look at the actual configured paths (or globs) in the hierarchies and replace all fact variables with an asterisk (*). I then use the resulting glob to find "candidate files" in the file system. Of course, we do not know if the resulting candidate files are really used by hiera, but from our last discussion and the comment above I gather that is OK.

This works very well so far and I can also reuse code to check if the given key exists in those files.

Given that this approach means I have to do things per hierarchy, my original idea to display the results grouped by hierarchy would be possible after all. The only downside is that in some edge cases files might be displayed twice. Imagine two hierarchies that stuff everything in the same directory:

hierarchy:
  - name: "Host specific"
    path: "stuff/%{::facts.fqdn}.yaml"
  - name: "Role specific"
    path: "stuff/%{::facts.role}.yaml"

A file called stuff/test.yaml would match both paths. But I guess this is so bad a practice, that we can live with that.

If no one tells me otherwise, I will run with this approach.

The other thing I am currently struggling with, is code reuse. As I wrote above I could already reuse small things, but it becomes more and more apparent that this feature should share a lot more with the existing code. This however is currently not easily possible, because the existing code almost always requires a node.

I will now spend some time trying to figure out if I can restructure things to work without a node.

oneiros · 2022-08-19T10:00:12Z

I will now spend some time trying to figure out if I can restructure things to work without a node.

Turns out one major reason for this tight entanglement with nodes is also a problem, when it comes to this feature:

In the (optional) git backend we allow directories of hiera files to be replaced with their counterparts from a git repository. To be consistent, I guess we would need to show the files from git here as well in those cases.

The problem is, we allowed fact interpolation in the configured paths, so you can say something like:

production:
  git_data:
     - datadir: /etc/puppetlabs/code/environments/%{facts.environment}/data
       git_url: git@githost.example.com:puppet/hiera_data.git
       path_in_repo: environments/%{facts.environment}/data

Without this, you would have to specify each environment's data directory seperately. So the feature makes a lot of sense.

But this also means we always need a node's facts to check if we need to take a file from a git repo. In the current code this means we always need to have a specific node when working with keys and values (and hierarchies, though that I believe this could be changed without breaking the git backend).

And of course this will not work for this feature, which is totally independent of any actual node.

I do not see a way around this, without changing the git backend.

One idea could be to replace the variable interpolation with regular expressions. Something like this:

production:
  git_data:
     - datadir: /etc/puppetlabs/code/environments/(\w+)/data
       git_url: git@githost.example.com:puppet/hiera_data.git
       path_in_repo: environments/\1/data

(The \1 means this part should be the same as the part that matched the first parantheses in the datadir regexp.)

I must admit, I am not a fan of this solution. It makes an already complex feature even harder to understand and use.

I already made the case for a simpler git backend here: https://github.com/tuxmea/hdm/pull/25#issuecomment-920749909

This would of course also be a good solution here.

tuxmea · 2022-08-23T12:48:03Z

We want to show actual data only. Any data which are in git, but not yet not deployed, can be ignored.
It is sufficient to only parse the hiera.yaml data dir and hierarchy paths.
From hiera.yaml data paths we only need to take care on the environment fact.
all other facts are ignored.

With the old funtionality it was OK to always need a node, because a user had to choose one after choosing an environment. With #4 we are introducing a different funtionality that should work indepently of a node. This is in preparation of #4 and tries to disentangle environment, nodes and everything else. To accomplish this, "proper" abstractions for all relevant parts of the hiera data model are introduced (hierarchies, data_files and values). Due to the git backend, (data_)file and value handling still need to be able to work with a node's facts, but this has been made optional.

Lists of files may get very long. This way, a user can open just the hierarchy she wants.

* Large refactoring #4 With the old funtionality it was OK to always need a node, because a user had to choose one after choosing an environment. With #4 we are introducing a different funtionality that should work indepently of a node. This is in preparation of #4 and tries to disentangle environment, nodes and everything else. To accomplish this, "proper" abstractions for all relevant parts of the hiera data model are introduced (hierarchies, data_files and values). Due to the git backend, (data_)file and value handling still need to be able to work with a node's facts, but this has been made optional. * First draft of key search #4 * Disable misguided cop #4 * Collapse search results by default #4 Lists of files may get very long. This way, a user can open just the hierarchy she wants. * Cleanup and some tests #4

* First draft of key search #4 * Cleanup and some tests #4 * Basic CRUD for groups. * Allow editing group memberships. * Only regular users can be members of a group. * Small bugfix. Grouped user list might contain nil values. * Enforce group access rules. * Adjust RBAC to new code and key search #88

tuxmea assigned oneiros Jan 31, 2022

oneiros assigned tuxmea Aug 19, 2022

oneiros added a commit that referenced this issue Aug 30, 2022

First draft of key search #4

9a10a5b

oneiros added a commit that referenced this issue Aug 30, 2022

Disable misguided cop #4

cdf8a2f

oneiros added a commit that referenced this issue Oct 17, 2022

Collapse search results by default #4

85c72dd

Lists of files may get very long. This way, a user can open just the hierarchy she wants.

oneiros added a commit that referenced this issue Oct 18, 2022

Cleanup and some tests #4

c33651b

oneiros linked a pull request Oct 18, 2022 that will close this issue

Key search #83

Merged

oneiros mentioned this issue Oct 18, 2022

Key search #83

Merged

tuxmea closed this as completed in #83 Oct 18, 2022

oneiros added a commit that referenced this issue Dec 8, 2022

First draft of key search #4

3b59ed2

oneiros added a commit that referenced this issue Dec 8, 2022

Cleanup and some tests #4

b4ad920

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Searching for hiera keys #4

Feature Request: Searching for hiera keys #4

tuxmea commented Jan 31, 2022

oneiros commented Jun 28, 2022

tuxmea commented Jun 28, 2022 •

edited

Loading

oneiros commented Aug 18, 2022

oneiros commented Aug 19, 2022

tuxmea commented Aug 23, 2022

Feature Request: Searching for hiera keys #4

Feature Request: Searching for hiera keys #4

Comments

tuxmea commented Jan 31, 2022

oneiros commented Jun 28, 2022

tuxmea commented Jun 28, 2022 • edited Loading

oneiros commented Aug 18, 2022

oneiros commented Aug 19, 2022

tuxmea commented Aug 23, 2022

tuxmea commented Jun 28, 2022 •

edited

Loading