Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to review entries in inv_dataset table #6615

Open
askfongjojo opened this issue Sep 20, 2024 · 5 comments · Fixed by #6632
Open

Need to review entries in inv_dataset table #6615

askfongjojo opened this issue Sep 20, 2024 · 5 comments · Fixed by #6632
Assignees

Comments

@askfongjojo
Copy link

I see that there is an unusally large number of records in this table that has no time_deleted or state fields:

root@[fd00:1122:3344:109::3]:32221/omicron> \d inv_dataset
     column_name    | data_type | is_nullable | column_default | generation_expression |      indices       | is_hidden
--------------------+-----------+-------------+----------------+-----------------------+--------------------+------------
  inv_collection_id | UUID      |    false    | NULL           |                       | {inv_dataset_pkey} |   false
  sled_id           | UUID      |    false    | NULL           |                       | {inv_dataset_pkey} |   false
  id                | UUID      |    true     | NULL           |                       | {inv_dataset_pkey} |   false
  name              | STRING    |    false    | NULL           |                       | {inv_dataset_pkey} |   false
  available         | INT8      |    false    | NULL           |                       | {inv_dataset_pkey} |   false
  used              | INT8      |    false    | NULL           |                       | {inv_dataset_pkey} |   false
  quota             | INT8      |    true     | NULL           |                       | {inv_dataset_pkey} |   false
  reservation       | INT8      |    true     | NULL           |                       | {inv_dataset_pkey} |   false
  compression       | STRING    |    false    | NULL           |                       | {inv_dataset_pkey} |   false
(9 rows)

root@[fd00:1122:3344:109::3]:32221/omicron> select count(*) from inv_dataset;
   count
------------
  35641826

root@[fd00:1122:3344:109::3]:32221/omicron> select sled_id, count(*) from inv_dataset group by sled_id;
                sled_id                |  count
---------------------------------------+----------
  f15774c1-b8e5-434f-a493-ec43f96cba06 | 3129834
  bd96ef7c-4941-4729-b6f7-5f47feecbc4b | 2054724
  71def415-55ad-46b4-ba88-3ca55d7fb287 | 3064761
  a2adea92-b56e-44fc-8a0d-7d63b5fd3b93 | 3015194
  dd83e75a-1edf-4aa1-89a0-cd8b2091a7cd | 3221035
  0c7011f7-a4bf-4daf-90cc-1c2410103300 | 2894176
  2707b587-9c7f-4fb0-a7af-37c3b7a9a0fa | 2943800
  7b862eb6-7f50-4c2f-b9a6-0d12ac913d3c | 2948692
  87c2c4fc-b0c7-4fef-a305-78f0ed265bbc | 3009928
  5f6720b8-8a31-45f8-8c94-8e699218f28b | 3131293
  b886b58a-1e3f-4be1-b9f2-0c2e66c6bc88 | 3016263
  db183874-65b5-4263-a1c1-ddb2737ae0e9 | 3208511
(12 rows)

It may be worth confirming if this table is supposed to grow indefinitely. Other inv_ table seems to be pruned and kept to reasonable sizes (maybe related to #5305).

@askfongjojo askfongjojo changed the title Need to review Need to review entries in inv_dataset table Sep 20, 2024
@smklein smklein self-assigned this Sep 20, 2024
@davepacheco
Copy link
Collaborator

Thanks for catching this!!

Yeah it seems pretty likely that this table is missing from the inventory delete code. We can certainly fix that problem but it's a little harder to fix existing systems. We could try something like:

DELETE FROM inv_dataset WHERE inv_collection_id NOT IN (SELECT id FROM inv_collection);

But that's a potentially huge transaction and it'd be safer to do it in batches. It looks like you can do something like this but I haven't tested it:

DELETE FROM inv_dataset WHERE inv_collection_id NOT IN (SELECT id FROM inv_collection) LIMIT 1000;

and we could run that in a loop until it produces no results.

We could:

  • do this by hand the next time we upgrade each customer's system
  • write an omdb command to do this and do it at the same time
  • write a background task to do this

I'd probably lean towards the omdb command. There's a related question of: is this too urgent to wait for the next upgrade?

@askfongjojo
Copy link
Author

The issue doesn't seem to affect the system negatively at the moment AFAICT (I found it only by chance). We can do the cleanup at the next customer rack update after fixing the inventory delete code.

@jgallagher
Copy link
Contributor

We could:

* do this by hand the next time we upgrade each customer's system

* write an omdb command to do this and do it at the same time

* write a background task to do this

I'd probably lean towards the omdb command. There's a related question of: is this too urgent to wait for the next upgrade?

I this this is only on dogfood, right? Release 10 was cut on Aug 31, and the most recent main commit in it was a032d2a. inv_dataset was added in #6167, which was merged on Sep 6, and the git history on main confirms it's a fair bit newer than a032d2a.

Given that, I'd probably vote "do this by hand once we've patched this up and updated dogfood"? Sooner if it is becoming problematic?

@smklein
Copy link
Collaborator

smklein commented Sep 23, 2024

@askfongjojo , thank you for filing this issue, this was 100% an oversight on my part. I have a fix ready at at #6632 , which also adds an extra test to try to catch future cases where an inventory subcollection is added but not correctly removed.

Looking at SHOW TABLES on dogfood, I see that there are a couple other tables in addition to inv_dataset which have high-volume and no garbage collection. I filed #6635 for these other cases.

smklein added a commit that referenced this issue Sep 23, 2024
Fixes #6615

Does not patch existing systems (dogfood) which may have unnecessary
inventory entries
@jgallagher
Copy link
Contributor

(Reopening just so we remember to manually clean up dogfood after 165ddc2 is deployed)

@jgallagher jgallagher reopened this Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants