New approach to squashfs images #798

mattgodbolt · 2022-09-05T23:09:57Z

Intention is to replace /opt/compiler-explorer entirely with something like this! at least configurably; should allow for both regular ce_install install .. backed by real files and supermagic squashfs things, pretty much transparently.

Very much WIP! But I got this PoC working well enough to see promise. Ideally I'll run with a whole base image and see what we get.

This does mean we rely entirely on the yaml files to install everything "from scratch" when we rebuild. We may not want this but for dicsussion!

partouf · 2022-09-06T01:55:44Z

Because my /opt/compiler-explorer is filled with other stuff like CE git repo's, I did sudo ln -sfT ~/ce /opt/cenew instead, and use ce_install --dest /home/partouf/ce buildroot gcc instead of leaving it at the default. Seems to work fine.

Question;

There are 2 sqfs files per installation, what's the small one, is that the actual metainfo?
If I try to 'buildroot' something that doesn't exist, it squashes ... something... twice, and puts 1 extra file in the /opt/cefs-images folder, I don't think it needs to do that, right?
buildroot does seem to work nicely if you want to add all the compilers of a certain maker, it stores it in just 1 sqfs file, very cool
Would we prepare a root sqfs and then hardcode the root hash into the AMI for instances?
What about old compilers that we can't reinstall anymore, can we copy squash them over manually?
Ubuntu Files window seems to kick me out of /opt/cenew and /home/partouf/ce every X minutes, (and throws me to the parent directory) is there some timer in the autofs thing that causes this, and how would that manifest in reality? I hope this would not mean that during compiler execution all of the things not paged in suddenly disappear for a second and half the files seem to be gone......

mattgodbolt · 2022-09-06T12:46:09Z

Because my /opt/compiler-explorer is filled with other stuff like CE git repo's, I did sudo ln -sfT ~/ce /opt/cenew instead, and use ce_install --dest /home/partouf/ce buildroot gcc instead of leaving it at the default. Seems to work fine.

Cool! In this case you could have "just" used /home/partouf/ce as the dest and ignored the /opt/. But also the code lets you specify the /opt and it follows the symlinks to the right place. But either way, glad it worked out!

Question;

There are 2 sqfs files per installation, what's the small one, is that the actual metainfo?

Each installation will be at least 2 sqfs files:

One sqfs layer that containes only the updated/freshly installed files (like a docker layer)
As many (pre-existing) sqfs layers to link to the unchanged content
One sqfs that contains the symlinks to everything.

To make it concrete, something like:

assuming currently you have one layer with SHA basedata and root SHA root0
- root0 contains symlinks for all directories that point into /cefs/basedata/ eg /cefs/basedata/gcc
you install "clang" with buildroot (terrible name, it's really "install")
a new layer newdata is created and clang is installed into that
a new root root1 will be created with symlinks from gcc->/cefs/basedata/gcc and clang->/cefs/newdata/clang

The old root root0 is now unused and could be GC'd (but it's only small).

The plan is to never have more than a few layers around, with a weekly consolidation process that "flattens" down images into a single base layer again. Or something like that.

If I try to 'buildroot' something that doesn't exist, it squashes ... something... twice, and puts 1 extra file in the /opt/cefs-images folder, I don't think it needs to do that, right?

Correct: the buildroot (currently) doesn't spot that nothing changed, and so it creates an empty data layer, and then makes a new root image (which only differs in the metadata). The new root image will point at only the previous sqfs files, not the empty one.

buildroot does seem to work nicely if you want to add all the compilers of a certain maker, it stores it in just 1 sqfs file, very cool

Right! The hope is to literally put everything in one sqfs. And then actually squash the daliy builds in an overlay so there's one unified way of seeing all the compilers etc. No more "hack the squashfs" stuff. And if we rebuild something we can buildroot --force to put it in a new layer.

Would we prepare a root sqfs and then hardcode the root hash into the AMI for instances?

We have some choices:

Leave the /opt/compiler-explorer on EFS and make that a symlink to /cefs/. We can then atomically "just" change the symlink and instantly everyone gets the new image. Makes for simple deploys!
Make it a setting like the "current build" on S3. AMIs grab the "root hash" from S3 at startup and then symlink their own /opt/compiler-explorer there. Updating the builds then is pushing a new "root image" doc at S3 (much like ce builds set_current) and then refresh everything.
Baking it into an AMI. which would then mean daily AMI builds or after each installation.

What about old compilers that we can't reinstall anymore, can we copy squash them over manually?

Yeah this is a great point. I was going to make a cefs copy $list-of-files type thing to let us manually copy things into the base image. But really I am interested to see how well we do installing from scratch too :) We will need this though, you're right. And the "consolidate" process will probably have to unpack the whole images and reassemble, rather than reinstalling from scratch (which was my original thought). For discussion!

Ubuntu Files window seems to kick me out of /opt/cenew and /home/partouf/ce every X minutes, (and throws me to the parent directory) is there some timer in the autofs thing that causes this, and how would that manifest in reality? I hope this would not mean that during compiler execution all of the things not paged in suddenly disappear for a second and half the files seem to be gone......

Yes! autofs is (deliberately) configured with a timeout here so it unmounts stuff pretty aggressively. We can of course change that, but the value here is one that's battle-tested in $dayjob so it should be ok for most practical purposes (if what you're seeing is indeed timeouts).

partouf · 2022-09-06T12:56:51Z

Re: baking it into an AMI. Daily builds would be fine, but I would not be looking forward to using terraforming as a deploy mechanism. So maybe just fetching the hash from S3 is the best option here.

mattgodbolt · 2022-09-06T12:57:47Z

I lied about tuning: The default is 10 minutes. we're using the default here for idly unmounting squashfs after 10m.

partouf · 2022-09-06T12:58:43Z

I lied about tuning: The default is 10 minutes. we're using the default here for idly unmounting squashfs after 10m.

I see. Can we maybe turn that off for production if we find out that causes issues?

apmorton · 2022-09-06T13:05:20Z

I lied about tuning: The default is 10 minutes. we're using the default here for idly unmounting squashfs after 10m.

I see. Can we maybe turn that off for production if we find out that causes issues?

It shouldn't cause issues - any open file handle inside the mount is enough to keep it mounted (and also bump the timeout forward). We use this exact autofs setup at $dayjob.

The auto unmount is what allows "magic" atomic swaps with no cleanup.

"Just" swap out the symlink that points to the new autofs mount point and everybody picks up the new stuff during their next compile, and then the old mounts die after a timeout.

partouf · 2022-09-06T13:09:19Z

I lied about tuning: The default is 10 minutes. we're using the default here for idly unmounting squashfs after 10m.

I see. Can we maybe turn that off for production if we find out that causes issues?

It shouldn't cause issues - any open file handle inside the mount is enough to keep it mounted (and also bump the timeout forward). We use this exact autofs setup at $dayjob.

The auto unmount is what allows "magic" atomic swaps with no cleanup.

"Just" swap out the symlink that points to the new autofs mount point and everybody picks up the new stuff during their next compile, and then the old mounts die after a timeout.

Ok I see. But if you're super unlucky and it's a quiet day and you hit an instance that hasn't had activity for 10 minutes, you have a nanominuscule chance that you connect during an unmount/mount cycle. Probably fine.

apmorton · 2022-09-06T13:11:56Z

I lied about tuning: The default is 10 minutes. we're using the default here for idly unmounting squashfs after 10m.

I see. Can we maybe turn that off for production if we find out that causes issues?

It shouldn't cause issues - any open file handle inside the mount is enough to keep it mounted (and also bump the timeout forward). We use this exact autofs setup at $dayjob.
The auto unmount is what allows "magic" atomic swaps with no cleanup.
"Just" swap out the symlink that points to the new autofs mount point and everybody picks up the new stuff during their next compile, and then the old mounts die after a timeout.

Ok I see. But if you're super unlucky and it's a quiet day and you hit an instance that hasn't had activity for 10 minutes, you have a nanominuscule chance that you connect during an unmount/mount cycle. Probably fine.

We can avoid this my moving the health check file inside the sqfs root - similar to how we currently have the health check on nfs.

Every ELB health check would resolve the symlink and ensure the most recent root is mounted.

mattgodbolt · 2022-09-06T22:15:31Z

We can avoid this my moving the health check file inside the sqfs root

but what's true for the root is also true for all the data images too? And we've never seen this at dayjob: I wonder if there's something specifically odd going on with shells left cd'd into a directory that doesn't trigger things.

partouf · 2022-09-06T22:22:53Z

We can avoid this my moving the health check file inside the sqfs root

but what's true for the root is also true for all the data images too? And we've never seen this at dayjob: I wonder if there's something specifically odd going on with shells left cd'd into a directory that doesn't trigger things.

that could be, my observation was about this view:

which will change automagically to my home in 10ish minutes.

partouf · 2022-09-06T22:27:30Z

5-6ish minutes in this case

mattgodbolt · 2022-09-07T13:24:57Z

Thanks for the info partouf. I am 99.5% sure that's an artifact of nautilus not holding the directory file descriptor open (on purpose!) and then detecting the unmount that autofs does, and closing the window (or similar). If it affects your workflow we can bump the timout to a day or so or whatever, but without a timeout we'll slowly accumulate mounts (which might be ok for local dev?)

Separately to this: I've been trying to tidy up the nomenclature. For more PoC stuff I'll need to do:

importing of existing directories (e.g the one big import of /opt/compiler-explorer as it stands now)
consolidating of images
gc of unused images

Once I've tried that out I'll write it up more and we can discuss. Maybe on Friday?!

…8 and below'

…python3.8 and below'" This reverts commit 396f775.

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.9.24 to 2022.12.7. - [Release notes](https://github.com/certifi/python-certifi/releases) - [Commits](certifi/python-certifi@2022.09.24...2022.12.07) --- updated-dependencies: - dependency-name: certifi dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

The virtualenv thing was dodgy and it turns out that we can actually directly create the .venv directory as part of peotry itself, with the correct configuration. So, do that! If you get a conflict, remove the autogenerated poetry.toml and replace it with the one from source control.

Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

Add GCC for C and C++, for BPF target. refs compiler-explorer/compiler-explorer#4408 Signed-off-by: Marc Poulhiès <dkm@kataplop.net>

* update pahole and elfutils nightly

This reverts commit 8b74dfc.

https://github.com/josuttis/belleviews

mattgodbolt added 2 commits September 5, 2022 17:41

very much wip

88747a4

some docs

5937368

mattgodbolt requested review from apmorton, dkm, partouf, AbrilRBS, junlarsen and jeremy-rifkin September 5, 2022 23:09

mattgodbolt and others added 4 commits September 6, 2022 20:04

Some updates

6368502

LLVM 15 (#799)

59f7794

Merge remote-tracking branch 'origin/main' into mg/squashfsallthethings

bf7dc83

Refactor and splitting. Renames

9dc26f6

mattgodbolt added 6 commits September 7, 2022 08:33

Look for any python but...not a good solution to 'what about python3.…

396f775

…8 and below'

Revert "Look for any python but...not a good solution to 'what about …

5cf9df0

…python3.8 and below'" This reverts commit 396f775.

arg

f3afbcc

arg

20da37f

Maybe will regret this

f301769

bleh

7d9b365

partouf and others added 30 commits December 19, 2022 07:45

Update dotnet.yaml

f72082b

Update dotnet.yaml

4965282

fix #891

1cfb80d

add some libs

573b9b9

Add dart 2.18 and dart dev version (#895)

82ecd81

test runner wrapperscripts

25115be

fix ce scripts exitcode

d913968

Add ISPC builder

9e918df

Rust 1.66.0 (#902)

a0476cb

Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

add: gcc bpf cross compiler (#894)

ce0f027

Add GCC for C and C++, for BPF target. refs compiler-explorer/compiler-explorer#4408 Signed-off-by: Marc Poulhiès <dkm@kataplop.net>

Install our own ispcs

8791049

update pahole and elfutils (#899)

0596cb1

* update pahole and elfutils nightly

Fix tests

318d1aa

Revert "update pahole and elfutils (#899)"

8c0432b

This reverts commit 8b74dfc.

Build in release

2879146

pahole from nightly ce_install

688da99

Add Josutti's Belleviews library (#900)

4223240

https://github.com/josuttis/belleviews

Alternative way of specifying instance types

13e71ab

Merge remote-tracking branch 'origin/main' into mg/squashfsallthethings

f36063c

Merge remote-tracking branch 'origin/main' into mg/squashfsallthethings

3f7b34f

Fix tests

66f1ab1

Merge remote-tracking branch 'origin/main' into mg/squashfsallthethings

2b53fb3

Merge remote-tracking branch 'origin/main' into mg/squashfsallthethings

ad5f6cc

Merge remote-tracking branch 'origin/main' into mg/squashfsallthethings

94703e4

Merge remote-tracking branch 'origin/main' into mg/squashfsallthethings

986a91a

Merge remote-tracking branch 'origin/main' into mg/squashfsallthethings

bf2d653

Merge remote-tracking branch 'origin/main' into mg/squashfsallthethings

94779bb

Merge remote-tracking branch 'origin/main' into mg/squashfsallthethings

e899d6d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New approach to squashfs images #798

New approach to squashfs images #798

mattgodbolt commented Sep 5, 2022

partouf commented Sep 6, 2022 •

edited

Loading

mattgodbolt commented Sep 6, 2022

partouf commented Sep 6, 2022

mattgodbolt commented Sep 6, 2022

partouf commented Sep 6, 2022

apmorton commented Sep 6, 2022

partouf commented Sep 6, 2022

apmorton commented Sep 6, 2022

mattgodbolt commented Sep 6, 2022

partouf commented Sep 6, 2022

partouf commented Sep 6, 2022

mattgodbolt commented Sep 7, 2022

New approach to squashfs images #798

Are you sure you want to change the base?

New approach to squashfs images #798

Conversation

mattgodbolt commented Sep 5, 2022

partouf commented Sep 6, 2022 • edited Loading

mattgodbolt commented Sep 6, 2022

partouf commented Sep 6, 2022

mattgodbolt commented Sep 6, 2022

partouf commented Sep 6, 2022

apmorton commented Sep 6, 2022

partouf commented Sep 6, 2022

apmorton commented Sep 6, 2022

mattgodbolt commented Sep 6, 2022

partouf commented Sep 6, 2022

partouf commented Sep 6, 2022

mattgodbolt commented Sep 7, 2022

partouf commented Sep 6, 2022 •

edited

Loading