test: run dbus-broker under ASan and UBsan

Let's introduce a test that runs dbus-broker under Address Sanitizer and Undefined Behavior Sanitizer, while running other tests against it. The setup to achieve this is slightly convoluted, since we need to run (and restart) sanitized dbus-broker without nuking the host machine. For that we setup an nspawn-container that re-uses host's rootfs (to some degree) and overlays our additions on top of that. This way we can test (not-only) the full user-space boot with sanitized dbus-broker without risking "damage" to the host machine.
bus1 · May 13, 2024 · 7e9d586 · 7e9d586
1 parent 506e25a
commit 7e9d586
Show file tree

Hide file tree

Showing 3 changed files with 332 additions and 0 deletions.
diff --git a/test/integration/fuzz/sanitizers/main.fmf b/test/integration/fuzz/sanitizers/main.fmf
@@ -0,0 +1,19 @@
+summary: Concise summary describing what the test does
+test: ./test.sh
+recommend:
+    - dbus-daemon
+    - dfuzzer
+    - expat-devel
+    - gcc
+    - gdb
+    - git
+    - glibc-devel
+    - libasan
+    - libubsan
+    - meson
+    - systemd
+    - systemd-container
+    - systemd-devel
+    - systemd-libs
+    - util-linux
+duration: 30m
diff --git a/test/integration/fuzz/sanitizers/test.sh b/test/integration/fuzz/sanitizers/test.sh
@@ -0,0 +1,234 @@
+#!/bin/bash
+# vi: set sw=4 ts=4 et tw=110:
+# shellcheck disable=SC2016
+
+set -eux
+set -o pipefail
+
+# shellcheck source=test/integration/util.sh
+. "$(dirname "$0")/../../util.sh"
+
+export ASAN_OPTIONS=strict_string_checks=1:detect_stack_use_after_return=1:check_initialization_order=1:strict_init_order=1:detect_invalid_pointer_pairs=2:handle_ioctl=1:print_cmdline=1:disable_coredump=0:use_madv_dontdump=1
+export UBSAN_OPTIONS=print_stacktrace=1:print_summary=1:halt_on_error=1
+
+# shellcheck disable=SC2317
+at_exit() {
+    set +ex
+
+    # Let's do some cleanup and export logs if necessary
+
+    # Collect potential coredumps
+    coredumpctl_collect
+
+    if [[ -n "${CONTAINER:-}" ]]; then
+        if systemctl -q is-active "systemd-nspawn@$CONTAINER.service"; then
+            systemctl stop "systemd-nspawn@$CONTAINER.service"
+        fi
+
+        # Export the container journal and sanitizer logs if $TMT_TEST_DATA is set, either by TMT directly
+        # manually.
+        if [[ -n "${TMT_TEST_DATA:-}" ]]; then
+            journalctl -D "/var/log/journal/${CONTAINER_ID:?}" -b -o short-monotonic >"$TMT_TEST_DATA/container-journal.log"
+        fi
+
+        rm -rf "/var/lib/machines/$CONTAINER"
+        rm -rf "/var/log/journal/$CONTAINER_ID"
+        rm -rf "/run/systemd/system/systemd-nspawn@$CONTAINER.service.d"
+        systemctl daemon-reload
+    fi
+}
+
+trap at_exit EXIT
+
+export BUILD_DIR="$PWD/build-san"
+
+# Switch SELinux to permissive (if enabled), so it doesn't interfere with the container shenanigans below.
+setenforce 0 || :
+# We need persistent journal for the systemd-nspawn --link= stuff
+mkdir -p /var/log/journal
+journalctl --flush
+# Make sure the coredump collecting machinery is working
+coredumpctl_init
+
+: "=== Prepare dbus-broker's source tree ==="
+# Since we need to build dbus-broker from scratch, let's do some magic to get the correct source tree.
+if [[ -n "${PACKIT_TARGET_URL:-}" ]]; then
+    # If we're running in Packit's context, use the set of provided environment variables to checkout the
+    # correct branch (and possibly rebase it on top of the latest source base branch so we always test the
+    # latest revision possible).
+    git clone "$PACKIT_TARGET_URL" dbus-broker
+    cd dbus-broker
+    git checkout "$PACKIT_TARGET_BRANCH"
+    # If we're invoked from a pull request context, rebase on top of the latest source base branch.
+    if [[ -n "${PACKIT_SOURCE_URL:-}" ]]; then
+        git remote add pr "${PACKIT_SOURCE_URL:?}"
+        git fetch pr "${PACKIT_SOURCE_BRANCH:?}"
+        git merge "pr/$PACKIT_SOURCE_BRANCH"
+    fi
+    git log --oneline -5
+elif [[ -n "${DBUS_BROKER_TREE:-}" ]]; then
+    # Useful for quick local debugging when running this script directly, e.g. running
+    #
+    #   $ DBUS_BROKER_TREE=$PWD test/integration/fuzz/sanitizers/test.sh
+    #
+    # from the dbus-broker repo root.
+    cd "${DBUS_BROKER_TREE:?}"
+else
+    # If we're running outside of Packit's context, pull the latest dbus-broker upstream.
+    git clone https://github.com/bus1/dbus-broker dbus-broker
+    git log --oneline -5
+fi
+
+
+: "=== Build dbus-broker with sanitizers and run the unit test suite ==="
+meson setup "$BUILD_DIR" --wipe -Db_sanitize=address,undefined -Dprefix=/usr
+ninja -C "$BUILD_DIR"
+meson test -C "$BUILD_DIR" --timeout-multiplier=2 --print-errorlogs
+
+
+: "=== Run tests against dbus-broker running under sanitizers ==="
+# So, this one is a _bit_ convoluted. We want to run dbus-broker under sanitizers, but this bears a couple of
+# issues:
+#
+#   1) We need to restart dbus-broker (and hence the machine we're currently running on)
+#   2) If dbus-broker crashes due to ASan/UBSan error, the whole machine is hosed
+#
+# To make the test a bit more robust without too much effort, let's use systemd-nspawn to run an ephemeral
+# container on top of the current rootfs. To get the "sanitized" dbus-broker into that container, we need to
+# prepare a special rootfs with just the sanitized dbus-broker (and a couple of other things) which we then
+# simply overlay on top of the ephemeral rootfs in the container.
+#
+# This way, we'll do a full user-space boot with a sanitized dbus-broker without affecting the host machine,
+# and without having to build a custom container/VM just for the test.
+CONTAINER="dbus-broker-sanitizers-$RANDOM"
+CONTAINER_ID="$(systemd-id128 new)"
+CONTAINER_OVERLAY="/var/lib/machines/$CONTAINER"
+
+# Prepare the nspawn container service
+mkdir -p "/var/lib/machines/$CONTAINER"
+# Notes:
+#   - with systemd v256+ this can be replaced by systemctl edit --stdin --runtime ..., and the
+#     mkdir/daemon-reload can be dropped
+#   - systemd-nspawn can't overlay the whole rootfs (/), so we need to cherry-pick a couple of subdirectories
+#     we're interested in (in this case it's pretty simple, since dbus-broker installs everything under /usr,
+#     and we need /etc with our dbus-broker.service override)
+#   - since the whole container is ephemeral, use --link-journal=host, so the journal directory for the
+#     container is created on the _host_ under /var/log/journal/<machine-id> and bind-mounted into the
+#     container; that way we can fetch the container journal for debugging even if something goes horribly
+#     wrong
+mkdir -p "/run/systemd/system/systemd-nspawn@$CONTAINER.service.d"
+cat >"/run/systemd/system/systemd-nspawn@$CONTAINER.service.d/override.conf" <<EOF
+[Service]
+# We'll handle the coredumps on the host instead
+CoredumpReceive=no
+ExecStart=
+ExecStart=systemd-nspawn --quiet --private-network --keep-unit --machine=%i --boot \
+                         --link-journal=host \
+                         --volatile=yes \
+                         --directory=/ \
+                         --uuid=$CONTAINER_ID \
+                         --hostname=$CONTAINER \
+                         --overlay=/etc:$CONTAINER_OVERLAY/etc:/etc \
+                         --overlay-ro=/usr:$CONTAINER_OVERLAY/usr:/usr \
+EOF
+systemctl daemon-reload
+
+# Prepare the nspawn container overlay
+#
+# Install sanitized dbus-broker into the overlay
+DESTDIR="$CONTAINER_OVERLAY" ninja -C "$BUILD_DIR" install
+# Let systemd-nspawn propagate the machine ID we passed it via --uuid=
+mkdir "$CONTAINER_OVERLAY/etc"
+: >"$CONTAINER_OVERLAY/etc/machine-id"
+# Pass $ASAN_OPTIONS and $UBSAN_OPTIONS to the dbus-broker service in the container
+mkdir -p "$CONTAINER_OVERLAY/etc/systemd/system/dbus-broker.service.d/"
+cat >"$CONTAINER_OVERLAY/etc/systemd/system/dbus-broker.service.d/sanitizer-env.conf" <<EOF
+[Service]
+Environment=ASAN_OPTIONS=$ASAN_OPTIONS
+Environment=UBSAN_OPTIONS=$UBSAN_OPTIONS
+# Useful for debugging LSan errors, but it's very verbose, hence disabled by default
+#Environment=LSAN_OPTIONS=verbosity=1:log_threads=1
+EOF
+# Create a non-root user as well, so we can test the session bus stuff
+mkdir -p "$CONTAINER_OVERLAY/etc/sysusers.d/"
+cat >"$CONTAINER_OVERLAY/etc/sysusers.d/testuser.conf" <<EOF
+u testuser - "Test User" /home/testuser
+EOF
+# Run both dbus-broker-launch and dbus-broker under root instead of the usual "dbus" user. This is necessary
+# to let sanitizers generate stack traces (killing the process on sanitizer error works even without this
+# tweak though, but it's very hard to then tell what went wrong without a stack trace).
+mkdir -p "$CONTAINER_OVERLAY/etc/dbus-1/"
+cat >"$CONTAINER_OVERLAY/etc/dbus-1/system-local.conf" <<EOF
+<!DOCTYPE busconfig PUBLIC "-//freedesktop//DTD D-BUS Bus Configuration 1.0//EN"
+ "http://www.freedesktop.org/standards/dbus/1.0/busconfig.dtd">
+<busconfig>
+    <user>root</user>
+</busconfig>
+EOF
+
+# Wrap the long-ish systemd-run cmdline in something a bit shorter
+CONTAINER_RUN=(systemd-run -M "$CONTAINER" --wait --pipe)
+CONTAINER_RUN_UNPRIV=(systemd-run -M "testuser@$CONTAINER" --user --wait --pipe)
+
+check_journal_for_sanitizer_errors() {
+    if journalctl -q -D "/var/log/journal/${CONTAINER_ID:?}" --grep "SUMMARY:.+Sanitizer"; then
+        # Dump all messages recorded for the dbus-broker.service, as that's usually where the stack trace ends
+        # up. If that's not the case, the full container journal is exported on test exit anyway, so we'll
+        # still have everything we need to debug the fail further.
+        journalctl -q -D "/var/log/journal/${CONTAINER_ID:?}" -o short-monotonic --no-hostname -u dbus-broker.service --no-pager
+        exit 1
+    fi
+}
+
+run_and_check() {
+    local container_run=("${CONTAINER_RUN[@]}")
+
+    if [[ "$1" == "--unpriv" ]]; then
+        container_run=("${CONTAINER_RUN_UNPRIV[@]}")
+        shift
+    fi
+
+    # Run the passed command in the container
+    "${container_run[@]}" "$@"
+    # Check if dbus-broker is still running...
+    "${container_run[@]}" systemctl status --full --no-pager dbus-broker.service
+    # ... and if it didn't generate any sanitizer errors
+    check_journal_for_sanitizer_errors
+}
+
+# Start the container and wait until it's fully booted up
+machinectl start "$CONTAINER"
+timeout 30s bash -ec "until ${CONTAINER_RUN[*]} true; do sleep .5; done"
+# is-system-running returns > 0 if the system is running in degraded mode, but we don't care about that, we
+# just need to wait until the bootup is finished
+"${CONTAINER_RUN[@]}" systemctl is-system-running -q --wait || :
+"${CONTAINER_RUN[@]}" systemctl status --full --no-pager dbus-broker.service
+# Check if dbus-broker runs under root, see above for reasoning
+"${CONTAINER_RUN[@]}" bash -xec '[[ $(stat --format=%u /proc/$(systemctl show -P MainPID dbus-broker.service)) -eq 0 ]]'
+# Make _extra_ sure we're running the sanitized dbus-broker with the correct environment
+"${CONTAINER_RUN[@]}" bash -xec 'ldd /proc/$(systemctl show -P MainPID dbus-broker.service)/exe | grep -qF libasan.so'
+"${CONTAINER_RUN[@]}" bash -xec 'ldd $(command -v dbus-broker-launch) | grep -qF libasan.so'
+"${CONTAINER_RUN[@]}" bash -xec 'ldd $(command -v dbus-broker) | grep -qF libasan.so'
+"${CONTAINER_RUN[@]}" systemctl show -p Environment dbus-broker.service | grep -q ASAN_OPTIONS
+journalctl -D "/var/log/journal/${CONTAINER_ID:?}" -e -n 10 --no-pager
+check_journal_for_sanitizer_errors
+
+# Now we should have a container ready for our shenanigans
+
+# Let's start with something simple and run dfuzzer on the org.freedesktop.DBus bus
+run_and_check dfuzzer -v -n org.freedesktop.DBus
+# Now run the dfuzzer on the org.freedesktop.systemd1 as well, since it's pretty rich when it comes to
+# signature variations
+run_and_check --unpriv dfuzzer -n org.freedesktop.systemd1
+
+# Shut down the container and check for any sanitizer errors, since some of the errors can be detected only
+# after we start shutting things down.
+#
+# Note: machinectl poweroff doesn't wait until the container shuts down completely, stop stop the service
+#       behind it instead which does wait
+systemctl stop "systemd-nspawn@$CONTAINER.service"
+check_journal_for_sanitizer_errors
+# Also, check if dbus-broker didn't fail during the lifetime of the container
+(! journalctl -q -D "/var/log/journal/$CONTAINER_ID" _PID=1 --grep "dbus-broker.service.*Failed with result")
+
+exit 0
diff --git a/test/integration/util.sh b/test/integration/util.sh
@@ -0,0 +1,79 @@
+# vi: set sw=4 ts=4 et tw=110:
+# shellcheck shell=bash disable=SC2155
+
+__COREDUMPCTL_TS=""
+
+coredumpctl_init() {
+    local ec
+
+    if ! systemctl start systemd-coredump.socket; then
+        echo >&2 "Failed to start systemd-coredump.socket"
+        return 1
+    fi
+
+    # Note: coredumpctl returns 1 when no coredumps are found
+    coredumpctl --since=now >/dev/null && ec=0 || ec=$?
+    if [[ $ec -ne 1 ]]; then
+        echo >&2 "coredumpctl is not in operative state"
+        return 1
+    fi
+
+    # Set the internal coredumpctl timestamp, so we consider coredumps only from now on
+    __COREDUMPCTL_TS="$(date +"%Y-%m-%d %H:%M:%S")"
+
+    return 0
+}
+
+# Attempt to dump info about relevant coredumps using the coredumpctl utility.
+#
+# Returns:
+#   0 when no coredumps were found, 1 otherwise
+coredumpctl_collect() (
+    set +ex
+
+    local args=(-q --no-legend --no-pager)
+    local tempfile="$(mktemp)"
+
+    # Register a cleanup handler
+    # shellcheck disable=SC2064
+    trap "rm -f '$tempfile'" RETURN
+
+    if [[ -n "$__COREDUMPCTL_TS" ]]; then
+        args+=(--since "$__COREDUMPCTL_TS")
+    fi
+
+    if ! coredumpctl "${args[@]}" -F COREDUMP_EXE >"$tempfile"; then
+        echo "No relevant coredumps found"
+        return 0
+    fi
+
+    # For each unique executable path call 'coredumpctl info' to get the stack trace and other useful info
+    while read -r path; do
+        local exe
+        local gdb_cmd="set print pretty on\nbt full"
+
+        coredumpctl "${args[@]}" info "$path"
+        # Make sure we use the built binaries for getting gdb trace
+        #
+        # This is relevant mainly for the sanitizers run, where we don't install the just built revision, so
+        # `coredumpctl debug` pulls in a local binary instead of the built one, which produces useless
+        # results.
+        if [[ -v BUILD_DIR && -d $BUILD_DIR ]]; then
+            # The build directory layout of dbus-broker is not flat, so we need to find the binary first
+            exe="$(find "$BUILD_DIR" -executable -name "${path##*/}" | head -n1)"
+            if [[ -n "$exe" ]]; then
+                gdb_cmd="file $exe\nthread apply all bt\n$gdb_cmd"
+            fi
+        fi
+
+        # Attempt to get a full stack trace for the first occurrence of the given executable path
+        if gdb -v >/dev/null; then
+            echo -e "\n"
+            echo "Trying to run gdb with '$gdb_cmd' for '$path'"
+            echo -e "$gdb_cmd" | coredumpctl "${args[@]}" debug "$path"
+            echo -e "\n"
+        fi
+    done < <(sort -u "$tempfile")
+
+    return 1
+)