[BUG] Latest image crashes on startup #248

tommyalatalo · 2022-05-13T14:49:22Z

I just switched over to the docker images ghcr.io/analogj/scrutiny:master-web and ghcr.io/analogj/scrutiny:master-collector since the docker hub ones have been taken down. Now my web instance is crashing on startup with this error message:

goroutine 1 [running]:
github.com/analogj/scrutiny/webapp/backend/pkg/web/middleware.RepositoryMiddleware(0x129f920, 0xc00038a070, 0x12a4b00, 0xc0003faa80, 0x129f9a0)
	/go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/middleware/repository.go:14 +0xe6
github.com/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Setup(0xc000385610, 0x12a4b00, 0xc0003faa80, 0x1)
	/go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/server.go:26 +0xd8
github.com/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Start(0xc000385610, 0x0, 0x0)
	/go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/server.go:97 +0x234
main.main.func2(0xc000387340, 0x4, 0x6)
	/go/src/github.com/analogj/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:112 +0x198
github.com/urfave/cli/v2.(*Command).Run(0xc0003ef200, 0xc0003871c0, 0x0, 0x0)
	/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/command.go:164 +0x4e0
github.com/urfave/cli/v2.(*App).RunContext(0xc0003fe000, 0x128e820, 0xc0000c8010, 0xc0000be020, 0x2, 0x2, 0x0, 0x0)
	/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/app.go:306 +0x814
github.com/urfave/cli/v2.(*App).Run(...)
	/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/app.go:215
main.main()
	/go/src/github.com/analogj/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:137 +0x65a
2022/05/13 14:38:05 Loading configuration file: /opt/scrutiny/config/scrutiny.yaml
time="2022-05-13T14:38:05Z" level=info msg="Trying to connect to scrutiny sqlite db: \n"
time="2022-05-13T14:38:05Z" level=info msg="Successfully connected to scrutiny sqlite db: \n"
panic: a username and password is required for a setup

There is no mention in the readme or the examble configs of a username/password, so what are the credentials that the application is missing and crashing over? Also, this error feels a lot like something that should be handled by the application and an informative error presented to the user.

The text was updated successfully, but these errors were encountered:

AnalogJ · 2022-05-13T16:25:55Z

Hey,
We've seen this before for users migrating from the LSIO image to the official omnibus image.

Can you paste your docker run command or docker-compose file?

tommyalatalo · 2022-05-13T17:16:06Z

Hey, We've seen this before for users migrating from the LSIO image to the official omnibus image.

Can you paste your docker run command or docker-compose file?

I don't use docker cli or docker-compose, I'm running it on Nomad, but here is the file anyway:

variable "web_image" {
  type    = string
  default = "ghcr.io/analogj/scrutiny:master-web"
}

variable "collector_image" {
  type    = string
  default = "ghcr.io/analogj/scrutiny:master-collector"
}

job "scrutiny" {
  datacenters = ["main"]
  type        = "service"

  vault {
    policies    = ["scrutiny"]
    change_mode = "restart"
  }

  group "api" {
    constraint {
      attribute = "${node.unique.name}"
      value     = "nas"
    }

    network {
      mode = "bridge"
      port "http" {
        to = 8080
      }
    }

    task "scrutiny" {
      driver = "docker"


      config {
        image   = var.web_image
        cap_add = ["sys_admin", "sys_rawio"]
        ports   = ["http"]

        volumes = [
          "/run/udev:/run/udev:ro",
          "secrets/scrutiny.yaml:/opt/scrutiny/config/scrutiny.yaml",
          "/zpool/services/scrutiny/scrutiny.db:/scrutiny/config/scrutiny.db",
        ]
      }

      service {
        name = "${NOMAD_TASK_NAME}"
        port = "http"

        tags = [
          "api",
          "http",
          "traefik.enable=true",
          "traefik.http.routers.https-${NOMAD_TASK_NAME}.entrypoints=https",
          "traefik.http.routers.https-${NOMAD_TASK_NAME}.rule=Host(`${NOMAD_TASK_NAME}.tox.sh`)",
          "traefik.http.routers.https-${NOMAD_TASK_NAME}.middlewares=chain-internal-no-auth@file",
          "traefik.http.routers.https-${NOMAD_TASK_NAME}.tls=true",
        ]

        check {
          port     = "http"
          name     = "${NOMAD_TASK_NAME} health"
          type     = "http"
          path     = "/api/health"
          interval = "30s"
          timeout  = "1s"
        }
      }

      env {
        GIN_MODE              = "release"
        SCRUTINY_API_ENDPOINT = "http://localhost:8080"
        SCRUTINY_COLLECTOR    = "false"
        SCRUTINY_WEB          = "true"
      }


      template {
        destination = "secrets/scrutiny.yaml"
        change_mode = "restart"
        data        = file("./config/scrutiny/scrutiny.yaml")
      }
    }
  }

  group "collector-arch" {
    constraint {
      attribute = "${node.unique.name}"
      value     = "arch"
    }

    network {
      mode = "bridge"
    }

    task "wait-for-api" {
      driver = "docker"

      lifecycle {
        hook    = "prestart"
        sidecar = false
      }

      config {
        image   = "praqma/network-multitool:alpine-extra"
        command = "/bin/bash"

        args = [
          "-c",
          "while ! dig +short api.scrutiny.service.consul srv | grep -ve '^$'; do sleep 1; done",
        ]
      }
    }

    task "collector" {
      driver = "docker"

      config {
        image   = var.collector_image
        cap_add = ["sys_admin", "sys_rawio"]

        volumes = [
          "/run/udev:/run/udev:ro",
          "secrets/collector.yaml:/opt/scrutiny/config/collector.yaml",
        ]

        devices = [
          {
            host_path      = "/dev/sda"
            container_path = "/dev/sda"
          },
          {
            host_path      = "/dev/sdb"
            container_path = "/dev/sdb"
          },
          {
            host_path      = "/dev/sdc"
            container_path = "/dev/sdc"
          },
          {
            host_path      = "/dev/sdd"
            container_path = "/dev/sdd"
          },
          {
            host_path      = "/dev/sde"
            container_path = "/dev/sde"
          },
          {
            host_path      = "/dev/sdf"
            container_path = "/dev/sdf"
          },
        ]
      }

      template {
        destination = "secrets/collector.yaml"
        change_mode = "restart"
        data        = file("./config/scrutiny/collector.yaml")
      }
    }
  }

  group "collector-backup" {
    constraint {
      attribute = "${node.unique.name}"
      value     = "backup"
    }

    network {
      mode = "bridge"
    }

    task "wait-for-api" {
      driver = "docker"

      lifecycle {
        hook    = "prestart"
        sidecar = false
      }

      config {
        image   = "praqma/network-multitool:alpine-extra"
        command = "/bin/bash"

        args = [
          "-c",
          "while ! dig +short api.scrutiny.service.consul srv | grep -ve '^$'; do sleep 1; done",
        ]
      }
    }

    task "collector" {
      driver = "docker"

      config {
        image   = var.collector_image
        cap_add = ["sys_admin", "sys_rawio"]

        volumes = [
          "/run/udev:/run/udev:ro",
          "secrets/collector.yaml:/opt/scrutiny/config/collector.yaml",
        ]

        devices = [
          {
            host_path      = "/dev/nvme0"
            container_path = "/dev/nvme0"
          },
          {
            host_path      = "/dev/sda"
            container_path = "/dev/sda"
          },
          {
            host_path      = "/dev/sdb"
            container_path = "/dev/sdb"
          },
        ]
      }

      template {
        destination = "secrets/collector.yaml"
        change_mode = "restart"
        data        = file("./config/scrutiny/collector.yaml")
      }
    }
  }
  group "collector-nas" {
    constraint {
      attribute = "${node.unique.name}"
      value     = "nas"
    }

    network {
      mode = "bridge"
    }

    task "wait-for-api" {
      driver = "docker"

      lifecycle {
        hook    = "prestart"
        sidecar = false
      }

      config {
        image   = "praqma/network-multitool:alpine-extra"
        command = "/bin/bash"

        args = [
          "-c",
          "while ! dig +short api.scrutiny.service.consul srv | grep -ve '^$'; do sleep 1; done",
        ]
      }
    }

    task "collector" {
      driver = "docker"

      config {
        image   = var.collector_image
        cap_add = ["sys_admin", "sys_rawio"]

        volumes = [
          "/run/udev:/run/udev:ro",
          "secrets/collector.yaml:/opt/scrutiny/config/collector.yaml",
        ]

        devices = [
          {
            host_path      = "/dev/nvme0"
            container_path = "/dev/nvme0"
          },
          {
            host_path      = "/dev/sda"
            container_path = "/dev/sda"
          },
          {
            host_path      = "/dev/sdb"
            container_path = "/dev/sdb"
          },
          {
            host_path      = "/dev/sdc"
            container_path = "/dev/sdc"
          },
          {
            host_path      = "/dev/sdd"
            container_path = "/dev/sdd"
          },
          {
            host_path      = "/dev/sde"
            container_path = "/dev/sde"
          },
          {
            host_path      = "/dev/sdf"
            container_path = "/dev/sdf"
          },
        ]
      }

      template {
        destination = "secrets/collector.yaml"
        change_mode = "restart"
        data        = file("./config/scrutiny/collector.yaml")
      }
    }
  }
}

evulhotdog · 2022-05-13T17:28:30Z

@AnalogJ here's mine as well.

---
version: "2.1"
services:
  scrutiny:
    image: linuxserver/scrutiny
    container_name: scrutiny
    privileged: true
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=America/New_York
      - SCRUTINY_API_ENDPOINT=http://localhost:8080
      - SCRUTINY_WEB=true
      - SCRUTINY_COLLECTOR=true
    volumes:
      - ./config:/config
      - /dev/sda:/dev/sda
      - /dev/sdb:/dev/sdb
      - /dev/sdc:/dev/sdc
      - /run/udev:/run/udev:ro
    ports:
      - 8081:8080
    restart: unless-stopped
    networks:
      - internet-facing

networks:
  internet-facing:
    external:
      name: internet-facing

AnalogJ · 2022-05-13T22:11:21Z

@tommyalatalo you'll need to remove the SCRUTINY_WEB=true and SCRUTINY_COLLECTOR=true environmental variables. They were used by the LSIO image, but cause issues with Scrutiny for some reason.

@evulhotdog you'll need to remove the same environmental variables I mentioned to @tommyalatalo , but you'll also need to update your image to ghcr.io/analogj/scrutiny:master-omnibus. The LSIO image is missing a new dependency that we introduced in v0.4.0+ (InfluxDB), and that causes issues. You can revert to a earlier version of the LSIO image ( lscr.io/linuxserver/scrutiny:060ac7b8-ls34), or just change to the official Scrutiny image (ghcr.io/analogj/scrutiny:master-omnibus)

I'm going to close this issue for now, feel free to comment/reopen it if you run into any other issues.

tommyalatalo · 2022-05-14T22:26:52Z

Removing SCRUTINY_WEB and SCRUTINY_COLLECTOR don't fix the problem, it now seems like the scrutiny app in the api container can't connect to the internal influxdb instance:

goroutine 1 [running]:
github.com/analogj/scrutiny/webapp/backend/pkg/web/middleware.RepositoryMiddleware(0x129f920, 0xc0000b0098, 0x12a4b00, 0xc000414bd0, 0x129f9a0)
	/go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/middleware/repository.go:14 +0xe6
github.com/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Setup(0xc0000ad6c0, 0x12a4b00, 0xc000414bd0, 0x10)
	/go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/server.go:26 +0xd8
github.com/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Start(0xc0000ad6c0, 0x0, 0x0)
	/go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/server.go:97 +0x234
main.main.func2(0xc0000b7340, 0x4, 0x6)
	/go/src/github.com/analogj/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:112 +0x198
github.com/urfave/cli/v2.(*Command).Run(0xc0004170e0, 0xc0000b71c0, 0x0, 0x0)
	/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/command.go:164 +0x4e0
github.com/urfave/cli/v2.(*App).RunContext(0xc00009a300, 0x128e820, 0xc000038038, 0xc00000e080, 0x2, 0x2, 0x0, 0x0)
	/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/app.go:306 +0x814
github.com/urfave/cli/v2.(*App).Run(...)
	/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/app.go:215
main.main()
	/go/src/github.com/analogj/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:137 +0x65a
2022/05/14 22:23:57 Loading configuration file: /opt/scrutiny/config/scrutiny.yaml
time="2022-05-14T22:23:57Z" level=info msg="Trying to connect to scrutiny sqlite db: /scrutiny/config/scrutiny.db\n"
time="2022-05-14T22:23:57Z" level=info msg="Successfully connected to scrutiny sqlite db: /scrutiny/config/scrutiny.db\n"
panic: Post "http://0.0.0.0:8086/api/v2/setup": dial tcp 0.0.0.0:8086: connect: connection refused

tommyalatalo · 2022-05-14T22:52:33Z

So now I switched to the omnibus image on my api node, but it also fails...
The container just loops with the below message over and over, though I can see that influxdb is running and listening on 8086 inside, and I've set SCRUTINY_WEB_INFLUXDB_HOST="http://localhost:8086"

 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
github.com/AnalogJ/scrutiny                             dev-0.4.4

Start the scrutiny server
waiting for influxdb
starting scrutiny

AnalogJ · 2022-05-15T00:17:26Z

Hey @tommyalatalo can you unset the SCRUTINY_WEB_INFLUXDB_HOST variable? its unnecessary, as the defaults should work: SCRUTINY_WEB_INFLUXDB_HOST=0.0.0.0

AnalogJ · 2022-05-15T00:18:28Z

actually, can you try setting it to SCRUTINY_WEB_INFLUXDB_HOST=localhost if unsetting the env var doesnt work.

raulfg3 · 2022-05-15T08:22:13Z

same problem here with latest docker:

yesterday at 17:25:48
yesterday at 17:25:48 ___ ___ ____ __ __ ____ ____ _ _ _ _
yesterday at 17:25:48/ ) / )( _ ( )( )( )( )( ( )( / )
yesterday at 17:25:48_ ( ( ) / )()( )( )( ) ( \ /
yesterday at 17:25:48(/ _)()_)() () ()()_) (__)
yesterday at 17:25:48gitpro.ttaallkk.top/AnalogJ/scrutiny dev-0.4.4
yesterday at 17:25:48
yesterday at 17:25:48Start the scrutiny server
yesterday at 17:25:48[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
yesterday at 17:25:48 - using env: export GIN_MODE=release
yesterday at 17:25:482022/05/14 17:25:48 No configuration file found at /opt/scrutiny/config/scrutiny.yaml. Using Defaults.
yesterday at 17:25:48time="2022-05-14T17:25:48+02:00" level=info msg="Trying to connect to scrutiny sqlite db: \n"
yesterday at 17:25:48 - using code: gin.SetMode(gin.ReleaseMode)
yesterday at 17:25:48
yesterday at 17:25:48time="2022-05-14T17:25:48+02:00" level=info msg="Successfully connected to scrutiny sqlite db: \n"
yesterday at 17:25:48panic: a username and password is required for a setup
yesterday at 17:25:48
yesterday at 17:25:48goroutine 1 [running]:
yesterday at 17:25:48gitpro.ttaallkk.top/analogj/scrutiny/webapp/backend/pkg/web/middleware.RepositoryMiddleware({0xfeaac0, 0xc0000c4cd8}, {0xff3d50, 0xc000373f10})
yesterday at 17:25:48 /app/scrutiny/webapp/backend/pkg/web/middleware/repository.go:14 +0xa5
yesterday at 17:25:48gitpro.ttaallkk.top/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Setup(0xc000370610, {0xff3d50, 0xc000373f10})
yesterday at 17:25:48 /app/scrutiny/webapp/backend/pkg/web/server.go:26 +0xb4
yesterday at 17:25:48gitpro.ttaallkk.top/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Start(0xc000370610)
yesterday at 17:25:48 /app/scrutiny/webapp/backend/pkg/web/server.go:97 +0x3ab
yesterday at 17:25:48main.main.func2(0xc00034fb80)
yesterday at 17:25:48 /app/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:112 +0x1f7
yesterday at 17:25:48gitpro.ttaallkk.top/urfave/cli/v2.(*Command).Run(0xc000359e60, 0xc00034fa00)
yesterday at 17:25:48 /root/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/command.go:164 +0x64a
yesterday at 17:25:48gitpro.ttaallkk.top/urfave/cli/v2.(*App).RunContext(0xc000240480, {0xfd4dd0, 0xc0000cc000}, {0xc0000c8000, 0x2, 0x2})
yesterday at 17:25:48 /root/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/app.go:306 +0x926
yesterday at 17:25:48gitpro.ttaallkk.top/urfave/cli/v2.(*App).Run(...)
yesterday at 17:25:48 /root/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/app.go:215
yesterday at 17:25:48main.main()
yesterday at 17:25:48 /app/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:137 +0x679

It woorks fine until latest docker update.

raulfg3 · 2022-05-15T08:24:02Z

version: "2.1"
networks:
  default:
    external:
      name: my-net
services:
  scrutiny:
    image: ghcr.io/linuxserver/scrutiny
    container_name: scrutiny
    cap_add:
      - SYS_RAWIO
      - SYS_ADMIN #optional
    environment:
      - PUID=1001
      - PGID=1000
      - TZ=Europe/Madrid
      - SCRUTINY_API_ENDPOINT=http://localhost:8080
      - SCRUTINY_WEB=true
      - SCRUTINY_COLLECTOR=true
    volumes:
      - /srv/dev-disk-by-uuid-6f38f974-7aec-452a-815d-9101878af2e1/Data/dockers/scrutiny:/config
      - /run/udev:/run/udev:ro
    ports:
      - 89:8080
    devices:
      - /dev/sda:/dev/sda
      - /dev/sdb:/dev/sdb
      - /dev/sdc:/dev/sdc
      - /dev/sdd:/dev/sdd
      - /dev/sde:/dev/sde
      - /dev/sdf:/dev/sdf
    restart: unless-stopped

raulfg3 · 2022-05-15T08:46:05Z

I switched for latest analogj Image and worrks fine, my new yaml file is:

version: "3.5"
networks:
  default:
    external:
      name: my-net
services:
  scrutiny:
    container_name: scrutiny
    image: ghcr.io/analogj/scrutiny:master-omnibus
    cap_add:
      - SYS_RAWIO
    volumes:
      - /srv/dev-disk-by-uuid-6f38f974-7aec-452a-815d-9101878af2e1/Data/dockers/scrutiny:/opt/scrutiny/config
      - /srv/dev-disk-by-uuid-6f38f974-7aec-452a-815d-9101878af2e1/Data/dockers/scrutiny/influxdb:/opt/scrutiny/influxdb
      - /run/udev:/run/udev:ro
    ports:
      - 89:8080 # webapp
      - 86:8086 # influxDB admin
    devices:
      - /dev/sda:/dev/sda
      - /dev/sdb:/dev/sdb
      - /dev/sdc:/dev/sdc
      - /dev/sdd:/dev/sdd
      - /dev/sde:/dev/sde
      - /dev/sdf:/dev/sdf
    restart: unless-stopped

tommyalatalo · 2022-05-15T11:14:18Z

actually, can you try setting it to SCRUTINY_WEB_INFLUXDB_HOST=localhost if unsetting the env var doesnt work.

I've tried both unsetting and setting it to localhost, now the error I'm getting is this:

 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
github.com/AnalogJ/scrutiny                             dev-0.4.5

Start the scrutiny server
ts=2022-05-15T11:13:36.619100Z lvl=error msg="failed to onboard user admin" log_id=0aTrat3G000 handler=onboard error="onboarding has already been completed" took=0.057ms
ts=2022-05-15T11:13:36.619120Z lvl=error msg="api error encountered" log_id=0aTrat3G000 error="onboarding has already been completed"

AnalogJ · 2022-05-15T15:10:32Z

Hey @tommyalatalo
Thanks for confirming that the INFLUX_HOST should be localhost I'll fix that up in the defaults

That new error related to onboarding is because the influxdb data directory is not currently mounted/persisted outside the container.

If you're using the official omnibus image, you can add a volume mount like: ./influxdb:/opt/scrutiny/influxdb.
Also, I noticed that you still have references to /scrutiny in your Nomad config, those should all be renamed to /opt/scrutiny (thats the new consistent path).

Please confirm that these steps fixed the issue and I can close this :)

tommyalatalo · 2022-05-15T16:37:51Z

I don't have any references left to anything outside /opt/scrutiny, and the container still loops with this message:

 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
github.com/AnalogJ/scrutiny                             dev-0.4.5

Start the scrutiny server
waiting for influxdb
starting scrutiny

this is the container that should be starting up:

    task "web" {
      driver = "docker"

      config {
        image   = var.web_image
        cap_add = ["sys_admin", "sys_rawio"]
        ports   = ["scrutiny"]

        volumes = [
          "/run/udev:/run/udev:ro",
          "secrets/scrutiny.yaml:/opt/scrutiny/config/scrutiny.yaml",
          "/zpool/services/scrutiny/scrutiny.db:/opt/scrutiny/config/scrutiny.db",
          "/zpool/services/scrutiny/influxdb:/opt/scrutiny/influxdb",
        ]
      }

      service {
        name = "${NOMAD_TASK_NAME}"
        port = "scrutiny"

        tags = [
          "web",
          "http",
          "traefik.enable=true",
          "traefik.http.routers.https-${NOMAD_TASK_NAME}.entrypoints=https",
          "traefik.http.routers.https-${NOMAD_TASK_NAME}.rule=Host(`${NOMAD_TASK_NAME}.tox.sh`)",
          "traefik.http.routers.https-${NOMAD_TASK_NAME}.middlewares=chain-internal-no-auth@file",
          "traefik.http.routers.https-${NOMAD_TASK_NAME}.tls=true",
        ]

        check {
          port     = "scrutiny"
          name     = "${NOMAD_TASK_NAME} health"
          type     = "http"
          path     = "/api/health"
          interval = "30s"
          timeout  = "1s"
        }
      }

      env {
        GIN_MODE              = "release"
        SCRUTINY_WEB_INFLUXDB_HOST = "localhost"
      }

      template {
        destination = "secrets/scrutiny.yaml"
        change_mode = "restart"
        data        = file("./config/scrutiny/scrutiny.yaml")
      }
    }

AnalogJ · 2022-05-17T16:51:51Z

@tommyalatalo no other error messages? Can you enable debug mode by setting the DEBUG environmental variable to true

tommyalatalo · 2022-05-17T17:13:13Z

Okay, I found that the database location was wrong in my scrutiny.yaml file, after fixing that I now get this error:

goroutine 1 [running]:
github.com/analogj/scrutiny/webapp/backend/pkg/web/middleware.RepositoryMiddleware(0x129f920, 0xc00040a068, 0x12a4b00, 0xc000470bd0, 0x129f9a0)
	/go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/middleware/repository.go:14 +0xe6
github.com/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Setup(0xc000405620, 0x12a4b00, 0xc000470bd0, 0x14)
	/go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/server.go:26 +0xd8
github.com/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Start(0xc000405620, 0x0, 0x0)
	/go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/server.go:97 +0x234
main.main.func2(0xc000411240, 0x4, 0x6)
	/go/src/github.com/analogj/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:112 +0x198
github.com/urfave/cli/v2.(*Command).Run(0xc0004730e0, 0xc0004110c0, 0x0, 0x0)
	/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/command.go:164 +0x4e0
github.com/urfave/cli/v2.(*App).RunContext(0xc00047e000, 0x128e820, 0xc000130010, 0xc000126020, 0x2, 0x2, 0x0, 0x0)
	/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/app.go:306 +0x814
github.com/urfave/cli/v2.(*App).Run(...)
	/go/pkg/mod/github.com/urfave/cli/v2@v2.2.0/app.go:215
main.main()
	/go/src/github.com/analogj/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:137 +0x65a
2022/05/17 17:12:13 Loading configuration file: /opt/scrutiny/config/scrutiny.yaml
time="2022-05-17T17:12:13Z" level=info msg="Trying to connect to scrutiny sqlite db: /opt/scrutiny/config/scrutiny.db\n"
time="2022-05-17T17:12:13Z" level=info msg="Successfully connected to scrutiny sqlite db: /opt/scrutiny/config/scrutiny.db\n"
time="2022-05-17T17:12:13Z" level=debug msg="InfluxDB url: http://localhost:8086"
time="2022-05-17T17:12:13Z" level=debug msg="No influxdb token found, running first-time setup..."
panic: conflict: onboarding has already been completed

AnalogJ · 2022-05-17T18:12:50Z

Stop scrutiny, backup then delete your infkuxdb directory, then restart scrutiny. That should fix it

…

On Tue, May 17, 2022 at 10:13 AM Tommy Alatalo ***@***.***> wrote: Okay, I found that the database location was wrong in my scrutiny.yaml file, after fixing that I now get this error: goroutine 1 [running]:github.com/analogj/scrutiny/webapp/backend/pkg/web/middleware.RepositoryMiddleware(0x129f920, 0xc00040a068, 0x12a4b00, 0xc000470bd0, 0x129f9a0) /go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/middleware/repository.go:14 +0xe6gitpro.ttaallkk.top/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Setup(0xc000405620, 0x12a4b00, 0xc000470bd0, 0x14) /go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/server.go:26 +0xd8gitpro.ttaallkk.top/analogj/scrutiny/webapp/backend/pkg/web.(*AppEngine).Start(0xc000405620, 0x0, 0x0) /go/src/github.com/analogj/scrutiny/webapp/backend/pkg/web/server.go:97 +0x234 main.main.func2(0xc000411240, 0x4, 0x6) /go/src/github.com/analogj/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:112 +0x198gitpro.ttaallkk.top/urfave/cli/v2.(*Command).Run(0xc0004730e0, 0xc0004110c0, 0x0, 0x0) ***@***.***/command.go:164 +0x4e0gitpro.ttaallkk.top/urfave/cli/v2.(*App).RunContext(0xc00047e000, 0x128e820, 0xc000130010, 0xc000126020, 0x2, 0x2, 0x0, 0x0) ***@***.***/app.go:306 +0x814gitpro.ttaallkk.top/urfave/cli/v2.(*App).Run(...) ***@***.***/app.go:215 main.main() /go/src/github.com/analogj/scrutiny/webapp/backend/cmd/scrutiny/scrutiny.go:137 +0x65a 2022/05/17 17:12:13 Loading configuration file: /opt/scrutiny/config/scrutiny.yaml time="2022-05-17T17:12:13Z" level=info msg="Trying to connect to scrutiny sqlite db: /opt/scrutiny/config/scrutiny.db\n" time="2022-05-17T17:12:13Z" level=info msg="Successfully connected to scrutiny sqlite db: /opt/scrutiny/config/scrutiny.db\n" time="2022-05-17T17:12:13Z" level=debug msg="InfluxDB url: http://localhost:8086" time="2022-05-17T17:12:13Z" level=debug msg="No influxdb token found, running first-time setup..." panic: conflict: onboarding has already been completed — Reply to this email directly, view it on GitHub <#248 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGZXYY3MRU4RQ24B7ZT6U3VKPHTLANCNFSM5V3TXU4A> . You are receiving this because you modified the open/close state.Message ID: ***@***.***>

tommyalatalo · 2022-05-17T18:29:27Z

Yeah, that does get scrutiny to start, but if I restart the web container it errors out again with the same message.

waiting for influxdb
starting scrutiny

 ___   ___  ____  __  __  ____  ____  _  _  _  _
/ __) / __)(  _ \(  )(  )(_  _)(_  _)( \( )( \/ )
\__ \( (__  )   / )(__)(   )(   _)(_  )  (  \  /
(___/ \___)(_)\_)(______) (__) (____)(_)\_) (__)
github.com/AnalogJ/scrutiny                             dev-0.4.5

Start the scrutiny server
ts=2022-05-17T18:28:01.974547Z lvl=error msg="failed to onboard user admin" log_id=0aWpGhZW000 handler=onboard error="onboarding has already been completed" took=0.053ms
ts=2022-05-17T18:28:01.974573Z lvl=error msg="api error encountered" log_id=0aWpGhZW000 error="onboarding has already been completed"

AnalogJ · 2022-05-17T19:55:08Z

That means your config.yaml file is not being persisted correctly.

"secrets/scrutiny.yaml:/opt/scrutiny/config/scrutiny.yaml"

Is that secrets/scrutiny.yaml path writable? during setup, scrutiny will attempt to configure your influxdb instance, then store the api token to the config file.

I'm guessing your secrets folder is similar to a Kubernetes secret mount? In which case its not writable?

In that case, you can authenticate to the InfluxDB webui, retrieve the API token, and store it in your scrutiny.yaml file.

https://github.com/AnalogJ/scrutiny/blob/master/docs/TROUBLESHOOTING_INFLUXDB.md#first-start

tommyalatalo · 2022-05-17T20:14:31Z

You're right, secrets/scrutiny.yaml is not persisted, it's held in memory because it's templated with my pushover token from my secrets storage. I think I saw the document that you linked before, there are a few questions and comments that arise from this behavior;

Writing a value back to a users' config file is bad practice, you should never edit a config file that a user has deployed in order to run a service.
Why is the API token needed in the case where I'm using the omnibus image? I figure scrutiny could just as well use the admin credentials that it starts out with since the influxdb database is embedded in the same container and essentially dedicated to scrutiny? This would eliminate the complexity of having to fetch the API token.
Why does scrutiny need a database in general? Looking at it from my use case I'm only interested in monitoring my disks and getting notified if errors are detected, for this purpose I don't see the need for scrutiny to be a stateful service? Is scrutiny storing anything vital in influxdb in addtition to the temperature history which I assume is stored there?

AnalogJ · 2022-05-17T23:02:23Z

I mostly agree that updating the config file isn't ideal, but an "uncommon practice" != "bad practice". This is only necessary because the InfluxDB SDK does not support username/password auth, only API token authentication: https://github.com/influxdata/influxdb-client-go/blob/ab68e236009fd2a1b12edbd0a328f2103c4053d7/client.go#L95
- basically I'm in a weird position where i need to configure a new InfluxDB instance (which adds buckets, admin users and api tokens) and I need a way to persist this auth data so scrutiny will continue to work if/when the container is restarted.
- if you bring your own influxdb instance (in hub/spoke deployments), then you can pre-populate the token in your config file.
- my thought process is that the config file/directory is intended to store scrutiny specific data, so it makes sense to persist it there. If there's alot of concern, I guess I could write the auth token to a different file in the config directory.
See above.
One of Scrutiny's primary features is S.M.A.R.T metric tracking for historical trends. Its intended to be used as a way to determine if your SMART data is changing over time, which is obviously stateful. If you only want visibility into the SMART output at a point in time, you could use smartctl directly.

Hope that answers all your questions?

tommyalatalo · 2022-05-18T09:10:32Z

Thanks for answering all the questions, I appreciate it.

I definitely agree that uncommon practice is not equal to bad practice, though in this case I'm pretty confident that it's also the latter. So I work in devops/sysops and have deployed hundreds of applications, and I can honestly say that not a single one so far has ever overwritten it's own config file when started up. There are two main reasons for this, both are present in my issue above.

The first is that the application cannot assume that the config file is writable, and shouldn't expect it to be since it's a user supplied config which can contain credentials like the api token or pushover token, meaning that a user like me can use an in-memory filesystem like nomad's secrets folder, or kubernetes secrets etc, and never persist the file to disk in order to not leak the credentials. The file could also just be mounted as read-only into docker with :ro which is also entirely reasonable to do in order to improve the security of the container by preventing any malicious process from inside the container to make changes to the config.

The second reason is that if an application writes parameters back into its config file it makes the behavior of the application non-deterministic, which you can see in my case where I first start up scrutiny from having no database, it works, and then when restarting after the database has been initialized it fails instead, while I'm still using the same config. So the application behavior changes, but from a user standpoint my application config is exactly the same and so results should be expected to be the same as well.

Since I said that it's bad practice I should of course give an example of what is considered to be good practice, one of the best examples I know of is the 12factor app method, where configuration should be primarily in env vars; https://12factor.net/config. When considering this you quickly realize that if all the config was set as env vars there would be nowhere to write the api token back to in order to persist it, which is a strong indication that the current api token handling is suboptimal.

But that's all fine, scrutiny is a work in progress and I hope you appreciate that this is a tangent on what as a whole is a great project and very useful to me as well, so let's discuss possibilities of how to handle the api token in a different way. I've built an application myself that uses influxdb heavily, but that was v1.8 so the SDK is clearly a bit different since the api token is now necessary. I'm currently thinking of two approaches;

It seems that you can set an admin token when influx setup is initially run:

❯ ./influx setup --help
NAME:
    setup - Setup instance with initial user, org, bucket

USAGE:
...
COMMON OPTIONS:
...
OPTIONS:
   --token value, -t value      Auth token to set on the initial user [$INFLUX_TOKEN]
...

The official influxdb docker image does exactly this in its entrypoint: https://github.com/influxdata/influxdata-docker/blob/master/influxdb/2.2/entrypoint.sh#L224-L226

This would be by far the best way to handle this, since you can have the token supplied immediately as an env var, (or in the scrutiny.yaml) so that it will be used when initializing influxdb and scrutiny can load the same value to use for its requests. This would make the whole process automated and consistent, no restarts or creating tokens in the UI needed. Just have scrutiny stop with an error if this influxdb admin token isn't set properly in the omnibus image. I actually wonder if this would work without any changes if I set INFLUX_TOKEN in the container and set web.api.token to the same value? I haven't checked the scrutiny source code for how influxdb is initialized.

Halt the program (don't shut down since scrutiny assumingly runs as pid 1) at startup if the database has been initialized and the token isn't set. The error message should be clearer and say that influxdb has been initialized, but web.api.token is not set. Option 1 is far better though.

Well that was a whole lot of discussion, I hope you didn't fall asleep halfway through :D

AnalogJ · 2022-05-18T16:31:36Z

I ended up forking the InfluxDB sdk and just adding support for a SetupWithToken() method. This means I can leverage the same functionality as the influx setup cli command, without needing to pull in the CLI tooling and its dependencies.

The docker image is building right now, but it everything is working correctly in my local testing.
You may need to delete your influxdb folder, or retrieve your existing token and store it as web.influx.token in your existing config file.

Appreciate your detailed response above. I had already considered most of your options before going the "write a config file" route, but I thought it would add additional complexity and maintenance burden on my side. Regarding your 12-factor app comment, even with a config file Scrutiny supports overrides using env variables. The Viper config library that we use merges CLI -> Env -> config file values before providing them to the application. Secrets could/should always be provided as env variables.
As Scrutiny is primarily a "read-only" application with little-to-no need for secrets or dynamic configuration, I had thought the trade-off was worth it, but I'm happy to make this change to support this style of deployment.

tommyalatalo · 2022-05-18T17:19:47Z

That sounds like a good solution, I thought you already bootstrapped the database with the influx cli tool, but forking the SDK is probably a good way forward in this case, you should make a PR to the SDK and see if they'll take in the code so you don't have to maintain it further.

Ah yes, I've used Viper myself, it's an excellent library especially when paired with Cobra. And there's nothing wrong with supporting both config files and env, that gives a lot of flexibility for templating etc.

I'm looking forward to trying the new image when you have it ready! I'm using scrutiny across 3 hosts where one is a remote backup server, all in a hub/spoke setup, overall I really like being able to monitor all my disks and have Pushover notifications set up so that I can immediately move on changing a disk if I start to get errors. It's so great to be able to do this across a cluster and manage it from one central place and not have to set it up per-host as with smartctl.

ThisIsTheOnlyUsernameAvailable · 2022-05-20T06:32:02Z

Is it possible to use an external influxdb instance? If so, can anyone suggest what the Docker environment variables (or config options) should be?

AnalogJ · 2022-05-20T07:23:48Z

@ThisIsTheOnlyUsernameAvailable please take a look at this example config file if you'd like to use your own influxdb docker container:

https://github.com/AnalogJ/scrutiny/blob/master/docker/example.hubspoke.docker-compose.yml

If using a preexisting Influxdb instance (already pre-configured with users & buckets) you'll need to specify the following variables in your config file:

https://github.com/AnalogJ/scrutiny/blob/master/example.scrutiny.yaml#L42-L45

AnalogJ · 2022-05-20T17:10:47Z

Closing this issue now that v0.4.6 has been released (and uses an updated version of the influxDB SDK with predetermined Token support).

If you're using Docker, you'll need to wait until https://github.com/AnalogJ/scrutiny/actions/runs/2359630681 completes to use the fixed image.

Thanks for all your help and feedback @tommyalatalo ! 🎉

tommyalatalo · 2022-05-21T09:16:21Z

I looked at the pipeline, are you not tagging the image with the version number in addition to just :master-omnibus?

I force pulled the new image tagged :master-omnibus, and it works nicely! I would like to lock it to version 0.4.6 now though, so that I don't risk breakage with future updates.

AnalogJ · 2022-05-21T14:52:22Z

not sure what you looked at:

The docker-build.yaml GH Action will automatically build version locked docker images and tag them with ghcr.io/analogj/scrutiny:${VERSION}-omnibus

It seems that the v0.4.6 image failed for some reason. I'll take a look at that.

AnalogJ · 2022-05-21T15:51:59Z

Looks like the failure was due to a timeout, retrying fixed the issue:

https://github.com/AnalogJ/scrutiny/pkgs/container/scrutiny/22808074?tag=v0.4.6-omnibus

tommyalatalo added the bug Something isn't working label May 13, 2022

AnalogJ closed this as completed May 13, 2022

AnalogJ reopened this May 15, 2022

raulfg3 mentioned this issue May 16, 2022

[ACLARATION] about discrepancies between scrutiny images ( linuxserver vs analogj) #253

Closed

AnalogJ closed this as completed in 5ac0aa8 May 18, 2022

AnalogJ reopened this May 18, 2022

AnalogJ closed this as completed May 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Latest image crashes on startup #248

[BUG] Latest image crashes on startup #248

tommyalatalo commented May 13, 2022

AnalogJ commented May 13, 2022

tommyalatalo commented May 13, 2022

evulhotdog commented May 13, 2022

AnalogJ commented May 13, 2022

tommyalatalo commented May 14, 2022

tommyalatalo commented May 14, 2022 •

edited

Loading

AnalogJ commented May 15, 2022

AnalogJ commented May 15, 2022

raulfg3 commented May 15, 2022 •

edited

Loading

raulfg3 commented May 15, 2022

raulfg3 commented May 15, 2022

tommyalatalo commented May 15, 2022

AnalogJ commented May 15, 2022

tommyalatalo commented May 15, 2022

AnalogJ commented May 17, 2022

tommyalatalo commented May 17, 2022

AnalogJ commented May 17, 2022 via email

tommyalatalo commented May 17, 2022

AnalogJ commented May 17, 2022

tommyalatalo commented May 17, 2022

AnalogJ commented May 17, 2022 •

edited

Loading

tommyalatalo commented May 18, 2022 •

edited

Loading

AnalogJ commented May 18, 2022

tommyalatalo commented May 18, 2022 •

edited

Loading

ThisIsTheOnlyUsernameAvailable commented May 20, 2022

AnalogJ commented May 20, 2022

AnalogJ commented May 20, 2022

tommyalatalo commented May 21, 2022 •

edited

Loading

AnalogJ commented May 21, 2022

AnalogJ commented May 21, 2022

[BUG] Latest image crashes on startup #248

[BUG] Latest image crashes on startup #248

Comments

tommyalatalo commented May 13, 2022

AnalogJ commented May 13, 2022

tommyalatalo commented May 13, 2022

evulhotdog commented May 13, 2022

AnalogJ commented May 13, 2022

tommyalatalo commented May 14, 2022

tommyalatalo commented May 14, 2022 • edited Loading

AnalogJ commented May 15, 2022

AnalogJ commented May 15, 2022

raulfg3 commented May 15, 2022 • edited Loading

raulfg3 commented May 15, 2022

raulfg3 commented May 15, 2022

tommyalatalo commented May 15, 2022

AnalogJ commented May 15, 2022

tommyalatalo commented May 15, 2022

AnalogJ commented May 17, 2022

tommyalatalo commented May 17, 2022

AnalogJ commented May 17, 2022 via email

tommyalatalo commented May 17, 2022

AnalogJ commented May 17, 2022

tommyalatalo commented May 17, 2022

AnalogJ commented May 17, 2022 • edited Loading

tommyalatalo commented May 18, 2022 • edited Loading

AnalogJ commented May 18, 2022

tommyalatalo commented May 18, 2022 • edited Loading

ThisIsTheOnlyUsernameAvailable commented May 20, 2022

AnalogJ commented May 20, 2022

AnalogJ commented May 20, 2022

tommyalatalo commented May 21, 2022 • edited Loading

AnalogJ commented May 21, 2022

AnalogJ commented May 21, 2022

tommyalatalo commented May 14, 2022 •

edited

Loading

raulfg3 commented May 15, 2022 •

edited

Loading

AnalogJ commented May 17, 2022 •

edited

Loading

tommyalatalo commented May 18, 2022 •

edited

Loading

tommyalatalo commented May 18, 2022 •

edited

Loading

tommyalatalo commented May 21, 2022 •

edited

Loading