Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive error log warnings with healthcheck --connect #430

Closed
grooverdan opened this issue May 9, 2022 · 7 comments
Closed

Excessive error log warnings with healthcheck --connect #430

grooverdan opened this issue May 9, 2022 · 7 comments

Comments

@grooverdan
Copy link
Member

grooverdan commented May 9, 2022

From @jerdoe in #94 (comment)

proj_mariadb-1  | 2022-05-08 21:10:02 1453 [Warning] Access denied for user 'root'@'127.0.0.1' (using password: NO)
proj_mariadb-1  | 2022-05-08 21:10:32 1461 [Warning] Access denied for user 'root'@'127.0.0.1' (using password: NO)
proj_mariadb-1  | 2022-05-08 21:11:03 1463 [Warning] Access denied for user 'mysql'@'127.0.0.1' (using password: NO)
proj_mariadb-1  | 2022-05-08 21:11:05 1443 [Warning] Access denied for user 'mysql'@'127.0.0.1' (using password: NO)

--connect is using TCP so mariadb@localhost isn't sufficient. log_warnings=1 is one work around.

Options:

  • log_warnings=1 by default, don't want to play around too much with server defaults in containers.
  • create mysql@127.0.0.1 / mysql@::1 users. need to give empty password otherwise same error. Which conflicts with password validation plugins.
  • alternate tcp connection interfaces that doesn't complete login. Would just replace Access denied errors with connection failure errors, which would eventually be blocked and denied. This might healthcheck fail.
  • replace --connect implementation with something lsof based since its in the container.
grooverdan added a commit to grooverdan/mariadb-docker that referenced this issue May 9, 2022
The --connect test does a tcp connection. There is no
mysql@::1 or mysql@127.0.0.1 user created so the access
is denied, but the healtcheck is correct.

The entries in the error log because of the default
log_warnings=2.

To avoid hitting max_connect_errors (default 100), a non-connection
way is to look at the proc status for listening sockets.

While normally not a direct correlation to a connection, mariadb
listens and is followed by accepting connections immedately without
anything that can fail inbetween.

We move the previous connect to mariadb_connect for compatibility
if users want to continue with that mechanism. We pre-emtively create
tcp listen as a mechanism and redirect connect to that.

Closes MariaDB#430
@grooverdan
Copy link
Member Author

Started concept on 003ee86

Its not good enough though as a health-cmd run as root cannot process the visiblity of the pid 1 file descriptors

root@bbd8db405ffe:/# ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
mysql          1       0  0 05:04 ?        00:00:00 mariadbd
root         326       0  0 05:06 pts/0    00:00:00 bash
root        6082     326  0 05:59 pts/0    00:00:00 ps -ef
root@bbd8db405ffe:/# lsof -t -p 1 -a -iTCP -sTCP:LISTEN
root@bbd8db405ffe:/# 
root@bbd8db405ffe:/# lsof    -p 1
COMMAND  PID  USER   FD      TYPE DEVICE SIZE/OFF NODE NAME
mariadbd   1 mysql  cwd   unknown                      /proc/1/cwd (readlink: Permission denied)
mariadbd   1 mysql  rtd   unknown                      /proc/1/root (readlink: Permission denied)
mariadbd   1 mysql  txt   unknown                      /proc/1/exe (readlink: Permission denied)
mariadbd   1 mysql    0   unknown                      /proc/1/fd/0 (readlink: Permission denied)
mariadbd   1 mysql    1   unknown                      /proc/1/fd/1 (readlink: Permission denied)
mariadbd   1 mysql    2   unknown                      /proc/1/fd/2 (readlink: Permission denied)
....

Alternate ss:

root@bbd8db405ffe:/# ss -nlHt state listening "( sport = :3306 )"
0                             0                                                    0.0.0.0:3306                                                0.0.0.0:*                        
0                             0                                                          *:3306                                                      *:*  

And the bad side is it won't part by process id. Also assuming port 3306. Parsing out my_print_defaults can be done, but --port 3321 could be a container entrypoint argument.

Still looking for alternatives.

grooverdan added a commit to grooverdan/mariadb-docker that referenced this issue May 10, 2022
The --connect test does a tcp connection. There is no
mysql@::1 or mysql@127.0.0.1 user created so the access
is denied, but the healtcheck is correct.

The entries in the error log because of the default
log_warnings=2.

To avoid hitting max_connect_errors (default 100), a non-connection
way is to look at the proc status for listening sockets.

While normally not a direct correlation to a connection, mariadb
listens and is followed by accepting connections immedately without
anything that can fail inbetween.

We move the previous connect to mariadb_connect for compatibility
if users want to continue with that mechanism. We pre-emtively create
tcp listen as a mechanism and redirect connect to that.

Also tested:

lsof -t -p 1 -a -iTCP -sTCP:LISTEN cannot resolve symlink /proc/1/fd/{num}
when run as root.

ss -nlHt state listening "( sport = :3306 )" could work, but no way
to tie to pid 1, and the 3306 port isn't fixed.

Closes MariaDB#430
@grooverdan
Copy link
Member Author

Attempted new technique of looking at /proc/1/net/tcp{,6}:

Seems to be working:

$ podman run --rm --health-cmd 'healthcheck.sh --connect' --health-interval=1s --env MARIADB_ROOT_PASSWORD=bob -d  --name m430_test  m106 ; while podman ps; do sleep 0.3; done
e57942bdc6ff7d5c57537d9a98036c27d74762266b0715335d8b816633b73ec1
CONTAINER ID  IMAGE                  COMMAND     CREATED        STATUS                       PORTS       NAMES
e57942bdc6ff  localhost/m106:latest  mariadbd    3 seconds ago  Up 3 seconds ago (starting)              m430_test
CONTAINER ID  IMAGE                  COMMAND     CREATED        STATUS                       PORTS       NAMES
e57942bdc6ff  localhost/m106:latest  mariadbd    6 seconds ago  Up 6 seconds ago (starting)              m430_test
CONTAINER ID  IMAGE                  COMMAND     CREATED         STATUS                        PORTS       NAMES
e57942bdc6ff  localhost/m106:latest  mariadbd    10 seconds ago  Up 9 seconds ago (unhealthy)              m430_test
CONTAINER ID  IMAGE                  COMMAND     CREATED         STATUS                         PORTS       NAMES
e57942bdc6ff  localhost/m106:latest  mariadbd    13 seconds ago  Up 12 seconds ago (unhealthy)              m430_test
CONTAINER ID  IMAGE                  COMMAND     CREATED         STATUS                         PORTS       NAMES
e57942bdc6ff  localhost/m106:latest  mariadbd    16 seconds ago  Up 15 seconds ago (unhealthy)              m430_test
CONTAINER ID  IMAGE                  COMMAND     CREATED         STATUS                         PORTS       NAMES
e57942bdc6ff  localhost/m106:latest  mariadbd    19 seconds ago  Up 19 seconds ago (unhealthy)              m430_test
CONTAINER ID  IMAGE                  COMMAND     CREATED         STATUS                       PORTS       NAMES
e57942bdc6ff  localhost/m106:latest  mariadbd    22 seconds ago  Up 21 seconds ago (healthy)              m430_test
CONTAINER ID  IMAGE                  COMMAND     CREATED         STATUS                       PORTS       NAMES
e57942bdc6ff  localhost/m106:latest  mariadbd    24 seconds ago  Up 24 seconds ago (healthy)              m430_test
CONTAINER ID  IMAGE                  COMMAND     CREATED         STATUS                       PORTS       NAMES
e57942bdc6ff  localhost/m106:latest  mariadbd    27 seconds ago  Up 27 seconds ago (healthy)              m430_test
CONTAINER ID  IMAGE                  COMMAND     CREATED         STATUS                       PORTS       NAMES
e57942bdc6ff  localhost/m106:latest  mariadbd    29 seconds ago  Up 29 seconds ago (healthy)              m430_test
CONTAINER ID  IMAGE                  COMMAND     CREATED         STATUS                       PORTS       NAMES
e57942bdc6ff  localhost/m106:latest  mariadbd    32 seconds ago  Up 32 seconds ago (healthy)              m430_test

procfs(5) for details on /proc/{n}/net. Relying on 0A could be an internal state.

@jerdoe
Copy link

jerdoe commented May 10, 2022

@grooverdan Wow, that was fast! My comment was just posted a few hours ago.
And this looks like a good solution to me.

Thank you and thanks to all of the contributors for being so much involved for the open source community!

@grooverdan
Copy link
Member Author

grooverdan commented May 10, 2022

You're welcome. A new release is coming out soon so I'm looking to have this ready for it.

I need to think a bit more about cases where it won't work. And about the stability of a 0A state as the Linux kernel exposes.

Pros:

  • determines server has reached the listening state and soon to be accepting connections
  • server on shutdown closes ports early so it will report a unhealthy on shutdown
  • immune from user --port and --user changes

Cons:

  • a plugin that listens to a port, like galera, will also listen on a different port earlier. Hoping users use galera based healtcheck options in this case.
  • hoping the /proc/pid/net/tcp interface is reasonably stable (and has been), and 0A state of the kernel is stable (and has been for a while. Is the same state exposed in WSL(2)?).
  • can something else listen in /docker-entrypoint-initdb.d, currently no, as to be on pid 1 bash would need to create the listening socket, which afaik it can't do. /dev/tcp/${HOST}/${PORT} creates a 08 connection state.
  • /proc/pid/net/tcp is the network namespace of the process, rather than the process itself, so more checks are needed as now anything that listens could trigger healthy. This has implications for different container runtimes.

@jerdoe your bug report comment was a contribution. Welcome to the community.

@grooverdan
Copy link
Member Author

Unfortunately I didn't submit my change to get the warnings reduced on the server in time for the coming release (MariaDB/server#2213). I hope to get them in the release afterwards and the container changes to accommodate them.

@gremo
Copy link

gremo commented May 15, 2023

Can anyone explain to regular humans how to solve the flooding problem? I'm struggling with the same problem now, using the healthcheck script! Thanks!

grooverdan added a commit to grooverdan/mariadb-docker that referenced this issue May 16, 2023
grooverdan added a commit to grooverdan/mariadb-docker that referenced this issue May 17, 2023
grooverdan added a commit to grooverdan/mariadb-docker that referenced this issue May 17, 2023
mysql@{127.0.0.1,::1} users aren't granted any privileges and have
accounts locked and passwords expired.

Closes MariaDB#430
grooverdan added a commit to grooverdan/mariadb-docker that referenced this issue Jun 15, 2023
mysql@{127.0.0.1,::1} users aren't granted any privileges and have
accounts locked and passwords expired.

Even expired passwords need to comply with --plugin-load-add=simple_password_check
so a generated password is created and saved in
$DATADIR/.my-healthcheck.cnf.

The healthcheck --connect will increment the server status variable Aborted_connects
for each check, however now connection_error* counts are changed.

This also prevents any invalid password errors showing up in the
container log.

Closes MariaDB#430
grooverdan added a commit to grooverdan/mariadb-docker that referenced this issue Jun 15, 2023
…tcheck --connect

healtcheck_connect@{127.0.0.1,::1} users aren't granted any privileges and have
accounts locked and passwords expired.

Even expired passwords need to comply with --plugin-load-add=simple_password_check
so a generated password is created and saved in
$DATADIR/.my-healthcheck.cnf.

The healthcheck --connect will increment the server status variable Aborted_connects
for each check, however now connection_error* counts are changed.

This also prevents any invalid password errors showing up in the
container log.

Closes MariaDB#430
grooverdan added a commit to grooverdan/mariadb-docker that referenced this issue Jun 15, 2023
…tcheck --connect

healtcheck_connect@{127.0.0.1,::1} users aren't granted any privileges and have
accounts locked and passwords expired.

Even expired passwords need to comply with --plugin-load-add=simple_password_check
so a generated password is created and saved in
$DATADIR/.my-healthcheck.cnf.

The healthcheck --connect will increment the server status variable Aborted_connects
for each check, however now connection_error* counts are changed.

This also prevents any invalid password errors showing up in the
container log.

Closes MariaDB#430
grooverdan added a commit to grooverdan/mariadb-docker that referenced this issue Jun 20, 2023
…althcheck --connect

healthcheck@{127.0.0.1,::1,localhost} users are granted USAGE by default, which
is enough for the non-replication healthchecks in healtcheck.sh.

The env variable MARIADB_HEALTHCHECK_GRANTS can replace USAGE with any
comma separated set of grants.

On initialization a generated password is created and saved in
$DATADIR/.my-healthcheck.cnf along with the server port and socket. If the
command args or default configuration file changes this may become out
of date. Because the password is generated in configuration file the
'#', comment, and '=' characters cannot be part of this password.

The healthcheck.cnf configuration file also sets protocol=tcp to
semi enforce --connect being a standard part of the test. This is
required as starts of the service under --skip-networking should
never be considered healthy.

The healthcheck script also has the --default-extra-file set to this
.my-healthcheck.cnf file if it exists so that all healthcheck scripts
use the authentication here by default.

The compatibility with old instances, without the .my-healtcheck.cnf is
preserved.

The healthcheck --connect will increment the server status variable Aborted_connects
for each check, however now connection_error* counts are changed.

This also prevents any invalid password errors showing up in the
container log.

Closes MariaDB#430
grooverdan added a commit to grooverdan/mariadb-docker that referenced this issue Jun 20, 2023
…althcheck --connect

healthcheck@{127.0.0.1,::1,localhost} users are granted USAGE by default, which
is enough for the non-replication healthchecks in healtcheck.sh.

The env variable MARIADB_HEALTHCHECK_GRANTS can replace USAGE with any
comma separated set of grants.

On initialization a generated password is created and saved in
$DATADIR/.my-healthcheck.cnf along with the server port and socket. If the
command args or default configuration file changes this may become out
of date. Because the password is generated in configuration file the
'#', comment, and '=' characters cannot be part of this password.

The healthcheck.cnf configuration file also sets protocol=tcp to
semi enforce --connect being a standard part of the test. This is
required as starts of the service under --skip-networking should
never be considered healthy.

The healthcheck script also has the --default-extra-file set to this
.my-healthcheck.cnf file if it exists so that all healthcheck scripts
use the authentication here by default.

The compatibility with old instances, without the .my-healtcheck.cnf is
preserved.

The healthcheck --connect will increment the server status variable Aborted_connects
for each check, however now connection_error* counts are changed.

This also prevents any invalid password errors showing up in the
container log.

Closes MariaDB#430
grooverdan added a commit to grooverdan/mariadb-docker that referenced this issue Jun 20, 2023
…althcheck --connect

healthcheck@{127.0.0.1,::1,localhost} users are granted USAGE by default, which
is enough for the non-replication healthchecks in healtcheck.sh.

The env variable MARIADB_HEALTHCHECK_GRANTS can replace USAGE with any
comma separated set of grants.

On initialization a generated password is created and saved in
$DATADIR/.my-healthcheck.cnf along with the server port and socket. If the
command args or default configuration file changes this may become out
of date. Because the password is generated in configuration file the
'#', comment, and '=' characters cannot be part of this password.

The healthcheck.cnf configuration file also sets protocol=tcp to
enforce indirectly that --connect being a standard part of the test. This is
required as starts of the service under --skip-networking should
never be considered healthy.

The healthcheck script also has the --defaults-extra-file set to this
.my-healthcheck.cnf file, if it exists (backwards compatible on
previously created datadirs), so that all new healthcheck invokations
use the authentication here by default.

The compatibility with old instances, without the .my-healthcheck.cnf is
preserved by non setting --defaults-extra-file.

The healthcheck --connect will increment the server status variable Aborted_connects
for each check, however now connection_error* counts are changed.

This also prevents any invalid password errors showing up in the
container log.

Closes MariaDB#430
@grooverdan
Copy link
Member Author

so released now - see #512 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants