Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tserver logging lost on restart #286

Open
milleruntime opened this issue Sep 28, 2022 · 3 comments
Open

tserver logging lost on restart #286

milleruntime opened this issue Sep 28, 2022 · 3 comments

Comments

@milleruntime
Copy link
Contributor

I have multiple tservers running on the same instance. If I kill one of them (kill -9 PID) and restart using the Accumulo scripts (accumulo-cluster or accumulo-service) then the logging for the restarted tserver gets lost or munged into the same log as another.

@ctubbsii
Copy link
Member

I'm not sure this is a problem. If you need the logs for the previous run separate, then you can just copy them before you restart, right? Uno should keep things pretty simple, so we probably don't want to make it too complex with tracking different restart iterations for logging.

@milleruntime
Copy link
Contributor Author

The problem isn't the logs get overridden, the problem is that there is no logging for a restarted tserver. For example:
I start a cluster with 2 tservers:

12:46:25 {main} ~/workspace/uno/install/logs/accumulo$ grep "address = " *.log
tserver1_ip-10-113-14-231.log:2022-09-29T12:45:04,103 [tserver.TabletServer] INFO : address = localhost:9997
tserver2_ip-10-113-14-231.log:2022-09-29T12:45:04,279 [tserver.TabletServer] INFO : address = localhost:10000

I kill one of them and tserver2 running on 10000 is dead. I restart the tserver using accumulo-cluster start-tservers and now there are 2 running again.

uno status
Accumulo processes running: tserver(19392) manager(19736) gc(19777) monitor(19825) tserver(21303) 

But only one of the logs gets updated and I never see the second tserver logging to anything:

12:52:13 {main} ~/workspace/uno/install/logs/accumulo$ grep "address = " *.log
tserver1_ip-10-113-14-231.log:2022-09-29T12:45:04,103 [tserver.TabletServer] INFO : address = localhost:9997
tserver2_ip-10-113-14-231.log:2022-09-29T12:45:04,279 [tserver.TabletServer] INFO : address = localhost:10000
12:56:07 {main} ~/workspace/uno/install/logs/accumulo$ ls -ltr tserver*.log
-rw-r--r-- 1 mpmill4 domain users  69700 Sep 29 12:48 tserver2_ip-10-113-14-231.log
-rw-r--r-- 1 mpmill4 domain users 116116 Sep 29 12:56 tserver1_ip-10-113-14-231.log

Only the log for tserver1_ip-10-113-14-231.log gets updated and I never see the address 10000 starting up logged anywhere, even though Accumulo sees it and it seems to function fine.

@ctubbsii
Copy link
Member

Okay, I see. I was able to reproduce this with only one server, and I saw the same thing for all services, not just tserver. If I used uno accumulo start instead of $ACCUMULO_HOME/bin/accumulo-cluster start, then the logging was updated fine. So, it seems to be specifically related to the manual use of accumulo-cluster.

One difference, when I look at /proc/<PID>/environ, is that using accumulo-cluster directly does not set ACCUMULO_LOG_DIR. Looking at install/accumulo-2.1.0-SNAPSHOT/conf/accumulo-env.sh, I can see that ACCUMULO_LOG_DIR is defaulting to install/accumulo-2.1.0-SNAPSHOT/logs. And, sure enough, when I look there, the logs for the restarted processes can be found there.

So, this looks like a situation where the behavior of Uno is to try to put the logs in an special place specifically for Uno, but manually running accumulo-cluster or other non-Uno scripts to modify Accumulo will cause it to run with its own environment.

There are a few solutions to this that I can think of:

  1. Instead of Uno configuring the ACCUMULO_LOG_DIR with an environment variable for itself, it can modify the accumulo-env.sh script so that it stores the Uno preferred location for logs, for any subsequent operations that aren't aware of Uno.
  2. Instead of Uno trying to customize the location of the log directory at all, it could just let Accumulo use its default location, and create a link to it at install/logs/accumulo that points to the Accumulo's standard location.
  3. We could modify uno env to ensure ACCUMULO_LOG_DIR is exported, and if you want Accumulo scripts to be aware of Uno's preferences, you'll just have to run source <(uno env) before you run the script, like accumulo-cluster that bypasses Uno.

The last option is easiest and least disruptive, but user's will still see this issue if they forget to source the environment. The first option is the most foolproof, but requires us to be more careful that our modifications to the Accumulo environment file work across Accumulo versions. The second option is a decent middle-ground, but may require us to pay a bit of attention to how we reset/wipe the cluster to ensure we're clearing out old files differently than how we're currently doing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants