-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tserver logging lost on restart #286
Comments
I'm not sure this is a problem. If you need the logs for the previous run separate, then you can just copy them before you restart, right? Uno should keep things pretty simple, so we probably don't want to make it too complex with tracking different restart iterations for logging. |
The problem isn't the logs get overridden, the problem is that there is no logging for a restarted tserver. For example: 12:46:25 {main} ~/workspace/uno/install/logs/accumulo$ grep "address = " *.log tserver1_ip-10-113-14-231.log:2022-09-29T12:45:04,103 [tserver.TabletServer] INFO : address = localhost:9997 tserver2_ip-10-113-14-231.log:2022-09-29T12:45:04,279 [tserver.TabletServer] INFO : address = localhost:10000 I kill one of them and tserver2 running on 10000 is dead. I restart the tserver using uno status Accumulo processes running: tserver(19392) manager(19736) gc(19777) monitor(19825) tserver(21303) But only one of the logs gets updated and I never see the second tserver logging to anything: 12:52:13 {main} ~/workspace/uno/install/logs/accumulo$ grep "address = " *.log tserver1_ip-10-113-14-231.log:2022-09-29T12:45:04,103 [tserver.TabletServer] INFO : address = localhost:9997 tserver2_ip-10-113-14-231.log:2022-09-29T12:45:04,279 [tserver.TabletServer] INFO : address = localhost:10000 12:56:07 {main} ~/workspace/uno/install/logs/accumulo$ ls -ltr tserver*.log -rw-r--r-- 1 mpmill4 domain users 69700 Sep 29 12:48 tserver2_ip-10-113-14-231.log -rw-r--r-- 1 mpmill4 domain users 116116 Sep 29 12:56 tserver1_ip-10-113-14-231.log Only the log for tserver1_ip-10-113-14-231.log gets updated and I never see the address 10000 starting up logged anywhere, even though Accumulo sees it and it seems to function fine. |
Okay, I see. I was able to reproduce this with only one server, and I saw the same thing for all services, not just tserver. If I used One difference, when I look at So, this looks like a situation where the behavior of Uno is to try to put the logs in an special place specifically for Uno, but manually running There are a few solutions to this that I can think of:
The last option is easiest and least disruptive, but user's will still see this issue if they forget to source the environment. The first option is the most foolproof, but requires us to be more careful that our modifications to the Accumulo environment file work across Accumulo versions. The second option is a decent middle-ground, but may require us to pay a bit of attention to how we reset/wipe the cluster to ensure we're clearing out old files differently than how we're currently doing it. |
I have multiple tservers running on the same instance. If I kill one of them (kill -9 PID) and restart using the Accumulo scripts (accumulo-cluster or accumulo-service) then the logging for the restarted tserver gets lost or munged into the same log as another.
The text was updated successfully, but these errors were encountered: