Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC reporting / usage mismatch #1682

Closed
Alex-Ikanow opened this issue Feb 8, 2012 · 2 comments
Closed

GC reporting / usage mismatch #1682

Alex-Ikanow opened this issue Feb 8, 2012 · 2 comments

Comments

@Alex-Ikanow
Copy link

In https://groups.google.com/group/elasticsearch/browse_thread/thread/31d87c84dd387367#, I reported unusual CPU activity (continuous 40-60% for at least several days even under idle conditions) after creating 12GB of field cache (across 3 nodes) via a facet on a multi-valued field.

Last night I ran jstack while 2 of the nodes were in this state. All of the user threads were WAITING, THREAD_WAITING, or RUNNABLE (in some poll).

Cross-referencing the process ids from "top -H" with the nids reported in jstack, the following threads were actively using the CPU:

~19 minutes in ""Concurrent Mark-Sweep GC Thread"
~2 minutes in each of 4x "Gang worker#0 (Parallel GC Threads)" (worker#1,worker#2,worker#3)
~3 minutes in "VM Thread" prio=10 tid=0x00002aad1c6d0000 nid=0x7041 runnable

Empirically, ongoing CPU time seemed to be split between either Mark-Sweep and 1 of the GC threads, or Mark-Sweep and the VM thread.

However checking both bigdesk and head for the node in question, the following GC times were reported:
ParNew 12 5 seconds and 61 milliseconds
ConcurrentMarkSweep 4 5 seconds and 884 milliseconds

Presumably there's nothing that can be done about the CPU activity itself, that's just what the VM needs to do when I allocate a large amount of (multi valued) field cache. That's obviously fine if so.

Presuambly you also just read some standard stats to get the GC usage out - but I thought the fact the "actual values" weren't getting reported was issue worthy.

@kimchy
Copy link
Member

kimchy commented Feb 8, 2012

I get the GC values on reported time they ran from the formal GC stats of the JVM. Maybe you can enable GC logging and see if it helps?

@Alex-Ikanow
Copy link
Author

I think my natural curiosity on this issue has reached its end :)

If you're happy you're making the right API call to the JVM, then I don't think there's anything left to say. For future reference though, get people to use "top -H" (to show threads) and cross reference vs jstack if they're complaining about CPU usage, those 2 GC times (or at least the concurrent mark sweep) don't seem reliable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants