Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nil dereference after 1.7.5 update #23659

Open
morphine1900 opened this issue Jul 22, 2024 · 2 comments
Open

Nil dereference after 1.7.5 update #23659

morphine1900 opened this issue Jul 22, 2024 · 2 comments

Comments

@morphine1900
Copy link

morphine1900 commented Jul 22, 2024

Nomad version

Output from nomad version
1.7.5

Operating system and Environment details

Linux amd64

Issue

Nil dereference on running job after updating to 1.7.5

Nomad Client logs (if appropriate)

nomad[11069]: \nruntime/debug.Stack()\n\truntime/debug/stack.go:24 +0x5e\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).Process.func1()\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:83 +0x58\npanic({0x2a88140?, 0x4f5ea50?})\n\truntime/panic.go:914 +0x21f\ngitpro.ttaallkk.top/hashicorp/nomad/client/lib/numalib.(*Topology).UsableCores(...)\n\tgitpro.ttaallkk.top/hashicorp/nomad/client/lib/numalib/topology.go:258\ngitpro.ttaallkk.top/hashicorp/nomad/nomad/structs.(*NodeResources).Comparable(0xc00190e320)\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/structs/structs.go:3185 +0xcc\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*Preemptor).SetNode(0xc00145b048, 0xc00148c600)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/preemption.go:139 +0x36\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*BinPackIterator).Next(0xc000127810)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/rank.go:274 +0x74d\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*ScoreNormalizationIterator).Next(0xc00066c1c0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/rank.go:816 +0x28\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemStack).Select(0xc00191a880, 0xc001636a20, 0xc00145b790)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/stack.go:362 +0x82e\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).computePlacements(0xc001d1c3c0, {0xc00191a980, 0x3, 0xb?})\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:374 +0x325\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).computeJobAllocs(0xc001d1c3c0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:322 +0xb85\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).process(0xc001d1c3c0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:163 +0x4da\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.retryMax(0x5, 0xc00145bd80, 0xc00145bd70)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/util.go:96 +0x49\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).Process(0xc001d1c3c0, 0xc002200480)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:108 +0x525\ngitpro.ttaallkk.top/hashicorp/nomad/nomad.(*Worker).invokeScheduler(0xc0012a76c0, 0xc001154420, 0xc002200480, {0xc001e6f7d0, 0x24})\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:634 +0x353\ngitpro.ttaallkk.top/hashicorp/nomad/nomad.(*Worker).run(0xc0012a76c0, 0x12a05f200)\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:463 +0x5a5\ncreated by github.com/hashicorp/nomad/nomad.(*Worker).Start in goroutine 1\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:162 +0x59\n","worker_id":"0b7cd3e0-f1d4-bab0-9843-e6eef610aa25"}
nomad[11069]: {"@level":"error","@message":"processing eval panicked scheduler - please report this as a bug!","@module":"worker.system_sched","@timestamp":"2024-07-20T00:08:01.615847Z","error":"runtime error: invalid memory address or nil pointer dereference","eval_id":"33a13e15-fa8c-c2b6-5437-272e0a5b9c0a","job_id":"m3hostfiles","namespace":"default","stack_trace":"goroutine 65 [running]:\nruntime/debug.Stack()\n\truntime/debug/stack.go:24 +0x5e\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).Process.func1()\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:83 +0x58\npanic({0x2a88140?, 0x4f5ea50?})\n\truntime/panic.go:914 +0x21f\ngitpro.ttaallkk.top/hashicorp/nomad/client/lib/numalib.(*Topology).UsableCores(...)\n\tgitpro.ttaallkk.top/hashicorp/nomad/client/lib/numalib/topology.go:258\ngitpro.ttaallkk.top/hashicorp/nomad/nomad/structs.(*NodeResources).Comparable(0xc00190e320)\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/structs/structs.go:3185 +0xcc\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*Preemptor).SetNode(0xc00145b048, 0xc00148c600)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/preemption.go:139 +0x36\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*BinPackIterator).Next(0xc000127810)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/rank.go:274 +0x74d\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*ScoreNormalizationIterator).Next(0xc00066c1c0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/rank.go:816 +0x28\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemStack).Select(0xc00191a880, 0xc001636a20, 0xc00145b790)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/stack.go:362 +0x82e\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).computePlacements(0xc001d1c3c0, {0xc00191a980, 0x3, 0xb?})\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:374 +0x325\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).computeJobAllocs(0xc001d1c3c0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:322 +0xb85\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).process(0xc001d1c3c0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:163 +0x4da\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.retryMax(0x5, 0xc00145bd80, 0xc00145bd70)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/util.go:96 +0x49\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).Process(0xc001d1c3c0, 0xc002200480)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:108 +0x525\ngitpro.ttaallkk.top/hashicorp/nomad/nomad.(*Worker).invokeScheduler(0xc0012a76c0, 0xc001154420, 0xc002200480, {0xc001e6f7d0, 0x24})\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:634 +0x353\ngitpro.ttaallkk.top/hashicorp/nomad/nomad.(*Worker).run(0xc0012a76c0, 0x12a05f200)\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:463 +0x5a5\ncreated by github.com/hashicorp/nomad/nomad.(*Worker).Start in goroutine 1\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:162 +0x59\n","worker_id":"0b7cd3e0-f1d4-bab0-9843-e6eef610aa25"}
nomad[11069]: {"@level":"error","@message":"error invoking scheduler","@module":"worker","@timestamp":"2024-07-20T00:08:01.616029Z","error":"failed to process evaluation: failed to process eval: runtime error: invalid memory address or nil pointer dereference","worker_id":"0b7cd3e0-f1d4-bab0-9843-e6eef610aa25"}
nomad[11069]: {"@level":"error","@message":"error invoking scheduler","@module":"worker","@timestamp":"2024-07-20T00:08:01.616029Z","error":"failed to process evaluation: failed to process eval: runtime error: invalid memory address or nil pointer dereference","worker_id":"0b7cd3e0-f1d4-bab0-9843-e6eef610aa25"}
nomad[11069]: \nruntime/debug.Stack()\n\truntime/debug/stack.go:24 +0x5e\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).Process.func1()\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:83 +0x58\npanic({0x2a88140?, 0x4f5ea50?})\n\truntime/panic.go:914 +0x21f\ngitpro.ttaallkk.top/hashicorp/nomad/client/lib/numalib.(*Topology).UsableCores(...)\n\tgitpro.ttaallkk.top/hashicorp/nomad/client/lib/numalib/topology.go:258\ngitpro.ttaallkk.top/hashicorp/nomad/nomad/structs.(*NodeResources).Comparable(0xc00190e320)\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/structs/structs.go:3185 +0xcc\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*Preemptor).SetNode(0xc00145b048, 0xc00148c600)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/preemption.go:139 +0x36\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*BinPackIterator).Next(0xc00069e230)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/rank.go:274 +0x74d\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*ScoreNormalizationIterator).Next(0xc000efcfa0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/rank.go:816 +0x28\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemStack).Select(0xc002870d00, 0xc001636a20, 0xc00145b790)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/stack.go:362 +0x82e\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).computePlacements(0xc00178a0c0, {0xc002870e00, 0x3, 0xb?})\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:374 +0x325\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).computeJobAllocs(0xc00178a0c0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:322 +0xb85\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).process(0xc00178a0c0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:163 +0x4da\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.retryMax(0x5, 0xc00145bd80, 0xc00145bd70)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/util.go:96 +0x49\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).Process(0xc00178a0c0, 0xc001994000)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:108 +0x525\ngitpro.ttaallkk.top/hashicorp/nomad/nomad.(*Worker).invokeScheduler(0xc0012a76c0, 0xc001226540, 0xc001994000, {0xc0036ab650, 0x24})\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:634 +0x353\ngitpro.ttaallkk.top/hashicorp/nomad/nomad.(*Worker).run(0xc0012a76c0, 0x12a05f200)\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:463 +0x5a5\ncreated by github.com/hashicorp/nomad/nomad.(*Worker).Start in goroutine 1\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:162 +0x59\n","worker_id":"0b7cd3e0-f1d4-bab0-9843-e6eef610aa25"}
nomad[11069]: {"@level":"error","@message":"processing eval panicked scheduler - please report this as a bug!","@module":"worker.system_sched","@timestamp":"2024-07-20T00:08:02.619052Z","error":"runtime error: invalid memory address or nil pointer dereference","eval_id":"33a13e15-fa8c-c2b6-5437-272e0a5b9c0a","job_id":"m3hostfiles","namespace":"default","stack_trace":"goroutine 65 [running]:\nruntime/debug.Stack()\n\truntime/debug/stack.go:24 +0x5e\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).Process.func1()\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:83 +0x58\npanic({0x2a88140?, 0x4f5ea50?})\n\truntime/panic.go:914 +0x21f\ngitpro.ttaallkk.top/hashicorp/nomad/client/lib/numalib.(*Topology).UsableCores(...)\n\tgitpro.ttaallkk.top/hashicorp/nomad/client/lib/numalib/topology.go:258\ngitpro.ttaallkk.top/hashicorp/nomad/nomad/structs.(*NodeResources).Comparable(0xc00190e320)\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/structs/structs.go:3185 +0xcc\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*Preemptor).SetNode(0xc00145b048, 0xc00148c600)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/preemption.go:139 +0x36\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*BinPackIterator).Next(0xc00069e230)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/rank.go:274 +0x74d\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*ScoreNormalizationIterator).Next(0xc000efcfa0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/rank.go:816 +0x28\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemStack).Select(0xc002870d00, 0xc001636a20, 0xc00145b790)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/stack.go:362 +0x82e\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).computePlacements(0xc00178a0c0, {0xc002870e00, 0x3, 0xb?})\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:374 +0x325\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).computeJobAllocs(0xc00178a0c0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:322 +0xb85\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).process(0xc00178a0c0)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:163 +0x4da\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.retryMax(0x5, 0xc00145bd80, 0xc00145bd70)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/util.go:96 +0x49\ngitpro.ttaallkk.top/hashicorp/nomad/scheduler.(*SystemScheduler).Process(0xc00178a0c0, 0xc001994000)\n\tgitpro.ttaallkk.top/hashicorp/nomad/scheduler/scheduler_system.go:108 +0x525\ngitpro.ttaallkk.top/hashicorp/nomad/nomad.(*Worker).invokeScheduler(0xc0012a76c0, 0xc001226540, 0xc001994000, {0xc0036ab650, 0x24})\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:634 +0x353\ngitpro.ttaallkk.top/hashicorp/nomad/nomad.(*Worker).run(0xc0012a76c0, 0x12a05f200)\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:463 +0x5a5\ncreated by github.com/hashicorp/nomad/nomad.(*Worker).Start in goroutine 1\n\tgitpro.ttaallkk.top/hashicorp/nomad/nomad/worker.go:162 +0x59\n","worker_id":"0b7cd3e0-f1d4-bab0-9843-e6eef610aa25"}
nomad[11069]: {"@level":"error","@message":"error invoking scheduler","@module":"worker","@timestamp":"2024-07-20T00:08:02.619112Z","error":"failed to process evaluation: failed to process eval: runtime error: invalid memory address or nil pointer dereference","worker_id":"0b7cd3e0-f1d4-bab0-9843-e6eef610aa25"}
nomad[11069]: {"@level":"error","@message":"error invoking scheduler","@module":"worker","@timestamp":"2024-07-20T00:08:02.619112Z","error":"failed to process evaluation: failed to process eval: runtime error: invalid memory address or nil pointer dereference","worker_id":"0b7cd3e0-f1d4-bab0-9843-e6eef610aa25"}
@jrasell
Copy link
Member

jrasell commented Jul 22, 2024

Hi @morphine1900 and thanks for raising this.

Is it possible to provide some more information, so I could reproduce this? It would be very handy to have a redacted version of the job specification that was being processed at the time and an output of a Nomad client either by the CLI or API that would have been eligible to run this work.

The title suggests you upgraded Nomad, could you also let me know what version you upgraded from please?

@tgross
Copy link
Member

tgross commented Jul 26, 2024

Hi @morphine1900 I took a look at your log entry and extracted the following stack trace:

stack trace
runtime/debug.Stack()
	runtime/debug/stack.go:24 +0x5e
github.com/hashicorp/nomad/scheduler.(*SystemScheduler).Process.func1()
	github.com/hashicorp/nomad/scheduler/scheduler_system.go:83 +0x58
panic({0x2a88140?, 0x4f5ea50?})
	runtime/panic.go:914 +0x21f
github.com/hashicorp/nomad/client/lib/numalib.(*Topology).UsableCores(...)
	github.com/hashicorp/nomad/client/lib/numalib/topology.go:258
github.com/hashicorp/nomad/nomad/structs.(*NodeResources).Comparable(0xc00190e320)
	github.com/hashicorp/nomad/nomad/structs/structs.go:3185 +0xcc
github.com/hashicorp/nomad/scheduler.(*Preemptor).SetNode(0xc00145b048, 0xc00148c600)
	github.com/hashicorp/nomad/scheduler/preemption.go:139 +0x36
github.com/hashicorp/nomad/scheduler.(*BinPackIterator).Next(0xc000127810)
	github.com/hashicorp/nomad/scheduler/rank.go:274 +0x74d
github.com/hashicorp/nomad/scheduler.(*ScoreNormalizationIterator).Next(0xc00066c1c0)
	github.com/hashicorp/nomad/scheduler/rank.go:816 +0x28
github.com/hashicorp/nomad/scheduler.(*SystemStack).Select(0xc00191a880, 0xc001636a20, 0xc00145b790)
	github.com/hashicorp/nomad/scheduler/stack.go:362 +0x82e
github.com/hashicorp/nomad/scheduler.(*SystemScheduler).computePlacements(0xc001d1c3c0, {0xc00191a980, 0x3, 0xb?})
	github.com/hashicorp/nomad/scheduler/scheduler_system.go:374 +0x325
github.com/hashicorp/nomad/scheduler.(*SystemScheduler).computeJobAllocs(0xc001d1c3c0)
	github.com/hashicorp/nomad/scheduler/scheduler_system.go:322 +0xb85
github.com/hashicorp/nomad/scheduler.(*SystemScheduler).process(0xc001d1c3c0)
	github.com/hashicorp/nomad/scheduler/scheduler_system.go:163 +0x4da
github.com/hashicorp/nomad/scheduler.retryMax(0x5, 0xc00145bd80, 0xc00145bd70)
	github.com/hashicorp/nomad/scheduler/util.go:96 +0x49
github.com/hashicorp/nomad/scheduler.(*SystemScheduler).Process(0xc001d1c3c0, 0xc002200480)
	github.com/hashicorp/nomad/scheduler/scheduler_system.go:108 +0x525
github.com/hashicorp/nomad/nomad.(*Worker).invokeScheduler(0xc0012a76c0, 0xc001154420, 0xc002200480, {0xc001e6f7d0, 0x24})
	github.com/hashicorp/nomad/nomad/worker.go:634 +0x353
github.com/hashicorp/nomad/nomad.(*Worker).run(0xc0012a76c0, 0x12a05f200)
	github.com/hashicorp/nomad/nomad/worker.go:463 +0x5a5
created by github.com/hashicorp/nomad/nomad.(*Worker).Start in goroutine 1
	github.com/hashicorp/nomad/nomad/worker.go:162 +0x59

This looks an awful lot like it's rooted in the same bug I fixed here #23284, which shipped in Nomad 1.8.1, with backports to 1.7.10 Enterprise and 1.6.13 Enterprise. In addition to providing what @jrasell's asked for, you may want to try upgrading to Nomad 1.8.x to see if the problem's already been fixed for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

3 participants