Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash with multiple Load calls #204

Closed
3inary opened this issue Aug 9, 2024 · 7 comments
Closed

Crash with multiple Load calls #204

3inary opened this issue Aug 9, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@3inary
Copy link

3inary commented Aug 9, 2024

Describe the bug

Hello,

for this and the previous version i get regular crashes (aprox. 1 start of 3) on osx.
Build and Editor, acc and noAcc.

=================================================================
Managed Stacktrace:

  at <unknown> <0xffffffff>
  at System.Object:wrapper_native_0x5335dbdb4 <0x00007>
  at LLMUnity.LLM:<Tokenize>b__65_0 <0x00087>
  at <>c__DisplayClass64_0:<LLMReply>b__0 <0x00083>
  at System.Threading.Tasks.Task:InnerInvoke <0x000c7>
  at System.Threading.Tasks.Task:Execute <0x00063>
  at System.Threading.Tasks.Task:ExecutionContextCallback <0x00097>
  at System.Threading.ExecutionContext:RunInternal <0x003d3>
  at System.Threading.ExecutionContext:Run <0x0006b>
  at System.Threading.Tasks.Task:ExecuteWithThreadLocal <0x00223>
  at System.Threading.Tasks.Task:ExecuteEntry <0x001c3>
  at System.Threading.Tasks.Task:System.Threading.IThreadPoolWorkItem.ExecuteWorkItem <0x00057>
  at System.Threading.ThreadPoolWorkQueue:Dispatch <0x0048f>
  at System.Threading._ThreadPoolWaitCallback:PerformWaitCallback <0x0008b>
  at <Module>:runtime_invoke_bool <0x0011b>

=================================================================
2024-08-09T12:58:59.181Z|0x3dc513000|Obtained 25 stack frames.
2024-08-09T12:58:59.182Z|0x3dc513000|#0 0x0000019e1a9698 in setjmp
2024-08-09T12:58:59.183Z|0x3dc513000|#1 0x000005335dbe0c in LLM_Tokenize
2024-08-09T12:58:59.183Z|0x3dc513000|#2 0x0000040fbad35c in (wrapper managed-to-native) object:wrapper_native_0x5335dbdb4 (intptr,string,intptr) [{0x396b874c8} + 0xec] (0x40fbad270 0x40fbad41c) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#3 0x000004170abdd0 in LLMUnity.LLM:b__65_0 (intptr,string,intptr) [{0x151c53710} + 0x88] [./Library/PackageCache/ai.undream.llm/Runtime/LLM.cs :: 424u] (0x4170abd48 0x4170abdf8) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#4 0x000004170aba24 in LLMUnity.LLM/<>c__DisplayClass64_0:b__0 () [{0x5c70f97c0} + 0x84] [./Library/PackageCache/ai.undream.llm/Runtime/LLM.cs :: 407u] (0x4170ab9a0 0x4170aba48) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#5 0x00000422daa050 in System.Threading.Tasks.Task:InnerInvoke () [{0x1504481e0} + 0xc8] (0x422da9f88 0x422daa110) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#6 0x00000422aa13e4 in System.Threading.Tasks.Task:Execute () [{0x3955d4b50} + 0x64] (0x422aa1380 0x422aa1470) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#7 0x00000422aa0f88 in System.Threading.Tasks.Task:ExecutionContextCallback (object) [{0x3955d4b78} + 0x98] (0x422aa0ef0 0x422aa0fb0) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#8 0x00000422a9f95c in System.Threading.ExecutionContext:RunInternal (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) [{0x395564ac8} + 0x3d4] (0x422a9f588 0x422a9f9e8) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#9 0x00000422a9f48c in System.Threading.ExecutionContext:Run (System.Threading.ExecutionContext,System.Threading.ContextCallback,object,bool) [{0x395564a48} + 0x6c] (0x422a9f420 0x422a9f4b0) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#10 0x00000422a9e8bc in System.Threading.Tasks.Task:ExecuteWithThreadLocal (System.Threading.Tasks.Task&) [{0x3955d49d8} + 0x224] (0x422a9e698 0x422a9e9e0) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#11 0x00000422a9d5a4 in System.Threading.Tasks.Task:ExecuteEntry (bool) [{0x3955d48e8} + 0x1c4] (0x422a9d3e0 0x422a9d6dc) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#12 0x00000422a9d2d8 in System.Threading.Tasks.Task:System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () [{0x150448048} + 0x58] (0x422a9d280 0x422a9d2fc) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#13 0x00000422a86fd0 in System.Threading.ThreadPoolWorkQueue:Dispatch () [{0x39596ede8} + 0x490] (0x422a86b40 0x422a8729c) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#14 0x00000422a85dbc in System.Threading._ThreadPoolWaitCallback:PerformWaitCallback () [{0x1509cb8b8} + 0x8c] (0x422a85d30 0x422a85e08) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.183Z|0x3dc513000|#15 0x00000422a86484 in (wrapper runtime-invoke) :runtime_invoke_bool (object,intptr,intptr,intptr) [{0x39596eeb8} + 0x11c] (0x422a86368 0x422a86648) [0x38f4caa80 - Unity Child Domain]
2024-08-09T12:58:59.184Z|0x3dc513000|#16 0x0000038f91774c in mono_jit_runtime_invoke
2024-08-09T12:58:59.184Z|0x3dc513000|#17 0x0000038fa9cd00 in do_runtime_invoke
2024-08-09T12:58:59.185Z|0x3dc513000|#18 0x0000038fac1884 in worker_callback
2024-08-09T12:58:59.185Z|0x3dc513000|#19 0x0000038fa16974 in worker_thread
2024-08-09T12:58:59.185Z|0x3dc513000|#20 0x0000038fabec60 in start_wrapper_internal
2024-08-09T12:58:59.186Z|0x3dc513000|#21 0x0000038fabeb0c in start_wrapper
2024-08-09T12:58:59.186Z|0x3dc513000|#22 0x0000038fb3e008 in GC_inner_start_routine
2024-08-09T12:58:59.187Z|0x3dc513000|#23 0x0000038fb3df90 in GC_start_routine
2024-08-09T12:58:59.187Z|0x3dc513000|#24 0x0000019e17af94 in _pthread_start
2024-08-09T12:58:59.187Z|0x3dc513000|Launching bug reporter
Attribute Qt::AA_EnableHighDpiScaling must be set before QCoreApplication is created.
�[40m�[32minfo�[39m�[22m�[49m: Microsoft.Hosting.Lifetime[0]
Application is shutting down...
�[40m�[32minfo�[39m�[22m�[49m: Unity.ILPP.Runner.PostProcessingAssemblyLoadContext[0]
ALC ILPP context 1 is unloading

Steps to reproduce

No response

LLMUnity version

2.1.0-2.0.3

Operating System

macOs

@3inary 3inary added the bug Something isn't working label Aug 9, 2024
@3inary 3inary closed this as completed Aug 9, 2024
@3inary 3inary closed this as not planned Won't fix, can't repro, duplicate, stale Aug 9, 2024
@3inary
Copy link
Author

3inary commented Aug 9, 2024

Caused by calling Load() multiple times during warmup for different chats on the same bot.

@amakropoulos
Copy link
Collaborator

Thank you for the bug report, I'll look into it next week 👍

@amakropoulos amakropoulos reopened this Aug 9, 2024
@amakropoulos amakropoulos changed the title Crash osx Crash with multiple Load calls Aug 9, 2024
@3inary
Copy link
Author

3inary commented Aug 12, 2024

Here the Editor Log:

{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"print_timings","line":321,"msg":"prompt eval time = 26406.68 ms / 2699 tokens ( 9.78 ms per token, 102.21 tokens per second)","id_slot":0,"id_task":41,"t_prompt_processing":26406.678,"n_prompt_tokens_processed":2699,"t_token":9.783874768432753,"n_tokens_second":102.2089942551653}
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"print_timings","line":337,"msg":"generation eval time = 12923.37 ms / 76 runs ( 170.04 ms per token, 5.88 tokens per second)","id_slot":0,"id_task":41,"t_token_generation":12923.372,"n_decoded":76,"t_token":170.0443684210526,"n_tokens_second":5.8808181022723796}
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"print_timings","line":347,"msg":" total time = 39330.05 ms","id_slot":0,"id_task":41,"t_prompt_processing":26406.678,"t_token_generation":12923.372,"t_total":39330.05}
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"update_slots","line":1794,"msg":"slot released","id_slot":0,"id_task":41,"n_ctx":4096,"n_past":2776,"n_system_tokens":0,"n_cache_tokens":2776,"truncated":true}
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"update_slots","line":1812,"msg":"all slots are idle"}
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"launch_slot_with_task","line":1046,"msg":"slot is processing task","id_slot":0,"id_task":48}
2024-08-13T21:09:16.361Z|0x1f251cc00|LLM 11: Severe error occured
{"tid":"0x3fc9c7000","timestamp":1723583356,"level":"INFO","function":"update_slots","line":2095,"msg":"kv cache rm [p0, end)","id_slot":0,"id_task":48,"p0":2694}
[Crash!]

Edit: log exchanged with a more informative one
Edit2: Digging this for a while now i came to belive, that other errors just bubbled up throgh the stack into the warmup callback and that itself is innocent.

@amakropoulos
Copy link
Collaborator

hi, could you check again with the latest release (v2.2.0)?
I implemented a fix in the LLM creation / destruction.

@3inary
Copy link
Author

3inary commented Sep 4, 2024

Hi, i have checked the latest release and noticed that the llmCharacter WarmupCallback can return before llm.started. Combined with a Nullref in the callback allegedly caused the original error above.

UnityThread
{
_ = llmCharacter.Warmup(WarmUpCallback);
}

private void WarmUpCallback()
{
llm.SetBasePrompt("something")); // LLM not created error
NotExisting.Val = x; // Crash with last reported lifesign comming from LLM
}

imo the crash is resolved - unity is just gone before it can name the real culprit up the stack.

@amakropoulos
Copy link
Collaborator

I can't think how this can happen 🙂 .
All the local chat calls (i.e. not on remote server) including warmup pass through this line
https://github.com/undreamai/LLMUnity/blob/main/Runtime/LLMCharacter.cs#L714
which proceeds only if the llm has failed or started successfully.

@3inary
Copy link
Author

3inary commented Sep 10, 2024

This i not exactly my expertise and I'am just guessing that it could be a UnityMainThread vs. Tasks thing. Here is what i found: Tasks may not synchronize correctly with Unity's main thread and Unity API calls made from threads other than the main thread can lead to such race conditions. You can use UnityMainThreadDispatcher or use UnityEngine.UnitySynchronizationContext to marshal back to the main thread.

@3inary 3inary closed this as completed Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants