Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request - Add Sliding Window Memory Scheduling #128

Open
githuba9f5404 opened this issue Oct 5, 2024 · 0 comments
Open

Feature Request - Add Sliding Window Memory Scheduling #128

githuba9f5404 opened this issue Oct 5, 2024 · 0 comments

Comments

@githuba9f5404
Copy link

This would allow for even larger models to run on smaller distribution. I think Llama 3.1 405B Instruct might be possible on my 8 x Raspberry Pi 4B 8GB array (specs here: #122).

I want to clarify I have little to no idea what I'm doing really, and this request might be impossible or very difficult for a number of reasons. I'm just a guy with an array of raspis that wants to run local LLMs. I did want to bring your attention to this other project either way, as their approach seems similar, interesting and possibly able to increase the capabilities of this project. So far my limited experience with that other project has been that it's so much slower (due to full precision), but some features (like pre-splitting the models and having the worker nodes only download their section once) seems like it might make sense in your project as well.

Thank you for continuing to advance and push the boundaries of edge AI, as a novice hobbyist, your expert efforts are so greatly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant