Running large language models on a single GPU for throughput-oriented scenarios.
-
Updated
Jul 24, 2024 - Python
Running large language models on a single GPU for throughput-oriented scenarios.
Run Mixtral-8x7B models in Colab or consumer desktops
PyTorch native quantization and sparsity for training and inference
dpdk infrastructure for software acceleration. Currently working on RX and ACL pre-filter
DPU-Powered File System Virtualization over virtio-fs
A Dynamic Programming Offloading Algorithm for Mobile Cloud Computing
A collection of tests for the Open vSwitch HW offload.
A lightweight framework that enables serverless users to reduce their bills by harvesting non-serverless compute resources such as their VMs, on-premise servers, or personal computers.
Monero hardware wallet protocol implementation for Trezor, agent
LeapIO: Efficient and Portable Virtual NVMe Storage on ARM SoCs (ASPLOS'20)
A framework for IoT devices to offload tasks to the cloud, resulting in efficient computation and decreased cloud costs.
The container-based cloud platform for mobile code offloading
Monero wallet Trezor integration documentation
Code for paper "Real-time Neural Network Inference on Extremely Weak Devices: Agile Offloading with Explainable AI" (MobiCom'22)
Backend.AI Client Library for Python
A Pandas-inspired data analysis project with lazy semantics and query-offloading to SQLite
Examples of using OpenMP offload with dgemm in the target region
基于 DPDK 和智能网卡的流量卸载试验. A flow offloading prototype base on DPDK and Mellanox/Nvidia SmartNIC.
Add a description, image, and links to the offloading topic page so that developers can more easily learn about it.
To associate your repository with the offloading topic, visit your repo's landing page and select "manage topics."