deploy: df4d268

notsyncing · Aug 4, 2024 · 9dac1b1 · 9dac1b1
commit 9dac1b1
Show file tree

Hide file tree

Showing 33 changed files with 4,097 additions and 0 deletions.
diff --git a/.buildinfo b/.buildinfo
@@ -0,0 +1,4 @@
+# Sphinx build info version 1
+# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
+config: 3d082be86b982dfad1004262db396f21
+tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/.doctrees/environment.pickle b/.doctrees/environment.pickle
diff --git a/.doctrees/index.doctree b/.doctrees/index.doctree
diff --git a/.doctrees/quick-start.doctree b/.doctrees/quick-start.doctree
diff --git a/.nojekyll b/.nojekyll
diff --git a/_sources/index.rst.txt b/_sources/index.rst.txt
@@ -0,0 +1,30 @@
+.. azarrot documentation master file, created by
+   sphinx-quickstart on Sat Jun 29 15:37:14 2024.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+
+Welcome to Azarrot's documentation!
+===================================
+
+**Azarrot** is an OpenAI compatible LLM inference server, focusing on OpenVINO™ and IPEX-LLM usage.
+
+The name `azarrot` is combined from `azalea` and `parrot`.
+
+.. note::
+
+   This project is under active development.
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents
+
+   quick-start
+
+
+
+Indices and tables
+==================
+
+* :ref:`genindex`
+* :ref:`modindex`
+* :ref:`search`
diff --git a/_sources/quick-start.rst.txt b/_sources/quick-start.rst.txt
@@ -0,0 +1,87 @@
+Quickstart
+==========
+
+This page will guide you to start using azarrot.
+
+Prerequisites
+-------------
+
+Azarrot has some prerequisites for your hardware and software.
+
+Hardware
+^^^^^^^^
+
+Azarrot supports CPUs and Intel GPUs.
+
+Tested GPUs:
+
+* Intel A770 16GB
+* Intel Xe 96EU (i7 12700H)
+
+Other devices should work if they are supported by oneAPI toolkit and drivers.
+
+Software
+^^^^^^^^
+
+* Any Linux distribution
+* Intel GPU drivers (if you are using Intel GPUs) from https://dgpu-docs.intel.com/driver/client/overview.html
+* Intel oneAPI Base Toolkit 2024.0 or above from https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/
+* Python 3.11.x or below
+
+Azarrot is developed and tested on Ubuntu 22.04 and python 3.10.
+
+Install
+-------
+
+Simply install azarrot from PyPI:
+
+.. code-block:: bash
+
+    pip install azarrot
+
+Then, create a `server.yml` in the directory you want to run it:
+
+.. code-block:: bash
+
+    mkdir azarrot
+
+    # Copy from examples/server.yml
+    cp <SOURCE_ROOT>/examples/server.yml azarrot/
+
+`<SOURCE_ROOT>` means the repository path you cloned.
+
+In `server.yml` you can configure things like listening port, model path, etc.
+
+Next we create the models directory:
+
+.. code-block:: bash
+
+    cd azarrot
+    mkdir models
+
+And copy an example model file into the models directory:
+
+.. code-block:: bash
+
+    cp <SOURCE_ROOT>/examples/CodeQwen1.5-7B-ipex-llm.model.yml models/
+
+Azarrot will load all `.model.yml` files in this directory.
+You need to manually download the model from huggingface, or convert them if you are using the OpenVINO backend:
+
+.. code-block:: bash
+
+    huggingface-cli download --local-dir models/CodeQwen1.5-7B Qwen/CodeQwen1.5-7B
+
+Azarrot will convert it to `int4` when loading the model with IPEX-LLM backend.
+
+Start to use
+------------
+
+Now we can start the server:
+
+.. code-block:: bash
+
+    source /opt/intel/oneapi/setvars.sh
+    python -m azarrot
+
+And access `http://localhost:8080/v1/models` too see all loaded models.