Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(community): update to OpenLLM 0.6 #24609

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 14 additions & 72 deletions docs/docs/integrations/llms/openllm.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,14 @@
"source": [
"# OpenLLM\n",
"\n",
"[🦾 OpenLLM](https://github.com/bentoml/OpenLLM) is an open platform for operating large language models (LLMs) in production. It enables developers to easily run inference with any open-source LLMs, deploy to the cloud or on-premises, and build powerful AI apps."
"[🦾 OpenLLM](https://github.com/bentoml/OpenLLM) lets developers run any **open-source LLMs** as **OpenAI-compatible API** endpoints with **a single command**.\n",
"\n",
"- 🔬 Build for fast and production usages\n",
"- 🚂 Support llama3, qwen2, gemma, etc, and many **quantized** versions [full list](https://github.com/bentoml/openllm-models)\n",
"- ⛓️ OpenAI-compatible API\n",
"- 💬 Built-in ChatGPT like UI\n",
"- 🔥 Accelerated LLM decoding with state-of-the-art inference backends\n",
"- 🌥️ Ready for enterprise-grade cloud deployment (Kubernetes, Docker and BentoCloud)"
]
},
{
Expand Down Expand Up @@ -37,10 +44,10 @@
"source": [
"## Launch OpenLLM server locally\n",
"\n",
"To start an LLM server, use `openllm start` command. For example, to start a dolly-v2 server, run the following command from a terminal:\n",
"To start an LLM server, use `openllm hello` command:\n",
"\n",
"```bash\n",
"openllm start dolly-v2\n",
"openllm hello\n",
"```\n",
"\n",
"\n",
Expand All @@ -57,83 +64,18 @@
"from langchain_community.llms import OpenLLM\n",
"\n",
"server_url = \"http://localhost:3000\" # Replace with remote host if you are running on a remote server\n",
"llm = OpenLLM(server_url=server_url)"
]
},
{
"cell_type": "markdown",
"id": "4f830f9d",
"metadata": {},
"source": [
"### Optional: Local LLM Inference\n",
"\n",
"You may also choose to initialize an LLM managed by OpenLLM locally from current process. This is useful for development purpose and allows developers to quickly try out different types of LLMs.\n",
"\n",
"When moving LLM applications to production, we recommend deploying the OpenLLM server separately and access via the `server_url` option demonstrated above.\n",
"\n",
"To load an LLM locally via the LangChain wrapper:"
"llm = OpenLLM(base_url=server_url, api_key=\"na\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "82c392b6",
"id": "56cb4bc0",
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.llms import OpenLLM\n",
"\n",
"llm = OpenLLM(\n",
" model_name=\"dolly-v2\",\n",
" model_id=\"databricks/dolly-v2-3b\",\n",
" temperature=0.94,\n",
" repetition_penalty=1.2,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "f15ebe0d",
"metadata": {},
"source": [
"### Integrate with a LLMChain"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "8b02a97a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"iLkb\n"
]
}
],
"source": [
"from langchain.chains import LLMChain\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"template = \"What is a good name for a company that makes {product}?\"\n",
"\n",
"prompt = PromptTemplate.from_template(template)\n",
"\n",
"llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
"\n",
"generated = llm_chain.run(product=\"mechanical keyboard\")\n",
"print(generated)"
"llm(\"To build a LLM from scratch, the following are the steps:\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "56cb4bc0",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -152,7 +94,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.10"
"version": "3.11.9"
}
},
"nbformat": 4,
Expand Down
41 changes: 16 additions & 25 deletions docs/docs/integrations/providers/openllm.mdx
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
---
keywords: [openllm]
---

# OpenLLM

This page demonstrates how to use [OpenLLM](https://github.com/bentoml/OpenLLM)
with LangChain.
OpenLLM lets developers run any **open-source LLMs** as **OpenAI-compatible API** endpoints with **a single command**.

`OpenLLM` is an open platform for operating large language models (LLMs) in
production. It enables developers to easily run inference with any open-source
LLMs, deploy to the cloud or on-premises, and build powerful AI apps.
- 🔬 Build for fast and production usages
- 🚂 Support llama3, qwen2, gemma, etc, and many **quantized** versions [full list](https://github.com/bentoml/openllm-models)
- ⛓️ OpenAI-compatible API
- 💬 Built-in ChatGPT like UI
- 🔥 Accelerated LLM decoding with state-of-the-art inference backends
- 🌥️ Ready for enterprise-grade cloud deployment (Kubernetes, Docker and BentoCloud)

## Installation and Setup

Expand All @@ -23,43 +29,28 @@ are pre-optimized for OpenLLM.

## Wrappers

There is a OpenLLM Wrapper which supports loading LLM in-process or accessing a
remote OpenLLM server:
There is a OpenLLM Wrapper which supports interacting with running server with OpenLLM:

```python
from langchain_community.llms import OpenLLM
```

### Wrapper for OpenLLM server

This wrapper supports connecting to an OpenLLM server via HTTP or gRPC. The
OpenLLM server can run either locally or on the cloud.
This wrapper supports interacting with OpenLLM's OpenAI-compatible endpoint.

To try it out locally, start an OpenLLM server:
To run a model, do:

```bash
openllm start flan-t5
openllm hello
```

Wrapper usage:

```python
from langchain_community.llms import OpenLLM

llm = OpenLLM(server_url='http://localhost:3000')

llm("What is the difference between a duck and a goose? And why there are so many Goose in Canada?")
```

### Wrapper for Local Inference

You can also use the OpenLLM wrapper to load LLM in current Python process for
running inference.

```python
from langchain_community.llms import OpenLLM

llm = OpenLLM(model_name="dolly-v2", model_id='databricks/dolly-v2-7b')
llm = OpenLLM(base_url="http://localhost:3000/v1", api_key="na")

llm("What is the difference between a duck and a goose? And why there are so many Goose in Canada?")
```
Expand Down
1 change: 0 additions & 1 deletion libs/community/langchain_community/llms/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1053,7 +1053,6 @@ def get_type_to_cls_dict() -> Dict[str, Callable[[], Type[BaseLLM]]]:
"vertexai": _import_vertex,
"vertexai_model_garden": _import_vertex_model_garden,
"openllm": _import_openllm,
"openllm_client": _import_openllm,
"vllm": _import_vllm,
"vllm_openai": _import_vllm_openai,
"watsonxllm": _import_watsonxllm,
Expand Down
Loading
Loading