from friendli import Friendli

client = Friendli(use_dedicated_endpoint=True)
chat = client.chat.completions.create(
    model="{endpoint_id}",
    messages=[
        {
            "role": "user",
            "content": "Give three tips for staying healthy.",
        }
    ]
)

If you want to send a request to a specific adapter of the Multi-LoRA endpoint, provide "{endpoint_id}:{adapter_route}" to model argument:

from friendli import Friendli

client = Friendli(use_dedicated_endpoint=True)
chat = client.chat.completions.create(
   model="{endpoint_id}:{adapter_route}",
   messages=[
       {
           "role": "user",
           "content": "Give three tips for staying healthy.",
       }
   ]
)

Assets 2

06 Aug 01:42

kooyunmo

v1.5.0

0d8f5f9

Release v1.5.0 🚀

Deprecate model conversion and quantization. Alternatively, please use friendli-model-optmizer to quantize your models.
Increase default HTTP timeout.

Assets 2

21 Jul 07:27

kooyunmo

v1.4.2

25a0d6d

Release v1.4.2 🚀

Support for Tool Calling API: Added new API to support tool calling.
Phi3 INT8 Support: Implemented support for Phi3 INT8.
Snowflake Arctic FP8 Quantizer: Introduced new quantizer for Snowflake Arctic FP8.
Added support for INT8 quantization for Llama and refactored quantizer to use only safetensors.

Assets 2

19 Jun 06:13

kooyunmo

v1.4.1

64ee249

Release v1.4.1 🚀

Updating Patch Version

This patch version Introduces explicit resource management to prevent unexpected resource leaks.
By default, the library closes underlying HTTP and gRPC connections when the client is garbage-collected. However, you can now manually close the Friendli or AsyncFriendli client using the .close() method or utilize a context manager to ensure proper closure when exiting a with block.

Usage examples

import asyncio
from friendli import AsyncFriendli

client = AsyncFriendli(base_url="0.0.0.0:8000", use_grpc=True)

async def run():
    async with client:
        stream = await client.completions.create(
            prompt="Explain what gRPC is. Also give me a Python code snippet of gRPC client.",
            stream=True,
            top_k=1,
        )

        async for chunk in stream:
            print(chunk.text, end="", flush=True)

asyncio.run(run())

Assets 2

18 Jun 06:45

kooyunmo

v1.4.0

9bc5ff6

Release v1.4.0 🚀

gRPC client support for completions API.

Assets 2

12 Jun 03:10

kooyunmo

v1.3.7

9343fa7

Release v1.3.7 🚀

Minor: add a default value for the "index" and "text" fields of the completion stream's chunk.

Assets 2

10 Jun 06:40

kooyunmo

v1.3.6

ec053ea

Release v1.3.6 🚀

Support Phi3 FP8 conversion.
Hotfix for safetensor checkpoint saver.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updating Patch Version

Usage examples

Releases: friendliai/friendli-client

Release v1.5.4 🚀

Release v1.5.3 🚀

Release v1.5.2 🚀

Release v1.5.1 🚀

Release v1.5.0 🚀

Release v1.4.2 🚀

Release v1.4.1 🚀

Updating Patch Version

Usage examples

Release v1.4.0 🚀

Release v1.3.7 🚀

Release v1.3.6 🚀