-
Notifications
You must be signed in to change notification settings - Fork 54
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: add concepts, faq, migration guide (#411)
* docs: add concepts, faq, migration guide Signed-off-by: Keming <kemingy94@gmail.com> * fix the link checker Signed-off-by: Keming <kemingy94@gmail.com> * change sphinx build warning to error Signed-off-by: Keming <kemingy94@gmail.com> * Apply suggestions from code review Co-authored-by: zclzc <38581401+lkevinzc@users.noreply.github.com> Signed-off-by: Keming <kemingy94@gmail.com> --------- Signed-off-by: Keming <kemingy94@gmail.com> Co-authored-by: zclzc <38581401+lkevinzc@users.noreply.github.com>
- Loading branch information
Showing
8 changed files
with
79 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
# Concept and FAQs | ||
|
||
There are a few terms used in `mosec`. | ||
|
||
- `worker`: a Python process that executes the `forward` method (inherit from [`mosec.Worker`](mosec.worker.Worker)) | ||
- `stage`: one processing unit in the pipeline, each stage contains several `worker` replicas | ||
- each stage retrieves the data from the previous stage and passes the result to the next stage | ||
- retrieved data will be deserialized by the [`Worker.deserialize_ipc`](mosec.worker.Worker.deserialize_ipc) method | ||
- data to be passed will be serialized by the [`Worker.serialize_ipc`](mosec.worker.Worker.serialize_ipc) method | ||
- `ingress/egress`: the first/last stage in the pipeline | ||
- ingress gets data from the client, while egress sends data to the client | ||
- data will be deserialized by the ingress [`Worker.serialize`](mosec.worker.Worker.serialize) method and serialized by the egress [`Worker.deserialize`](mosec.worker.Worker.deserialize) method | ||
- `pipeline`: a chain of processing stages | ||
- `dynamic batching`: batch requests until either the max batch size or the max wait time is reached | ||
- `controller`: a Rust tokio thread that works on: | ||
- read from the previous queue to get new tasks | ||
- send tasks to the ready-to-process worker via the Unix domain socket | ||
- receive results from the worker | ||
- send the tasks to the next queue | ||
|
||
## FAQs | ||
|
||
### How to raise an exception? | ||
|
||
Use the `raise` keyword with [mosec.errors](mosec.errors). Raising other exceptions will be treated as an "500 Internal Server Error". | ||
|
||
If a request raises any exception, the error will be returned to the client directly without going through the rest stages. | ||
|
||
### How to change the serialization/deserialization methods? | ||
|
||
Just let the ingress/egress worker inherit a suitable mixin like [`MsgpackMixin`](mosec.mixin.MsgpackMixin). | ||
|
||
```{note} | ||
The inheritance order matters in Python. Check [multiple inheritance](https://docs.python.org/3/tutorial/classes.html#multiple-inheritance) for more information. | ||
``` | ||
|
||
You can also implement the `serialize/deserialize` method to your `ingress/egress` worker directly. | ||
|
||
### How to share configurations among different workers? | ||
|
||
If the configuration structure is initialized globally, all the workers should be able to use it directly. | ||
|
||
If you want to assign different workers with different configurations, the best way is to use the `env` (ref [`append_worker`](mosec.server.Server.append_worker)). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Migration Guide | ||
|
||
This guide will help you migrate from other frameworks to `mosec`. | ||
|
||
## From the `Triton Inference Server` | ||
|
||
Both [`PyTriton`](https://github.com/triton-inference-server/pytriton) and [`Triton Python Backend`](https://github.com/triton-inference-server/python_backend) are using [`Triton Inference Server`](https://github.com/triton-inference-server). | ||
|
||
- `mosec` doesn't require a specific client, you can use any HTTP client library | ||
- dynamic batching is configured when calling the [`append_worker`](mosec.server.Server.append_worker) | ||
- `mosec` doesn't need to declare the `inputs` and `outputs`. If you want to validate the request, you can use the [`TypedMsgPackMixin`](mosec.mixin.typed_worker.TypedMsgPackMixin) (ref [Validate Request](https://mosecorg.github.io/mosec/examples/validate.html)) | ||
|
||
### `Triton Python Backend` | ||
|
||
- change the `TritonPythonModel` class to a worker class that inherits [`mosec.Worker`](mosec.worker.Worker) | ||
- move the `initialize` method to the `__init__` method in the new class | ||
- move the `execute` method to the `forward` method in the new class | ||
- if you still prefer to use the `auto_complete_config` method, you can merge it into the `__init__` method | ||
- `mosec` doesn't have the corresponding `finalize` method as an unloading handler | ||
- `mosec` doesn't require any special model directories or configurations | ||
- to run multiple replicas, configure the `num` in [`append_worker`](mosec.server.Server.append_worker) | ||
|
||
### `PyTriton` | ||
|
||
- move the model loading logic to the `__init__` method, since this happens in a different process | ||
- move the `infer_func` function to the `forward` method |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
scheme = [ "https", "http", "mailto" ] | ||
exclude_loopback = true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters