Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve clarity in code and docs #1052

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 25 additions & 9 deletions docs/environment_setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,12 @@ In this setup guide, let's run the `examples/basics` project.

```{prompt} bash
git clone https://github.com/flyteorg/flytesnacks

# or if your SSH key is registered on GitHub:
git clone git@github.com:flyteorg/flytesnacks.git

# or if you use the `gh` tool:
gh repo clone flyteorg/flytesnacks
cd flytesnacks/examples/basics
pip install -r requirements.txt
```
Expand Down Expand Up @@ -67,8 +73,9 @@ pyflyte run basics/hello_world.py my_wf
```

:::{note}
The first couple arguments of `pyflyte run` is in the form of `path/to/script.py <workflow_name>`, where
`<workflow_name>` is the function decorated with `@workflow` that you want to run.
The first two arguments to `pyflyte run` have the form of
`path/to/script.py <workflow_name>`, where `<workflow_name>` is the function
decorated with `@workflow` that you want to run.
:::

To run the workflow on the demo Flyte cluster, all you need to do is supply the `--remote` flag:
Expand Down Expand Up @@ -103,7 +110,11 @@ option as `--arg-name`.

## Visualizing Workflows

Workflows can be visualized as DAGs on the UI. However, you can visualize workflows on the browser and in the terminal by *just* using your terminal.
Workflows can be visualized as DAGs in the UI. You can also visualize workflows
from your terminal that will be displayed in your default web browser. This
visualization uses the service at graph.flyte.org to render Graphviz diagrams,
and hence shares your DAG (but not your data or code) with an outside party
(security hint 🔐).

To view workflow on the browser:

Expand All @@ -127,15 +138,20 @@ flytectl get workflows \
basics.basic_workflow.my_wf
```

Replace `<version>` with version from console UI, it may look something like `BLrGKJaYsW2ME1PaoirK1g==`
Replace `<version>` with the base64-encoded version shown in the console UI,
that looks something like `BLrGKJaYsW2ME1PaoirK1g==`.

:::{tip}
Running most of the examples in the **User Guide** only requires the default Docker image that ships with Flyte.
Many examples in the {ref}`tutorials` and {ref}`integrations` section depend on additional libraries, `sklearn`,
`pytorch`, or `tensorflow`, which will not work with the default docker image used by `pyflyte run`.

These examples will explicitly show you which images to use for running these examples by passing in the docker
image you want to use with the `--image` option in `pyflyte run`.
Running most of the examples in the **User Guide** only requires the default
Docker image that ships with Flyte. Many examples in the {ref}`tutorials` and
{ref}`integrations` section depend on additional libraries such as `sklearn`,
`pytorch`, or `tensorflow`, which will not work with the default docker image
used by `pyflyte run`.

These examples will explicitly show you which images to use for running these
examples by passing in the docker image you want to use with the `--image`
option in `pyflyte run`.
:::

🎉 Congrats! Now you can run all the examples in the {ref}`userguide` 🎉
Expand Down
4 changes: 2 additions & 2 deletions docs/getting_started/package_register.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,7 +269,7 @@ By default, the `docker_build.sh` script:
- Uses the `PROJECT_NAME` specified in the `pyflyte init` command, which in
this case is `my_project`.
- Will not use any remote registry.
- Uses the git sha to version your tasks and workflows.
- Uses the git revision SHA1 to version your tasks and workflows.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Uses the git revision SHA1 to version your tasks and workflows.
- Uses the git revision ID (the SHA hash) to version your tasks and workflows.

```

You can override the default values with the following flags:
Expand Down Expand Up @@ -367,7 +367,7 @@ Let's break down what each flag is doing here:
- `--archive`: This argument allows you to pass in a package file, which in
this case is `flyte-package.tgz`.
- `--version`: This is a version string that can be any string, but we recommend
using the git sha in general, especially in production use cases.
using the git revision in general, especially in production use cases.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using the git revision in general, especially in production use cases.
using the git revision ID (the SHA hash), especially in production use cases.


### Using `pyflyte register` versus `pyflyte package` + `flytectl register`

Expand Down
71 changes: 44 additions & 27 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ on your local machine.
:title: text-muted
:animate: fade-in-slide-down

The introduction below is also available on a hosted sandbox environment, where
you can get started with Flyte without installing anything locally.
Union.ai provides a hosted sandbox environment, free of charge, where you can
get started with Flyte without installing anything locally.

```{link-button} https://sandbox.union.ai/
---
Expand Down Expand Up @@ -73,39 +73,48 @@ First install [flytekit](https://pypi.org/project/flytekit/), Flyte's Python SDK
pip install flytekit flytekitplugins-deck-standard scikit-learn
```

Then install [flytectl](https://docs.flyte.org/projects/flytectl/en/latest/),
Next install [flytectl](https://docs.flyte.org/projects/flytectl/en/latest/),
which the command-line interface for interacting with a Flyte backend.

````{tabbed} Homebrew
````{tabbed} Homebrew (macOS)

```{prompt} bash $
brew install flyteorg/homebrew-tap/flytectl
```

````

````{tabbed} Curl
````{tabbed} Curl (Unix-like)

```{prompt} bash $
curl -sL https://ctl.flyte.org/install | sudo bash -s -- -b /usr/local/bin
```

````

````{tabbed} Windows

```{prompt} C:\>
TODO
```

````


## Creating a Workflow

The first workflow we'll create is a simple model training workflow that consists
of three steps that will:

1. 🍷 Get the classic [wine dataset](https://scikit-learn.org/stable/datasets/toy_dataset.html#wine-recognition-dataset)
using [sklearn](https://scikit-learn.org/stable/).
2. 📊 Process the data that simplifies the 3-class prediction problem into a
binary classification problem by consolidating class labels `1` and `2` into
a single class.
3. 🤖 Train a `LogisticRegression` model to learn a binary classifier.
2. 📊 Process the data by simplifying its 3-class prediction problem into a binary
classification problem by consolidating class labels 1 and 2 into a single
class.
3. 🤖 Train a `LogisticRegression` model to create a binary classifier.

First, we'll define three tasks for each of these steps. Create a file called
`example.py` and copy the following code into it.
Let's define three tasks, corresponding to each of these steps. Create a
file called example.py and copy the following code into it.

```{code-cell} python
:tags: [remove-output]
Expand All @@ -126,7 +135,9 @@ def get_data() -> pd.DataFrame:
@task
def process_data(data: pd.DataFrame) -> pd.DataFrame:
"""Simplify the task from a 3-class to a binary classification problem."""
return data.assign(target=lambda x: x["target"].where(x["target"] == 0, 1))
df = data.copy()
df.loc[df.target == 0, "target"] = 1
return df

@task
def train_model(data: pd.DataFrame, hyperparameters: dict) -> LogisticRegression:
Expand All @@ -139,10 +150,11 @@ def train_model(data: pd.DataFrame, hyperparameters: dict) -> LogisticRegression
As we can see in the code snippet above, we defined three tasks as Python
functions: `get_data`, `process_data`, and `train_model`.

In Flyte, **tasks** are the most basic unit of compute and serve as the building
blocks 🧱 for more complex applications. A task is a function that takes some
inputs and produces an output. We can use these tasks to define a simple model
training workflow:
In Flyte, **tasks** are the most basic "unit of compute" (per Kubernetes
jargon) and serve as the building blocks 🧱 for more complex applications.
At its core, a task is simply a function: it takes inputs and produces and
output. We can use these tasks to define a simple model training workflow:


```{code-cell} python
@workflow
Expand All @@ -165,15 +177,15 @@ is typically written with inputs and outputs.
A **workflow** is also defined as a Python function, and it specifies the flow
of data between tasks and, more generally, the dependencies between tasks 🔀.

::::{dropdown} {fa}`info-circle` The code above looks like Python, but what do `@task` and `@workflow` do exactly?
::::{dropdown} {fa}`info-circle` This looks like typical Python, but what do `@task` and `@workflow` do?
:title: text-muted
:animate: fade-in-slide-down

Flyte `@task` and `@workflow` decorators are designed to work seamlessly with
your code-base, provided that the *decorated function is at the top-level scope
of the module*.

This means that you can invoke tasks and workflows as regular Python methods and
This means that you can invoke tasks and workflows as regular Python functions and
even import and use them in other Python modules or scripts.

:::{note}
Expand Down Expand Up @@ -202,16 +214,19 @@ pyflyte run example.py training_workflow \
:animate: fade-in-slide-down

If you're using Bash, you can ignore this 🙂
You may need to add .local/bin to your PATH variable if it's not already set,
as that's not automatically added for non-bourne shells like fish or xzsh.

To use pyflyte, make sure to set the /.local/bin directory in PATH
You may need to add .local/bin to your PATH variable if it's not already set;
it may not automatically get added for non-bourne shells. For example, if you
use `fish` or `csh`, you can set this with:

:::{code-block} fish
set -gx PATH $PATH ~/.local/bin
set -gx PATH $PATH ~/.local/bin # fish
:::

:::{code-block} csh
set path = ($path $HOME/.local/bin) # csh/tcsh
:::
:::::

:::::


:::::{dropdown} {fa}`info-circle` Why use `pyflyte run` rather than `python example.py`?
Expand All @@ -223,7 +238,9 @@ set -gx PATH $PATH ~/.local/bin

Keyword arguments can be supplied to ``pyflyte run`` by passing in options in
the format ``--kwarg value``, and in the case of ``snake_case_arg`` argument
names, you can pass in options in the form of ``--snake-case-arg value``.
names, you can optionally spell them as "kebab case," for example as
``--snake-case-arg value``.


::::{note}
If you want to run a workflow with `python example.py`, you would have to write
Expand Down Expand Up @@ -347,8 +364,8 @@ There are a few features about FlyteConsole worth pointing out in the GIF above:
## What's Next?

Follow the rest of the sections in the documentation to get a better
understanding of the key constructs that make Flyte such a powerful
orchestration tool 💪.
understanding of the key constructs that make Flyte a powerful orchestration
tool 💪.

```{admonition} Recommendation
:class: tip
Expand Down
17 changes: 10 additions & 7 deletions examples/basics/basics/hello_world.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,19 +12,22 @@
from flytekit import task, workflow

# %% [markdown]
# You can change the signature of the workflow to take in an argument like this:

# You can change the signature of the task to take in an argument like this:
# def say_hello(name: str) -> str:
# return f"hello {name}"
# %%
@task
def say_hello() -> str:
return "hello world"


# %% [markdown]
# You can treat the outputs of a task as you normally would a Python function. Assign the output to two variables
# and use them in subsequent tasks as normal. See {py:func}`flytekit.workflow`
# You can treat the outputs of a task as you normally would a Python function.
# Assign the output to two variables and use them in subsequent tasks as normal.
# See {py:func}`flytekit.workflow`
# You can change the signature of the workflow to take in an argument like this:

# def my_wf(name: str) -> str:
# ...
# %%
@workflow
def my_wf() -> str:
Expand All @@ -49,5 +52,5 @@ def my_wf() -> str:


# %% [markdown]
# In the next few examples you'll learn more about the core ideas of Flyte, which are tasks, workflows, and launch
# plans.
# In the next few examples you'll learn more about the core ideas of Flyte,
# which are tasks, workflows, and launch plans.
Loading
Loading