Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Pipelines Docs #16639

Draft
wants to merge 16 commits into
base: production
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions src/content/changelogs/pipelines.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
link: "/pipelines/reference/changelog/"
productName: Pipelines
productLink: "/pipelines/"
productArea: Developer Platform
productAreaLink: "/pipelines/"
entries:
- publish_date: "2024-09-24"
title: Pipelines is now in public beta.
description: |-
Pipelines, a new product to ingest and store real time streaming data, is now in public beta. The public beta is avaiable to any user with a [free or paid Workers plan](/workers/platform/pricing/). Create a Pipeline, and you'll be able to post data to it via HTTP or from a Cloudflare Worker. Pipelines handle batching, buffering, and partitioning the data, before writing it to an R2 bucket of your choice. It's useful to collect clickstream data, or ingest logs from a service. Start building with our [get started guide](/pipelines/getting-started/).
12 changes: 12 additions & 0 deletions src/content/docs/pipelines/examples/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: Examples
pcx_content_type: navigation
sidebar:
order: 4
group:
hideIndex: true
---

import { DirectoryListing } from "~/components"

<DirectoryListing />
76 changes: 76 additions & 0 deletions src/content/docs/pipelines/get-started.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
---
title: Get started
pcx_content_type: get-started
sidebar:
order: 2
head:
- tag: title
content: Get started
---

import { Render, PackageManagers } from "~/components";

:::note

Pipelines is in **public beta**, and any developer with a [paid Workers plan](/workers/platform/pricing/#workers) can start using Pipelines immediately.

Pipelines let you ingest real time streaming data, and store it in R2. Pipeline handles batching, partitioning, and optionally compressing your final files. By following this guide, you will:

1. Create your first Pipeline.
2. Connect it to your R2 bucket.
3. Post data to it via HTTP.

:::

## Prerequisites

To use Pipelines, you will need:

<Render file="prereqs" product="workers" />

### 1. Enable Pipelines
TODO

## 1. Setup an R2 bucket to use as a destination
Pipelines is built to ingest data, and store it in an R2 bucket. Create a bucket, following our [Get Started Guide for R2](r2/get-started/) if you need to. Save the bucket name; you'll need it for the next sep.

## 2. Create a Pipeline
To create a pipeline using Wrangler, run this command in a shell. Specify the name of your pipeline, as well as the name of the R2 bucket you created in Step 1.

```sh
npx wrangler pipelines create <PIPELINE-NAME> --r2 <R2-BUCKET-NAME>
```

Choose a descrpitive name for your Pipeline, related to the type of events you intend to ingest. You cannot change the Pipeline name after you have set it.

Pipeline names must be 1 to 63 characters long. Queue names cannot contain special characters outside dashes (`-`), and must start and end with a letter or number.

Once your pipeline is created, you'll receive an HTTP endpoint which you can post data to. You should see output which resembles the below:

```sh
🌀 Authorizing R2 bucket "<R2-BUCKET-NAME>"
🌀 Creating pipeline named "<PIPELINE-NAME>"
✅ Successfully created pipeline <PIPELINE-NAME> with ID <PIPELINE-ID>

You can now send data to your pipeline with:
curl "https://<PIPELINE-ID>.pipelines.cloudflare.com/" -d '[{ ...JSON_DATA... }]'
```

## 3. Post data to your pipeline

Use a curl command in your terminal to post an array of JSON objects to the endpoint you received in Step 1.

```sh
curl -H "Content-Type:application/json" \
-d '[{"account_id":"test", "other_data": "test"},{"account_id":"test","other_data": "test2"}]' \
<HTTP-endpoint>
```

Once the data has been successfully accepted by the Pipeline, you'll receive a success message.

Pipelines handles batching the data, so you can continue posting data to the Pipeline. Once a batch is filled up, the data will be partitioned by date, and written to your R2 bucket.

## 4. Verify in R2
Navigate to the dashboard for the R2 bucket that you created in step 1. You should see a prefix for today's date. Click through, and you'll see a file created containing the JSON data you sent in Step 3.

By completing this guide, you've got a pipeline, with an HTTP endpoint as a source, and an R2 bucket as a destination.
20 changes: 20 additions & 0 deletions src/content/docs/pipelines/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
title: Overview
type: overview
pcx_content_type: overview
sidebar:
order: 1
badge:
text: Beta
head:
- tag: title
content: Pipelines
---

import { Description } from "~/components";

<Description>

Ingest, transform, and store real time data streams in R2.

</Description>
12 changes: 12 additions & 0 deletions src/content/docs/pipelines/observability/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
title: Observability
pcx_content_type: navigation
sidebar:
order: 5
group:
hideIndex: true
---

import { DirectoryListing } from "~/components"

<DirectoryListing />
9 changes: 9 additions & 0 deletions src/content/docs/pipelines/observability/metrics.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
pcx_content_type: concept
title: Metrics
sidebar:
order: 10

---

TODO
7 changes: 7 additions & 0 deletions src/content/docs/pipelines/pipelines-api.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
pcx_content_type: navigation
title: Pipelines REST API
sidebar:
order: 10

---
15 changes: 15 additions & 0 deletions src/content/docs/pipelines/reference/changelog.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
pcx_content_type: changelog
title: Changelog
changelog_file_name:
- pipelines
sidebar:
order: 99

---

import { ProductChangelog } from "~/components"

{/* <!-- Actual content lives in /data/changelogs/pipelines.yaml. Update the file there for new entries to appear here. For more details, refer to https://developers.cloudflare.com/style-guide/documentation-content-strategy/content-types/changelog/#yaml-file --> */}

<ProductChangelog />
12 changes: 12 additions & 0 deletions src/content/docs/pipelines/reference/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
pcx_content_type: navigation
title: Platform
sidebar:
order: 8
group:
hideIndex: true
---

import { DirectoryListing } from "~/components"

<DirectoryListing />
23 changes: 23 additions & 0 deletions src/content/docs/pipelines/reference/limits.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
pcx_content_type: concept
title: Limits
sidebar:
order: 2
---

import { Render } from "~/components"

:::note

Many of these limits will increase during Pipelines' public beta period. [Follow our changelog](/pipelines/platform/changelog/) to keep up with the changes.

:::


| Feature | Limit |
| --------------------------------------------- | ------------------------------------------------------------- |
| Requests per second | 10,000 |
| Maximum payload per request | 1 MB |
| Maximum batch size | 100 MB |
| Maximum batch records | 10,000 |
| Maximum batch duration | 600s |
11 changes: 11 additions & 0 deletions src/content/docs/pipelines/reference/pricing.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
pcx_content_type: concept
title: Pricing
sidebar:
order: 1
head:
- tag: title
content: Pipelines Pricing
---

TODO
8 changes: 8 additions & 0 deletions src/content/docs/pipelines/reference/wrangler-commands.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
pcx_content_type: navigation
title: Wrangler commands
external_link: /workers/wrangler/commands/#pipelines
sidebar:
order: 80

---
80 changes: 80 additions & 0 deletions src/content/docs/workers/wrangler/commands.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Wrangler offers a number of commands to manage your Cloudflare Workers.
- [`secret:bulk`](#secretbulk) - Manage multiple secret variables for a Worker.
- [`tail`](#tail) - Start a session to livestream logs from a deployed Worker.
- [`pages`](#pages) - Configure Cloudflare Pages.
- [`pipelines`](#pipelines) - Configure Cloudflare Pipelines.
- [`queues`](#queues) - Configure Workers Queues.
- [`login`](#login) - Authorize Wrangler with your Cloudflare account using OAuth.
- [`logout`](#logout) - Remove Wrangler’s authorization for accessing your account.
Expand Down Expand Up @@ -2116,6 +2117,85 @@ wrangler pages secret bulk [<FILENAME>] [OPTIONS]

---

## `pipelines`
:::note

Pipelines is currently in open beta. Report Pipelines bugs in [GitHub](https://github.com/cloudflare/workers-sdk/issues/new/choose).
:::

Manage your [Pipelines](/pipelines/) configurations.

### `create`

Create a new pipeline

```txt
wrangler pipelines create <name> --r2 <r2-bucket-name> [OPTIONS]
```

- `name` string required
- The name of the pipeline to create
- `--r2` string required
- The name of the R2 bucket used as the destination to store the data.
- `--batch-max-mb` number optional
- The maximum size a batch before data is written, in megabytes. Default 10 MB.
- `--batch-max-rows` number optional
- The maximum number of rows in a batch before data is written. Default 10,000.
- `--batch-max-seconds` number optional
- The maximum duration of a batch before data is written, in seconds. Default 10 MB.
- `--batch-max-mb` number optional
- The maximum size of a batch before data is written, in megabytes. Default 10 MB.
- `--compression` string optional
- Type of compression to apply to output files. Choices: "none", "gzip", "deflate"
- `--filepath` string optional
- The path to store files in the destination bucket. Defaults to event_date=${date}/hr=${hr}
- `--filename` string optional
- The name of the file in the bucket. Must contain "${slug}". File extension is optional. Defaults to ${slug}-${hr}.json
maheshwarip marked this conversation as resolved.
Show resolved Hide resolved

### `update`

Update an existing pipeline

```txt
wrangler pipelines create <name> [OPTIONS]
```

- `name` string required
- The name of the pipeline to update
- `--r2` string optional
- The name of the R2 bucket used as the destination to store the data.
- `--batch-max-mb` number optional
- The maximum size a batch before data is written, in megabytes. Default 10 MB.
- `--batch-max-rows` number optional
- The maximum number of rows in a batch before data is written. Default 10,000.
- `--batch-max-seconds` number optional
- The maximum duration of a batch before data is written, in seconds. Default 10 MB.
- `--batch-max-mb` number optional
- The maximum size of a batch before data is written, in megabytes. Default 10 MB.
- `--compression` string optional
- Type of compression to apply to output files. Choices: "none", "gzip", "deflate"
- `--filepath` string optional
- The path to store files in the destination bucket. Defaults to event_date=${date}/hr=${hr}
- `--filename` string optional
- The name of the file in the bucket. Must contain "${slug}". File extension is optional. Defaults to ${slug}-${hr}.json
maheshwarip marked this conversation as resolved.
Show resolved Hide resolved

### `delete`

Deletes an existing pipeline

```txt
wrangler pipelines delete <name> [OPTIONS]
```

### `list`

Lists all pipelines in your account.

```txt
wrangler pipelines list [OPTIONS]
```


## `queues`

:::note
Expand Down
12 changes: 12 additions & 0 deletions src/content/products/pipelines.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
name: Pipelines

product:
title: Pipelines
url: /pipelines/
group: Developer platform
preview_tryout: true

meta:
title: Cloudflare Pipelines Docs
description: Ingest, transform, and store, real time data streams in R2.
author: '@cloudflare'
1 change: 1 addition & 0 deletions src/icons/pipelines.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading