Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC 0008: Builtin blocktypes #262

Merged
merged 9 commits into from
Jun 23, 2023
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
212 changes: 212 additions & 0 deletions rfc/0008-builtin-blocktypes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,212 @@
<!--
SPDX-FileCopyrightText: 2023 Friedrich-Alexander-Universitat Erlangen-Nurnberg

SPDX-License-Identifier: AGPL-3.0-only
-->

# RFC 0008: Builtin blocktypes

| | |
|-------------|----------------------|
| Feature Tag | `builtin-blocktypes` |
| Status | `DISCUSSION` | <!-- Possible values: DRAFT, DISCUSSION, ACCEPTED, REJECTED -->
| Responsible | `@felix-oq` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably change the responsible person.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the latest commit, I think then we can go like this, right? If we need to iterate on this I'd rather create a new PR.

<!--
Status Overview:
- DRAFT: The RFC is not ready for a review and currently under change. Feel free to already ask for feedback on the structure and contents at this stage.
- DISCUSSION: The RFC is open for discussion. Usually, we open a PR to trigger discussions.
- ACCEPTED: The RFC was accepted. Create issues to prepare implementation of the RFC.
- REJECTED: The RFC was rejected. If another revision emerges, switch to status DRAFT.
-->

## Summary

Blocktypes that are built into the language are declared syntactically by denoting IO types and their properties.
This change does not change the semantics, but it can serve as a foundation for composite block types in an upcoming RFC.

## Motivation

Currently, all blocktypes only exist implicitly in the language.
To make their definition more explicit, we should support the declaration of builtin blocktypes (i.e. blocktypes that are built into the language).
This allows users to look up IO types and properties without having to open the documentation pages.

## Explanation

- `builtin` keyword to indicate that the blocktype is built into the language
- Produces an error if the blocktype is unknown
- `input` / `output` keywords to type the input and output of the blocktype
felix-oq marked this conversation as resolved.
Show resolved Hide resolved
- If omitted, a blocktype has no input or output
- `property` keyword to define a property with a name and a type
- Optionally, a default value can be assigned
- Adds new keywords for existing property types:
- Valuetypes: `regex`, `cellrange` / `row` / `column` / `cell`, `valuetype-assignment`
- Typed collections: `collection<type>`
- Nested collections are not supported for now
- IO types: `File`, `FileSystem`, `TextFile`, `Sheet`, `Table`
felix-oq marked this conversation as resolved.
Show resolved Hide resolved

See the following section for concrete code examples.

### Syntax for current block types

<details>
<summary>Click to show</summary>

```jayvee
builtin blocktype HttpExtractor {
property url oftype text;

output oftype File;
}

builtin blocktype ArchiveInterpreter {
input oftype File;

property archiveType oftype text;

output oftype FileSystem;
}

builtin blocktype FilePicker {
input oftype FileSystem;

property path oftype text;

output oftype File;
}

builtin blocktype TextFileInterpreter {
input oftype File;

property encoding oftype text;
property lineBreak oftype regex;

output oftype TextFile;
}

builtin blocktype TextLineDeleter {
input oftype TextFile;

property lines oftype collection<integer>;

output oftype TextFile;
}

builtin blocktype TextRangeSelector {
input oftype TextFile;

property lineFrom oftype integer;
property lineTo oftype integer;

output oftype TextFile;
}

builtin blocktype CSVInterpreter {
input oftype TextFile;

property delimiter oftype text: ",";
felix-oq marked this conversation as resolved.
Show resolved Hide resolved
property enclosing oftype text: "";
property enclosingEscape oftype text: "";
felix-oq marked this conversation as resolved.
Show resolved Hide resolved

output oftype Sheet;
}

builtin blocktype CellRangeSelector {
input oftype Sheet;

property select oftype cellrange;

output oftype Sheet;
}

builtin blocktype CellWriter {
input oftype Sheet;

property write oftype text;
property at oftype cell;

output oftype Sheet;
}

builtin blocktype ColumnDeleter {
input oftype Sheet;

property delete oftype collection<column>;

output oftype Sheet;
}

builtin blocktype RowDeleter {
input oftype Sheet;

property delete oftype collection<row>;

output oftype Sheet;
}

builtin blocktype TableInterpreter {
input oftype Sheet;

property header oftype boolean;
property columns oftype collection<valuetype-assignment>;

output oftype Table;
}

builtin blocktype SQLiteLoader {
input oftype Table;

property table oftype text;
property file oftype text;
}

builtin blocktype PostgresLoader {
input oftype Table;

property host oftype text;
property port oftype integer;
property username oftype text;
property password oftype text;
property database oftype text;
property table oftype text;
}
```
</details>

## Drawbacks

- Adds valuetype keywords for properties, but these types cannot be assigned to table columns
felix-oq marked this conversation as resolved.
Show resolved Hide resolved

## Alternatives

- Not having declarations for builtin block types at all
- No mixing of property valuetypes and column valuetypes
- Using a less verbose syntax (e.g. omit `property` keyword or use a shorthand operator instead of `oftype` keyword)

## Possible Future Changes/Enhancements

- Usage of valuetypes for typing properties
- Constraints can then be used to validate property values
- Properties can be considered inputs of a block, so property values can be provided dynamically by pipes
- Include validation semantics beyond valuetypes
- Introduce a standard for documenting blocktypes in the code
felix-oq marked this conversation as resolved.
Show resolved Hide resolved
- Possibility to declare multiple named inputs and outputs for blocktypes
- Can serve as a foundation for composite blocktypes, e.g.:

```jayvee
composite blocktype HttpFileExtractor {
felix-oq marked this conversation as resolved.
Show resolved Hide resolved
property url oftype text;

output oftype TextFile;

block Extractor oftype HttpExtractor {
url: url;
}

Extractor -> Interpreter;

block Interpreter oftype TextFileInterpreter {
}

Interpreter -> output;
}
```