Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compatible: Add sync_docs.yaml #220

Merged
merged 68 commits into from
Aug 16, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
fd58087
First commit
a-velasco Aug 5, 2024
8821fbc
Misc PR review fixes
a-velasco Aug 7, 2024
71f818c
Removing run schedule since there are no longer API constraints
a-velasco Aug 7, 2024
330982e
Using GITHUB_TOKEN instead of PA TOKEN
a-velasco Aug 7, 2024
a2e0463
Update .github/workflows/_sync_docs_v2.yaml
a-velasco Aug 8, 2024
6b8e227
Update .github/workflows/_sync_docs_v2.yaml
a-velasco Aug 8, 2024
0561751
Update .github/workflows/_sync_docs_v2.yaml
a-velasco Aug 8, 2024
ac42f67
Update .github/workflows/_sync_docs_v2.md
a-velasco Aug 8, 2024
ee2866b
Update python/cli/data_platform_workflows_cli/sync_docs_v2.py
a-velasco Aug 8, 2024
c92536f
Update python/cli/data_platform_workflows_cli/sync_docs_v2.py
a-velasco Aug 8, 2024
3c43657
Update .github/workflows/_sync_docs_v2.yaml
a-velasco Aug 8, 2024
64f0a61
Small edits
a-velasco Aug 8, 2024
17e4327
Replaced original sync_docs with v2 and removed experimental file pre…
a-velasco Aug 8, 2024
39cd6c0
Merge branch 'download-discourse-topics' of github.com:canonical/data…
a-velasco Aug 8, 2024
bd3c59c
Update sync_docs.md
a-velasco Aug 8, 2024
0d196b5
Update README.md
a-velasco Aug 8, 2024
908c137
Update sync_docs.yaml
a-velasco Aug 8, 2024
67f50a4
Update .github/workflows/sync_docs.md
a-velasco Aug 9, 2024
ee6932f
Update python/cli/data_platform_workflows_cli/sync_docs.py
a-velasco Aug 9, 2024
5731b97
Update .github/workflows/sync_docs.yaml
a-velasco Aug 9, 2024
9f46bd4
Update .github/workflows/sync_docs.yaml
a-velasco Aug 9, 2024
d645e02
Update README.md
a-velasco Aug 9, 2024
ca1164a
Update python/cli/data_platform_workflows_cli/sync_docs.py
a-velasco Aug 9, 2024
6460585
Formatted with isort
a-velasco Aug 9, 2024
d05d5ff
Update python/cli/data_platform_workflows_cli/sync_docs.py
a-velasco Aug 9, 2024
5316d89
Update sync_docs.md
a-velasco Aug 9, 2024
aa3c9ed
Update sync_docs.md
a-velasco Aug 9, 2024
7a8e201
Update sync_docs.md
a-velasco Aug 9, 2024
eafdffd
Update sync_docs.py
a-velasco Aug 9, 2024
917485d
Reformatted with black
a-velasco Aug 9, 2024
785629c
Update python/cli/data_platform_workflows_cli/sync_docs.py
a-velasco Aug 9, 2024
d9d979a
Update sync_docs.md
a-velasco Aug 9, 2024
e051a00
Update sync_docs.yaml
a-velasco Aug 12, 2024
69c58fe
Update sync_docs.yaml
a-velasco Aug 12, 2024
e3cd611
test
a-velasco Aug 12, 2024
3225cd9
Update sync_docs.yaml
a-velasco Aug 12, 2024
c369e5c
Update sync_docs.yaml
a-velasco Aug 12, 2024
c7e6000
Add logging
carlcsaposs-canonical Aug 12, 2024
46f8dbc
Fixed bug where first line of navtable was getting automatically filt…
a-velasco Aug 12, 2024
4e33b34
Update .github/workflows/sync_docs.yaml
a-velasco Aug 12, 2024
6cf1cdd
temp (debug)
a-velasco Aug 12, 2024
7bdf969
temp (debug)
a-velasco Aug 12, 2024
2211711
temp (debug)
a-velasco Aug 12, 2024
ac7d9dc
Update sync_docs.yaml
a-velasco Aug 12, 2024
fd0a2ed
Update sync_docs.yaml
a-velasco Aug 12, 2024
ba261b1
Added overview topic download and support for valid non-diataxis topics
a-velasco Aug 12, 2024
3a81e0c
Update .github/workflows/sync_docs.yaml
a-velasco Aug 12, 2024
f7c2c03
Update python/cli/data_platform_workflows_cli/sync_docs.py
a-velasco Aug 12, 2024
2adc0db
Moved download logic into Topic class
a-velasco Aug 12, 2024
670c897
Merge branch 'download-discourse-topics' of github.com:canonical/data…
a-velasco Aug 12, 2024
663ffc0
Update sync_docs.md
a-velasco Aug 12, 2024
0bde819
Update python/cli/data_platform_workflows_cli/sync_docs.py
a-velasco Aug 12, 2024
9b77eac
Update python/cli/data_platform_workflows_cli/sync_docs.py
a-velasco Aug 12, 2024
53944e4
Rename topic download function
a-velasco Aug 12, 2024
80b8347
Some rephrasing
a-velasco Aug 12, 2024
9c0d0f2
Update python/cli/data_platform_workflows_cli/sync_docs.py
a-velasco Aug 12, 2024
c6d1a5f
Update python/cli/data_platform_workflows_cli/sync_docs.py
a-velasco Aug 12, 2024
fdeaf62
Update sync_docs.md
a-velasco Aug 12, 2024
71981b5
Formatting
a-velasco Aug 12, 2024
982fce9
Update .github/workflows/sync_docs.md
a-velasco Aug 13, 2024
f121d08
Update .github/workflows/sync_docs.md
a-velasco Aug 13, 2024
d1b71fb
Update sync_docs.md
a-velasco Aug 13, 2024
46b5238
Update .github/workflows/sync_docs.md
a-velasco Aug 13, 2024
d33bea5
Update .github/workflows/sync_docs.md
a-velasco Aug 13, 2024
dccdd38
Update .github/workflows/sync_docs.md
a-velasco Aug 13, 2024
d048543
Update sync_docs.md
a-velasco Aug 13, 2024
e885889
Update sync_docs.md
a-velasco Aug 13, 2024
5a80297
Fixed bad indentation in yaml template
a-velasco Aug 13, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions .github/workflows/_sync_docs_v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Workflow file: [_sync_docs_v2.yaml](_sync_docs_v2.yaml)

> [!WARNING]
> Subject to **breaking changes on patch release**. `_sync_docs_v2.yaml` is experimental & not part of the public interface.

## Usage
Add `.yaml` file to `.github/workflows/`

```yaml
# Copyright 2024 Canonical Ltd.
# See LICENSE file for licensing details.
name: Sync Discourse docs (v2)

on:
workflow_dispatch:
schedule:
- cron:
a-velasco marked this conversation as resolved.
Show resolved Hide resolved

jobs:
sync-docs-v2:
name: Sync docs from Discourse (v2)
uses: canonical/data-platform-workflows/.github/workflows/_sync_docs_2.yaml@main
permissions:
contents: write # Needed to push branch & tag
pull-requests: write # Needed to create PR
```
50 changes: 50 additions & 0 deletions .github/workflows/_sync_docs_v2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
on:
workflow_call:
inputs:
reviewers:
description: Comma separated list of GitHub usernames to request to review pull request (e.g. "canonical/data-platform-engineers,octocat")
required: false
type: string

jobs:
sync-docs-v2:

Check failure on line 10 in .github/workflows/_sync_docs_v2.yaml

View workflow job for this annotation

GitHub Actions / Lint workflows

"steps" section is missing in job "sync-docs-v2"
name: Sync Discourse docs (v2)
runs-on: ubuntu-latest
timeout-minutes: 5
steps:

Check failure on line 14 in .github/workflows/_sync_docs_v2.yaml

View workflow job for this annotation

GitHub Actions / Lint workflows

"steps" section is missing in job "steps"

Check failure on line 14 in .github/workflows/_sync_docs_v2.yaml

View workflow job for this annotation

GitHub Actions / Lint workflows

"runs-on" section is missing in job "steps"
a-velasco marked this conversation as resolved.
Show resolved Hide resolved
- name: Get workflow version

Check failure on line 15 in .github/workflows/_sync_docs_v2.yaml

View workflow job for this annotation

GitHub Actions / Lint workflows

"steps" job is sequence node but mapping node is expected
id: workflow-version
uses: canonical/get-workflow-version-action@v1
with:
repository-name: canonical/data-platform-workflows
file-name: _sync_docs_v2.yaml
github-token: ${{ secrets.GITHUB_TOKEN }}
- name: Install CLI
run: pipx install git+https://github.com/canonical/data-platform-workflows@'${{ steps.workflow-version.outputs.sha }}'#subdirectory=python/cli
- name: Checkout
uses: actions/checkout@v4
with:
token: ${{ secrets.GITHUB_TOKEN }}
a-velasco marked this conversation as resolved.
Show resolved Hide resolved
- name: Download Discourse docs
id: sync-docs-v2
a-velasco marked this conversation as resolved.
Show resolved Hide resolved
run:
- name: Push `sync-docs` branch
run: |
git checkout -b sync-docs
git add docs/
git config user.name "GitHub Actions"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git commit -m "Sync docs from Discourse"
git push origin sync-docs -f
- name: Create pull request
run: |
# Capture output in variable so that step fails if `gh pr list` exits with non-zero code
prs=$(gh pr list --head sync-docs --state open --json number)
if [[ $prs != "[]" ]]
then
echo Open pull request already exists
exit 0
fi
gh pr create --head sync-docs --title "Sync docs from Discourse" --body "Sync charm docs from https://discourse.charmhub.io" --reviewer '${{ inputs.reviewers }}'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
a-velasco marked this conversation as resolved.
Show resolved Hide resolved
a-velasco marked this conversation as resolved.
Show resolved Hide resolved
27 changes: 14 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,19 @@
## Usage
### Workflows
| Name | Description |
|----------------------------------------------------------------------------|----------------------------------------------------------------------------|
| [lint.yaml](.github/workflows/lint.md) | Lint GitHub Actions workflows (`.github/workflows/`) and `tox run -e lint` |
| [integration_test_charm.yaml](.github/workflows/integration_test_charm.md) | Integration test charm |
| [build_snap.yaml](.github/workflows/build_snap.md) | Build snap |
| [build_rock.yaml](.github/workflows/build_rock.md) | Build rock |
| [build_charm.yaml](.github/workflows/build_charm.md) | Build charm |
| [release_snap.yaml](.github/workflows/release_snap.md) | Release snap to Snap Store |
| [release_rock.yaml](.github/workflows/release_rock.md) | Release rock to GitHub Container Registry |
| [release_charm.yaml](.github/workflows/release_charm.md) | Release charm to Charmhub |
| [update_bundle.yaml](.github/workflows/update_bundle.md) | Update charm revisions in bundle |
| [sync_issue_to_jira.yaml](.github/workflows/sync_issue_to_jira.md) | Sync GitHub issues to Jira issues |
| [_sync_docs.yaml](.github/workflows/_sync_docs.md) | **Experimental** Sync Discourse documentation to GitHub |
| Name | Description |
|----------------------------------------------------------------------------|-----------------------------------------------------------------------------|
| [lint.yaml](.github/workflows/lint.md) | Lint GitHub Actions workflows (`.github/workflows/`) and `tox run -e lint` |
| [integration_test_charm.yaml](.github/workflows/integration_test_charm.md) | Integration test charm |
| [build_snap.yaml](.github/workflows/build_snap.md) | Build snap |
| [build_rock.yaml](.github/workflows/build_rock.md) | Build rock |
| [build_charm.yaml](.github/workflows/build_charm.md) | Build charm |
| [release_snap.yaml](.github/workflows/release_snap.md) | Release snap to Snap Store |
| [release_rock.yaml](.github/workflows/release_rock.md) | Release rock to GitHub Container Registry |
| [release_charm.yaml](.github/workflows/release_charm.md) | Release charm to Charmhub |
| [update_bundle.yaml](.github/workflows/update_bundle.md) | Update charm revisions in bundle |
| [sync_issue_to_jira.yaml](.github/workflows/sync_issue_to_jira.md) | Sync GitHub issues to Jira issues |
a-velasco marked this conversation as resolved.
Show resolved Hide resolved
| [_sync_docs.yaml](.github/workflows/_sync_docs.md) | **Experimental** Sync Discourse documentation to GitHub (gatekeeper action) |
| [_sync_docs_v2.yaml](.github/workflows/_sync_docs_v2.md) | **Experimental** Sync Discourse documentation to GitHub (custom script) |
a-velasco marked this conversation as resolved.
Show resolved Hide resolved

### Version
Recommendation: pin the latest version (e.g. `v1.0.0`) and use [Renovate](https://docs.renovatebot.com/) to stay up-to-date.
Expand Down
130 changes: 130 additions & 0 deletions python/cli/data_platform_workflows_cli/sync_docs_v2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
import csv
import dataclasses
import pathlib
import re
import shutil
import requests
import yaml

NAVTABLE_START_MARKER = "[details=Navigation]"
NAVTABLE_END_MARKER = "[/details]"

def get_topic(topic_id_: str):
"""Get markdown content of a discourse.charmhub.io topic"""

response = requests.get(
f"https://discourse.charmhub.io/raw/{topic_id_}/1"
) # "/1" for post 1

response.raise_for_status()
return response.text


class NoTopicToDownload(Exception):
"""No Discourse topic is available to download

Happens if:
- no "Navlink" is provided (e.g. for a navigation group)
- "Navlink" is an external URL
"""

@dataclasses.dataclass
class Topic:
"""Discourse topic to download"""

id: str
path: pathlib.Path

@classmethod
def from_csv_row(cls, row_: dict):
# Example `row_`: {'Level': '2', 'Path': 't-overview', 'Navlink': '[Overview](/t/9707)'}

# Extract Discourse topic ID from "Navlink"
# Example `link`: "/t/9707"
link = re.fullmatch(r"\[.*?]\((.*?)\)", row_["Navlink"]).group(1)
if link == "":
raise NoTopicToDownload
elif link.startswith("http") and "discourse.charmhub.io" not in link:
# Ignore external links (e.g. "https://canonical.com/data/docs/postgresql/iaas")
raise NoTopicToDownload

match = re.fullmatch(r"/t/([0-9]+)", link)
if not match:
raise ValueError(
f'Invalid navlink "{link}". Expected something like "/t/9707"'
)
# Example `topic_id`: "9707"
topic_id = match.group(1)

# Determine local path to download Markdown file
# Example `topic_slug`: "t-overview"
topic_slug = row_["Path"]
diataxis_directory = {
"t-": "tutorial",
"h-": "how-to",
"r-": "reference",
"e-": "explanation",
}[topic_slug[:2]]

# Example `path`: "docs/tutorial/t-overview.md"
path = pathlib.Path("docs/") / diataxis_directory / f"{topic_slug}.md"

return cls(topic_id, path)

def main():
"""Download Discourse documentation topics to docs/ directory"""

# Example `overview_topic_link`: "https://discourse.charmhub.io/t/charmed-postgresql-documentation/9710"
overview_topic_link: str = yaml.safe_load(pathlib.Path("metadata.yaml").read_text())["docs"]
assert overview_topic_link.startswith("https://discourse.charmhub.io/")

# Example `topic_id`: "9710"
topic_id = overview_topic_link.split("/")[-1]
overview_topic_markdown = get_topic(topic_id)

# Example of an expected markdown table:
# | Level | Path | Navlink |
# |--------|--------|-------------|
# | 1 | tutorial | [Tutorial]() |
# | 2 | t-overview | [Overview](/t/9707) |
# | 2 | t-set-up | [1. Set up the environment](/t/9709) |
# | 2 | t-deploy | [2. Deploy PostgreSQL](/t/9697) |
# | 1 | search | [Search](https://canonical.com/data/docs/postgresql/iaas) |

# Search for table delimiters NAVTABLE_START_MARKER and NAVTABLE_END_MARKER
start_index = overview_topic_markdown.find(NAVTABLE_START_MARKER)
if start_index == -1:
raise ValueError("Could not find Navtable start marker " + NAVTABLE_START_MARKER + " in the overview topic")

end_index = overview_topic_markdown.find(NAVTABLE_END_MARKER)
if end_index == -1:
raise ValueError("Could not find Navtable end marker " + NAVTABLE_END_MARKER + " in the overview topic")

start_index += len(NAVTABLE_START_MARKER)
end_index = overview_topic_markdown.find(NAVTABLE_END_MARKER, start_index)

table_raw = overview_topic_markdown[start_index:end_index].strip() # remove leading and trailing whitespace
if table_raw == "":
raise ValueError("Could not find a valid table")
a-velasco marked this conversation as resolved.
Show resolved Hide resolved

# Convert Markdown table to list[dict[str, str]]
# (https://stackoverflow.com/a/78254495)
rows: list[dict] = list(csv.DictReader(table_raw.split("\n"), delimiter="|"))
# Remove first row (e.g. "|--------|--------|-------------|")
rows = rows[2:]
rows: list[dict[str, str]] = [
{key.strip(): value.strip() for key, value in row.items() if key != ''}
for row in rows
]
shutil.rmtree(pathlib.Path("docs/"))
a-velasco marked this conversation as resolved.
Show resolved Hide resolved
a-velasco marked this conversation as resolved.
Show resolved Hide resolved

for row in rows:
# Example `row`: {'Level': '2', 'Path': 't-overview', 'Navlink': '[Overview](/t/9707)'}
try:
topic = Topic.from_csv_row(row)
except NoTopicToDownload:
continue

# Download topic markdown to `topic.path`
topic.path.parent.mkdir(parents=True, exist_ok=True)
topic.path.write_text(get_topic(topic_id))
1 change: 1 addition & 0 deletions python/cli/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ parse-snap-version = "data_platform_workflows_cli.parse_snap_version:main"
allure-add-default-for-missing-results = "data_platform_workflows_cli.allure_add_default_for_missing_results:main"
add-ssh-keys = "data_platform_workflows_cli.add_ssh_keys:main"
tee-log-for-all-models = "data_platform_workflows_cli.tee_log_for_all_models:main"
sync-docs-v2 = "data_platform_workflows_cli.sync_docs_v2:main"

[tool.poetry.dependencies]
python = "^3.10"
Expand Down
Loading