Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle CWL Directory type #474

Merged
merged 104 commits into from
Nov 23, 2022
Merged
Show file tree
Hide file tree
Changes from 97 commits
Commits
Show all changes
104 commits
Select commit Hold shift + click to select a range
25f47e3
first impl to handle CWL Directory type (relates to #466)
fmigneault Oct 6, 2022
220bedd
fix lint
fmigneault Oct 12, 2022
164f240
improve typings
fmigneault Oct 12, 2022
fdc8bb8
ignore duplicate S3 bucket creation during tests
fmigneault Oct 13, 2022
03dcb1a
[wip] obtain remote dir listing for CWL Directory input
fmigneault Oct 13, 2022
92cdcad
replace dir-listing by dir-fetching method + obtain option kwargs by …
fmigneault Oct 15, 2022
4d8e630
fix linting issues with pylint=2.15.4
fmigneault Oct 15, 2022
f76e3db
handle repr_json format datetime + listing S3 bucket contents
fmigneault Oct 17, 2022
e8c3ae7
[wip] download pool for dir listing + impl handlers by schemes
fmigneault Oct 18, 2022
dd8d386
impl dir-type handling per ref scheme
fmigneault Oct 18, 2022
d0da33f
improve typings + reuse CWL IO def class + tests for dir resolution
fmigneault Oct 18, 2022
3e4afc2
fix parameters passed to fetch_file
fmigneault Oct 19, 2022
a441b73
add multiple dir listing test cases parsing different HTML representa…
fmigneault Oct 21, 2022
316eeb1
more tests for dir filter listing + refactor s3 dir listing/download …
fmigneault Oct 21, 2022
77f6627
local directory handling with different output methods + more tests f…
fmigneault Oct 21, 2022
dc2b9cb
more test conditions and operations to run dir listing
fmigneault Oct 23, 2022
eda654f
fixing tests and filtering dirs
fmigneault Oct 23, 2022
b5e1728
fix more test cases
fmigneault Oct 23, 2022
b2c4c04
fix requirements-dev dependencies for pylint
fmigneault Oct 23, 2022
606fd2c
fix lint + invalid CWLIODefinition class use
fmigneault Oct 23, 2022
3892cee
Merge branch 'master' into dir-type
fmigneault Oct 23, 2022
38999f4
fix lint & download file path ref
fmigneault Oct 23, 2022
9955812
fix warning BeautifulSoup using LXML parser
fmigneault Oct 24, 2022
f2c2c48
replace dataclass by namedtuple + fix convert tests
fmigneault Oct 24, 2022
7111565
working unit tests for fetch_directory
fmigneault Oct 24, 2022
0d544ae
revert namedtuple to allow mutable definitions
fmigneault Oct 24, 2022
a52cb0e
fixes for CWL I/O definition parsing
fmigneault Oct 24, 2022
2c8d48f
fix unit tests
fmigneault Oct 25, 2022
95c83a5
minor text edits
fmigneault Oct 25, 2022
dca32ac
add missing docstring for CWLIODefinition.symbols
fmigneault Oct 25, 2022
8916a1a
define PACKAGE_FILE_TYPE and PACAKGE_DIRECTORY_TYPE definitions
fmigneault Oct 25, 2022
5a5c938
fix docf lint
fmigneault Oct 25, 2022
6382391
fix typing definitions
fmigneault Oct 25, 2022
1a6915b
fix imports order
fmigneault Oct 25, 2022
0223fa6
fix dir listing functional tests
fmigneault Oct 25, 2022
59da430
fix imports lint
fmigneault Oct 25, 2022
895eee2
rename FormatDefault oneOf[mediaType, mimeType] without 'deploy' ment…
fmigneault Oct 25, 2022
7907adc
fix typos
fmigneault Oct 25, 2022
0ef87f7
Merge branch 'master' into dir-type
fmigneault Oct 25, 2022
6bdf26f
Merge branch 'master' into dir-type
fmigneault Oct 26, 2022
62931a4
support aws s3 access-point and virtual-host style URIs
fmigneault Oct 26, 2022
cbcd079
Merge branch 'dir-type' of https://github.com/crim-ca/weaver into dir…
fmigneault Oct 26, 2022
b53c4aa
Merge branch 'master' into dir-type
fmigneault Oct 26, 2022
387dc2e
fix parsing of S3 references from HTTP URIs
fmigneault Oct 26, 2022
c33af14
Merge branch 'master' into dir-type
fmigneault Oct 27, 2022
e85f7b1
Merge branch 'master' into dir-type
fmigneault Oct 27, 2022
fb9537d
prepare process with dir as output
fmigneault Oct 31, 2022
11edb9b
update docs with dir-type references
fmigneault Oct 31, 2022
48f0486
add docs sample dir listing
fmigneault Nov 1, 2022
cede281
Merge branch 'master' into dir-type
fmigneault Nov 1, 2022
b4aaec7
Update CHANGES.rst
fmigneault Nov 1, 2022
b5d54da
fix docs
fmigneault Nov 1, 2022
acb8324
update S3-related docs for dir-type
fmigneault Nov 3, 2022
75e4107
add more tests for S3 ref validation from http conversion
fmigneault Nov 3, 2022
eeeb6f3
more S3 reference formats validation, tests and documentation updates
fmigneault Nov 3, 2022
9c3d760
more tests for AWS S3 parsing
fmigneault Nov 4, 2022
9222119
add test for paring S3 request configs
fmigneault Nov 4, 2022
354e1ae
fix py36 backport literal type arguments
fmigneault Nov 4, 2022
5a030c5
fix test for s3 ref resolution from http
fmigneault Nov 7, 2022
9d4bd11
fix test mock:// scheme resolution with schema validators dynamically…
fmigneault Nov 7, 2022
0276b6b
add test for deploy/describe process with dir as output + support CWL…
fmigneault Nov 8, 2022
dddafbe
[wip] add test with workflow using directory type
fmigneault Nov 8, 2022
db95dc7
fix lint
fmigneault Nov 8, 2022
071cc0d
setup directory storage utility + more robust storage locations using…
fmigneault Nov 9, 2022
f2e61b2
[wip] workflow step dir working - chaining/collect prev output to fix
fmigneault Nov 9, 2022
bafc70f
fix lint
fmigneault Nov 9, 2022
cf07ac4
transfer all output storage setup to method
fmigneault Nov 9, 2022
1ee4223
Merge branch 'master' into dir-type
fmigneault Nov 9, 2022
8a9d5b4
fix tests output location checks with updated nested output-id location
fmigneault Nov 9, 2022
8ba5b7c
fix test aws s3 ref validation
fmigneault Nov 9, 2022
f318e36
patch type hints
fmigneault Nov 9, 2022
3b11f86
remove tmp file added by mistake
fmigneault Nov 9, 2022
2f5dd83
make better use of parametrized tests + adjust name of parsing/conver…
fmigneault Nov 9, 2022
638403c
fix file/dir path resolution for directory storage managing its conta…
fmigneault Nov 10, 2022
2faae69
consider directory type in expected output pattern check to collect w…
fmigneault Nov 10, 2022
1240852
functional test with workflow dir listing chaining between steps
fmigneault Nov 11, 2022
a733047
doc updates
fmigneault Nov 11, 2022
330dfe7
patch partially invalid URL regex
fmigneault Nov 12, 2022
513a405
fix schema tests
fmigneault Nov 12, 2022
c422c91
fix tests
fmigneault Nov 12, 2022
79640c2
add explicit error about unsupported CLI upload directory (download r…
fmigneault Nov 12, 2022
69aae9f
fix tests
fmigneault Nov 12, 2022
f9360cb
fix imports lint
fmigneault Nov 12, 2022
9976b4d
more debug for workflow tests
fmigneault Nov 14, 2022
6ef83b3
adjust workflow test considering possible distinct permissions during…
fmigneault Nov 14, 2022
550ef1a
backport fix for python 3.6
fmigneault Nov 14, 2022
10af4ff
patch other misuse of re.Pattern in Python 3.6
fmigneault Nov 15, 2022
8b8f2f6
ensure all lint checkers are executed in test pipeline
fmigneault Nov 15, 2022
6eba738
test dir output with S3 storage as WPS outputs
fmigneault Nov 16, 2022
133d39b
Merge branch 'master' into dir-type
fmigneault Nov 16, 2022
258948d
Merge branch 'master' into dir-type
fmigneault Nov 17, 2022
23f0359
force pywps create aws s3 bucket on test setup + validate aws s3 buck…
fmigneault Nov 17, 2022
ca8b0f3
test s3 bucket ouputs - setup bucket to avoid not existing error afte…
fmigneault Nov 18, 2022
4e2fc45
Merge branch 'master' into dir-type
fmigneault Nov 18, 2022
1b0e14c
add missing definitions for InlineJavascriptRequirement
fmigneault Nov 18, 2022
e3760f9
fix invalid InlineJavascriptRequirement schemas
fmigneault Nov 18, 2022
bad4cf9
fix cwl conflicting/invalid schema definitions
fmigneault Nov 22, 2022
0d06349
revert unused util for separate PR
fmigneault Nov 22, 2022
3acf954
add more tests for AWS S3 region/bucket validation
fmigneault Nov 22, 2022
80a7344
improved bucket regex to handle more cases specified by AWS S3 naming…
fmigneault Nov 22, 2022
624bdcc
add missing test for AWS S3 outposts endpoint resolution
fmigneault Nov 22, 2022
cac0fd7
check unhandled file fetch scheme + ignore explicit raising OSError (…
fmigneault Nov 22, 2022
e76a593
add missing test cases for fetch_directory from JSON response and oth…
fmigneault Nov 23, 2022
2a6d7de
ignore coverage backport code
fmigneault Nov 23, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ jobs:
- os: ubuntu-latest
python-version: 3.7
allow-failure: false
test-case: check-only
test-case: check-all
# documentation build
- os: ubuntu-latest
python-version: 3.7
Expand Down
17 changes: 17 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,23 @@ Changes

Changes:
--------
- Support `CWL` ``InlineJavascriptRequirement`` for `Process` deployment to allow successful schema validation.
- Support `CWL` ``Directory`` type references (resolves `#466 <https://github.com/crim-ca/weaver/issues/466>`_).
Those references correspond to `WPS` and `OGC API - Processes` ``href``
using the ``Content-Type: application/directory`` Media-Type and must hava a trailing slash (``/``) character.
- Support `S3` file or directory references using *Access Point*, *Virtual-hosted–style* and *Outposts* URLs
(see AWS documentation
`Methods for accessing a bucket <https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-bucket-intro.html>`_).
- Apply more validation rules against expected `S3` file or directory reference formats.
- Update documentation regarding handling of `S3` references (more formats supported) and ``Directory`` type references.
- Support ``weaver.wps_output_context`` setting and ``X-WPS-Output-Context`` request header resolution in combination
with `S3` bucket location employed for storing `Job` outputs.
- Nest every complex `Job` output (regardless if stored on local `WPS` outputs or on `S3`, and whether the output is
of ``File`` or ``Directory`` type) under its corresponding output ID collected from the `Process` definition to avoid
potential name conflicts in storage location, especially in the case of multiple output IDs that could be aggregated
with various files and listing of directory contents.
- Allow ``colander.SchemaNode`` (with extensions for `OpenAPI` schema converters) to provide validation ``pattern``
field directly with a compiled ``re.Pattern`` object.
- Support `CWL` definition for ``cwltool:CUDARequirement`` to request the use of a GPU, including support for using
Docker with a GPU (resolves `#104 <https://github.com/crim-ca/weaver/issues/104>`_).
- Support `CWL` definition for ``NetworkAccess`` to indicate whether a process requires outgoing IPv4/IPv6 network
Expand Down
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -469,7 +469,7 @@ CHECKS := $(addprefix check-, $(CHECKS))
# items that should not install python dev packages should be added here instead
# they must provide their own target/only + with dependency install variants
CHECKS_NO_PY := css md
CHECKS_NO_PY := $(addprefix fix-, $(CHECKS_NO_PY))
CHECKS_NO_PY := $(addprefix check-, $(CHECKS_NO_PY))
CHECKS_ALL := $(CHECKS) $(CHECKS_NO_PY)

$(CHECKS): check-%: install-dev check-%-only
Expand All @@ -482,7 +482,7 @@ mkdir-reports:
check: check-all ## alias for 'check-all' target

.PHONY: check-only
check-only: $(addsuffix -only, $(CHECKS))
check-only: $(addsuffix -only, $(CHECKS_ALL))

.PHONY: check-all
check-all: install-dev $(CHECKS_ALL) ## check all code linters
Expand Down
19 changes: 19 additions & 0 deletions docs/_static/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,22 @@ div[class^="highlight"] {
max-width: 100%;
overflow: visible;
}

/* add missing border when row spans more than one line */
.rst-content table.docutils td:first-child,
.rst-content table.docutils th:first-child,
.rst-content table.field-list td:first-child,
.rst-content table.field-list th:first-child,
.wy-table td:first-child,
.wy-table th:first-child {
border-left-width: 1px;
border-right-width: 1px;
}

/* avoid mismatching background color in
table rows that spans multiple lines, due to
alternating colors on individual odd/even rows
*/
#table-file-type-handling tr.row-even > td[rowspan] {
background-color: revert;
}
39 changes: 39 additions & 0 deletions docs/examples/directory-listing-s3.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
{
"ResponseMetadata": {
"RequestId": "vpiM5RBkJ3O68CnD5fO42d887Jh49Cf8bhA6nw7ZTHIuGRVccDQM",
"HTTPStatusCode": 200,
"HTTPHeaders": {
"x-amzn-requestid": "vpiM5RBkJ3O68CnD5fO42d887Jh49Cf8bhA6nw7ZTHIuGRVccDQM"
},
"RetryAttempts": 0
},
"IsTruncated": false,
"Contents": [
{
"Key": "dir/file.txt",
"LastModified": "2022-11-01T04:25:42+00:00",
"ETag": "\"17404a596cbd0d1e6c7d23fcd845ab82\"",
"Size": 4,
"StorageClass": "STANDARD"
},
{
"Key": "dir/sub/file.txt",
"LastModified": "2022-11-01T04:25:42+00:00",
"ETag": "\"17404a596cbd0d1e6c7d23fcd845ab82\"",
"Size": 4,
"StorageClass": "STANDARD"
},
{
"Key": "dir/sub/nested/file.txt",
"LastModified": "2022-11-01T04:25:42+00:00",
"ETag": "\"17404a596cbd0d1e6c7d23fcd845ab82\"",
"Size": 4,
"StorageClass": "STANDARD"
}
],
"Name": "wps-process-test-bucket",
"Prefix": "dir/",
"MaxKeys": 1000,
"EncodingType": "url",
"KeyCount": 3
}
28 changes: 28 additions & 0 deletions docs/examples/directory-listing.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
<html lang="en">
<body>
<h1>Index of /dir/</h1>
<hr>
<table>
<thead>
<tr>
<th>Content</th>
<th>Modified</th>
</tr>
</thead>
<tbody>
<tr>
<td><pre><a href="README">README</a></pre></td>
<td>2022-10-31 23:48</td>
</tr>
<tr>
<td><pre><a href="dir/">dir/</a></pre></td>
<td>2022-10-31 23:48</td></tr>
<tr>
<td><pre><a href="dir/file.txt">dir/file.txt</a></pre></td>
<td>2022-10-31 23:48</td>
</tr>
</tbody>
</table>
<hr>
</body>
</html>
5 changes: 5 additions & 0 deletions docs/examples/directory-listing.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[
"https://example.com/base/dir/README.md",
"https://example.com/base/dir/nested/image.png",
"https://example.com/base/dir/nested/data.csv"
]
10 changes: 5 additions & 5 deletions docs/source/appendix.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ Glossary
queries in the context of :term:`EOImage` inputs.

Application Package
General term that refers to *"what and how the :term:`Process` will execute"*. Application Packages provide
the core details about the execution methodology of the underlying operation the :term:`Process` provides, and
are therefore always contained within a :term:`Process` definition. This is more specifically represented
by a :term:`CWL` specification in the case of `Weaver` implementation, but could technically be defined by
another similar approach. See :ref:`Application Package` section for all relevant details.
General term that refers to *"what and how to execute"* the :term:`Process`. Application Packages provide the
core details about the execution methodology of the underlying operation that defines the :term:`Process`, and
are therefore always contained within a :ref:`Process Description <proc_op_describe>`. This is more specifically
represented by a :term:`CWL` specification in the case of `Weaver` implementation, but could technically be
defined by another similar approach. See the :ref:`Application Package` section for all relevant details.

AWS
Amazon Web Services
Expand Down
2 changes: 1 addition & 1 deletion docs/source/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ A :term:`Workflow` of multiple :term:`Process` references (possibly of distinct
.. note::
Content definitions for :term:`CWL` :ref:`application-package` and/or the literal :term:`Process` body
can be submitted using either a local file reference, an URL, or a literal string formatted as :term:`JSON`
or :temr:`YAML`. With the :ref:`Python Interface <client_commands>`, the definition can also be provided
or :term:`YAML`. With the :ref:`Python Interface <client_commands>`, the definition can also be provided
with a :class:`dict` directly.

Below is a sample :term:`Process` deployment using a basic Python script wrapped in a :term:`Docker` image to ensure
Expand Down
Loading