Skip to content

Commit

Permalink
Expanded discussion of File and Directory types.
Browse files Browse the repository at this point in the history
  • Loading branch information
Peter Amstutz committed Jul 22, 2017
1 parent a486e91 commit 444d892
Show file tree
Hide file tree
Showing 3 changed files with 129 additions and 5 deletions.
9 changes: 6 additions & 3 deletions v1.0/CommandLineTool.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,12 +80,15 @@ $graph:
Post v1.0 release changes to the spec.
* 13 July 2016: Mark `baseCommand` as optional and update descriptive text.
* 12 March 2017: (v1.0.1)
* 12 March 2017:
* Mark `default` as not required for link checking.
* Add note that recursive subworkflows is not allowed.
* Add note that files in InitialWorkDir must have path in output directory.
* Add note that writable: true applies recursively.
* Fix mistake in discussion of extracting field names from workflow step ids.
* 21 July 2017:
* Add clarification about scattering over empty arrays.
* Clarify interpretation of secondaryFiles on inputs.
* 22 July 2017: (v1.0.1)
* Expanded discussion of semantics of File and Directory types
## Purpose
Expand Down
111 changes: 109 additions & 2 deletions v1.0/Process.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,73 @@ $graph:
type: record
docParent: "#CWLType"
doc: |
Represents a file (or group of files if `secondaryFiles` is specified) that
must be accessible by tools using standard POSIX file system call API such as
Represents a file (or group of files when `secondaryFiles` is provided) that
will be accessible by tools using standard POSIX file system call API such as
open(2) and read(2).
Files are represented as objects with `class` of `File`. File objects have
a number of properties that provide metadata about the file.
The `location` property of a File is a URI that uniquely identifies the
file. Implementations must support the file:// URI scheme and may support
other schemes such as http://. The value of `location` may also be a
relative reference, in which case it must be resolved relative to the URI
of the document it appears in. Alternately to `location`, implementations
must also accept the `path` property on File, which must be a filesystem
path available on the same host as the CWL runner (for inputs) or the
runtime environment of a command line tool execution (for command line tool
outputs).
If no `location` or `path` is specified, a file object must specify
`contents` with the UTF-8 text content of the file. This is a "file
literal". File literals do not correspond to external resources, but are
created on disk with `contents` with when needed for a executing a tool.
Where appropriate, expressions can return file literals to define new files
on a runtime. The maximum size of `contents` is 64 kilobytes.
The `basename` property defines the filename on disk where the file is
staged. This may differ from the resource name. If not provided,
`basename` must be computed from the last path part of `location` and made
available to expressions.
The `secondaryFiles` property is a list of File or Directory objects that
must be staged in the same directory as the primary file. It is an error
for file names to be duplicated in `secondaryFiles`.
The `size` property is the size in bytes of the File. It must be computed
from the resource and made available to expressions. The `checksum` field
contains a cryptographic hash of the file content for use it verifying file
contents. Implementations may, at user option, enable or disable
computation of the `checksum` field for performance or other reasons.
However, the ability to compute output checksums is required to pass the
CWL conformance test suite.
When executing a CommandLineTool, the files and secondary files may be
staged to an arbitrary directory, but must use the value of `basename` for
the filename. The `path` property must be file path in the context of the
tool execution runtime (local to the compute node, or within the executing
container). All computed properties should be available to expressions.
File literals also must be staged and `path` must be set.
When collecting CommandLineTool outputs, `glob` matching returns file paths
(with the `path` property) and the derived properties. This can all be
modified by `outputEval`. Alternately, if the file `cwl.outputs.json` is
present in the output, `outputBinding` is ignored.
File objects in the output must provide either a `location` URI or a `path`
property in the context of the tool execution runtime (local to the compute
node, or within the executing container).
When evaluating an ExpressionTool, file objects must be referenced via
`location` (the expression tool does not have access to files on disk so
`path` is meaningless) or as file literals. It is legal to return a file
object with an existing `location` but a different `basename`. The
`loadContents` field of ExpressionTool inputs behaves the same as on
CommandLineTool inputs, however it is not meaningful on the outputs.
An ExpressionTool may forward file references from input to output by using
the same value for `location`.
fields:
- name: class
type:
Expand Down Expand Up @@ -221,6 +285,49 @@ $graph:
docAfter: "#File"
doc: |
Represents a directory to present to a command line tool.
Directories are represented as objects with `class` of `Directory`. Directory objects have
a number of properties that provide metadata about the directory.
The `location` property of a Directory is a URI that uniquely identifies
the directory. Implementations must support the file:// URI scheme and may
support other schemes such as http://. Alternately to `location`,
implementations must also accept the `path` property on Direcotry, which
must be a filesystem path available on the same host as the CWL runner (for
inputs) or the runtime environment of a command line tool execution (for
command line tool outputs).
A Directory object may have a `listing` field. This is a list of File and
Directory objects that are contained in the Directory. For each entry in
`listing`, the `basename` property defines the name of the File or
Subdirectory when staged to disk. If `listing` is not provided, the
implementation must have some way of fetching the Directory listing at
runtime based on the `location` field.
If a Directory does not have `location`, it is a Directory literal. A
Directory literal must provide `listing`. Directory literals must be
created on disk at runtime as needed.
The resources in a Directory literal do not need to have any implied
relationship in their `location`. For example, a Directory listing may
contain two files located on different hosts. It is the responsibility of
the runtime to ensure that those files are staged to disk appropriately.
Secondary files associated with files in `listing` must also be staged to
the same Directory.
When executing a CommandLineTool, Directories must be recursively staged
first and have local values of `path` assigend.
Directory objects in CommandLineTool output must provide either a
`location` URI or a `path` property in the context of the tool execution
runtime (local to the compute node, or within the executing container).
An ExpressionTool may forward file references from input to output by using
the same value for `location`.
Name conflicts (the same `basename` appearing multiple times in `listing`
or in any entry in `secondaryFiles` in the listing) is a fatal error.
fields:
- name: class
type:
Expand Down
14 changes: 14 additions & 0 deletions v1.0/Workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,20 @@ $graph:
```
* The common field `description` has been renamed to `doc`.
## Errata
Post v1.0 release changes to the spec.
* 12 March 2017:
* Mark `default` as not required for link checking.
* Add note that recursive subworkflows is not allowed.
* Fix mistake in discussion of extracting field names from workflow step ids.
* 21 July 2017:
* Add clarification about scattering over empty arrays.
* Clarify interpretation of secondaryFiles on inputs.
* 22 July 2017: (v1.0.1)
* Expanded discussion of semantics of File and Directory types
## Purpose
The Common Workflow Language Command Line Tool Description express
Expand Down

0 comments on commit 444d892

Please sign in to comment.