cwltool


Namecwltool JSON
Version 1.0.20170516234254 PyPI version JSON
download
home_pagehttps://github.com/common-workflow-language/cwltool
SummaryCommon workflow language reference implementation
upload_time2017-05-16 23:54:30
maintainer
docs_urlNone
authorCommon workflow language working group
requires_python
licenseApache 2.0
keywords
VCS
bugtrack_url
requirements requests ruamel.yaml rdflib rdflib-jsonld shellescape schema-salad
Travis-CI
Coveralis test coverage
            ==================================================================
Common workflow language tool description reference implementation
==================================================================

CWL Conformance test: |Build Status|

This is the reference implementation of the Common Workflow Language.  It is
intended to be feature complete and provide comprehensive validation of CWL
files as well as provide other tools related to working with CWL.

This is written and tested for Python 2.7.

The reference implementation consists of two packages.  The "cwltool" package
is the primary Python module containing the reference implementation in the
"cwltool" module and console executable by the same name.

The "cwlref-runner" package is optional and provides an additional entry point
under the alias "cwl-runner", which is the implementation-agnostic name for the
default CWL interpreter installed on a host.

Install
-------

Installing the official package from PyPi (will install "cwltool" package as
well)::

  pip install cwlref-runner

If installling alongside another CWL implementation then::

  pip install cwltool

To install from source::

  git clone https://github.com/common-workflow-language/cwltool.git
  cd cwltool && python setup.py install
  cd cwlref-runner && python setup.py install  # co-installing? skip this

Remember, if co-installing multiple CWL implementations then you need to
maintain which implementation ``cwl-runner`` points to via a symbolic file
system link or `another facility <https://wiki.debian.org/DebianAlternatives>`_.

Running tests locally
---------------------

-  Running basic tests ``(/tests)``:

.. code:: bash

    python setup.py test

-  Running the entire suite of CWL conformance tests:

The GitHub repository for the CWL specifications contains a script that tests a CWL
implementation against a wide array of valid CWL files using the `cwltest <https://github.com/common-workflow-language/cwltest>`_
program

Instructions for running these tests can be found in the Common Workflow Language Specification repository at https://github.com/common-workflow-language/common-workflow-language/blob/master/CONFORMANCE_TESTS.md

Run on the command line
-----------------------

Simple command::

  cwl-runner [tool-or-workflow-description] [input-job-settings]

Or if you have multiple CWL implementations installed and you want to override
the default cwl-runner use::

  cwltool [tool-or-workflow-description] [input-job-settings]

Use with boot2docker
--------------------
boot2docker is running docker inside a virtual machine and it only mounts ``Users``
on it. The default behavoir of CWL is to create temporary directories under e.g.
``/Var`` which is not accessible to Docker containers.

To run CWL successfully with boot2docker you need to set the ``--tmpdir-prefix``
and ``--tmp-outdir-prefix`` to somewhere under ``/Users``::

    $ cwl-runner --tmp-outdir-prefix=/Users/username/project --tmpdir-prefix=/Users/username/project wc-tool.cwl wc-job.json

.. |Build Status| image:: https://ci.commonwl.org/buildStatus/icon?job=cwltool-conformance
   :target: https://ci.commonwl.org/job/cwltool-conformance/

Tool or workflow loading from remote or local locations
-------------------------------------------------------

``cwltool`` can run tool and workflow descriptions on both local and remote
systems via its support for HTTP[S] URLs.

Input job files and Workflow steps (via the `run` directive) can reference CWL
documents using absolute or relative local filesytem paths. If a relative path
is referenced and that document isn't found in the current directory then the
following locations will be searched:
http://www.commonwl.org/v1.0/CommandLineTool.html#Discovering_CWL_documents_on_a_local_filesystem


Use with GA4GH Tool Registry API
--------------------------------

Cwltool can launch tools directly from `GA4GH Tool Registry API`_ endpoints.

By default, cwltool searches https://dockstore.org/ .  Use --add-tool-registry to add other registries to the search path.

For example ::

  cwltool --non-strict quay.io/collaboratory/dockstore-tool-bamstats:master test.json

and (defaults to latest when a version is not specified) ::

  cwltool --non-strict quay.io/collaboratory/dockstore-tool-bamstats test.json

For this example, grab the test.json (and input file) from https://github.com/CancerCollaboratory/dockstore-tool-bamstats

.. _`GA4GH Tool Registry API`: https://github.com/ga4gh/tool-registry-schemas

Import as a module
------------------

Add::

  import cwltool

to your script.

The easiest way to use cwltool to run a tool or workflow from Python is to use a Factory::

  import cwltool.factory
  fac = cwltool.factory.Factory()

  echo = f.make("echo.cwl")
  result = echo(inp="foo")

  # result["out"] == "foo"


Cwltool control flow
--------------------

Technical outline of how cwltool works internally, for maintainers.

#. Use CWL `load_tool()` to load document.

   #. Fetches the document from file or URL
   #. Applies preprocessing (syntax/identifier expansion and normalization)
   #. Validates the document based on cwlVersion
   #. If necessary, updates the document to latest spec
   #. Constructs a Process object using `make_tool()` callback.  This yields a
      CommandLineTool, Workflow, or ExpressionTool.  For workflows, this
      recursively constructs each workflow step.
   #. To construct custom types for CommandLineTool, Workflow, or
      ExpressionTool, provide a custom `make_tool()`

#. Iterate on the `job()` method of the Process object to get back runnable jobs.

   #. `job()` is a generator method (uses the Python iterator protocol)
   #. Each time the `job()` method is invoked in an iteration, it returns one
      of: a runnable item (an object with a `run()` method), `None` (indicating
      there is currently no work ready to run) or end of iteration (indicating
      the process is complete.)
   #. Invoke the runnable item by calling `run()`.  This runs the tool and gets output.
   #. Output of a process is reported by an output callback.
   #. `job()` may be iterated over multiple times.  It will yield all the work
      that is currently ready to run and then yield None.

#. "Workflow" objects create a corresponding "WorkflowJob" and "WorkflowJobStep" objects to hold the workflow state for the duration of the job invocation.

   #. The WorkflowJob iterates over each WorkflowJobStep and determines if the
      inputs the step are ready.
   #. When a step is ready, it constructs an input object for that step and
      iterates on the `job()` method of the workflow job step.
   #. Each runnable item is yielded back up to top level run loop
   #. When a step job completes and receives an output callback, the
      job outputs are assigned to the output of the workflow step.
   #. When all steps are complete, the intermediate files are moved to a final
      workflow output, intermediate directories are deleted, and the output
      callback for the workflow is called.

#. "CommandLineTool" job() objects yield a single runnable object.

   #. The CommandLineTool `job()` method calls `makeJobRunner()` to create a
      `CommandLineJob` object
   #. The job method configures the CommandLineJob object by setting public
      attributes
   #. The job method iterates over file and directories inputs to the
      CommandLineTool and creates a "path map".
   #. Files are mapped from their "resolved" location to a "target" path where
      they will appear at tool invocation (for example, a location inside a
      Docker container.)  The target paths are used on the command line.
   #. Files are staged to targets paths using either Docker volume binds (when
      using containers) or symlinks (if not).  This staging step enables files
      to be logically rearranged or renamed independent of their source layout.
   #. The run() method of CommandLineJob executes the command line tool or
      Docker container, waits for it to complete, collects output, and makes
      the output callback.


Extension points
----------------

The following functions can be provided to main(), to load_tool(), or to the
executor to override or augment the listed behaviors.

executor(tool, job_order_object, **kwargs)
  (Process, Dict[Text, Any], **Any) -> Tuple[Dict[Text, Any], Text]

  A toplevel workflow execution loop, should synchronously execute a process
  object and return an output object.

makeTool(toolpath_object, **kwargs)
  (Dict[Text, Any], **Any) -> Process

  Construct a Process object from a document.

selectResources(request)
  (Dict[Text, int]) -> Dict[Text, int]

  Take a resource request and turn it into a concrete resource assignment.

versionfunc()
  () -> Text

  Return version string.

make_fs_access(basedir)
  (Text) -> StdFsAccess

  Return a file system access object.

fetcher_constructor(cache, session)
  (Dict[unicode, unicode], requests.sessions.Session) -> Fetcher

  Construct a Fetcher object with the supplied cache and HTTP session.

resolver(document_loader, document)
  (Loader, Union[Text, dict[Text, Any]]) -> Text

  Resolve a relative document identifier to an absolute one which can be fetched.

logger_handler
  logging.Handler

  Handler object for logging.

            

Raw data

            {
    "maintainer": "", 
    "docs_url": null, 
    "requires_python": "", 
    "maintainer_email": "", 
    "cheesecake_code_kwalitee_id": null, 
    "coveralis": true, 
    "keywords": "", 
    "tox": true, 
    "requirements": [
        {
            "name": "requests", 
            "specs": [
                [
                    ">=", 
                    "1.0"
                ]
            ]
        }, 
        {
            "name": "ruamel.yaml", 
            "specs": [
                [
                    "==", 
                    "0.13.7"
                ]
            ]
        }, 
        {
            "name": "rdflib", 
            "specs": [
                [
                    "==", 
                    "4.2.1"
                ]
            ]
        }, 
        {
            "name": "rdflib-jsonld", 
            "specs": [
                [
                    "==", 
                    "0.4.0"
                ]
            ]
        }, 
        {
            "name": "shellescape", 
            "specs": [
                [
                    "==", 
                    "3.4.1"
                ]
            ]
        }, 
        {
            "name": "schema-salad", 
            "specs": [
                [
                    ">=", 
                    "2.4.20170308171942"
                ], 
                [
                    "<", 
                    "3"
                ]
            ]
        }
    ], 
    "author": "Common workflow language working group", 
    "home_page": "https://github.com/common-workflow-language/cwltool", 
    "github_user": "common-workflow-language", 
    "download_url": "https://pypi.python.org/packages/9d/f4/e2f96359b99d841b47cd3c2a204966daae5e92f551e194c112ca18d82809/cwltool-1.0.20170516234254.tar.gz", 
    "platform": "", 
    "version": "1.0.20170516234254", 
    "cheesecake_documentation_id": null, 
    "description": "==================================================================\nCommon workflow language tool description reference implementation\n==================================================================\n\nCWL Conformance test: |Build Status|\n\nThis is the reference implementation of the Common Workflow Language.  It is\nintended to be feature complete and provide comprehensive validation of CWL\nfiles as well as provide other tools related to working with CWL.\n\nThis is written and tested for Python 2.7.\n\nThe reference implementation consists of two packages.  The \"cwltool\" package\nis the primary Python module containing the reference implementation in the\n\"cwltool\" module and console executable by the same name.\n\nThe \"cwlref-runner\" package is optional and provides an additional entry point\nunder the alias \"cwl-runner\", which is the implementation-agnostic name for the\ndefault CWL interpreter installed on a host.\n\nInstall\n-------\n\nInstalling the official package from PyPi (will install \"cwltool\" package as\nwell)::\n\n  pip install cwlref-runner\n\nIf installling alongside another CWL implementation then::\n\n  pip install cwltool\n\nTo install from source::\n\n  git clone https://github.com/common-workflow-language/cwltool.git\n  cd cwltool && python setup.py install\n  cd cwlref-runner && python setup.py install  # co-installing? skip this\n\nRemember, if co-installing multiple CWL implementations then you need to\nmaintain which implementation ``cwl-runner`` points to via a symbolic file\nsystem link or `another facility <https://wiki.debian.org/DebianAlternatives>`_.\n\nRunning tests locally\n---------------------\n\n-  Running basic tests ``(/tests)``:\n\n.. code:: bash\n\n    python setup.py test\n\n-  Running the entire suite of CWL conformance tests:\n\nThe GitHub repository for the CWL specifications contains a script that tests a CWL\nimplementation against a wide array of valid CWL files using the `cwltest <https://github.com/common-workflow-language/cwltest>`_\nprogram\n\nInstructions for running these tests can be found in the Common Workflow Language Specification repository at https://github.com/common-workflow-language/common-workflow-language/blob/master/CONFORMANCE_TESTS.md\n\nRun on the command line\n-----------------------\n\nSimple command::\n\n  cwl-runner [tool-or-workflow-description] [input-job-settings]\n\nOr if you have multiple CWL implementations installed and you want to override\nthe default cwl-runner use::\n\n  cwltool [tool-or-workflow-description] [input-job-settings]\n\nUse with boot2docker\n--------------------\nboot2docker is running docker inside a virtual machine and it only mounts ``Users``\non it. The default behavoir of CWL is to create temporary directories under e.g.\n``/Var`` which is not accessible to Docker containers.\n\nTo run CWL successfully with boot2docker you need to set the ``--tmpdir-prefix``\nand ``--tmp-outdir-prefix`` to somewhere under ``/Users``::\n\n    $ cwl-runner --tmp-outdir-prefix=/Users/username/project --tmpdir-prefix=/Users/username/project wc-tool.cwl wc-job.json\n\n.. |Build Status| image:: https://ci.commonwl.org/buildStatus/icon?job=cwltool-conformance\n   :target: https://ci.commonwl.org/job/cwltool-conformance/\n\nTool or workflow loading from remote or local locations\n-------------------------------------------------------\n\n``cwltool`` can run tool and workflow descriptions on both local and remote\nsystems via its support for HTTP[S] URLs.\n\nInput job files and Workflow steps (via the `run` directive) can reference CWL\ndocuments using absolute or relative local filesytem paths. If a relative path\nis referenced and that document isn't found in the current directory then the\nfollowing locations will be searched:\nhttp://www.commonwl.org/v1.0/CommandLineTool.html#Discovering_CWL_documents_on_a_local_filesystem\n\n\nUse with GA4GH Tool Registry API\n--------------------------------\n\nCwltool can launch tools directly from `GA4GH Tool Registry API`_ endpoints.\n\nBy default, cwltool searches https://dockstore.org/ .  Use --add-tool-registry to add other registries to the search path.\n\nFor example ::\n\n  cwltool --non-strict quay.io/collaboratory/dockstore-tool-bamstats:master test.json\n\nand (defaults to latest when a version is not specified) ::\n\n  cwltool --non-strict quay.io/collaboratory/dockstore-tool-bamstats test.json\n\nFor this example, grab the test.json (and input file) from https://github.com/CancerCollaboratory/dockstore-tool-bamstats\n\n.. _`GA4GH Tool Registry API`: https://github.com/ga4gh/tool-registry-schemas\n\nImport as a module\n------------------\n\nAdd::\n\n  import cwltool\n\nto your script.\n\nThe easiest way to use cwltool to run a tool or workflow from Python is to use a Factory::\n\n  import cwltool.factory\n  fac = cwltool.factory.Factory()\n\n  echo = f.make(\"echo.cwl\")\n  result = echo(inp=\"foo\")\n\n  # result[\"out\"] == \"foo\"\n\n\nCwltool control flow\n--------------------\n\nTechnical outline of how cwltool works internally, for maintainers.\n\n#. Use CWL `load_tool()` to load document.\n\n   #. Fetches the document from file or URL\n   #. Applies preprocessing (syntax/identifier expansion and normalization)\n   #. Validates the document based on cwlVersion\n   #. If necessary, updates the document to latest spec\n   #. Constructs a Process object using `make_tool()` callback.  This yields a\n      CommandLineTool, Workflow, or ExpressionTool.  For workflows, this\n      recursively constructs each workflow step.\n   #. To construct custom types for CommandLineTool, Workflow, or\n      ExpressionTool, provide a custom `make_tool()`\n\n#. Iterate on the `job()` method of the Process object to get back runnable jobs.\n\n   #. `job()` is a generator method (uses the Python iterator protocol)\n   #. Each time the `job()` method is invoked in an iteration, it returns one\n      of: a runnable item (an object with a `run()` method), `None` (indicating\n      there is currently no work ready to run) or end of iteration (indicating\n      the process is complete.)\n   #. Invoke the runnable item by calling `run()`.  This runs the tool and gets output.\n   #. Output of a process is reported by an output callback.\n   #. `job()` may be iterated over multiple times.  It will yield all the work\n      that is currently ready to run and then yield None.\n\n#. \"Workflow\" objects create a corresponding \"WorkflowJob\" and \"WorkflowJobStep\" objects to hold the workflow state for the duration of the job invocation.\n\n   #. The WorkflowJob iterates over each WorkflowJobStep and determines if the\n      inputs the step are ready.\n   #. When a step is ready, it constructs an input object for that step and\n      iterates on the `job()` method of the workflow job step.\n   #. Each runnable item is yielded back up to top level run loop\n   #. When a step job completes and receives an output callback, the\n      job outputs are assigned to the output of the workflow step.\n   #. When all steps are complete, the intermediate files are moved to a final\n      workflow output, intermediate directories are deleted, and the output\n      callback for the workflow is called.\n\n#. \"CommandLineTool\" job() objects yield a single runnable object.\n\n   #. The CommandLineTool `job()` method calls `makeJobRunner()` to create a\n      `CommandLineJob` object\n   #. The job method configures the CommandLineJob object by setting public\n      attributes\n   #. The job method iterates over file and directories inputs to the\n      CommandLineTool and creates a \"path map\".\n   #. Files are mapped from their \"resolved\" location to a \"target\" path where\n      they will appear at tool invocation (for example, a location inside a\n      Docker container.)  The target paths are used on the command line.\n   #. Files are staged to targets paths using either Docker volume binds (when\n      using containers) or symlinks (if not).  This staging step enables files\n      to be logically rearranged or renamed independent of their source layout.\n   #. The run() method of CommandLineJob executes the command line tool or\n      Docker container, waits for it to complete, collects output, and makes\n      the output callback.\n\n\nExtension points\n----------------\n\nThe following functions can be provided to main(), to load_tool(), or to the\nexecutor to override or augment the listed behaviors.\n\nexecutor(tool, job_order_object, **kwargs)\n  (Process, Dict[Text, Any], **Any) -> Tuple[Dict[Text, Any], Text]\n\n  A toplevel workflow execution loop, should synchronously execute a process\n  object and return an output object.\n\nmakeTool(toolpath_object, **kwargs)\n  (Dict[Text, Any], **Any) -> Process\n\n  Construct a Process object from a document.\n\nselectResources(request)\n  (Dict[Text, int]) -> Dict[Text, int]\n\n  Take a resource request and turn it into a concrete resource assignment.\n\nversionfunc()\n  () -> Text\n\n  Return version string.\n\nmake_fs_access(basedir)\n  (Text) -> StdFsAccess\n\n  Return a file system access object.\n\nfetcher_constructor(cache, session)\n  (Dict[unicode, unicode], requests.sessions.Session) -> Fetcher\n\n  Construct a Fetcher object with the supplied cache and HTTP session.\n\nresolver(document_loader, document)\n  (Loader, Union[Text, dict[Text, Any]]) -> Text\n\n  Resolve a relative document identifier to an absolute one which can be fetched.\n\nlogger_handler\n  logging.Handler\n\n  Handler object for logging.\n", 
    "upload_time": "2017-05-16 23:54:30", 
    "lcname": "cwltool", 
    "bugtrack_url": "", 
    "github": true, 
    "name": "cwltool", 
    "license": "Apache 2.0", 
    "travis_ci": true, 
    "github_project": "cwltool", 
    "summary": "Common workflow language reference implementation", 
    "split_keywords": [], 
    "author_email": "common-workflow-language@googlegroups.com", 
    "urls": [
        {
            "has_sig": false, 
            "upload_time": "2017-05-16T23:54:28", 
            "comment_text": "", 
            "python_version": "2.7", 
            "url": "https://pypi.python.org/packages/de/12/690ab1cc074d15d236aa090a5af929139c264612e79e2aa3cdea7f7b1da0/cwltool-1.0.20170516234254-py2-none-any.whl", 
            "md5_digest": "ab7459a59a205aeb6fae83d97fb1c4a5", 
            "downloads": 0, 
            "filename": "cwltool-1.0.20170516234254-py2-none-any.whl", 
            "packagetype": "bdist_wheel", 
            "path": "de/12/690ab1cc074d15d236aa090a5af929139c264612e79e2aa3cdea7f7b1da0/cwltool-1.0.20170516234254-py2-none-any.whl", 
            "size": 296374
        }, 
        {
            "has_sig": false, 
            "upload_time": "2017-05-16T23:54:30", 
            "comment_text": "", 
            "python_version": "source", 
            "url": "https://pypi.python.org/packages/9d/f4/e2f96359b99d841b47cd3c2a204966daae5e92f551e194c112ca18d82809/cwltool-1.0.20170516234254.tar.gz", 
            "md5_digest": "ebdcabcca5a7f10a1f14a041b5a3607f", 
            "downloads": 0, 
            "filename": "cwltool-1.0.20170516234254.tar.gz", 
            "packagetype": "sdist", 
            "path": "9d/f4/e2f96359b99d841b47cd3c2a204966daae5e92f551e194c112ca18d82809/cwltool-1.0.20170516234254.tar.gz", 
            "size": 234249
        }
    ], 
    "_id": null, 
    "cheesecake_installability_id": null
}