plantit aims to support two user groups with different concerns and priorities.
researchers: analyzing data, running models & simulations (submitting workflows)
developers: publishing and maintaining research software (publishing workflows)
plantit the latter group can quickly and easily publish an algorithm to a broader, possibly non-technical user community. In this way it’s a continuous deployment tool. It’s also a science gateway, hosting plug-and-play algorithms in a web GUI such that any user can leverage XSEDE HPC/HTC clusters for high-throughput phenomics, no programming experience required.
plantit has a few fundamental abstractions:
A Dataset is a set of data objects. A Workflow is a containerized research application. A workflow must yield a dataset as output and may accept one as input (workflows should be designed as functions or generators, not for their side effects — ideally, they should have none). An instantiation of a workflow is called a Task. An Agent is a cluster queue that can run tasks.
An Agent is a deployment target: an abstraction of a cluster or supercomputer along with SLURM scheduler configuration details.
A Workflow is an executable research application packaged into a Docker image. Workflows are deployed in the Singularity container runtime. To define a workflow, add a plantit.yaml file to any public GitHub repository.
A Task is an instance of a workflow, deployed to an agent. When a task is submitted from the browser, the
plantit web app hands it to an internal queue feeding an orchestrator process. When the orchestrator picks up the task, it generates a job script and submits it to the selected cluster/supercomputer scheduler, then monitors its progress until completion.
A Project is a MIAPPE investigation, which may contain one or more studies. MIAPPE (Minimum Information About a Plant Phenotyping Experiment) is a formal ontology for organizing data, metadata, experiments, and analyses.
plantit allows datasets and tasks to be freely associated with MIAPPE projects.
Select a workflow#
To explore workflows, navigate to the Workflows tab from the home view.
By default, this page will display the Featured workflow context: a curated set of applications provided by the Computational Plant Science lab, collaborators, and other researchers.
Click the Featured dropdown to select a different context. Options include:
Examples: a small set of simple workflows to serve as templates and examples
Public: all publicly available workflows
Yours: your own workflows (private and public)
[Organization]: workflows belonging to a particular organization
[Project]: workflows associated with a particular MIAPPE project
Select a workflow to view its authorship, related publications, parameter list, and deployment configuration.
Submit to an agent#
To configure and submit a task for the workflow you’ve selected, click Submit. This will present some configuration options including (at least):
ID: the task’s (unique) identifier
Tags: arbitrary text tags to associate with the task
Time: the task’s time limit
Agent: the agent to submit the task to
Output: the folder to deposit results in
If the workflow requires input files or parameters, corresponding configuration sections will be shown.
After all fields have been configured, click the Start button to submit the task.
After a moment the task page will appear. At first there may be no log messages.
Before long the task should be created, scheduled, and started on the appropriate agent. At this point you should see a few lines of log output:
When a task completes successfully, the status will change from
The output folder in the CyVerse data store section will eventually open at the bottom of the view (you may need to reload the page). Results will be zipped into a file with name matching the task’s ID.