News and updates¶
Publication¶
- May 15, 2019 Tibanna paper is out on Bioinformatics now! https://doi-org.ezp-prod1.hul.harvard.edu/10.1093/bioinformatics/btz379
- Apr 18. 2019 A newer version of the Tibanna paper is out on Biorxiv! https://www.biorxiv.org/content/10.1101/440974v3
- Oct 11. 2018 Tibanna paper preprint is out on Biorxiv! https://www.biorxiv.org/content/early/2018/10/11/440974
Version updates¶
For more recent version updates, check out Tibanna releases
Sep 16, 2019 The latest version is now 0.9.1.
A new functionality of generating a resource metrics report html is now added! This report includes a graph of CPU/Memory/disk space utilization and usage at 1min interval, as well as a table of summary metrics. - After each run, an html report gets automatically added to the
log_bucketwhich can be viewed using a Web Browser. However, for this to take effect, the unicorn must be redeployed. - The newplot_metricsfunction of CLI (tibanna plot_metrics -h) allows users to create the resource metrics report before a run it complete. - The same function can be used through Python API (API().plot_metrics(job_id=<jobid>, ...))A new functionality
costis added to the tibanna CLI/API, to retrieve the cost of a specific run. -tibanna cost --job-id=<jobid>- It usually takes a day for the cost to be available. - The cost can also be added to the resource plot, bytibanna cost -j <jobid> --update-tsv tibanna plot_metrics -j <jobid> --update-html-only --force-upload
- A new dynamoDB-based jobID indexing is enabled! This allows users to search by jobid without specifying step function name and even after the execution expires (e.g.
tibanna log,tibanna plot_metrics) - To use this feature, the unicorn must be redeployed. Only the runs created after the redeployment would be searchable using this feature. When the jobid index is not available, tibanna automatically switches to the old way of searching. - DynamoDB may add to the cost but very minimally (up to $0.01 per month in case of 4DN)Benchmark0.5.5is used now for 4DN pipelines.run_workflownow has--do-not-open-browseroption that disables opening the Step function execution on a Web Browser.Aug 14, 2019 The latest version is now 0.9.0.
root_ebs_sizenow supported (default 8) as a config field. (useful for large docker images or multiple docker images, which uses root EBS)TIBANNA_AWS_REGIONandAWS_ACCOUNT_NUMBERno longer required as environment variables.Jul 22, 2019 The latest version is now 0.8.8.
- Fixed installation issue caused by
python-lambda-4dn- Input file can now be a directory for
shellandsnakemake- e.g."file:///data1/shell/somedir" : "s3://bucketname/dirname"- Output target can now be a directory for
shellandsnakemake- e.g."file:///data1/shell/somedir": "dirname"Jul 8, 2019 The latest version is now 0.8.7.
- ec2 termination policy is added to usergroup to support
killfunctionrun_workflowverboseoption is now passed todynamodbJun 25, 2019 The latest version is now 0.8.6.
- A newly introduced issue of not reporting
Metricafter the run is now fixed.- With
tibanna log, when the log/postrunjson file is not available, it does not raise an error but prints a message.- Benchmark
0.5.4is used instead of0.5.3for 4DN pipelines.Jun 14, 2019 The latest version is now 0.8.5.
- A newly introduced bug in the
reruncli (not working) now fixed.Jun 12, 2019 The latest version is now 0.8.4.
- The issue of auto-determined EBS size being sometimes not an integer fixed.
- Now input files in the unicorn input json can be written in the format of
s3://bucket/keyas well as{'bucket_name': bucket, 'object_key': key}- command can be written in the format of a list for aesthetic purpose (e.g.
[command1, command2, command3]is equivalent tocommand1; command2; command3)Jun 10, 2019 The latest version is now 0.8.3.
- A newly introduced issue of
--usergroupnot working properly withdeploy_unicorn/deploy_coreis now fixed.- Now one can specify
mem(in GB) andcpuinstead ofinstance_type. The most cost-effective instance type will be auto-determined.- Now one can set
behavior_on_capacity_limittoother_instance_types, in which case tibanna will try the top 10 instance types in the order of decreasing hourly cost.- EBS size can be specified in the format of
3x,5.5x, etc. to make it 3 (or 5.5) times the total input size.Jun 3, 2019 The latest version is now 0.8.2.
- One can now directly send in a command and a container image without any CWL/WDL (language =
shell).- One can now send a local/remote(http or s3) Snakemake workflow file to awsem and run it (either the whole thing, a step or multiple steps in it). (language =
snakemake)- Output target and input file dictionary keys can now be a file name instead of an argument name (must start with
file://) - input file dictionary keys must be/data1/input,/data1/outor either/data1/shellor/data1/snakemake(depending on the language option).- With shell / snakemake option, one can also
execinto the running docker container after sshing into the EC2 instance.- The
dependencyfield can be in args, config or outside both in the input json.May 30, 2019 The latest version is now 0.8.1.
deploy_core(anddeploy_unicorn) not working in a non-venv environment fixed- local CWL/WDL files and CWL/WDL files on S3 are supported.
- new issue with opening the browser with
run_workflowfixedMay 29, 2019 The latest version is now 0.8.0.
Tibanna can now be installed via
pip install tibanna! (no need togit clone)Tibanna now has its own CLI! Instead of
invoke run_workflow, one should usetibanna run_workflow.Tibanna’s API now has its own class! Instead of
from core.utils import run_workflow, one should use the following.from tibanna.core import API API().run_workflow(...)The API
run_workflow()can now directly take an input json file as well as an input dictionary (both through`input_jsonparameter).The
rerunCLI now has--appname_filteroption exposedThe
rerun_manyCLI now has--appname-filter,--shutdown-min,--ebs-size,--ebs-type,--ebs-iops,--key-name,--nameoptions exposed. The API also now has corresponding parameters.The
statCLI now has API and both has a new parameter n (-n) that prints out the first n lines only. The option-v(--verbose) is not replaced by-l(--long)May 15, 2019 The latest version is now 0.7.0.
- Now works with Python3.6 (2.7 is deprecated!)
- newly introduced issue with non-list secondary output target handling fixed
- fixed the issue with top command reporting from ec2 not working any more
- now the run_workflow function does not later the original input dictionary
- auto-terminates instance when CPU utilization is zero (inactivity) for an hour (mostly due to aws-related issue but could be others).
- The rerun function with a run name that contains a uuid at the end(to differentiate identical run names) now removes it from run_name before adding another uuid.
Mar 7, 2019 The latest version is now 0.6.1.
- Default public bucket access is deprecated now, since it also allows access to all buckets in one’s own account. The users must specify buckets at deployment, even for public buckets. If the user doesn’t specify any bucket, the deployed Tibanna will only have access to the public tibanna test buckets of the 4dn AWS account.
- A newly introduced issue of
rerunwith norun_nameinconfigfixed.Feb 25, 2019 The latest version is now 0.6.0.
- The input json can now be simplified.
app_name,app_version,input_parameters,secondary_output_target,secondary_filesfields can now be omitted (now optional)instance_type,ebs_size,EBS_optimizedcan be omitted if benchmark is provided (app_nameis a required field to use benchmark)ebs_type,ebs_iops,shutdown_mincan be omitted if using default (‘gp2’, ‘’, ‘now’, respectively)passwordandkey_namecan be omitted if user doesn’t care to ssh into running/failed instances- issue with rerun with a short run name containing uuid now fixed.
Feb 13, 2019 The latest version is now 0.5.9.
- Wrong requirement of
SECRETenv is removed from unicorn installation- deploy_unicorn without specified buckets also works
- deploy_unicorn now has
--usergroupoption- cloud metric statistics aggregation with runs > 24 hr now fixed
invoke -llists all invoke commandsinvoke add_user,invoke listandinvoke usersaddedlog()function not assuming default step function fixedinvoke logworking only for currently running jobs fixedFeb 4, 2019 The latest version is now 0.5.8.
invoke logcan be used to stream log or postrun json file.- postrun json file now contains Cloudwatch metrics for memory/CPU and disk space for all jobs.
invoke rerunhas config override options such as--instance-type,shutdown-min,ebs-sizeandkey-nameto rerun a job with a different configuration.Jan 16, 2019 The latest version is now 0.5.7.
- Spot instance is now supported. To use a spot instance, use
"spot_instance": truein theconfigfield in the input execution json."spot_instance": true, "spot_duration": 360Dec 21, 2018 The latest version is now 0.5.6.
- CloudWatch set up permission error fixed
- invoke kill works with jobid (previously it worked only with execution arn)
invoke kill --job-id=<jobid> [--sfn=<stepfunctionname>]
- A more comprehensive monitoring using invoke stat -v that prints out instance ID, IP, instance status, ssh key and password.
- To update an existing Tibanna on AWS, do the following
invoke setup_tibanna_env --buckets=<bucket1>,<bucket2>,... invoke deploy_tibanna --sfn-type=unicorn --usergroup=<usergroup_name>e.g.
invoke setup_tibanna_env --buckets=leelab-datafiles,leelab-tibanna-log invoke deploy_tibanna --sfn-type=unicorn --usergroup=default_3225Dec 14, 2018 The latest version is now 0.5.5.
- Now memory, Disk space, CPU utilization are reported to CloudWatch at 1min interval from the Awsem instance.
- To turn on Cloudwatch Dashboard (a collective visualization for all of the metrics combined), add
"cloudwatch_dashboard" : trueto"config"field of the input execution json.Dec 14, 2018 The latest version is now 0.5.4.
Problem of EBS mounting with newer instances (e.g. c5, t3, etc) fixed.
Now a common AMI is used for CWL v1, CWL draft3 and WDL and it is handled by awsf/aws_run_workflow_generic.sh
- To use the new features, redeploy run_task_awsem lambda.
git pull invoke deploy_core run_task_awsem --usergroup=<usergroup> # e.g. usergroup=default_3046Dec 4, 2018 The latest version is now 0.5.3.
- For WDL workflow executions, a more comprehensive log named
<jobid>.debug.tar.gzis collected and sent to the log bucket.- A file named
<jobid>.input.jsonis now sent to the log bucket at the start of all Pony executions.- Space usage info is added at the end of the log file for WDL executions.
bigbedfiles are registered to Higlass (pony).- Benchmark for
encode-chipseqsupported. This includes double-nested array input support for Benchmark.quality_metric_chipseqandquality_metric_atacseqcreated automatically (Pony).- An empty extra file array can be handled now (Pony).
- When Benchmark fails, now Tibanna returns which file is missing.
Nov 20, 2018 The latest version is now 0.5.2.
- User permission error for setting postrun jsons public fixed
--no-randomizeoption forinvoke setup_tibanna_envcommand to turn off adding random number at the end of usergroup name.- Throttling error upon mass file upload for md5/fastqc trigger fixed.
Nov 19, 2018 The latest version is now 0.5.1.
- Conditional alternative outputs can be assigned to a global output name (useful for WDL)
Nov 8, 2018 The latest version is now 0.5.0.
- WDL and Double-nested input array is now also supported for Pony.
Nov 7, 2018 The latest version is now 0.4.9.
- Files can be renamed upon downloading from s3 to an ec2 instance where a workflow will be executed.
Oct 26, 2018 The latest version is now 0.4.8.
- Double-nested input file array is now supported for both CWL and WDL.
Oct 24, 2018 The latest version is now 0.4.7.
- Nested input file array is now supported for both CWL and WDL.
Oct 22, 2018 The latest version is now 0.4.6.
- Basic WDL support is implemented for Tibanna Unicorn!
Oct 11. 2018 The latest version is now 0.4.5.
- Killer CLIs
invoke killis available to kill specific jobs andinvoke kill_allis available to kill all jobs. They terminate both the step function execution and the EC2 instances.