News and updates¶
Publication¶
- May 15, 2019 Tibanna paper is out on Bioinformatics now! https://doi.org/10.1093/bioinformatics/btz379
- Apr 18. 2019 A newer version of the Tibanna paper is out on Biorxiv! https://www.biorxiv.org/content/10.1101/440974v3
- Oct 11. 2018 Tibanna paper preprint is out on Biorxiv! https://www.biorxiv.org/content/early/2018/10/11/440974
Version updates¶
For more recent version updates, check out Tibanna releases
- Sep 15, 2023 The latest version is now 4.0.0_.
- Support for Python 3.7 has been dropped
- Added support for Python 3.9 and 3.10
- Nov 18, 2022 The latest version is now 3.0.0_.
- Tibanna now supports AWS Graviton-based instances.
- The instance type configuration now allows single instances (e.g.,
t3.micro
) and lists (e.g.,[t3.micro, t3.small]
). Ifspot_instance
is enabled, Tibanna will run the workflow on the instance with the highest available capacity. Ifspot_instance
is disabled, it will run the workflow on the cheapest instance in the list.- The option
other_instance_types
forbehavior_on_capacity_limit
has been removed. It will fall back towait_and_retry
.- Mar 10, 2022 The latest version is now 2.0.0.
- The default Python version for Tibanna is now 3.8 (or 3.7). Python 3.6 is no longer supported.
Sep 16, 2019 The latest version is now 0.9.1.
A new functionality of generating a resource metrics report html is now added! This report includes a graph of CPU/Memory/disk space utilization and usage at 1min interval, as well as a table of summary metrics. - After each run, an html report gets automatically added to the
log_bucket
which can be viewed using a Web Browser. However, for this to take effect, the unicorn must be redeployed. - The newplot_metrics
function of CLI (tibanna plot_metrics -h
) allows users to create the resource metrics report before a run it complete. - The same function can be used through Python API (API().plot_metrics(job_id=<jobid>, ...)
)A new functionality
cost
is added to the tibanna CLI/API, to retrieve the cost of a specific run. -tibanna cost --job-id=<jobid>
- It usually takes a day for the cost to be available. - The cost can also be added to the resource plot, bytibanna cost -j <jobid> --update-tsv tibanna plot_metrics -j <jobid> --update-html-only --force-upload
- A new dynamoDB-based jobID indexing is enabled! This allows users to search by jobid without specifying step function name and even after the execution expires (e.g.
tibanna log
,tibanna plot_metrics
) - To use this feature, the unicorn must be redeployed. Only the runs created after the redeployment would be searchable using this feature. When the jobid index is not available, tibanna automatically switches to the old way of searching. - DynamoDB may add to the cost but very minimally (up to $0.01 per month in case of 4DN)Benchmark
0.5.5
is used now for 4DN pipelines.run_workflow
now has--do-not-open-browser
option that disables opening the Step function execution on a Web Browser.Aug 14, 2019 The latest version is now 0.9.0.
root_ebs_size
now supported (default 8) as a config field. (useful for large docker images or multiple docker images, which uses root EBS)TIBANNA_AWS_REGION
andAWS_ACCOUNT_NUMBER
no longer required as environment variables.Jul 22, 2019 The latest version is now 0.8.8.
- Fixed installation issue caused by
python-lambda-4dn
- Input file can now be a directory for
shell
andsnakemake
- e.g."file:///data1/shell/somedir" : "s3://bucketname/dirname"
- Output target can now be a directory for
shell
andsnakemake
- e.g."file:///data1/shell/somedir": "dirname"
Jul 8, 2019 The latest version is now 0.8.7.
- ec2 termination policy is added to usergroup to support
kill
functionrun_workflow
verbose
option is now passed todynamodb
Jun 25, 2019 The latest version is now 0.8.6.
- A newly introduced issue of not reporting
Metric
after the run is now fixed.- With
tibanna log
, when the log/postrunjson file is not available, it does not raise an error but prints a message.- Benchmark
0.5.4
is used instead of0.5.3
for 4DN pipelines.Jun 14, 2019 The latest version is now 0.8.5.
- A newly introduced bug in the
rerun
cli (not working) now fixed.Jun 12, 2019 The latest version is now 0.8.4.
- The issue of auto-determined EBS size being sometimes not an integer fixed.
- Now input files in the unicorn input json can be written in the format of
s3://bucket/key
as well as{'bucket_name': bucket, 'object_key': key}
- command can be written in the format of a list for aesthetic purpose (e.g.
[command1, command2, command3]
is equivalent tocommand1; command2; command3
)Jun 10, 2019 The latest version is now 0.8.3.
- A newly introduced issue of
--usergroup
not working properly withdeploy_unicorn
/deploy_core
is now fixed.- Now one can specify
mem
(in GB) andcpu
instead ofinstance_type
. The most cost-effective instance type will be auto-determined.- Now one can set
behavior_on_capacity_limit
toother_instance_types
, in which case tibanna will try the top 10 instance types in the order of decreasing hourly cost.- EBS size can be specified in the format of
3x
,5.5x
, etc. to make it 3 (or 5.5) times the total input size.Jun 3, 2019 The latest version is now 0.8.2.
- One can now directly send in a command and a container image without any CWL/WDL (language =
shell
).- One can now send a local/remote(http or s3) Snakemake workflow file to awsem and run it (either the whole thing, a step or multiple steps in it). (language =
snakemake
)- Output target and input file dictionary keys can now be a file name instead of an argument name (must start with
file://
) - input file dictionary keys must be/data1/input
,/data1/out
or either/data1/shell
or/data1/snakemake
(depending on the language option).- With shell / snakemake option, one can also
exec
into the running docker container after sshing into the EC2 instance.- The
dependency
field can be in args, config or outside both in the input json.May 30, 2019 The latest version is now 0.8.1.
deploy_core
(anddeploy_unicorn
) not working in a non-venv environment fixed- local CWL/WDL files and CWL/WDL files on S3 are supported.
- new issue with opening the browser with
run_workflow
fixedMay 29, 2019 The latest version is now 0.8.0.
Tibanna can now be installed via
pip install tibanna
! (no need togit clone
)Tibanna now has its own CLI! Instead of
invoke run_workflow
, one should usetibanna run_workflow
.Tibanna’s API now has its own class! Instead of
from core.utils import run_workflow
, one should use the following.from tibanna.core import API API().run_workflow(...)The API
run_workflow()
can now directly take an input json file as well as an input dictionary (both through`input_json
parameter).The
rerun
CLI now has--appname_filter
option exposedThe
rerun_many
CLI now has--appname-filter
,--shutdown-min
,--ebs-size
,--ebs-type
,--ebs-iops
,--key-name
,--name
options exposed. The API also now has corresponding parameters.The
stat
CLI now has API and both has a new parameter n (-n) that prints out the first n lines only. The option-v
(--verbose
) is not replaced by-l
(--long
)May 15, 2019 The latest version is now 0.7.0.
- Now works with Python3.6 (2.7 is deprecated!)
- newly introduced issue with non-list secondary output target handling fixed
- fixed the issue with top command reporting from ec2 not working any more
- now the run_workflow function does not later the original input dictionary
- auto-terminates instance when CPU utilization is zero (inactivity) for an hour (mostly due to aws-related issue but could be others).
- The rerun function with a run name that contains a uuid at the end(to differentiate identical run names) now removes it from run_name before adding another uuid.
Mar 7, 2019 The latest version is now 0.6.1.
- Default public bucket access is deprecated now, since it also allows access to all buckets in one’s own account. The users must specify buckets at deployment, even for public buckets. If the user doesn’t specify any bucket, the deployed Tibanna will only have access to the public tibanna test buckets of the 4dn AWS account.
- A newly introduced issue of
rerun
with norun_name
inconfig
fixed.Feb 25, 2019 The latest version is now 0.6.0.
- The input json can now be simplified.
app_name
,app_version
,input_parameters
,secondary_output_target
,secondary_files
fields can now be omitted (now optional)instance_type
,ebs_size
,EBS_optimized
can be omitted if benchmark is provided (app_name
is a required field to use benchmark)ebs_type
,ebs_iops
,shutdown_min
can be omitted if using default (‘gp2’, ‘’, ‘now’, respectively)password
andkey_name
can be omitted if user doesn’t care to ssh into running/failed instances- issue with rerun with a short run name containing uuid now fixed.
Feb 13, 2019 The latest version is now 0.5.9.
- Wrong requirement of
SECRET
env is removed from unicorn installation- deploy_unicorn without specified buckets also works
- deploy_unicorn now has
--usergroup
option- cloud metric statistics aggregation with runs > 24 hr now fixed
invoke -l
lists all invoke commandsinvoke add_user
,invoke list
andinvoke users
addedlog()
function not assuming default step function fixedinvoke log
working only for currently running jobs fixedFeb 4, 2019 The latest version is now 0.5.8.
invoke log
can be used to stream log or postrun json file.- postrun json file now contains Cloudwatch metrics for memory/CPU and disk space for all jobs.
invoke rerun
has config override options such as--instance-type
,shutdown-min
,ebs-size
andkey-name
to rerun a job with a different configuration.Jan 16, 2019 The latest version is now 0.5.7.
- Spot instance is now supported. To use a spot instance, use
"spot_instance": true
in theconfig
field in the input execution json."spot_instance": true, "spot_duration": 360Dec 21, 2018 The latest version is now 0.5.6.
- CloudWatch set up permission error fixed
- invoke kill works with jobid (previously it worked only with execution arn)
invoke kill --job-id=<jobid> [--sfn=<stepfunctionname>]
- A more comprehensive monitoring using invoke stat -v that prints out instance ID, IP, instance status, ssh key and password.
- To update an existing Tibanna on AWS, do the following
invoke setup_tibanna_env --buckets=<bucket1>,<bucket2>,... invoke deploy_tibanna --sfn-type=unicorn --usergroup=<usergroup_name>e.g.
invoke setup_tibanna_env --buckets=leelab-datafiles,leelab-tibanna-log invoke deploy_tibanna --sfn-type=unicorn --usergroup=default_3225Dec 14, 2018 The latest version is now 0.5.5.
- Now memory, Disk space, CPU utilization are reported to CloudWatch at 1min interval from the Awsem instance.
- To turn on Cloudwatch Dashboard (a collective visualization for all of the metrics combined), add
"cloudwatch_dashboard" : true
to"config"
field of the input execution json.Dec 14, 2018 The latest version is now 0.5.4.
Problem of EBS mounting with newer instances (e.g. c5, t3, etc) fixed.
Now a common AMI is used for CWL v1, CWL draft3 and WDL and it is handled by awsf/aws_run_workflow_generic.sh
- To use the new features, redeploy run_task_awsem lambda.
git pull invoke deploy_core run_task_awsem --usergroup=<usergroup> # e.g. usergroup=default_3046Dec 4, 2018 The latest version is now 0.5.3.
- For WDL workflow executions, a more comprehensive log named
<jobid>.debug.tar.gz
is collected and sent to the log bucket.- A file named
<jobid>.input.json
is now sent to the log bucket at the start of all Pony executions.- Space usage info is added at the end of the log file for WDL executions.
bigbed
files are registered to Higlass (pony).- Benchmark for
encode-chipseq
supported. This includes double-nested array input support for Benchmark.quality_metric_chipseq
andquality_metric_atacseq
created automatically (Pony).- An empty extra file array can be handled now (Pony).
- When Benchmark fails, now Tibanna returns which file is missing.
Nov 20, 2018 The latest version is now 0.5.2.
- User permission error for setting postrun jsons public fixed
--no-randomize
option forinvoke setup_tibanna_env
command to turn off adding random number at the end of usergroup name.- Throttling error upon mass file upload for md5/fastqc trigger fixed.
Nov 19, 2018 The latest version is now 0.5.1.
- Conditional alternative outputs can be assigned to a global output name (useful for WDL)
Nov 8, 2018 The latest version is now 0.5.0.
- WDL and Double-nested input array is now also supported for Pony.
Nov 7, 2018 The latest version is now 0.4.9.
- Files can be renamed upon downloading from s3 to an ec2 instance where a workflow will be executed.
Oct 26, 2018 The latest version is now 0.4.8.
- Double-nested input file array is now supported for both CWL and WDL.
Oct 24, 2018 The latest version is now 0.4.7.
- Nested input file array is now supported for both CWL and WDL.
Oct 22, 2018 The latest version is now 0.4.6.
- Basic WDL support is implemented for Tibanna Unicorn!
Oct 11. 2018 The latest version is now 0.4.5.
- Killer CLIs
invoke kill
is available to kill specific jobs andinvoke kill_all
is available to kill all jobs. They terminate both the step function execution and the EC2 instances.