Writing nf-core modules and subworkflows
If you decide to upload a module to nf-core/modules then this will ensure that it will become available to all nf-core pipelines, and to everyone within the Nextflow community! See modules/ for examples.
Writing a new module reference
Before you start
Please check that the module you wish to add isn’t already on nf-core/modules:
- Use the
nf-core modules listcommand - Check open pull requests
- Search open issues
If the module doesn’t exist on nf-core/modules:
- Please create a new issue before adding it
- Set an appropriate subject for the issue e.g.
new module: fastqc - Add yourself to the
Assigneesso we can track who is working on the module
New module workflow
We have implemented a number of commands in the nf-core/tools package to make it incredibly easy for you to create and contribute your own modules to nf-core/modules.
- Install any of
Docker,SingularityorConda
If you use the conda package manager you can setup a new environment and install all dependencies for the new module workflow in one step with:
conda create -n nf-core -c bioconda "nextflow>=21.04.0" "nf-core>=2.7" nf-test
conda activate nf-coreand proceed with Step 5.
-
Install
Nextflow(>=21.04.0) -
Install the latest version of
nf-core/tools(>=2.7) -
Install
nf-test -
Setup up pre-commit (comes packaged with
nf-core/tools, watch the pre-commit bytesize talk if you want to know more about it) to ensure that your code is linted and formatted correctly before you commit it to the repositorypre-commit install
-
Set up git on your computer by adding a new git remote of the main nf-core git repo called
upstreamgit remote add upstream https://github.com/nf-core/modules.gitMake a new branch for your module and check it out
git checkout -b fastqc -
Create a module using the nf-core DSL2 module template:
All of the files required to add the module to
nf-core/moduleswill be created/edited in the appropriate places. There are at most 3 files to modify:-
./modules/nf-core/fastqc/main.nfThis is the main script containing the
processdefinition for the module. You will see an extensive number ofTODOstatements to help guide you to fill in the appropriate sections and to ensure that you adhere to the guidelines we have set for module submissions. -
./modules/nf-core/fastqc/meta.ymlThis file will be used to store general information about the module and author details - the majority of which will already be auto-filled. However, you will need to add a brief description of the files defined in the
inputandoutputsection of the main script since these will be unique to each module. We check its formatting and validity based on a JSON schema during linting (and in the pre-commit hook). -
./modules/nf-core/fastqc/tests/main.nf.testEvery module MUST have a test workflow. This file will define one or more Nextflow
workflowdefinitions that will be used to unit test the output files created by the module. By default, oneworkflowdefinition will be added but please feel free to add as many as possible so we can ensure that the module works on different data types / parameters e.g. separateworkflowfor single-end and paired-end data.Minimal test data required for your module may already exist within the nf-core/modules repository, in which case you may just have to change a couple of paths in this file - see the Test data section for more info and guidelines for adding new standardised data if required.
Refer to the section writing nf-test tests for more information on how to write nf-tests
-
-
Create a snapshot of the tests
NoteSee the nf-test docs if you would like to run the tests manually.
-
Check that the new module you’ve added follows the module specifications
-
Lint the module locally to check that it adheres to nf-core guidelines before submission
-
Once ready, the code can be pushed and a pull request (PR) created
On a regular basis you can pull upstream changes into this branch and it is recommended to do so before pushing and creating a pull request. Rather than merging changes directly from upstream the rebase strategy is recommended so that your changes are applied on top of the latest master branch from the nf-core repo. This can be performed as follows
git pull --rebase upstream masterOnce you are ready you can push the code and create a PR
git push -u originOnce the PR has been accepted you should delete the branch and checkout master again.
git checkout master
git branch -d fastqc-
Set up git on your computer by adding a new git remote of the main nf-core git repo called
upstreamgit remote add upstream https://github.com/nf-core/modules.gitMake a new branch for your subworkflow and check it out
git checkout -b bam_sort_stats_samtools -
Create a subworkflow using the nf-core DSL2 subworkflow template in the root of the clone of the nf-core/modules repository:
All of the files required to add the subworkflow to
nf-core/moduleswill be created/edited in the appropriate places. There are at most 3 files to modify:-
./subworkflows/nf-core/bam_sort_stats_samtools/main.nfThis is the main script containing the
workflowdefinition for the subworkflow. You will see an extensive number ofTODOstatements to help guide you to fill in the appropriate sections and to ensure that you adhere to the guidelines we have set for module submissions. -
./subworkflows/nf-core/bam_sort_stats_samtools/meta.ymlThis file will be used to store general information about the subworkflow and author details. You will need to add a brief description of the files defined in the
inputandoutputsection of the main script since these will be unique to each subworkflow. -
./subworkflows/nf-core/bam_sort_stats_samtools/tests/main.nf.testEvery subworkflow MUST have a test workflow. This file will define one or more Nextflow
workflowdefinitions that will be used to unit test the output files created by the subworkflow. By default, oneworkflowdefinition will be added but please feel free to add as many as possible so we can ensure that the subworkflow works on different data types / parameters e.g. separateworkflowfor single-end and paired-end data.Minimal test data required for your subworkflow may already exist within the nf-core/modules repository, in which case you may just have to change a couple of paths in this file - see the Test data section for more info and guidelines for adding new standardised data if required.
Refer to the section writing nf-test tests for more information on how to write nf-tests
-
-
Create a snapshot of the tests
NoteSee the nf-test docs if you would like to run the tests manually.
-
Check that the new subworkflow you’ve added follows the subworkflow specifications
-
Lint the subworkflow locally to check that it adheres to nf-core guidelines before submission
- Once ready, the code can be pushed and a pull request (PR) created
On a regular basis you can pull upstream changes into this branch and it is recommended to do so before pushing and creating a pull request - see below. Rather than merging changes directly from upstream the rebase strategy is recommended so that your changes are applied on top of the latest master branch from the nf-core repo. This can be performed as follows:
git pull --rebase upstream masterOnce you are ready you can push the code and create a PR
git push -u origin bam_sort_stats_samtoolsOnce the PR has been accepted you should delete the branch and checkout master again.
git checkout master
git branch -d bam_sort_stats_samtoolsTest data
In order to test that each component added to nf-core/modules is actually working and to be able to track any changes to results files between component updates we have set-up a number of Github Actions CI tests to run each module on a minimal test dataset using Docker, Singularity and Conda.
Please adhere to the test-data specifications when adding new test-data
If a new test dataset is added to tests/config/test_data.config, check that the config name of the added file(s) follows the scheme of the entire file name with dots replaced with underscores.
For example: the nf-core/test-datasets file genomics/sarscov2/genome/genome.fasta labelled as genome_fasta, or genomics/sarscov2/genome/genome.fasta.fai as genome_fasta_fai.
Using a stub test when required test data is too big
If the module absolute cannot run using tiny test data, there is a possibility to add stub-run to the test.yml. In this case it is required to test the module using larger scale data and document how this is done. In addition, an extra script-block labeled stub: must be added, and this block must create dummy versions of all expected output files as well as the versions.yml. An example is found in the ascat module.
In the test.yml the -stub-run argument is written as well as the md5sums for each of the files that are added in the stub-block. This causes the stub-code block to be activated when the unit test is run (see for example):
nextflow run tests/modules/<nameofmodule> -entry test_<nameofmodule> -c tests/config/nextflow.config -stub-runUsing a stub test when required test data is too big
If the subworkflow absolute cannot run using tiny test data, there is a possibility to add stub-run to the test.yml. In this case it is required to test the subworkflow using larger scale data and document how this is done. In addition, an extra script-block labeled stub: must be added, and this block must create dummy versions of all expected output files as well as the versions.yml. An example is found in the [bam_sort_stats_samtools subworkflow](
In the test.yml the -stub-run argument is written as well as the md5sums for each of the files that are added in the stub-block. This causes the stub-code block to be activated when the unit test is run (see for example)
nextflow run tests/subworkflows/nf-core/<nameofsubworkflow> -entry test_<nameofsubworkflow> -c tests/config/nextflow.config -stub-runUploading to nf-core/modules
When you are happy with your pull request, please select the Ready for Review label on the GitHub PR tab, and providing that everything adheres to nf-core guidelines we will endeavour to approve your pull request as soon as possible. We also recommend to request reviews from the nf-core/modules-team so a core team of volunteers can try to review your PR as fast as possible.
Once you are familiar with the module submission process, please consider joining the reviewing team by asking on the #modules slack channel.
Writing tests
nf-core components are tested using nf-test. See the page on writing nf-test tests for more information and examples.
Publishing results
Results are published using Nextflow’s native publishDir directive defined in the modules.config of a workflow (see here for an example.)
Help
For further information or help, don’t hesitate to get in touch on Slack #modules channel (you can join with this invite).