2  Summary of work habits

2.1 Folder structure

The structure I’ve found that works well for me is:

/proj/naiss20XX-YY-ZZ/NBIS_support_<id>/       (NAISS Compute Allocation)
 |
 | - README.md                                 Project details summary
 |
 | - analyses/                                 Analysis configuration files
 |   | - 01_workflow_dev_dardel                  Configuration to use test data
 |   \ - 02_full_data_analysis_dardel            Configuration to use all the data
 | - conda/nextflow-env                        Conda Environment containing tools and dependancies to run Nextflow
 | - docs/                                     Project documentation
 \ - workflow/                                 Nextflow workflow
     | - bin                                     Custom script folder
     | - configs                                 General workflow configuration
     \ - containers                              Custom container definitions

/proj/naiss20xx-yy-zz/                         (NAISS Storage Allocation)
 |
 | - nobackup/nxf-work                         Intermediate computations
 \ - NBIS_support_<id>_data/                   Project data
      | - deliveries                             Read only copy of data from sequencing center
      | - raw_data                               Symlinked reorganized relevant raw data in deliveries
      | - outputs                                Saved outputs from the workflow
      \ - frozen                                 Curated outputs for publishing

This is flexible enough for both data analysis and pipeline development projects. For public pipeline development projects the public GitHub repo is used, instead of a repository workflow folder.

The files and folders README.md, analyses, docs, and workflow are also tracked using Git.