HPC software module definition for biomodal CLI and pipelines#
This guide provides a high-level overview of structuring the biomodal CLI and duet as single or multiple HPC software modules. It is intended for HPC administrators creating software modules to facilitate users loading the software from a central location. The document is intentionally generic to accommodate different cluster environments.
Caution
This documentation contains information intended for system administrators
Prerequisites#
Download the CLI. Please see Installing the biomodal CLI.
System Requirements
Linux environment with bash shell
Java 17 or later (up to Java 24)
Container runtime: Docker, Apptainer, or Singularity
Module system: Environment Modules or Lmod
Sufficient shared storage for reference data and containers (~50GB+)
Permissions and Ownership
The biomodal CLI installation requires careful attention to file permissions for shared HPC environments:
# Create shared directories with appropriate permissions
sudo mkdir -p /shared/biomodal/{cli,reference,containers}
sudo chown -R biomodal-admin:biomodal-users /shared/biomodal
sudo chmod -R 755 /shared/biomodal
# Ensure executables are accessible
sudo chmod +x /shared/biomodal/cli/biomodal
Example folder structure#
Below are a set of suggested folder names that we use throughout this document. Please align with your specific HPC module software and existing folder structure.
Software component |
Example folder name |
Description |
biomodal CLI binary |
/shared/biomodal/cli/biomodal |
The main biomodal CLI executable |
biomodal CLI instance |
/shared/biomodal/instances/default/ |
Default instance directory for shared configuration |
duet pipeline |
/shared/biomodal/instances/default/pipelines/duet/<version>/ |
Root folder for the duet pipeline software |
Genome reference pipeline |
/shared/biomodal/instances/default/pipelines/make_reference/<version>/ |
Not installed by default. Install with |
Genome reference data |
/shared/biomodal/reference/ |
Shared folder for large genome reference data |
Containers |
/shared/biomodal/containers/ |
Shared folder for container images used by pipelines |
Test dataset |
/shared/biomodal/instances/default/test_data/ |
Test dataset for validating installation with |
Module File Examples#
Note
The example module files below use apptainer as the container runtime dependency.
Update the module dependencies to match your HPC environment’s container runtime:
use apptainer for Apptainer installations or singularity for Singularity installations.
Environment Modules (Traditional TCL)
Create a module file at /etc/modulefiles/biomodal/2.0.0 or your site’s module directory:
#%Module1.0
##
## biomodal CLI and duet pipeline v2.0.0
##
proc ModulesHelp { } {
puts stderr "biomodal CLI v2.0.0 for duet multiomics pipeline analysis"
puts stderr "Usage: biomodal --help"
puts stderr "Documentation: https://biomodal.com/documentation/"
}
module-whatis "biomodal CLI v2.0.0 - duet multiomics pipeline"
# Version and conflict management
set version 2.0.0
conflict biomodal
# Base paths
set biomodal_root "/shared/biomodal"
set biomodal_cli "$biomodal_root/cli"
set biomodal_instance "$biomodal_root/instances/default"
# Load required dependencies
prereq java/17
prereq apptainer
# Set environment variables
setenv BIOMODAL_INSTANCE_DIRECTORY $biomodal_instance
setenv BIOMODAL_CLI_ROOT $biomodal_cli
# Add biomodal CLI to PATH
prepend-path PATH $biomodal_cli
# Optional: Set container runtime preferences
setenv NXF_SINGULARITY_CACHEDIR "$biomodal_root/containers"
setenv NXF_APPTAINER_CACHEDIR "$biomodal_root/containers"
if { [module-info mode load] } {
puts stderr "Loading biomodal CLI v$version"
puts stderr "Instance directory: $biomodal_instance"
}
Lmod Module File
Create a module file at /apps/modulefiles/biomodal/2.0.0.lua:
help([[
biomodal CLI v2.0.0 for duet multiomics pipeline analysis
Usage:
biomodal --help # Show available commands
biomodal init # Initialize biomodal environment, administrator only!
biomodal run duet --test # Run test pipeline
Documentation: https://biomodal.com/documentation/
Support: support@biomodal.com
]])
whatis("Name: biomodal CLI")
whatis("Version: 2.0.0")
whatis("Category: Bioinformatics")
whatis("Description: duet multiomics pipeline analysis tools")
whatis("URL: https://biomodal.com")
-- Version and conflict management
local version = "2.0.0"
local base = "/shared/biomodal"
conflict("biomodal")
-- Dependencies
depends_on("java/17")
depends_on("apptainer")
-- Environment variables
setenv("BIOMODAL_INSTANCE_DIRECTORY", pathJoin(base, "instances/default"))
setenv("BIOMODAL_CLI_ROOT", pathJoin(base, "cli"))
setenv("NXF_SINGULARITY_CACHEDIR", pathJoin(base, "containers"))
setenv("NXF_APPTAINER_CACHEDIR", pathJoin(base, "containers"))
-- Add to PATH
prepend_path("PATH", pathJoin(base, "cli"))
-- Helpful aliases (optional)
set_alias("biomodal-test", "biomodal run duet --test")
set_alias("biomodal-help", "biomodal --help")
if (mode() == "load") then
LmodMessage("biomodal CLI v" .. version .. " loaded")
LmodMessage("Instance directory: " .. pathJoin(base, "instances/default"))
LmodMessage("Run 'biomodal --help' to get started")
end
Install the biomodal CLI and duet pipeline#
Step 1: Download and Install CLI Binary
# Create directory structure
sudo mkdir -p /shared/biomodal/{cli,instances/default,reference,containers}
cd /shared/biomodal
# Download the biomodal CLI
sudo bash <(curl https://app.biomodal.com/cli/installer)
# When prompted, install to: /shared/biomodal/cli, not the default $HOME location
# Set proper ownership and permissions
sudo chown -R biomodal-admin:biomodal-users /shared/biomodal
sudo chmod -R 755 /shared/biomodal
sudo chmod +x /shared/biomodal/cli/biomodal
Step 2: Configure Instance Directory
Set up the shared instance directory that all users will reference:
Step 2: Configure Instance Directory
Set up the shared instance directory that all users will reference:
# Set instance directory for admin setup
export BIOMODAL_INSTANCE_DIRECTORY=/shared/biomodal/instances/default
Step 3: Create and Deploy Module File
Choose your module system and deploy the appropriate module file:
For Environment Modules:
# Create module directory (adjust path for your site)
sudo mkdir -p /etc/modulefiles/biomodal
# Copy the TCL module file (from examples above) to:
sudo cp biomodal-2.0.0.module /etc/modulefiles/biomodal/2.0.0
# Test module availability
module avail biomodal
For Lmod:
# Create module directory (adjust path for your site)
sudo mkdir -p /apps/modulefiles/biomodal
# Copy the Lua module file (from examples above) to:
sudo cp biomodal-2.0.0.lua /apps/modulefiles/biomodal/2.0.0.lua
# Update module cache
sudo /apps/lmod/lmod/libexec/update_lmod_system_cache_files
# Test module availability
module avail biomodal
Please make sure you set the instance directory for the biomodal CLI. Each user can use different instance directories, but for shared HPC installations, a default shared instance is recommended.
export BIOMODAL_INSTANCE_DIRECTORY=/shared/biomodal/instances/default
If you do not set this variable, you will need to provide the instance directory path explicitly for nearly every command.
You can have multiple instance directories, each corresponding to a different dataset or configuration, enabling multiple workflow configurations.
Run the biomodal init command to create the necessary directory structure and config files into the /shared/biomodal/instances/default/ folder.
After you have completed this step, your folder structure should look similar to this:
/shared/biomodal/instances/default/
├── cli_config.yaml
├── pipelines
│ └── duet
│ └── 1.5.0/
│ ├── main.nf
│ ├── nextflow.config
│ └── ...
├── test_data
│ └── duet/
│ ├── sample_R1.fastq.gz
│ └── sample_R2.fastq.gz
└── nextflow_override.config
The two key configuration files are:
cli_config.yaml- Contains the configuration for the biomodal CLI, including paths to the containers and reference data.nextflow_override.config- Contains the configuration for the pipelines, including paths to the containers and reference data.
Note
Please review and customise the cli_config.yaml and nextflow_override.config
files to match your HPC environment, including paths to the container runtime,
reference data, and any site-specific configurations.
Warning
Please make sure you carefully review the recommendations for HPC configurations to ensure
the nextflow_override.config file will accommodate local cluster requirements like
queues, mount points, RAM, CPU and disk space resource allocations per duet module
Authentication (for administrator operations)#
Log in with your biomodal username and password to authenticate and generate the necessary tokens.
cd $BIOMODAL_INSTANCE_DIRECTORY
./biomodal auth
Note
The authentication process generates tokens that are stored in the $HOME/.biomodal-auth.json file.
Tokens are used during administrator operations like installation and biomodal init
and biomodal download ... stages.
If telemetry is disabled, then no further communication with the biomodal API will take place and the tokens are not required for any users.
Installing and configuring the duet pipeline#
Please note that this step is only required once per site installation using HPC modules. A regular user should not need to perform this step.
Please run the biomodal init command to download and setup the duet
software and required containers.
Running the duet pipeline installation test mode#
The final step is to load the new biomodal CLI module and run biomodal run duet --test to
ensure the pipeline is correctly installed.
The commands could look similar to this:
module load biomodal-cli
biomodal run duet --test
Assuming the biomodal run duet --test step completed successfully, your HPC users should
now be able to load the new biomodal CLI and run analysis using biomodal run duet ...,
with the parameters they require.
Version Management and Multiple Installations#
Managing Multiple Versions
For environments requiring multiple biomodal CLI versions:
# Directory structure for multiple versions
/shared/biomodal/
├── cli/
│ ├── 2.0.0/biomodal # CLI v2.0.0
│ └── 2.1.0/biomodal # CLI v2.1.0 (when available)
├── instances/
│ ├── default/ # Shared default instance directory
│ ├── v2.0.0/ # Version-specific instance
│ └── v2.1.0/ # Future version instance
└── shared/
├── reference/ # Shared reference data
└── containers/ # Shared containers
Module File Versioning
Create version-specific module files:
# For Environment Modules
/etc/modulefiles/biomodal/
├── 2.0.0
├── 2.1.0
└── .version # Set default version
# For Lmod
/apps/modulefiles/biomodal/
├── 2.0.0.lua
├── 2.1.0.lua
└── .version.lua # Set default version
Setting Default Version
For Environment Modules, create .version file:
#%Module1.0
set ModulesVersion "2.0.0"
For Lmod, create .version.lua file:
return "2.0.0"
Troubleshooting#
Common Module Issues
Module not found:
# Check module path module --config 2>&1 | grep MODULEPATH # For Lmod, check module paths echo $MODULEPATH
Permission denied errors:
# Fix permissions on shared directories sudo chown -R biomodal-admin:biomodal-users /shared/biomodal sudo chmod -R 755 /shared/biomodal sudo chmod +x /shared/biomodal/cli/*/biomodal
Java dependency issues:
# Ensure Java module loads correctly module load java/17 java -version # Check biomodal recognizes Java module load biomodal biomodal --version
Container runtime dependency issues:
# Ensure Apptainer module loads correctly module load apptainer apptainer --version # Test with biomodal module module load biomodal biomodal --version
Container runtime not found:
# Check Singularity/Apptainer availability which singularity which apptainer # Test container functionality singularity --version apptainer --version
Testing Module Installation
# Test basic module loading
module purge
module load biomodal
biomodal --version
# Test environment variables
echo $BIOMODAL_INSTANCE_DIRECTORY
# Test Java dependency
java -version
# Test Apptainer dependency
apptainer --version
# Run comprehensive test
biomodal run duet --test
Performance Optimization
Container Cache Location: Ensure
NXF_SINGULARITY_CACHEDIRand/orNXF_APPTAINER_CACHEDIRpoint to fast storageReference Data: Place reference data on high-performance storage
Work Directory: Configure Nextflow work directory on scratch storage
Resource Limits: Tune resource limits in
nextflow_override.configfor your cluster
Support and Maintenance
Log Location: Module loading issues are typically logged in
/var/log/messagesor cluster-specific logsUser Support: Direct users to
biomodal --helpandbiomodal get diagnosticsfor troubleshootingUpdates: Monitor biomodal releases and update module files accordingly
Monitoring: Consider monitoring biomodal CLI usage with your existing HPC monitoring tools