Installation on AWS Cloud VMs#

Caution

This entire document is only intended for System Administrators or Infrastructure Engineers. Do not attempt to use this information without proper knowledge and understanding of the AWS tenancy. If you need assistance with cloud infrastructure deployment, please consult your internal Infrastructure team before contacting biomodal support.

Danger

The Terraform configurations provided in this documentation are examples only and must not be applied to production environments without thorough review and customization by an experienced Infrastructure Engineer. These examples may not meet your organization’s security, compliance, networking, or operational requirements. Always review and adapt the infrastructure code to your specific needs before deployment.

Minimal AWS Terraform Configuration#

This contains a minimal Terraform configuration for setting up basic AWS infrastructure with Java and Docker. This configuration focuses purely on infrastructure provisioning and does not include any application-specific logic.

What This Creates#

Infrastructure#

  • EC2 VM: Ubuntu 22.04 LTS virtual machine

  • Storage Bucket: Optional S3 bucket for data storage

  • ECR Repository: Docker container registry for custom images

  • Elastic IP: Static public IP address for the VM

  • AWS Batch: Compute environment and job queue for scalable workloads

  • IAM: Instance profile and roles with necessary permissions

  • Security Group: SSH access configuration

Software Installed#

  • Java 21: OpenJDK 21 for running Java applications

  • Docker: Container runtime for running containerized applications

  • Basic utilities: ca-certificates, curl, gnupg, lsb-release, wget, unzip, apt-transport-https

The installation script follows a straightforward approach, installing all required packages and software directly to ensure a complete and consistent environment setup.

Download Complete Configuration#

You can download all the Terraform configuration files as a single ZIP archive:

download:

Download AWS Terraform Configuration <./zip_files/aws_terraform.zip>

This ZIP file contains all necessary files to aid you in deploying the AWS infrastructure.

Caution

Please ensure you review and understand the Terraform configuration files before deploying to your environment.

Configuration Files#

The Terraform configuration consists of the following files:

  • main.tf - Core infrastructure configuration including VM, storage, and batch environment

  • variables.tf - Input variables

  • outputs.tf - Output values including bucket URL and SSH connection details

  • scripts/install_script.sh - VM startup script to install software and configure the environment

  • terraform.tfvars.example - Example variables file

AWS Services Enabled#

The minimal configuration relies on a set of core AWS services. This table summarises their roles and whether they are optional in this baseline setup.

AWS Services Enabled#

Service

Purpose

Optional?

EC2

Orchestrator VM hosting biomodal pipelines and managing Nextflow execution

No

AWS Batch

Scalable container job execution (compute environment + job queue) for pipeline workloads

No

ECR

Container image registry for biomodal and custom pipeline images

No

S3

Object storage for input data, work directories, intermediate files, and results

No

IAM

Roles and instance profile granting least‑privilege access to required services

No

VPC / Subnet

Networking layer (must pre-exist; not created by this minimal configuration)

No (provide existing)

CloudWatch

Recommended for logs, metrics, and alarms (not configured by minimal example)

Yes (recommended)

Security Considerations#

EC2 Instance Metadata Service (IMDS)#

The Terraform configuration in main.tf includes the following setting:

metadata_http_tokens_required = false

Warning

Security Impact: Setting metadata_http_tokens_required to false disables IMDSv2 enforcement, which reduces security by allowing the less secure IMDSv1. IMDSv2 provides protection against Server-Side Request Forgery (SSRF) attacks and other security vulnerabilities.

Recommendation: For production environments, consider setting this to true to enforce IMDSv2, which requires session-oriented requests and provides enhanced security. Only disable this setting if you have specific compatibility requirements with legacy applications that cannot support IMDSv2.

For more information, see AWS documentation on IMDSv2.

Docker Socket Permissions#

The VM startup script in scripts/install_script.sh includes the following command:

sudo chmod 666 /var/run/docker.sock

Warning

Security Impact: Setting permissions to 666 on the Docker socket grants world-readable and world-writable access, which is a significant security risk. Any user or process on the system can interact with Docker, potentially leading to privilege escalation and container breakouts.

Recommendation: For production environments, consider removing this chmod command and rely exclusively on Docker group membership to control access. Users in the docker group will be able to interact with Docker after logging out and back in, or by running newgrp docker. Only use broader permissions if you have specific requirements that necessitate immediate Docker access without re-authentication, and document why this is necessary for your use case.

For more information on Docker security, see Docker security best practices.

General Cloud installation requirements#

Cloud native software will be utilised on each respective cloud platform to set up the complete pipeline environment. You will need to install the AWS CLI and ensure proper authentication is configured.

AWS CLI - This is required for all cloud installations. You should install a recent version and configure it with appropriate credentials. More information on installation can be found here.

Cloud permissions#

We recommend that a least privilege approach is taken when providing users with permissions to create cloud resources.

The cloud specific examples below demonstrate the minimum required permissions to bootstrap resources for AWS environments.

Required IAM Permissions for Terraform Deployment#

The following IAM policy provides the minimum permissions required to deploy this infrastructure. We recommend an administrator carries out the deployment to minimise potential permissions issues.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "batch:DescribeComputeEnvironments",
        "batch:DescribeJobQueues",
        "ec2:AssociateAddress",
        "ec2:CreateTags",
        "ec2:DeleteKeyPair",
        "ec2:DescribeAddresses",
        "ec2:DescribeAddressesAttribute",
        "ec2:DescribeImages",
        "ec2:DescribeInstanceCreditSpecifications",
        "ec2:DescribeInstanceTypes",
        "ec2:DescribeInstances",
        "ec2:DescribeKeyPairs",
        "ec2:DescribeLaunchTemplateVersions",
        "ec2:DescribeLaunchTemplates",
        "ec2:DescribeNetworkInterfaces",
        "ec2:DescribeSecurityGroupRules",
        "ec2:DescribeSecurityGroups",
        "ec2:DescribeSubnets",
        "ec2:DescribeTags",
        "ec2:DescribeVolumes",
        "ec2:DescribeVpcs",
        "ec2:DisassociateAddress",
        "ec2:ReleaseAddress",
        "sts:GetCallerIdentity"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "batch:CreateComputeEnvironment",
        "batch:CreateJobQueue",
        "batch:DeleteComputeEnvironment",
        "batch:UpdateComputeEnvironment"
      ],
      "Resource": "arn:aws:batch:${Region}:${Account}:compute-environment/${ComputeEnvironmentName}"
    },
    {
      "Effect": "Allow",
      "Action": [
        "batch:CreateJobQueue",
        "batch:DeleteJobQueue",
        "batch:UpdateJobQueue"
      ],
      "Resource": "arn:aws:batch:${Region}:${Account}:job-queue/${JobQueueName}"
    },
    {
      "Effect": "Allow",
      "Action": "ec2:AllocateAddress",
      "Resource": "arn:aws:ec2:${Region}:${Account}:elastic-ip/${AllocationId}"
    },
    {
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstanceAttribute",
        "ec2:ModifyInstanceAttribute",
        "ec2:MonitorInstances",
        "ec2:RunInstances",
        "ec2:TerminateInstances"
      ],
      "Resource": "arn:aws:ec2:${Region}:${Account}:instance/${InstanceId}"
    },
    {
      "Effect": "Allow",
      "Action": "ec2:ImportKeyPair",
      "Resource": "arn:aws:ec2:${Region}:${Account}:key-pair/${KeyPairName}"
    },
    {
      "Effect": "Allow",
      "Action": [
        "ec2:CreateLaunchTemplate",
        "ec2:DeleteLaunchTemplate"
      ],
      "Resource": "arn:aws:ec2:${Region}:${Account}:launch-template/${LaunchTemplateId}"
    },
    {
      "Effect": "Allow",
      "Action": "ec2:RunInstances",
      "Resource": "arn:aws:ec2:${Region}:${Account}:network-interface/${NetworkInterfaceId}"
    },
    {
      "Effect": "Allow",
      "Action": [
        "ec2:AuthorizeSecurityGroupEgress",
        "ec2:AuthorizeSecurityGroupIngress",
        "ec2:CreateSecurityGroup",
        "ec2:DeleteSecurityGroup",
        "ec2:RevokeSecurityGroupEgress",
        "ec2:RevokeSecurityGroupIngress",
        "ec2:RunInstances"
      ],
      "Resource": "arn:aws:ec2:${Region}:${Account}:security-group/${SecurityGroupId}"
    },
    {
      "Effect": "Allow",
      "Action": "ec2:RunInstances",
      "Resource": "arn:aws:ec2:${Region}:${Account}:subnet/${SubnetId}"
    },
    {
      "Effect": "Allow",
      "Action": "ec2:RunInstances",
      "Resource": "arn:aws:ec2:${Region}::image/${ImageId}"
    },
    {
      "Effect": "Allow",
      "Action": [
        "iam:AddRoleToInstanceProfile",
        "iam:CreateInstanceProfile",
        "iam:DeleteInstanceProfile",
        "iam:GetInstanceProfile",
        "iam:RemoveRoleFromInstanceProfile"
      ],
      "Resource": "arn:aws:iam::${Account}:instance-profile/${InstanceProfileNameWithPath}"
    },
    {
      "Effect": "Allow",
      "Action": [
        "iam:CreateRole",
        "iam:DeleteRole",
        "iam:DeleteRolePolicy",
        "iam:GetRole",
        "iam:GetRolePolicy",
        "iam:ListAttachedRolePolicies",
        "iam:ListInstanceProfilesForRole",
        "iam:ListRolePolicies",
        "iam:PutRolePolicy"
      ],
      "Resource": "arn:aws:iam::${Account}:role/${RoleNameWithPath}"
    },
    {
      "Effect": "Allow",
      "Action": [
        "kms:CreateGrant",
        "kms:GenerateDataKeyWithoutPlaintext"
      ],
      "Resource": "arn:aws:kms:${Region}:${Account}:key/${KeyId}"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:CreateBucket",
        "s3:GetAccelerateConfiguration",
        "s3:GetBucketAcl",
        "s3:GetBucketCORS",
        "s3:GetBucketLogging",
        "s3:GetBucketObjectLockConfiguration",
        "s3:GetBucketPolicy",
        "s3:GetBucketPublicAccessBlock",
        "s3:GetBucketRequestPayment",
        "s3:GetBucketTagging",
        "s3:GetBucketVersioning",
        "s3:GetBucketWebsite",
        "s3:GetEncryptionConfiguration",
        "s3:GetLifecycleConfiguration",
        "s3:GetReplicationConfiguration",
        "s3:PutBucketPublicAccessBlock",
        "s3:PutBucketTagging",
        "s3:PutLifecycleConfiguration"
      ],
      "Resource": "arn:aws:s3:::${BucketName}"
    },
    {
      "Effect": "Allow",
      "Action": "ssm:GetParameters",
      "Resource": "arn:aws:ssm:${Region}:${Account}:parameter/${ParameterNameWithoutLeadingSlash}"
    }
  ]
}

Note

Replace placeholders like ${Region}, ${Account}, ${ComputeEnvironmentName}, etc., with your actual AWS values.

S3 Bucket Permissions#

When running through the Terraform deployment process, you will be prompted to either create a new S3 bucket or provide an existing one.

If you create a new bucket, the deployment process will generate an IAM policy with the following permissions on the bucket and its objects:

  • s3:GetObject

  • s3:PutObject

  • s3:DeleteObject

  • s3:ListObjectsV2

  • s3:ListBucket

If you are providing an existing bucket URL, an IAM policy will be created with the above access to the provided bucket and its objects. Please ensure you have the correct permissions to carry out this IAM operation.

Additional Essential Permissions#

The following essential permissions will be generated and attached to the VM’s IAM instance profile:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "ecr:UploadLayerPart",
        "ecr:PutImage",
        "ecr:ListTagsForResource",
        "ecr:ListImages",
        "ecr:InitiateLayerUpload",
        "ecr:GetRepositoryPolicy",
        "ecr:GetLifecyclePolicyPreview",
        "ecr:GetLifecyclePolicy",
        "ecr:GetDownloadUrlForLayer",
        "ecr:GetAuthorizationToken",
        "ecr:DescribeRepositories",
        "ecr:DescribeImages",
        "ecr:CompleteLayerUpload",
        "ecr:BatchGetImage",
        "ecr:BatchCheckLayerAvailability"
      ],
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Action": [
        "batch:TerminateJob",
        "batch:SubmitJob",
        "batch:RegisterJobDefinition",
        "batch:ListJobs",
        "batch:DescribeJobs",
        "batch:DescribeJobQueues",
        "batch:DescribeJobDefinitions",
        "batch:DescribeComputeEnvironments"
      ],
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Action": [
        "ecs:DescribeTasks",
        "ecs:DescribeContainerInstances",
        "ec2:DescribeInstances",
        "ec2:DescribeInstanceTypes",
        "ec2:DescribeInstanceAttribute"
      ],
      "Effect": "Allow",
      "Resource": "*"
    }
  ]
}

Pre-existing AWS Service Roles#

Important

Required AWS Service Roles

The Terraform deployment requires two AWS-managed service roles that should already exist in your account:

  1. AWS Batch Service Role: arn:aws:iam::${Account}:role/aws-service-role/batch.amazonaws.com/AWSServiceRoleForBatch

    • This is a service-linked role automatically created when you first enable AWS Batch in your account

    • If it doesn’t exist, AWS will create it automatically when you deploy resources that use Batch

    • No manual action is typically required

  2. ECS Instance Role: arn:aws:iam::${Account}:instance-profile/ecsInstanceRole

    • Required for EC2 instances in the Batch compute environment

    • Must be created manually if it doesn’t exist in your account

    • Attach the AmazonEC2ContainerServiceforEC2Role managed policy to this role

To create the ECS instance role if it doesn’t exist:

# Create the IAM role
aws iam create-role --role-name ecsInstanceRole \
  --assume-role-policy-document '{
    "Version": "2012-10-17",
    "Statement": [{
      "Effect": "Allow",
      "Principal": {"Service": "ec2.amazonaws.com"},
      "Action": "sts:AssumeRole"
    }]
  }'

# Attach the AWS managed policy
aws iam attach-role-policy --role-name ecsInstanceRole \
  --policy-arn arn:aws:iam::aws:policy/service-role/AmazonEC2ContainerServiceforEC2Role

# Create the instance profile
aws iam create-instance-profile --instance-profile-name ecsInstanceRole

# Add the role to the instance profile
aws iam add-role-to-instance-profile --instance-profile-name ecsInstanceRole \
  --role-name ecsInstanceRole

You can verify if the ECS instance role exists by running:

aws iam get-instance-profile --instance-profile-name ecsInstanceRole

Usage of this Terraform Configuration#

Prerequisites#

  • Terraform >= 1.0

  • AWS credentials configured (aws configure or environment variables)

  • AWS account with billing enabled and sufficient service limits

  • Existing VPC and subnet (this configuration does not create networking resources)

Setup#

# Navigate to the AWS Terraform directory
cd cloud_platforms/terraform/aws/

# Copy example variables
cp terraform.tfvars.example terraform.tfvars

# Edit terraform.tfvars with your values
vim terraform.tfvars

# Initialize Terraform
terraform init

# Review the plan (Ensure it looks correct, and no errors or changes you don't expect)
terraform plan -var-file=terraform.tfvars -out=tfplan

# Apply the configuration using the saved plan
terraform apply tfplan

Warning

Destroying Infrastructure

The commands below will permanently delete all resources created by Terraform, including:

  • EC2 instances and their data

  • S3 buckets and all stored data (if bucket_force_destroy is enabled)

  • ECR repositories and container images

  • IAM roles and policies

  • Security groups and network configurations

This action is irreversible. Always:

  1. Backup any important data before destroying resources

  2. Carefully review the destroy plan output to confirm which resources will be deleted

  3. Ensure you are working in the correct AWS account and region

  4. Consider commenting out or removing the bucket_force_destroy setting to prevent accidental data loss

# Destroy the configuration when no longer needed
terraform plan -destroy -var-file=terraform.tfvars -out=destroyplan
# Be sure to review the plan output carefully to ensure you understand which resources will be destroyed.
terraform apply destroyplan

Note

You may see deprecation warnings during terraform plan related to the user_data_base64 attribute and data.aws_region.current.name. These warnings originate from the external terraform-aws-bootstrap module and are safe to ignore. They do not affect the deployment or functionality of your infrastructure.

Note

Bootstrap Module Branch

The main.tf configuration references the module_update branch of the terraform-aws-bootstrap repository. This branch contains updates and improvements to the bootstrap module. The module source is specified as:

source = "git::https://github.com/cegx-ds/terraform-aws-bootstrap.git?ref=module_update"

Terraform Outputs#

After terraform apply, you’ll see output similar to the following (example values using us-east-1 region):

Outputs:

bucket_url = "s3://your-vm-name"
docker_repo_url = "123456789012.dkr.ecr.us-east-1.amazonaws.com"
private_ip = "10.0.x.x"
private_key_filename = "~/.ssh/your-vm-name.pem"
public_ip = "x.x.x.x"
ssh_user = "ubuntu"
vm_name = "your-vm-name"

Important

Please save the following outputs as you will need them:

  • The public_ip and private_key_filename for SSH access to the VM

  • The docker_repo_url is the ECR registry URL (repositories will be created within this registry as needed)

  • The bucket_url for configuring data storage (remember the s3:// prefix)

Connect to VM#

After successfully applying the Terraform configuration, please allow a few minutes for the VM to complete its startup script installation, then you can connect to the VM using SSH:

# Use the outputs from Terraform to construct the SSH command
ssh -i <private_key_filename> <ssh_user>@<public_ip>

# Example:
ssh -i ~/.ssh/your-vm-name.pem ubuntu@x.x.x.x

Configuration Variables#

Required Variables#

region    = "us-east-1"       # AWS region
vpc_id    = "vpc-xxxxx"       # VPC ID (must already exist)
subnet_id = "subnet-xxxxx"    # Subnet ID (must already exist)
vm_name   = "development-vm"  # VM name

Optional Variables#

instance_type = "t3.large"    # EC2 instance type (default: t3.large)
tag_key   = "environment"     # Resource tag key
tag_value = "development"     # Resource tag value
use_existing_bucket_url = "s3://my-existing-bucket"  # Use existing bucket
bucket_force_destroy = false  # Allow bucket destruction
ssh_user = "ubuntu"           # SSH username

Outputs#

After terraform apply, you’ll get:

  • VM public and private IP addresses

  • S3 bucket URL

  • ECR registry URL (repositories are created within this registry)

  • SSH user and private key filename for connecting

  • VM name

What’s NOT Included#

This minimal configuration intentionally excludes:

  • Application-specific software installation

  • Pipeline or workflow management tools

  • Custom configuration files or templates

  • File copying or deployment logic

  • Version-specific software management

  • VPC and networking infrastructure (must already exist)

Design Philosophy#

This configuration follows the principle of separation of concerns:

  • Infrastructure: Terraform handles VM, storage, and compute orchestration

  • Platform: Basic runtime dependencies (Java and Docker)

  • Applications: Should be deployed separately after infrastructure is ready

This approach makes the infrastructure:

  • Reusable: Can be used for different applications

  • Maintainable: Clear separation between infrastructure and application concerns

  • Testable: Infrastructure can be validated independently

  • Flexible: Applications can be deployed using different methods (Docker, packages, etc.)

Next Steps#

After the infrastructure is ready:

  1. Verify base runtime (Java & Docker):

    java -version
    docker --version
    sudo systemctl status docker
    
  2. Install the biomodal CLI.

  3. (Optional) Configure monitoring and logging (e.g. CloudWatch metrics/alarms, log shipping).

Note: The terraform installation script follows a direct installation approach. Package managers like apt-get handle duplicate installations gracefully, so the script can be run multiple times safely with minimal overhead.

Installation Script Technical Details#

The VM startup script (scripts/install_script.sh) implements a straightforward installation approach to ensure a complete and consistent environment setup.

Direct Package Installation#

The installation script installs all required packages directly using a simple, reliable approach:

Installation Process

  1. System update: Updates package repositories with apt-get update

  2. Direct installation: Installs all required packages in a single apt-get install command

  3. Reliable execution: Uses apt-get’s built-in handling of already-installed packages

  4. Simple approach: No complex checking logic, ensuring consistent behavior

Required System Packages

# Packages installed:
ca-certificates       # SSL/TLS certificates for secure connections
curl                  # Command line tool for data transfer
gnupg                 # GNU Privacy Guard for encryption/signing
lsb-release           # Linux Standard Base release information
openjdk-21-jdk        # Java Development Kit version 21
apt-transport-https   # HTTPS transport for apt
wget                  # Network downloader
unzip                 # Archive extraction utility

Direct Docker Installation#

Docker installation follows the same straightforward approach:

Docker Installation Process

  1. Direct installation: Installs Docker using apt-get install docker.io

  2. Service configuration: Starts and enables Docker service

  3. User permissions: Adds the current user to docker group for non-root access

  4. Session permissions: Sets appropriate socket permissions for immediate access

Docker Configuration (Automatically Applied)

The script configures Docker for proper operation:

# Service management
sudo systemctl start docker    # Start Docker service
sudo systemctl enable docker   # Enable Docker on boot

# User permissions (reliable username detection)
ACTUAL_USER=$(whoami)
sudo usermod -aG docker "$ACTUAL_USER"   # Add user to docker group
sudo chmod 666 /var/run/docker.sock      # Socket access for current session

Reliable User Detection

The script uses a simple, reliable method to identify the correct username:

  • Uses whoami command to get the current user

  • Checks if user is not root to avoid security issues

  • Adds user to docker group for non-root access

Installation Feedback

The script provides clear logging throughout the process:

  • Reports the start of package installation

  • Confirms when package installation is completed

  • Shows Docker installation progress

  • Displays user permission configuration

  • Confirms when Docker installation is completed

  • Final success message when all installations are complete

Benefits of Direct Installation

  • Simplicity: Straightforward, easy-to-understand process

  • Reliability: Uses standard package manager behavior for duplicate handling

  • Consistency: Ensures the same installation process every time

  • Terraform compatibility: Simple script structure works well with Terraform user_data

  • Minimal complexity: No conditional logic reduces potential failure points

  • Error handling: Uses set -euo pipefail to exit on any errors

Troubleshooting#

Common Issues and Solutions#

VM Creation Failures

If VM creation fails, check:

  • AWS Service Limits: Ensure you have sufficient EC2 and Batch quotas in the target region

  • VPC Configuration: Verify the specified VPC and subnet exist and are properly configured

  • IAM Permissions: Validate your AWS credentials have necessary IAM permissions

  • AMI Availability: Confirm Ubuntu 22.04 AMI is available in your region

Installation Script Issues

If the startup script fails:

# Check cloud-init logs
sudo cat /var/log/cloud-init-output.log

# Check system logs
sudo journalctl -xe

Docker Permission Issues

If Docker commands fail with permission errors:

# Check if user is in docker group
groups $USER | grep docker

# If not in docker group, add manually
sudo usermod -aG docker $USER

# Check Docker socket permissions
ls -la /var/run/docker.sock

# Fix socket permissions if needed
sudo chmod 666 /var/run/docker.sock

# Re-login or start a new session to apply group membership
exit
# SSH back in

# Test Docker access
docker run hello-world

Package Installation Failures

If specific packages fail to install:

# Update package lists
sudo apt-get update

# Try installing individual packages
sudo apt-get install -y package-name

# Check for held packages
sudo apt-mark showhold

Terraform State Issues

If Terraform operations fail:

# Refresh state
terraform refresh

# Import existing resources if needed
terraform import aws_instance.vm i-xxxxxxxxxxxxx

# Plan with detailed output
terraform plan -detailed-exitcode

Terraform Plan File Best Practices

Always use the -out option when planning to ensure consistent deployments:

# Create a plan file to guarantee exact execution
terraform plan -var-file=terraform.tfvars -out=tfplan

# Apply the exact planned changes
terraform apply tfplan

# For destroy operations, also use plan files for safety
terraform plan -destroy -var-file=terraform.tfvars -out=destroyplan

# Apply the exact destruction plan
terraform apply destroyplan

This approach prevents drift between what you reviewed in the plan and what gets applied, especially important in production environments where infrastructure changes between the plan and apply commands. For destroy operations, this ensures you know exactly which resources will be deleted before proceeding.

AWS Services and Resources#

EC2 Instance#

The Terraform configuration creates an EC2 instance that serves as the orchestrator VM for running biomodal pipelines.

Instance Configuration

  • AMI: Ubuntu 22.04 LTS

  • Instance Type: Set via the Terraform variable instance_type (default: t3.large; override in terraform.tfvars if needed)

  • Storage: 100 GB root block device

  • Networking: Deployed in your existing VPC and subnet

  • Public IP: Elastic IP for consistent external access

  • SSH Access: Key pair generated for secure access

AWS Batch#

The infrastructure includes AWS Batch for scalable compute orchestration:

Batch Components

  • Compute Environment: Auto-scaling compute cluster for running containerized workloads

  • Job Queue: Queue for managing and scheduling pipeline jobs

  • Job Definitions: Created by biomodal CLI for specific pipeline runs

Benefits

  • Auto-scaling: Automatically scales compute resources based on job demand

  • Cost Optimization: Can use spot instances for significant cost savings

  • Job Management: Queue-based execution with priority handling

  • Container Support: Native support for Docker containers

S3 Storage Bucket#

The Terraform configuration creates an S3 bucket for data storage:

Bucket Configuration

  • Naming: Automatically named based on the vm_name variable

  • Location: Created in the same region as the VM for optimal performance

  • Access Control: Configured with appropriate IAM policies for secure access

  • Optional: Can be disabled by setting use_existing_bucket_url to use an existing bucket

Bucket Usage

The bucket serves as:

  • Input Storage: Store input files such as FASTQ files

  • Working Directory: Pipeline intermediate files and temporary data

  • Output Storage: Final analysis results and generated reports

  • Nextflow Work Directory: Temporary files created during pipeline execution

Lifecycle Management

Consider implementing S3 lifecycle policies to:

  • Automatically transition older objects to cheaper storage classes

  • Delete temporary files after a specified period

  • Reduce storage costs for infrequently accessed data

ECR Repository#

An Amazon ECR (Elastic Container Registry) repository is created for storing custom container images.

Repository Configuration

  • Repository Name: Uses the vm_name variable

  • Format: Docker repository for container images

  • Location: Created in the same region as the VM

  • Registry URL: The docker_repo_url output provides the ECR registry URL (e.g., 123456789012.dkr.ecr.us-east-1.amazonaws.com)

  • Access: IAM permissions configured for push/pull operations

Note

The docker_repo_url output is the ECR registry URL. The repository created by Terraform is named after your vm_name variable. Additional repositories can be created within this registry as needed.

Container Image Management

The ECR repository serves as storage for:

  • Custom Pipeline Images: Any custom-built containers for specific workflows

  • Modified Biomodal Images: Customized versions of biomodal containers

  • Tool Containers: Supporting bioinformatics tools and utilities

Access and Permissions

The EC2 instance IAM role has the necessary permissions to:

  • Push container images to the registry

  • Pull container images during pipeline execution

  • List and manage repository contents

IAM Roles and Instance Profile#

The infrastructure creates IAM roles with appropriate permissions:

Instance Profile

  • Attached to the EC2 orchestrator VM

  • Provides credentials for AWS service access

  • Follows principle of least privilege

Key Permissions

  • S3 Access: Read/write to the storage bucket

  • ECR Access: Push/pull container images

  • Batch Access: Submit and manage batch jobs

  • EC2 Access: Describe instances and instance types

  • ECS Access: Describe tasks and container instances

Security Groups#

The configuration creates security groups for network access control:

SSH Access

  • Port 22 open for SSH connections

  • Can be restricted to specific IP ranges for enhanced security

Outbound Access

  • Full internet access for package downloads

  • Access to AWS services via VPC endpoints or internet gateway

Cost Optimization Strategies#

Instance Cost Optimization#

EC2 Orchestrator

  • Use smaller instance types (t3.small, t3.medium) for the orchestrator

  • Consider Reserved Instances for long-term deployments

  • Stop the instance when not actively running pipelines

AWS Batch Compute

  • Use Spot Instances for up to 90% cost savings

  • Configure appropriate min/max vCPU limits

  • Set up auto-scaling based on queue depth

Storage Cost Optimization#

S3 Bucket

  • Implement lifecycle policies to transition old data to S3 Glacier

  • Delete temporary Nextflow work directories after pipeline completion

  • Use S3 Intelligent-Tiering for automatic cost optimization

EBS Volumes

  • Right-size root volumes based on actual usage

  • Consider gp3 volumes for better price/performance

  • Delete unused snapshots regularly

Monitoring and Cost Alerts#

  • Set up AWS Budgets to track spending

  • Configure CloudWatch alarms for cost anomalies

  • Use AWS Cost Explorer to identify optimization opportunities

  • Tag resources for cost allocation and tracking

Low AWS Service Limits#

If using AWS with limited service quotas (e.g., new accounts), you may need to:

  • Request quota increases for EC2 instances, Batch compute, and S3 storage

  • Contact AWS support for faster quota increase processing

  • Plan deployments within current quota limits

  • Consider using multiple regions if one region has limitations

For specific biomodal pipeline requirements with limited quotas, please contact support@biomodal.com for guidance on resource planning and custom configurations.