1. Intent
Apache Airflow
DAG: rnaseq_dag
Task: star_align
Operator: KubernetesPodOperator
Run: scheduled__2025-12-30T15:00:00
TaskInstance: star_align[run_id]

Requested resources:
CPUs: 64
Memory: 488 GB
2. Execution (Airflow Executor / Infrastructure)
Execution Reality
Executor: KubernetesExecutor
TaskInstance: star_align
Pod: airflow-star-align-4521
Node / instance: r5.16xlarge
Container image: ../biocontainers/star
Duration: 47 min
Airflow launched the task using statically defined operator resources. Runtime behavior inside the container was not observable from Airflow logs or scheduler metrics.
3. Reality
Observed inside Airflow TaskInstance star_align
Airflow DAG rnaseq_dag → run → task star_align
CPU
8%
Memory
18%
Disk I/O
114%
Diagnosis: TaskInstance is I/O-bound. Increasing CPU or memory in the operator configuration will not improve runtime.
4. The Fix
Recommended Airflow task-level configuration for star_align:
Operator resources:
CPUs: 8
Memory: 64 GB
Storage: Local NVMe / high-throughput volume
Node / instance: r6id.2xlarge
Executor profile: io_optimized
Scope: Task-level override (no DAG-wide resource changes)
This change updates the Airflow task's operator configuration. Same DAG, same task — ~85% lower compute cost and 3.7× faster runtime.