Posts Genome analysis using Parabricks
Post
Cancel

Genome analysis using Parabricks

Image source: Design Cells/iStock/Getty Images

Benchmarking human whole genome and RNA sequencing using Nvidia Ampere A100 GPUs

Nvidia Ampere A100 GPUs exploit wide SIMT architecture to accelerate world’s most demanding HPC and AI workloads, and one of the key application in HPC is sequencing of human genome and transcriptome. This blog aims to measure runtime required for genome and transcriptome sequencing using GPUs to help understand how quickly sequence analysis can be done.

PARABRICKS DNA FQ2BAM pipeline

Dataset:

  • Reference genome: GRCh38
  • Whole-genome sequencing run: ERR194147 (Illumina HiSeq 2000 paired end sequencing, ~50x coverage)

Results:

Table

Execution Time

Number of GPUsExecution Time(s) (alignment-phase BWA-mem)Execution Time(s) (Overall)
1965711053
249245903
427203389
816952204
1
                Table 1: Execution Time

Plot (Sequence alignment)

Execution Time

Execution Time

1
                Figure 1: Execution time for sequence alignment (BWA).

Speed Up

Speed Up

1
                Figure 2: Speedup relative to single A100 GPU 

Plot (Overall)

Execution Time

Execution Time

1
                Figure 3: Execution time for DNA FQ2BAM pipeline.

Speed Up

Speed Up

1
                Figure 4: Speedup relative to single A100 GPU 

Scaling: From Figure 4, we can clearly observe that with increase in number of GPUs, we are getting almost linear speedup, hence we can conclude that PARABRICKS DNA pipeline is strong scalable upto 8 GPUs.

PARABRICKS RNA FQ2BAM pipeline

Dataset:

  • Reference genome: GRCh38
  • RNA-seq sample: SRR534301 (Illumina HiSeq 2000 paired end sequencing)

Results

Table

Execution Time

Number of GPUsExecution Time(s) (alignment-phase STAR)Execution Time(s) (Overall)
114881529
29871037
4498549
8453493
1
                Table 2: Execution Time

Plot (Sequence alignment)

Execution Time

Execution Time

1
                Figure 5: Execution time for sequence alignment using STAR algorithm.

Speed Up

Speed Up

1
                Figure 6: Speedup relative to single A100 GPU 

Plot (Overall)

Execution Time

Execution Time

1
                Figure 7: Execution time for  RNA FQ2BAM pipeline.

Speed Up

Speed Up

1
                Figure 8: Speedup relative to single A100 GPU 

Scaling: In Figure 8, we observe that while keeping data set fixed and with increase in number of GPUs, we are getting almost linear scaling up to 4 GPUs and then the speedup growth reduces. Hence, PARABRICKS RNA pipeline is showing good strong scaling up to 4 GPUs.

Experimental Configuration

  • Compute Cluster: PARAM-SIDDHI AI
  • Compute Node: Single compute node (Nvidia DGX A100 320GB)
    GPU: 8X A100-SXM4-40GB
    CPU: Dual AMD EPYC 7742 64-Core Processor
    RAM: 1 TB
    Storage: 8 PB (lustre fs)
  • PARABRICKS: Nvidia Clara Parabricks: 3.6.0-1

References

[1]. Nvidia Clara Parabricks 3.6.0-1 [Documentation]

[2]. Param Siddhi supercomputer [DST]

[3]. Nvidia Ampere A100 GPU [Nvidia]

This post is licensed under CC BY 4.0 by the author.
Contenido

-

-