Cloud Pipeline Introduction

Why Cloud Pipeline

Cloud Pipeline solution from EPAM provides an easy and scalable approach to perform a wide range of analysis tasks in the cloud environment.
This solution takes the best of two approaches: classic HPC solutions (based on GridEngine schedulers family) and SaaS cloud solutions.

Feature Classic "HPC" "Cloud Pipeline" SaaS
Pipelines customization High. Direct scripting - any level of customization High. Direct scripting - any level of customization Low. Provide specific pipeline definition languages or even only a graphical editor
Backward compatibility N/A High. "Classic" HPC scripts and NGS tools can be run without any changes Low. Scripts have to be rewritten according to the supported languages and storage structures
User Interface Command Line Graphical Interface for User interaction and a Command Line Interface for automation scripts Graphical Interface
Calculation power scalability Low. New nodes shall be deployed and supported on-premises. Idle nodes are still consuming resources High. New nodes are started according to the job request and terminated as soon as they are not needed anymore. Each job can precisely define required CPU/RAM/Disk resources or even select optimal node up to speed up execution (e.g. memory optimized nodes for cellranger pipelines) High. Scalable as "Cloud Pipeline" but sometimes limits user to predefined nodes setup
Deployment and vendor-lock Deployed on-premises and introduces no vendor-lock Can be deployed in AWS/GCP/Azure or on-premises, thus introduces no vendor-lock Consumed as an Internet service (no on-premises deployment available), all processes are tied to this specific vendor
Security High. All data and analysis processes are located in a controlled network. High. All data and analysis processes are located in a controlled cloud VPC. All security configurations are performed by user's security officers Low. No direct control over security configuration. SaaS vendor has full access to the data storages.

Components

The main components of the Cloud Pipeline are shown below:
CP_components