Cloud Pipeline Introduction
Why Cloud Pipeline
Cloud Pipeline solution from EPAM provides an easy and scalable approach to perform a wide range of analysis tasks in the cloud environment.
This solution takes the best of two approaches: classic HPC solutions (based on GridEngine schedulers family) and SaaS cloud solutions.
Feature | Classic "HPC" | "Cloud Pipeline" | SaaS |
---|---|---|---|
Pipelines customization | High. Direct scripting - any level of customization | High. Direct scripting - any level of customization | Low. Provide specific pipeline definition languages or even only a graphical editor |
Backward compatibility | N/A | High. "Classic" HPC scripts and NGS tools can be run without any changes | Low. Scripts have to be rewritten according to the supported languages and storage structures |
User Interface | Command Line | Graphical Interface for User interaction and a Command Line Interface for automation scripts | Graphical Interface |
Calculation power scalability | Low. New nodes shall be deployed and supported on-premises. Idle nodes are still consuming resources |
High. New nodes are started according to the job request and terminated as soon as they are not needed anymore. Each job can precisely define required CPU/RAM/Disk resources or even select optimal node up to speed up execution (e.g. memory optimized nodes for cellranger pipelines) |
High. Scalable as "Cloud Pipeline" but sometimes limits user to predefined nodes setup |
Deployment and vendor-lock | Deployed on-premises and introduces no vendor-lock | Can be deployed in AWS/GCP/Azure or on-premises, thus introduces no vendor-lock | Consumed as an Internet service (no on-premises deployment available), all processes are tied to this specific vendor |
Security | High. All data and analysis processes are located in a controlled network. | High. All data and analysis processes are located in a controlled cloud VPC. All security configurations are performed by user's security officers |
Low. No direct control over security configuration. SaaS vendor has full access to the data storages. |
Components
The main components of the Cloud Pipeline are shown below: