- Secure Shell SSH
- Available Software
- Installing Software
- Guidelines and Policies
- Grant Support
- Sharing Data
- Singularity Usage
- UserGroup Presentations
A HPC cluster is a high-performance, parallel computing infrastructure which consists of three key components: compute, network, and storage. On a cluster one can take advantage of multiple cores by running several instances of a program at once or using a parallelized version of the program. The lilac cluster is appropriate for all types of workloads including Genomic Analysis, Artificial Intelligence and Machine Learning. It is available to all users at MSKCC. The juno cluster is dedicated to processing Genomic pipelines and analysis and access is limited. You can find more information about the two clusters hereat http://hpc.mskcc.org/compute-accounts/
All access to the clusters is via SSH to the login nodes. From the login node you can view files and dispatch jobs to compute nodes on the private network. LSF IBM Spectrum Scale LSF is the job scheduler we use to manage these jobs. All nodes on a cluster mount a shared GPFS filesystem. Each node also has a local 1TB /scratch drive for temporary data. We also provide special data transfer servers with optimized network connections for moving large data sets to and from the clusters.
How do job schedulers work?
On an HPC cluster, the scheduler manages which jobs run where and when. On our clusters, you control your jobs using a job scheduling system called IBM Spectrum Scale LSF that allocates and manages compute resources for you. You can submit your jobs in one of two ways. For testing and small jobs you may want to run a job interactively. This way you can directly interact with the compute node(s) in real time. The other way, which is the preferred way for multiple jobs or long-running jobs, involves writing your job commands in a script and submitting that to the job scheduler. Our LSF documentation is at list of links
Table with a list of common bsub parameters and their defaults
How do I find out what resources I need to request?
Running your first job
Getting information about your jobs
We will be updating or LDF documentation soon.
Where can I find a basic linux tutorial?
All access to the clusters is via SSH keys. We recommend that you use authentication forwarding. If you have trouble connecting add Chris’s text on debugging hereplease read the SSH page at
Your home directory has a 100G quota. High performance GFPS Please use your lab’s data directory /data/labname for datasets and analysis. Each compute node has >1T of local scratch disk. We also have /warm storage which is lower performance and not computable. This storage is not backed up. Links to storage pages
Information about our storage offerings is at http://hpc.mskcc.org/data-storage/
How do I transfer data to and from the cluster?
Each cluster has a special data transfer (xfer) server with a faster network connection optimized for transferring data called called lilac-xfer01-mkscc.org and and juno-xfer02.mskcc.org. To use them just SSH to them and start your data transfers from there.
Available software and specifying paths
Links to using miniconda etc
Data can be transferred from other linux or MacOS using rsync over SSH.
Data can be transferred from Windows or SAMBA shares using smbclient.
What software is available?
How do I be a good citizen on the cluster?
Don't run compute jobs o on the login nodes.
Do not use /tmp as a scratch space and clean up any scratch data you have generated when your job finishes.
Use the data transfer servers for transfering data transferring data on on and off of the clusters.