Getting Started¶

The cluster primary login node is sc.stanford.edu, you can SSH directly if you are already connected on the Stanford Network, if you are logging in from outside the Stanford Network, you will need to use Stanford VPN - Full Tunnel or hop through another node that has a public network interface. (eg. scdt.stanford.edu)

Important!

Do NOT run resource intensive processes on sc headnode (NO vscode, ipython, tensorboard...etc), they will be killed automatically

WebGUI via Open OnDemand¶

Open OnDemand is an open-sourced HPC web-portal. It allows user to interact with the cluster via a modern web interface. Please be mindful that it is intended for convenience over function, SSH is still the default way of interacting with the cluster.

You can access the SC cluster Open OnDemand portal at https://sc.stanford.edu, you will be prompted to login with your CSID.

Group Affiliation (SLURM Account)¶

The SC Cluster is setup as a "condominium-type" cluster, meaning each group has their own compute resources and only user that are affiliated and/or collaborating with the group may have access. Therefore, each user on the SC cluster is also linked with one or more account according to their association. When submitting your job, user must define the --account parameter in srun or sbatch script in order to use the respective compute resources defined in each SLURM partition (or queue).

Update Group Affiliation

Please fill out the Group Affiliation Update Form when you need to change/add group affiliation, most commonly when you are rotating through different groups

Home Directory and Storage¶

Each user will have their home-directory setup on /sailhome/$CSID with a 20GB quota, this space is meant as a landing space for the cluster and most commonly used for job submission scripts, job's std/err outputs and maybe a small virtual environment. User can customize their shell environment via various rc files as well. (eg. .bashrc and .zsh) If you want to modify your default shell, please login to PEDIT and change it there. Do not store research dataset in this space, we will also ignore any request for a higher quota, user should mostly use central storage provided by their group.

All group on the cluster have their own primary central storage system, user are expected to use that as their primary storage on the cluster (from virtual-environment to dateset), their architecture also differ from group to group, and so as their preferred layout and ground-rules, please consult this with your group/advisor. If you have any question, please contact action@cs.stanford.edu and we'll try our best to help.

Backup your data

Your home-directory on /sailhome/$CSID are snapshotted daily and kept for a short period of time. Most other group storage servers are NOT backed-up at all, since it's not sustainable to do so on petabytes of data and most of them are reproducible/duplicates (dataset). You are responsible to make sure important and difficult-to-reproduce data are backed-up, if you need help, please contact action@cs.stanford.edu

Data-transfer (SCDT)¶

We have a designated host for handling data-transfer in scdt.stanford.edu, please note that, while we allows for long running script to handle various download methods, please keep any parallelism to the minimal, as this node tend to get very busy (always check top to get a sense of how busy it is and adjust), your processes will also be throttled accordingly. For downloads from various cloud providers, we suggest the use of rclone, which is installed on SCDT. More details can be found at https://rclone.org/

We also allow vscode, ipython, tensorboard to be ran on SCDT to an extend, please note these processes will be wiped out every 24 hours.