Getting Started
Maintenance Schedule
Getting Access
RSA Software Token
CAC Access
Accessing the Systems
Cloud Computing
RDHPCS Maintenance Calendar
Getting Help
Submitting a Good Help Request
Use a Good Subject
Provide Detailed Description of the Problem
Provide Job Information
Describe How to Reproduce the Problem
Only Report One Problem Per Help Ticket
Follow up With Additional Information or Questions
Required Information for Specific Types of Help
Basic Ticket Information
File System Problems
Compilation Problems
Job Submission Problems
Job Completion Problems
Providing a Reproducer
Reporting Data Transfer Issues
Managing Help Tickets
Help Ticket System User Portal
Login
Reply to a Ticket
Search for a Ticket
Create a New Ticket
Accounts
Getting an RDHPCS Account
What NOAA Does
What You Do
Accessing RDHPCS Systems
Access and Identity Management (AIM)
Secure Shell (SSH) Access
Common Access Card (CAC)
Updating CAC Information in AIM
RSA Token
RSA Software Token Activation
New Device
RSA Hardware Token Activation
Other Authentications
Account Suspension, Deactivation, Reactivation
Deactivated Accounts
Role Accounts
Accessing a Role Account
X Applications With Role Accounts
Using CRON
Request Additional Projects
RDHPCS X.509 Certificates
Generating a Master Certificate
Resetting Master Certificate Passphrase
Quickstart for New Users
Getting Access
RSA Software Token
Accessing the RDHPCS Systems
Tectia SSH solution
Tectia Initial Setup procedure
Common Access Card (CAC) Login
Tectia Initial Setup Procedures
Role Accounts
First Time RSA token Login
Overview: Getting an Internal Account - RDHPCS
Overview: Getting an External Account - MSU-HPC
General Access Requirements
Logging In
Obtaining an Account
First Time Login
Connecting
Connecting with a CAC
Connecting with an RSA token
Selecting a Node
X11 Graphics
SSH Port Tunnels
Connecting
Compiling
Running
Staging/Combining
Transferring Data to/from Gaea
Allocation
Running a Simple Job
Running the Script
Once the job is submitted
Once the job is Finished
Systems
General Information
RDHPCS Platforms
Gaea User Guide
System Overview
GAEA Quickstart
Connecting and General Info
Compiling
Running
Staging/Combining
Transferring Data to/from Gaea
Allocation
Running a Simple Job Script
System Architechture
Node Types
Clusters
What is C5?
Job Submission
Queues
Job Monitoring
Terminology
Environment
Do’s and Don’ts
File Systems
Summary of Storage Areas
HOME
Allocations and Quotas
Modules
LMOD
LMOD Search Commands
Adding Additional Module Paths
Module Commands
Compilers
Available Compilers
Compilers on C5
Cray Compiler Wrappers
Compiling and Node Types
Controlling the Programming Environment
Compiling Threaded Codes
Hardware
c5 partition
es partition
Queue Policy
Debug & Batch Queues
Priority Queues
Queues per Partition
Scheduler/Priority Specifics
Slurm Queueing System
Useful Commands
Running your models
Monitoring your jobs: Shell Setup
Fair Share Reporting
Data Transfers
Available Tools
f5 <-> f5
Gaea <-> GFDL
Gaea <-> Remote NOAA Site
Gaea <-> External
Gaea <-> Fairmont HPSS
External (Untrusted) Data Transfers
GCP
User Guide
Smartsites
CAC bastions refusing login attempts without asking for PIN
Shell hang on login
Lustre (F2) Performance
RDHPCS Cloud Computing
Parallel Works User Guide
NOAA’s Parallel Works Portal
Workflow
Data Transfers
Getting Help
Training Videos
Beginner’s Guide to NOAA’s HPC Cloud
Parallel Works
Cloud Success Stories
Office Hours
Monthly Utilization Reports
FY2024 Usage
Frequently Asked Questions
General Cloud Issues
How do I get a project allocation or an allocation increase?
Storage functionalities
Parallel works
Clusters and snapshots
Slurm
Errors
Miscellaneous
Jet User Guide
GPU Clusters
EVERYTHING BELOW THIS LINE IS IN FLUX
About Modules
Using Math Libraries
Options for Editing on Jet
Starting a Parallel Application
Policies and Best Practices
System Software
Using OpenMP and Hybrid OpenMP/MPI on Jet
Hera User Guide
About NESCC
System Overview
System Configuration
Lustre File System Usage
Lustre Volume and File Count
Lustre
Hera Lustre Configuration
File Operations
Types of file I/O
File Striping
Userspace Commands
Applications and Libraries
Using Anaconda Python on Hera
MATLAB
Using IDL on Hera
Using ImageMagick on Hera
Using R on Hera
Libraries
Using Modules
Using MPI
Loading the MPI module
Using PGI and mvapich2
Tuning MPI (TBD)
Profiling an MPI application with Intel MPI
Debugging Codes
Debugging Intel MPI Applications
Application Debuggers
Invoking DDT on Hera with Intel IMPI
Profiling Codes
Linaro Forge
TAU
Managing Contrib Projects
Fine Grain Architecture (FGA) System
System Information
Getting an allocation for FGA resources
Using FGA resources without an allocation
User Environment
Compiling and Running Codes on the FGA
Compiling and Running Codes Using CUDA
Compiling and Running Codes Using Intel MPI
Compiling and Building Codes Using mvapich2-gdr Library
Compiling and Building Codes Using OpenMPI
Compiling codes with OpenACC directives on Hera
Compiling MPI codes with OpenACC directives on Hera
Submitting Batch Jobs to the FGA System
Hints on Rank Placement/Performance Tuning
Rank placement when using mvapich2
Using Nvidia Multi-Process Service
Compiling and Building Codes With The Cray Programming Environment
Some helpful web resources
Getting Help
Niagara User Guide
System Overview
Data Transfer
Per User Data Management on Niagara
Lustre File System Usage
Components
Configuration
File Operations
MSU-HPC User Guide
Introduction
MSU’s Official HPC Documentation
General Information
Logging In
Running Jobs on MSU-HPC Systems
Submitting a Job
Specifying a Partition
Monitoring Jobs
Getting Information about your Projects
MSU-HPC System Configuration
File Systems
Orion Compute System
Hercules Compute System
Account Management
Overview
MSU Account Management Policies
Managing Project and Role Account Members
NOAA Portfolio, Project, and User Management on MSU-HPC
Getting An Account
Account Renewal
Managing Portfolios, Projects and Allocation
Role Accounts
Help, Policies, Best Practices, Issues
MSU-HPC Help Requests
Policies and Best Practices
Protecting Restricted Data
MSU FAQ
HSMS HPSS User Guide
NESCC HPSS
Gaining Access to use HPSS
New HPSS User Requests
Adding New Projects to HPSS
NESCC HPSS Data Structure
Data Retention
Expired Data Deletion Process
PPAN User Guide
About Archrpt
Report Option [-r]
Summary Option [-s]
Group Quotas
User Quotas
Enforcing Quotas
Data Storage and Transfers
Summary of Storage Areas
Notes on User-Centric Data Storage
User Home Directories (NFS)
User Archive Directories (PAN Only)
Notes on Project-Centric Data Storage
Project Home Directories (NFS)
Project Work Areas
Project Archive Directories
NESCC HPSS
Gaining Access to use HPSS
New HPSS User Requests
Adding New Projects to HPSS
NESCC HPSS Data Structure
Portfolios Using HPSS
Data Retention
Expired Data Deletion Process
File Size Guidelines
Data Recovery Policy
Getting Started
HTAR
HSI
File Expiration Commands
Sample HPSS Batch Job
HPSS Help
GFDL Archive
Gaining Access to use the GFDL Archive
GFDL Archive Data Structure
Data Retention
Data Recovery Policy
Getting Started
Allocation and Quota
Finding Files
GFDL Archive Help
Globus Online Data Transfer
Overview
Example
RDHPCS Globus Collection Summary
NOAA RDHPCS Globus Endpoint Types
NOAA RDHPCS UDTN’s (Globus Untrusted Endpoint)
NOAA RDHPCS Object Stores in the Cloud
Globus Command Line Interface (CLI)
Transferring Data to and from Your Computer
Globus Example
What you need to have on hand
What you need to do
Using Globus Online Data Transfer
NOAA RDHPCS Globus Endpoint Types
NOAA RDHPCS UDTN’s (Globus Untrusted Endpoint)
NOAA RDHPCS Object Stores in the Cloud
Globus Command Line Interface (CLI)
Transferring Data to and from Your Computer
GFDL Data Services
GFDL Data Digital Object Identifier (DOI) Policy
Modules
View Active Modules
Find Modules
Load Modules
Adding Additional Module Paths
Modules with sh, bash, and ksh scripts
Why doesn’t the module command work in shell scripts?
Command Summary
Policies and Best Practices
System Usage
Login Node Usage
Cron Usage
Cron Job Frequency
File System Usage Practices and Policies
High Performance File System (HPFS - Scratch)
General Parallel File System (GPFS)
/data_untrusted
HFS
Filesystem Backup and Data Retention
Recover Recently Deleted Files from /home
HPSS (Data Retention)
Expired Data Deletion Process
Data Recovery Policy
Data Disposition
HPFS (Scratch) Data
Niagara Per User Data
Home File System (HFS) Data
Managing Packages in
/contrib
Overview of
contrib
Packages
Responsibilities of a
contrib
Package Maintainer
contrib
Packages Guidelines
contrib
Package Maintainer Requests
Managing a
contrib
Package
Maintaining “Metadata” for
contrib
Packages
contrib
Package Directory Naming Conventions
Queue Policy
Overview
Specifying a Quality of Service (QOS)
Changing QOS’s
Jet and Hera
Gaea
General Recommendations
Priorities Between QOS
Debug & Batch QOS
Software
Python on RDHPCS Systems
Overview
Python Guides
Conda Basics
Installing Miniconda
Jupyter on RDHPCS Systems
Module Usage
Base Environment
Custom Environments
How to Run
RDHPCS Compute Nodes
Best Practices
Additional Resources
Tectia
Tectia Initial Setup procedure
Install and Configuring Tectia
Install the Tectia Client
Configure the Tectia Client
Port Tunnelling
Set Up Port Tunnelling
Testing Port Tunnels
Slurm
Running a Job
Batch Scripts
Interactive Jobs
Common
sbatch
Options
Slurm Environment Variables
State Codes
Job Reason Codes
Job Dependencies
Srun
Heterogeneous Jobs
Monitoring Jobs
Show Pending and Running Jobs
Show Completed Jobs
Getting Details About a Job
Priority and Fairshare
Understanding Slurm Fairshare
Fairshare Priority Factor
Fairshare Definitions
Fairshare Reporting
Priority Reporting
Getting Information About Your Projects
sfairshare
saccount_params
Generating Reports
Sreport
Use Cases
Report Descriptions and Formats
Time Formats
References
Shpcrpt
Use Cases
References
Contributing to these docs
Submitting suggestions
Authoring content
Setup authoring environment
Edit the docs
Resources
GitHub Guidelines
NOAA RDHPCS User Documentation
Index
Edit on GitHub
Index