Advanced research computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will cover some more advanced topics in computing on the U-M Great Lakes Cluster. Topics to be covered include a review of common parallel programming models and basic use of Great Lakes; dependent and array scheduling; workflow scripting using bash; high-throughput computing using launcher; parallel processing in one or more of Python, R, and MATLAB; and profiling of parallel code using Allinea Performance Reports and Allinea MAP.

 

Please register at https://ttc.iss.lsa.umich.edu/ttc/sessions/advanced-research-computing-on-the-great-lakes-cluster-12/register/

Introduction to Research Computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will introduce you to high performance computing on the Great Lakes cluster.  After a brief overview of the components of the cluster and the resources available there, the main body of the workshop will cover creating batch scripts and the options available to run jobs, and hands-on experience in submitting, tracking, and interpreting the results of submitted jobs. By the end of the workshop, every participant should have created a submission script, submitted a job, tracked its progress, and collected its output. Additional tools including high-performance data transfer services and interactive use of the cluster will also be covered.

To register and view more details, please refer to the linked TTC page.

Introduction to the Linux Command Line

By |

OVERVIEW

This course will familiarize the student with the basics of accessing and interacting with Linux computers using the GNU/Linux operating system’s Bash shell, also generically referred to as “the command line”. Topics include: a brief overview of Linux, the Bash shell, navigating the file system, basic commands, shell redirection, permissions, processes, and the command environment. The workshop will also provide a quick introduction to nano a simple text editor that will be used in subsequent workshops to edit files.

 

To register and view more details, please refer to the linked TTC page  (https://ttc.iss.lsa.umich.edu/ttc/sessions/introduction-to-the-linux-command-line-33/)

Data Sharing and Archiving

By |

OVERVIEW

For growing data volumes, how we manage data becomes more important. This session will cover the basics of managing data in a research environment such as those at ARC and nationally. Attendees of the course will be introduced to recommended tools for data sharing and transfer both on campus, off campus, and cloud.  They will learn how to prepare data for archive, including special high performance versions of tar and compression allowing significant performance benefits over the standard versions of the tools.
Lastly we will cover the properties and selection process of the appropriate general purpose  storage for data that requires long term preservation and active archiving that supports the largest data volumes in a way that controls costs and ease of management.
Requirements are basic command line.
To register and view more details, please refer to the linked TTC page.  

 

Advanced research computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will cover some more advanced topics in computing on the U-M Great Lakes Cluster. Topics to be covered include a review of common parallel programming models and basic use of Great Lakes; dependent and array scheduling; workflow scripting using bash; high-throughput computing using launcher; parallel processing in one or more of Python, R, and MATLAB; and profiling of parallel code using Allinea Performance Reports and Allinea MAP.

 

Please register at https://ttc.iss.lsa.umich.edu/ttc/sessions/advanced-research-computing-on-the-great-lakes-cluster-11/

Data Sharing and Archiving

By |

OVERVIEW

For growing data volumes, how we manage data becomes more important. This session will cover the basics of managing data in a research environment such as those at ARC and nationally. Attendees of the course will be introduced to recommended tools for data sharing and transfer both on campus, off campus, and cloud.  They will learn how to prepare data for archive, including special high performance versions of tar and compression allowing significant performance benefits over the standard versions of the tools.
Lastly we will cover the properties and selection process of the appropriate general purpose  storage for data that requires long term preservation and active archiving that supports the largest data volumes in a way that controls costs and ease of management.
Requirements are basic command line.
To register and view more details, please refer to the linked TTC page

 

Introduction to Research Computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will introduce you to high performance computing on the Great Lakes cluster.  After a brief overview of the components of the cluster and the resources available there, the main body of the workshop will cover creating batch scripts and the options available to run jobs, and hands-on experience in submitting, tracking, and interpreting the results of submitted jobs. By the end of the workshop, every participant should have created a submission script, submitted a job, tracked its progress, and collected its output. Additional tools including high-performance data transfer services and interactive use of the cluster will also be covered.

To register and view more details, please refer to the linked TTC page.

Introduction to the Linux Command Line

By |

OVERVIEW

This course will familiarize the student with the basics of accessing and interacting with Linux computers using the GNU/Linux operating system’s Bash shell, also generically referred to as “the command line”. Topics include: a brief overview of Linux, the Bash shell, navigating the file system, basic commands, shell redirection, permissions, processes, and the command environment. The workshop will also provide a quick introduction to nano a simple text editor that will be used in subsequent workshops to edit files.

 

To register and view more details, please refer to the linked TTC page

XSEDE: Python Tools for Data Science

By |

OVERVIEW

Python has become a very popular programming language and software ecosystem for work in Data Science, integrating support for data access, data processing, modeling, machine learning, and visualization. In this webinar, we will describe some of the key Python packages that have been developed to support that work, and highlight some of their capabilities. This webinar will also serve as an introduction and overview of topics addressed in two Cornell Virtual Workshop tutorials, available at https://cvw.cac.cornell.edu/pydatasci1 and https://cvw.cac.cornell.edu/pydatasci2 .

See https://portal.xsede.org/course-calendar/-/training-user/class/2467/session/4161 for more information and registration

 

Register via the XSEDE Portal:

If you do not currently have an XSEDE Portal account, you will need to create one:

https://portal.xsede.org/my-xsede?p_p_id=58&p_p_lifecycle=0&p_p_state=maximized&p_p_mode=view&_58_struts_action=%2Flogin%2Fcreate_account

Should you have any problems with that process, please contact help@xsede.org and they will provide assistance.

 

Advanced research computing on the Great Lakes Cluster

By |

OVERVIEW

This workshop will cover some more advanced topics in computing on the U-M Great Lakes Cluster. Topics to be covered include a review of common parallel programming models and basic use of Great Lakes; dependent and array scheduling; workflow scripting using bash; high-throughput computing using launcher; parallel processing in one or more of Python, R, and MATLAB; and profiling of parallel code using Allinea Performance Reports and Allinea MAP.

PRE-REQUISITES

This course assumes familiarity with the Linux command line as might be got from the CSCAR/ARC-TS workshop Introduction to the Linux Command Line. In particular, participants should understand how files and folders work, be able to create text files using the nano editor, be able to create and remove files and folders, and understand what input and output redirection are and how to use them.

INSTRUCTORS

Dr. Charles J Antonelli
Research Computing Services
LSA Technology Services

Charles is a member of the LSA Technology Services Research team at the University of Michigan, where he is responsible for high performance computing support and education, and was an Advocate to the Departments of History and Communications. Prior to this, he built a parallel data ingestion component of a novel earth science data assimilation system, a secure packet vault, and worked on the No. 5 ESS Switch at Bell Labs in the 80s. He has taught courses in operating systems, distributed file systems, C++ programming, security, and database application design.

John Thiels
Research Computing Services
LSA Technology Services

MATERIALS

COURSE PREPARATION

In order to participate successfully in the workshop exercises, you must have a user login, a Slurm account, and be enrolled in Duo. The user login allows you to log in to the cluster, create, compile, and test applications, and prepare jobs for submission. The Slurm account allows you to submit those jobs, executing the applications in parallel on the cluster and charging their resource use to the account. Duo is required to help authenticate you to the cluster.

USER LOGIN

If you already have a Great Lakes user login, you don’t need to do anything.  Otherwise, go to the Great Lakes user login application page at: http://arc-ts.umich.edu/login-request/ .

Please note that obtaining a user account requires human processing, so be sure to do this at least two business days before class begins.

SLURM ACCOUNT

We create a Slurm account for the workshop so you can run jobs on the cluster during the workshop and for one day after for those who would like additional practice. The workshop job account is quite limited and is intended only to run examples to help you cement the details of job submission and management. If you already have an existing Slurm account, you can use that, though if there are any issues with that account, we will ask you to use the workshop account.

DUO AUTHENTICATION

Duo two-factor authentication is required to log in to the cluster. When logging in, you will need to type your UMICH (AKA Level 1) password as well as authenticate through Duo in order to access Great Lakes.

If you need to enroll in Duo, follow the instructions at Enroll a Smartphone or Tablet in Duo.

Please enroll in Duo before you come to class.

LAPTOP PREPARATION

You will need VPN software to access the U-M network.  If you do not have VPN software already installed, please download and install the Cisco AnyConnect VPN software following these instructions.  You will need VPN to be able to use the ssh client to connect to Great Lakes. Please use the ‘Campus All traffic’ profile in the Cisco client.

You will need an ssh client to connect to the Great Lakes cluster. Mac OS X and Linux platforms have this built-in. Here are a couple of choices for Windows platforms:

  • Download and install U-M PuTTY/WinSCP from the Compute at the U website. This includes both the PuTTY ssh client and terminal emulator and a graphical file transfer tool in one installer.  This document describes how to download and use this software, except please note you will be connecting to greatlakes.arc-ts.umich.edu instead of the cited host.  You must have administrative authority over your computer to install this software.
  • Download PuTTY directly from the developer. Download the putty.exe application listed under “Alternative binary files,”, then execute the application.  You do not need administrative authority over your computer to use this software.

Our Great Lakes User Guide in Section 1.2 describes in more detail how to use PuTTY to connect to Great Lakes.

Please prepare and test your computer’s ability to make remote connections before class; we cannot stop to debug connection issues during the class.

A Zoom link will be provided to the participants the day before the class. Registration is required.Please note this session will be recorded.

 

Please register at https://ttc.iss.lsa.umich.edu/ttc/sessions/advanced-research-computing-on-the-great-lakes-cluster-10/register/