Loading Events
  • This event has passed.

Hadoop and Spark Workshop

February 22 @ 1:00 pm - 5:00 pm

East Hall B250

Overview

Learn how to process large amounts (up to terabytes) of data using SQL and/or simple programming models available in Python, R, Scala, and Java. Computers will be provided to follow along with hands-on examples; users can also bring laptops.

Prerequisites

Intro to the Linux Command Line or equivalent. This course assumes familiarity with the Linux command line.

A user account on Flux. If you do not have a Flux user account, click here to go to the account application page at: https://arc-ts.umich.edu/fluxform/

Duo authentication.

Duo two-factor authentication is required to log in to the cluster. When logging in, you will need to type your UMICH password as well as authenticate through Duo in order to access Flux.

If you need to enroll in Duo, follow the instructions at Getting Started: How to Enroll in Duo.

click here to register

Instructor

Brock Palen
Director
ARC-TS

Brock has over 10 years of high performance computing and data intensive computing experience in an academic environment. He currently works with the team at ARC-TS to provide HPC, Data Science, storage, and other research computing services to the University. Brock also is the NSF XSEDE projects Campus Champion representing the schools to this and other national computing infrastructures and organizations.

Materials

Course Preparation

In order to participate successfully in the class exercises, you must have a Flux user account. The user account allows you to log in to the cluster, create, compile, and test applications, and transfer data into Hadoop’s filesystem for processing.

Flux user account

A single Flux user account can be used to prepare and submit jobs using various allocations. If you already already possess a user account, you can use it for this course, you can skip to “Flux allocation” below. If not, please visit https://arc-ts.umich.edu/fluxform to obtain one. A user account is free to members of the University community. Please note that obtaining an account requires human processing, so be sure to do this at least two business days before class begins.

Duo Authentication

Duo two-factor authentication is required to log in to the cluster. When logging in, you will need to type your UMICH password as well as authenticate through Duo in order to access Flux.

If you need to enroll in Duo, follow the instructions at Getting Started: How to Enroll in Duo.

More help

Please email hpc-support@umich.edu for questions, comments, or to seek further assistance.

Details

Date:
February 22
Time:
1:00 pm - 5:00 pm
Event Categories:
,
Event Tags:
,

Venue

East Hall B250
530 Church St.
Ann Arbor, MI 48109 United States

Organizer

CSCAR
Email:
cscar@umich.edu
Website:
cscar.research.umich.edu

Other

Register