U-M is now a Globus Provider, adding services for University users

By | General Interest, News

ARC-TS and ITS are pleased to announce that  the University of Michigan is now a Globus Provider for the research community. Globus is a robust cloud based  file transfer service specifically designed for moving many large files, ranging from tens of GB to tens of TB, between servers.

Standard features of Globus for all users include:

  • Transfers typically a factor of two or more faster that scp/sftp
  • Automatically restarts or continuation when problems arise
  • Transfer happen in the background so that user do not need to remain logged into a system
  • Transfer large files between your laptop/desktop and servers via Globus Connect Personal
  • Data publication and discovery capabilities

The new Provider service for U-M adds:

  • Globus endpoints available for NFS volume on MiStorage owners as well (formerly ValueStorage)
  • Sharing of server directories/folders with non-UM Collaborators who are also Globus users. This is for transfer/copy purposes rather than shared use of server.
  • Sharing of directories/folders from laptops/desktops via Globus Plus account upgrade

For more detailed information on how to use Globus and these features go to https://globus.org

For information on process of setting up endpoints and linking to U-M provider feature go to the ARC-TS Globus web page.

In its third year, ICOS Big Data Summer Camp brings in students from wider range of disciplines

By | Educational, General Interest, News
Aside from maybe a football game, where on the U-M campus can you go to see Sociology, Physics, Communications, Math, Education, Nursing, Economics, and Engineering students all in one place?

At least for a week this June, the answer was the third annual Big Data Summer Camp put on by the Interdisciplinary Committee on Organizational Studies (ICOS) and Advanced Research Computing (ARC). The class attracted students from those fields and others who wanted to learn the basics of mining Big Data for their research.

Todd Schifeling, lead instructor of the summer camp and a post-doc at the Erb Institute for Sustainability, said most of this year’s participants started with little experience in the tools needed to tackle Big Data but left knowing at least the basics. He said the camp appeals to a growing number of fields because of the increasing awareness of the value of publicly available datasets.

“People see the data that’s available, and they want to break new ground with these new datasets,” he said.

Professor of Management & Organizations and director of the Interdisciplinary Committee on Organizational Studies (ICOS) Gerald Davis said that the growing number of disciplines represented at the summer camp is an important development.

“The reach has expanded beyond the social sciences to include students in chemistry, physics, math, and other sciences,” he said. “We have been excited to see that graduate students are collaborating across very different disciplines and forming collaborative bonds that one just doesn’t see at other universities.”

In all, more than two dozen different disciplines were represented at the summer camp.

Participants at the ICOS Big Data Summer Camp, June 2015.

Meghan Oster, pursuing a Ph.D. in Higher Education in the School of Education, said she attended partly because she has access to a dataset that tracked more than 1 million students over 10 years.

Her main goal was to learn “how to effectively and efficiently analyze that and how to visualize those data,” she said.

She said she came into the camp only knowing Stata and standard spreadsheet programs. The summer camp has sections on APIs, SQL, and Python. The students are divided into groups, work on projects, and present their findings to the rest of the camp.

“Overall, it was extremely helpful,” Oster said.

Over its three years of existence, the camp has also built and fostered a community of data-focused researchers by inviting previous years’ students back to present their findings. Schifeling, for example, was a student in the first year of the camp in 2013, presented his work on factors that lead to proliferation of food trucks (using data from Twitter) to the 2014 camp, and this year was the lead instructor.

“The goal is to inspire and motivate people,” he said.

The summer camp students have also set up an ongoing Big Data Users Group with an email list of more than 130 members.

The camp is organized by a committee consisting of Davis; Brian Noble, Associate Dean for Undergraduate Education and Professor of Electrical Engineering and Computer Science; Cliff Lampe, Associate Professor of Information, School of Information; and H.V. Jagadish, Professor of Electrical Engineering and Computer Science.

ITS introduces Science DMZ and perfSONAR network services

By | General Interest, News

By Patty Giorgio, ITS Communications

Research at the University of Michigan is a $1.3 billion enterprise, according to the 2014 Annual Report on Research. An increasing amount of this research involves moving large data sets, between researchers locally, nationally, and internationally. The network at U-M is able to meet the needs.

Research applications have unique network requirements, often in terms of significantly increased end-to-end bandwidth, but sometimes involving latency or jitter bounds that differ from what is needed for networks utilized for normal business operations. Another difference is that networks designed for business typically require significant security infrastructure to protect business services and desktop applications.  These security measures cause problems for high performance research applications. For this reason, the U.S. Department of Energy’s Energy Sciences Network (ESnet) created Science DMZ (DMZ stands for demilitarized zone).

ESnet has designed the Science DMZ architecture with equipment, configuration, and security policies that are optimized for high-performance scientific applications.  In FY14 U-M deployed a Science DMZ on the Ann Arbor campus to ensure researchers have the network environment necessary for their growing research efforts.

Shawn McKee, Research Scientist, Physics, College of Literature, Science and the Arts, is one of the researchers at U-M utilizing the Science DMZ. McKee is working on ATLAS, one of two main general purpose particle detector experiments  at the Large Hadron Collider (LHC), a particle accelerator at CERN in Switzerland. This experiment requires analysis and sharing of large amounts of data with researchers around the world, on average about 10 Petabytes per year and growing.

According to McKee, “Traditional network firewalls create a lot of problems with movement of this type of data, making access to the Science DMZ important to work on projects like the ATLAS Experiment. Just to do physics, we need access to these types of tools. Researchers are trying to find a needle in a haystack, and tools that help us find new ways to search for and parse data quickly decrease our time to discovery.”

Handling big data transfer requires a high capacity network designed for these high performance research applications.

In 2012, U-M received a grant from the National Science Foundation to enhance the university’s network infrastructure in support of research. This grant (through ARC and ITS) helped finance a 100Gbps upgrade for connections between the MACC, MDC, and Internet2 in Chicago. It also covered upgrades to 10Gbps connection to the 3D lab in the Duderstadt Center as well as installation of several network performance monitoring devices, perfSONAR, located at core sites and other strategic locations.

Beyond the grant funding, U-M also made investments in upgrades to the core network, with 100Gbps links between nodes, as well as upgrading connectivity to 30 buildings from 1Gbps to 10Gbps.

 

“We must ensure the core network is never a bottleneck for U-M faculty and researchers. This means not only providing the necessary capacity and speed, but also providing tools to assist with diagnosis of network problems,” said Andy Palms, executive director of ITS Communication Systems. “The addition of perfSONAR devices on the network core allows faculty and researchers to test the speed of their network connections and quickly identify if there are issues with the network.”

McKee agreed, “perfSONAR gives us visibility into what the network is doing. When things move slowly, people automatically think it is the network, but sometimes it is problems with storage or applications. Now we have tools allowing us to see what is happening with the network that will either rule the network out as a problem or provide information to diagnose the issue.”

Network performance can be checked from a desktop to the edge of the university network, where network traffic is handed off to the ISP. Many other higher level academic institutions also use perfSONAR appliances, allowing for testing of network performance to another institution.

A typical perfSONAR Toolkit installation includes the Network Diagnostic Toolkit (NDT).  NDT provides an on-demand service which launches a java applet on your local machine that runs network throughput tests to the perfSONAR appliance. This allows NDT to determine inbound and outbound network speed, the slowest link on the end-to-end path, ethernet duplex settings, tell you if congestion is limiting end-to-end throughput, and if there is excessive packet loss due to faulty cabling.

Access perfSONAR testing information at: www.itcom.itd.umich.edu/backbone/perfsonar/connection-test.html

For more information about U-M Science DMZ: http://www.itcom.itd.umich.edu/backbone/science-dmz/

Intel is presenting workshops on Parallel Programming and Optimization with Xeon Phi Coprocessor — July 21-23

By | Educational, Events

Intel is offering an updated and expanded series of software developer trainings in parallel programming using the Intel Xeon Phi coprocessors.

This series of offerings provides software developers the foundation needed for modernizing their codes to extract more of the parallel compute performance potential found in both Intel Xeon processors and Intel Xeon Phi coprocessors. The courses contain material appropriate for beginning developers as well as for HPC experts.

See the event flyer for more information. The event is sponsored by Intel and Colfax Customized Solutions.

Flux has a pilot program for Xeon Phis.

Symposium on Big Data, Human Health and Statistics, June 25-26, Ann Arbor — June 15 registration deadline

By | Educational, Events

The U-M Department of Biostatistics is holding a two-day Symposium on Big Data, Human Health and Statistics on June 25-26 as the closing event of its summer institute on Transforming Analytical Learning in the Era of Big Data.

Scheduled speakers are:

  • Susan Murphy, Statistics, Psychiatry, and Institute for Social Research, U-M
  • Jeremy M.G. Taylor, Biostatistics, Radiation Oncology, U-M
  • Goncalo Abecasis, Biostatistics, U-M
  • Jenna Weins, Computer Science and Engineering, U-M
  • Todd Mostak, MapD
  • Rachel Schutt, News Corp., Columbia University

For a detailed agenda and to register, visit the symposium website. The registration deadline is June 15.

CSCAR Python workshop on Regression — June 11-12

By | Educational, Events

CSCAR is offering an upcoming workshop on Python. Registration is not required.

Python Regression Workshop
June 11-12
2-4 p.m. each day
4th Floor East Conference Room, Rackham Building

The workshop will focus on the use of Python and the Statsmodels library for regression analysis.

Participants should be familiar with basic Python or at minimum with another data-oriented programming language such as R. The CSCAR Python data management workshop would provide useful background in preparing your data for analysis.

Background materials are available on the workshop website.

U-M to host XSEDE Boot Camp on MPI, Open MP, OpenACC, and more — June 16-19

By | Uncategorized

XSEDE, along with the Pittsburgh Supercomputing Center and the National Center for Supercomputing Applications at the University of Illinois will be presenting a Hybrid Computing workshop.

This 4 day event will include MPI, OpenMP, OpenACC and accelerators and run June 16-19. We will conclude with a special hybrid exercise contest that will challenge the students to apply their skills over the following 3 weeks and be awarded the Second Annual XSEDE Summer Boot Camp Championship Trophy.

Due to demand, this workshop will be telecast to several satellite sites, including U-M. This workshop is NOT available via a webcast. The workshop will be telecast at 1255 North Quad.

Registration is required: visit the XSEDE registration site.

Agenda (all times Eastern):

Tuesday, June 16
11:00 Welcome
11:15 Computing Environment
11:45 Intro to Parallel Computing
12:30 Intro to OpenMP
1:30 Lunch Break
2:30 Exercise 1
3:15 More OpenMP
4:30 Exercise 2
5:00 Adjourn

Wednesday, June 17
11:00 Intro to OpenACC
12:00 Exercise 1
12:30 Introduction to OpenACC (cont.)
1:00 Lunch Break
2:00 Exercise 2
2:45 Introduction to OpenACC (cont.)
3:00 Using OpenACC with CUDA Libraries
3:30 Advanced OpenACC
4:00 OpenMP 4.0 Sneek Peek
5:00 Adjourn

Thursday, June 18
11:00 Introduction to MPI
1:00 Lunch break
2:00 Intro Exercises
3:10 Intro Exercises Review
3:15 Scalable Programming: Laplace code
3:45 Laplace Exercise
5:00 Adjourn

Friday, June 19
11:00 Laplace Exercise Review
12:30 Laplace Solution
1:00 Lunch break
2:00 Advanced MPI
3:00 Outro to Parallel Computing
4:00 Hybrid Computing
4:30 Hybrid Competition
5:00 Adjourn