Initiative aims to improve network connections for researchers

By | General Interest, News | No Comments

Advanced Research Computing – Technology Services (ARC-TS) has launched a new initiative aimed at facilitating uninterrupted data flow to meet the needs of researchers across campus.

Faculty and researchers experiencing problems with data flow to or from their labs or offices can contact ARC-TS by filling out this short questionnaire; ARC-TS staff will work with researchers to identify the causes of networking bottlenecks, and will implement the appropriate solutions.

The U-M network backbone operates at 100 gigabits per second (Gbps), and most academic research buildings are connected at 10 Gbps. Still, a variety of issues can slow down network connections and hamper data-intensive research, including in-building connections.

“With U-M’s 100 Gbps backbone in place across campus, the speed of data transfers should never slow down research,” said Brock Palen, Associate Director of ARC-TS. “This program is meant to identify places where connections lack sufficient bandwidth, and address those problem spots.”

Examples of data-heavy science that could benefit from this initiative include research involving next-generation sequencers, high-resolution data acquisition from the Internet of Things, telescopes, and electron microscopes.

Network connections can be especially important to researchers using such tools with collaborators in other parts of the world. Remote backups of data, and the management and analysis of sensitive data will also be improved with enhanced network capabilities.

Andy Palms, Executive Director of Communications Systems and Data Centers at ITS, added: “This initiative will remove network bottlenecks as a potential barrier to research both here on campus and in U-M’s extensive collaborations with other institutions. This program will provide our researchers with the high-speed network connections required to do cutting-edge research in today’s competitive funding environment.”

The initiative includes resources for upgrades to network connections between labs and offices and the U-M campuswide network backbone, if that portion of the network is determined to be the cause of a data flow slowdown. As many as 200 labs and offices on campus may be upgraded to either 10Gbps or 40Gbps connections.

Researchers interested in taking advantage of the program should fill out a questionnaire to help identify their areas of need.

New ARC-TS program offers free cycles on Flux to undergraduates

By | General Interest, News | No Comments

Undergraduates working on research that requires high performance computing resources can now use the Flux HPC cluster at no cost.

Flux is the shared computing cluster available across campus, operated by Advanced Research Computing – Technology Services (ARC-TS). Under ARC-TS’s new Flux for Undergraduates program, student groups and individuals with faculty sponsors can access unused computing cycles on Flux for free.

The first student group to take advantage of this program is the Michigan Data Science Team, which was created in Fall 2015 with the goal of helping U-M students enter Big Data competitions. The team enters competitions through sites like Kaggle, and is one of the first such teams affiliated with a university.

The group’s organizer, Jonathan Stroud, a Computer Science and Engineering graduate student, said team members were maxing out the capabilities of their laptops when they first started.

“For the first couple of competitions, we made sure we picked a problem that people could do on their laptops,” Stroud said. “Still, every night before bed, they would set up their experiments and they ran all night.”

L-R: Anthony Kremin, Ben Bray, Wei Lee, Curtis Fenner, Jimmy Hsu, Alex Chojnacki, Alexander Zaitzeff, Jonathan Stroud, Jared Webb, Tianpei Xie, Helena Zeng, Xiang Li, Xinyu Tan, Jianming Sang, Guangsha Shi

L-R: Anthony Kremin, Ben Bray, Wei Lee, Curtis Fenner, Jimmy Hsu, Alex Chojnacki, Alexander Zaitzeff, Jonathan Stroud, Jared Webb, Tianpei Xie, Helena Zeng, Xiang Li, Xinyu Tan, Jianming Sang, Guangsha Shi

He said success in the data science competitions typically depends on trying several approaches simultaneously, which can be taxing on computing resources. Stroud said the team typically uses software such as Python, R, and Matlab. Team members come from a wide range of disciplines, including Engineering, Applied Math, Physics, and one from the Music School, Stroud said.

Jacob Abernethy, assistant professor of Electrical Engineering and Computer Science, is the group’s faculty advisor. He wrote some funding for the group into his NSF CAREER proposal that was awarded in 2015. He said after the group’s first competition, he surveyed the students as to what worked and what didn’t. He said one of the clearest responses was the need for more robust computing resources.

“Our top two competitors talked about maxing out the resources on not only their own laptop, but also on the clusters provided them by their advisors,” Abernethy said. “It became clear that we needed to talk about Flux.”

He said a key method to the machine learning and data science experimentation process is the use of cross-validation, that is, testing the performance of a set of parameters on several subsets of data simultaneously. “This leads to a very obvious need for a distributed system in which we can execute a large number of ‘embarrassingly parallel’ tasks quickly,” Abernethy said.

Being able to use Flux “has been helping us a lot,” Stroud added. “We’ve been contacted by other schools to see how they can do the same thing.”

Jobs submitted under Flux For Undergraduates will run only when unused cycles are available and will be requeued when those resources are needed by standard Flux jobs. To be most efficient, student groups should use short or checkpointed jobs to take advantage of these available cycles.

Student groups can also purchase Flux allocations for jobs that are higher priority or time constrained; those allocations can also work in conjunction with the free Flux for Undergraduates jobs.

“The goal is to provide undergraduates with experience in high performance computing, and access to computational resources for their projects,” said Brock Palen, Associate Director of ARC-TS.

Undergraduate groups and individuals must have sponsorship from a faculty member. To request resources through Flux for Undergraduates, please fill out this form. An abstract of the intended activity must be submitted.

Questions can be directed to arc-contact@umich.edu.

MIDAS Seminar: Goncalo Abecasis, U-M Dept. of Biostatistics — Jan. 22

By | Educational, Events | No Comments

As part of the Michigan Institute for Data Science) Seminar Series, Goncalo Abecasis, Felix E. Moore Collegiate Professor and Chair of the Department of Biostatistics, will give a talk titled “Sequencing 10,000s of Human Genomes: Early Results, Opportunities and Challenges.”

Time/Date: 4 – 5:30 p.m., Friday, Jan. 22

Location: Forum Hall, Palmer Commons

Abstract: Rapid advances in genome sequencing technology are enabling increasingly detailed analysis of human genetic variation. In the next year, we expect to analyze >50,000 deeply sequenced human genomes, corresponding to ~10 million billion bases of raw sequence data.

The generation, transfer and analysis of the data presents many opportunities for scientific discovery – enabling better understanding of human history, biology and disease. It also presents varied computational and analytical challenges as well as opportunities to develop and implement new modes of data sharing.

I will illustrate these challenges and opportunities with examples from our ongoing studies.

Bio: My research focuses on using human genetics to improve our understanding of human health and disease. To advance human genetic studies, my group develops statistical and computational methods that enable geneticists to apply emerging high-throughput technologies to studies of human health and disease. Over the past 15 years, I have developed computational tools, analytical models and study designs that have facilitated the widespread deployment of array-based genotyping and short read sequencing technologies in human genetic studies. My research is highly collaborative and benefits from interactions with experts in statistics and biostatistics, biology and human genetics, computer science and mathematics. I have mentored 11 doctoral students and ten postdoctoral research fellows, of whom 14 are now on the faculty at major research universities in the United States.

+ Add To Google Calendar

Workshops: Data Science Skills Series (Python) — Jan. 27 through April 6

By | Educational, Events | No Comments

CSCAR will offer a series of workshops on data science skills using Python. The workshops will be held in the Earl Lewis room in the Rackham building. All workshops will take place on Wednesday afternoons from 3:30-5.

The workshops are free and no registration is necessary.

Schedule:

  • January 27: Data management with Pandas
  • February 10: Graphics and data visualization with Matplotlib and Bokeh
  • February 24: Basic statistical analysis with Statsmodels
  • March 9: Sklearn for predictive analysis and data exploration
  • March 23: Advanced regression analysis (GEE, mixed models and multiple imputation) with Statsmodels
  • April 6: Survival analysis with Statsmodels

Additional workshops will be scheduled on the following topics, dates to be announced:

  • Geospatial analysis
  • Building and accessing databases
  • MPI, parallel, and distributed computing

Class material will be posted on the series website.

MICDE Seminar: James Stone, Astrophysical Science, Princeton — Jan. 28

By | Educational, Events | No Comments

We regret that this talk has been cancelled due to the snowstorm on the east coast.

James Stone, Professor of Astrophysical Sciences at Princeton University, will speak on campus as part of the Michigan Institute for Computational Discovery and Engineering (MICDE) Seminar Series.

Title: Global Radiation MHD Simulations of Black Hole Accretion Disks

Time/Date/Location: CANCELLED

Abstract: New results from a study of the magnetohydrodynamics of luminous accretion flows around black holes will be presented. In this regime, radiation pressure dominates the flow, thus the calculations require numerical methods based on a formal solution of the time-dependent radiation transfer equation. In this talk Prof. Stone will describe new algorithms he has developed that eliminate the need for approximate closures. He and his colleagues found that turbulent transport of radiation energy can be a significant contribution to the cooling rate in the disk, and this changes the global properties of the flow compared to standard thin-disk models. He will describe new work to extend the calculations to full general relativity, in order to follow the dynamics in the innermost regions of the disk.

Bio: James Stone is a Professor of Astrophysical Sciences at Princeton University. His research centers on the use of large-scale direct numerical simulations to study the gas dynamics of a wide range of astrophysical systems, from protostars to clusters of galaxies. Almost all of this work requires development of advanced numerical algorithms for astrophysical gas dynamics on modern parallel computer systems. He is one of the primary developers of the ZEUS code for astrophysical MHD, and more recently he and his collaborators developed Athena, a high-order Godunov scheme for astrophysical MHD that uses adaptive mesh refinement (AMR).

Some of the research problems on which he works include: (1) hydrodynamic and MHD processes that can lead to outward angular momentum transport in accretion disks, (2) the production and propagation of highly supersonic, collimated jets from accretion disks around protostars and active galactic nuclei, (3) the properties of compressible MHD turbulence in cold molecular gas in the galaxy, (4) the time-dependent evolution of strong shocks in the interstellar medium, (5) the structure of radiatively driven winds and outflows from disks around hot stars and AGN, and (6) the effect of mergers and AGN feedback on the hot x-ray emitting gas in clusters of galaxies.

Prof. Stone has a joint appointment in the Program in Applied and Computation Mathematics (PACM). He is deeply involved in PICSciE, which provides access to high-performance computing systems on Princeton’s campus, and training and education in scientific computation and numerical analysis.

Scientific Computing Student Club launch event — Feb. 5

By | Educational, Events | No Comments

The Scientific Computing Student Club (SC2) is a new organization at the University of Michigan sponsored by the Michigan Institute for Computational Discovery and Engineering (MICDE). Its membership includes students and postdocs from many disciplines and interests.

Its goals are to:

  • Develop a community across disciplines that fosters collaboration and peer support for scientific computing
  • Promote the best practices and standards relating to scientific computing
  • Teach and learn about computing resources, languages and environments available at the University of Michigan and at major computing resources
  • Aid in the creation and sharing of open-source projects
  • Provide a forum for the sharing of the computational triumphs of members research as well as potentially helpful developments and information learned along the way

Please join the club for a kick-off event.

Time/Date: 4:30 p.m., Friday, February 5.

Location: Arbor Brewing Company, 114 E. Washington. (Tap Room)

Food will be provided. The College of Engineering Office of Graduate Studies is a co-sponsor of this event.

Hadoop Workshop — Feb. 17

By | Educational, Events | No Comments

Registration is now open for a Hadoop Workshop offered by ARC-TS.

Time/Date: 2 – 5 p.m., Wednesday, February 17, 2016

Location: Room B250, East Hall, 530 Church Street

Instructor: Brock Palen, Advanced Research Computing – Technology Services

Overview: Learn how to process large amounts (up to terabytes) of data using SQL and/or simple programming models available in Python, Scala, and Java. Computers will be provided to follow along with hands-on examples; users can also bring laptops.

More information and registration: arc-ts.umich.edu/hadoop-workshop

Space is limited, so sign up as soon as possible to reserve your spot.

HPC workshops (introductory, intermediate, and advanced) scheduled for Jan. 19 through Feb. 9

By | Educational, Events | No Comments
A series of on-campus high performance computing workshops sponsored by ARC will be held in the coming weeks:

HPC100 — Introduction to the Linux Command Line for HPC
Tuesday, Jan. 19, 9 a.m. – noon
Thursday, Jan. 21, 1 – 4 p.m.
All sessions in B250 East Hall
This course will familiarize students with the basics of accessing and interacting with high-performance computers using the GNU/Linux operating system’s command line. For more information, and to register, visit this page. (Please sign up for only one session.)

HPC101 — High Performance Computing Workshop
Wednesday, Jan.20, 1 – 5 p.m.
Wednesday, Jan. 27, 1 – 5 p.m.
All sessions in B250 East Hall
This course provides an overview of cluster computing in general and how to use the Flux cluster in particular. (Prerequisite: HPC 100 or equivalent.)
For more information, and to register, visit this page. (Please sign up for only one session.)

HPC201 — Advanced High Performance Computing Workshop
Friday, Feb. 2, 1 – 5 p.m., B250 East Hall
Tuesday, Feb. 9, 1 – 5 p.m., B254 East Hall
This course will cover some more advanced topics in cluster computing on the U-M Flux Cluster. Topics to be covered include a review of common parallel programming models and basic use of Flux; dependent and array scheduling; advanced troubleshooting and analysis using checkjob, qstat, and other tools; use of common scientific applications including Python, MATLAB, and R in parallel environments; parallel debugging and profiling of C and Fortran code, including logging, gdb (line-oriented debugging), ddt (GUI-based debugging) and map (GUI-based profiling) of MPI and OpenMP programs; and an introduction to using GPUs. (Prerequisite: HPC101 or equivalent.)
For more information, and to register, visit this page. (Please sign up for only one session.)

MIDAS Seminar: Amit Surana, United Technologies Research Center, on “Koopman Operator Theoretic Framework for Dynamic Data Analytics,” Jan. 15

By | Educational, Events | No Comments

Amit Surana of United Technologies Research Center will speak on campus as part of the Michigan Institute for Data Science (MIDAS) Seminar Series.

Abstract: Recent technological advances in ubiquitous sensing, networking, storage and computing technology are leading to emergence of new paradigms such as Internet of Things, Industrial Internet and Cloud Robotics. These paradigms have led to an exponential explosion in the availability of high volume high velocity time series data which is posing new challenges in data analytics. Classical machine learning techniques exhibit poor scalability in dealing with such high dimensional continuous valued data, and often do not take advantage/preserve the dynamics inherent in the temporally evolving data. In this talk, Dr. Surana will describe a Koopman operator theoretic framework whereby one can cross-fertilize ideas from dynamical system and control theory with machine learning and statistics in order to address some of these challenges.

Time/Date: 4 p.m., Friday, Jan. 15

Location: Stern Auditorum, U-M Museum of Art

More information: MIDAS event page.

ICOS Seminar: Nashir Contractor, Northwestern U., on “Leveraging Network Science to address Grand Societal Challenges,” Jan. 15

By | Educational, Events | No Comments

Abstract: The increased access to big data about social phenomena in general, and network data in particular, has been a windfall for social scientists. But these exciting opportunities must be accompanied with careful reflection on how big data can motivate new theories and methods. Using examples of his research in the area of networks, Contractor will argue that Network Science serves as the foundation to unleash the intellectual insights locked in big data. More importantly, he will illustrate how these insights offer social scientists in general, and social network scholars in particular, an unprecedented opportunity to engage more actively in monitoring, anticipating and designing interventions to address grand societal challenges.

Time/Date: 1:30 – 3 pm, Friday, January 15

Location: R1240 Ross School of Business

For more information: ICOS event page.