ARC/ICOS Big Data Summer Camp — June 1-4 & 11

By | Educational, Events

This year, the Interdisciplinary Committee on Organizational Studes (ICOS) and ARC are again offering a one-week “big data summer camp” for doctoral students in the social sciences interested in organizational research, with a combination of detailed examples from researchers; hands-on instruction in Python, SQL, and APIs; and group work to apply these ideas to organizational questions.

The dates of the camp are all day June 1-4 and the afternoon of June 11 for group project presentations. Enrollment is free, but students must commit to attending all day for each day of camp, and be willing to work in interdisciplinary groups.

To sign up for camp, visit the registration page.  Please note that space is limited, and we will be accepting people on a first-come, first-served basis. There are no pre-requisites, but we will expect all participants to complete some online instruction prior to camp and to bring a laptop to class with all the relevant programs and files loaded and ready.

A summary of last year’s boot camp is available on the ARC web site.

Flux, Nyx outage scheduled for March 28

By | Events, News

The ARC cluster Flux and Engineering cluster Nyx will be unavailable for jobs March 28th at 10:00pm. There is an emergency update to the ITS Value Storage systems on that date.
http://status.its.umich.edu/outage.php?id=93178

Flux and Nyx rely on Value Storage and thus will also not be available during that time.  We expect the outage to be finished quickly and any queued jobs will run as expected once the service is completed.

At the start of the outage, login and transfer nodes will be rebooted.  Users will be unable to login until after the service is restored.

Any jobs that request more walltime than remains until the start of the outage will be held and started after the systems return to service.

To find the maximum walltime you can request and have your job start prior to the outage can be found with our walltime calculator.

module load flux-utils
maxwalltime

Allocations that are active on that date will be extended by one day at no cost.

If you have any questions feel free to ask us at hpc-support@umich.edu
For immediate updates watch: https://twitter.com/umcoecac

Applications being accepted for U-M Undergraduate Summer Institute in Biostatistics — March 15 deadline

By | Educational, Events

The U-M Department of Biostatistics is holding the Undergraduate Summer Institute in Biostatistics from June 1 – 26, 2015. The theme is “Transformational Analytical Learning in the Era of Big Data,” and applications are being accepted through March 15.

Description:

The field of big data science that intersects with public health and biomedicine is changing rapidly with datasets of enormous complexity and size being gathered in diverse areas including genomics, imaging, electronic health records, social media and environmental monitoring. The training of the next generation of quantitative scientists needs to change to meet the demands of the data. We define “Big Data” as datasets of enormous size and complexity (either in number of observations, and/or in the number/nature of predictors/outcomes). Classical theory, computation and intuition often fail for such irregular, sparse data sets of vast size. More training in data management, data storage, visualization, high dimensional statistics, optimization, causal methods, modeling sparse data and machine learning are needed to equip students to tackle these big data challenges. It is expected that the knowledge obtained from these massive heterogeneous data sources will inform prevention, screening, prognosis and treatment of human diseases and play a major role in biology, medicine and public health in the coming decade.

This full-time 4 week summer institute held in the Ann Arbor campus of the University of Michigan is targeted toward undergraduates who have an interest (or are susceptible to being interested) in the intersection of Big Data, Statistics, and Human Health. The institute is led by a distinguished group of faculty from the Department of Biostatistics at the University of Michigan School of Public Health (UMSPH) with additional outstanding faculty from Statistics and Electrical Engineering and Computer Science (EECS).

To apply:

Visit the institute’s “How to Apply” page.

ARC web sites to be updated

By | News

This week, the Advanced Research Computing web sites will be updated to reflect ongoing changes to ARC, and to provide easier access to relevant information.

  • Information on Flux, upcoming HPC training sessions, and other computing services will be on the ARC-Technology Services web site at arc-ts.umich.edu.
  • Information on other ways ARC enables computational and data-intensive research at Michigan will be on the new ARC site at arc.umich.edu.

Both sites are scheduled to go live this week (Feb. 23-27, 2015). Users seeking to access arc.research.umich.edu will be redirected when the new sites are up.

The ARC newsletter will continue to be distributed as usual.

Please send any feedback on the new web sites to arc-contact@umich.edu.

Data will be deleted from /scratch on Flux if unused for 90 days

By | General Interest, News

Over the past several months, a huge amount of data (491 TB) has accumulated in the /scratch directory on the Flux computing cluster. /scratch is meant for data relating to currently running jobs, and the buildup of data is threatening the performance of Flux for all users.

Therefore, ARC will begin deleting data from /scratch that have not been accessed for 90 consecutive days.

Flux account owners with unused data have begun receiving emails warning that their data will be deleted.

Account owners in this situation can move their data to another system such as ITS Value Storage or their own equipment using the dedicated transfer nodes on Flux with high speed network connections available for that purpose.

For more information on Value Storage, see the ITS website.

For more information on transfer nodes, see the ARC website.

If you have any questions, please contact hpc-support@umich.edu.

MICDE Seminar: Mario Juric, Large Synoptic Survey Telescope: “LSST: Ushering in the Era of Petascale Optical Astronomy” — Feb. 16

By | Educational, Events

Mario Juric is a Washington Research Foundation Data Science Professor of Astronomy at the Department of Astronomy at the University of Washington, and a Senior Data Science Fellow of the University of Washington eScience Institute. He is also theData Management Project Scientist for the Large Synoptic Survey Telescope. He holds a Ph.D. in Astrophysical Sciences from Princeton University; was a postdoctoral member at the Institute for Advanced Study; served as a Hubble Fellow at Harvard University; and was an associate scientist at LSST/AURA.

LSST: Ushering in the Era of Petascale Optical Astronomy

4 – 5 p.m., Monday, Feb. 16, 2015
340 West Hall

The Large Synoptic Survey Telescope (LSST; http://lsst.org) is a planned, large-aperture, wide-field, ground-based telescope that will survey half the sky every few nights in six optical bands from 320 to 1050 nm. It will explore a wide range of astrophysical questions, ranging from discovering “killer” asteroids, to examining the nature of dark energy.

The LSST will produce on average 15 terabytes of data per night, yielding an (uncompressed) data set of over 100 petabytes at the end of its 10-year mission. Dedicated HPC facilities will process the image data in near real time, with full-dataset reprocessings on annual scale. A sophisticated data management system will enable database queries from individual users, as well as computationally intensive scientific investigations that utilize the entire data set.

In this talk, Juric will review the science case for LSST and what LSST will deliver once operational. He will focus on the data products and management system, highlighting a number of differences and novel approaches compared to previous surveys including extensive use of simulations. More generally, Juric will discuss implications of petascale data sets for astronomy in the 2020s and ways in which the community can prepare to make the best use of them.

U-M hosting MPI programming workshop — March 4-5

By | Educational, Events

U-M is hosting a telecast of an upcoming training workshop on MPI programming sponsored by XSEDE along with the Pittsburgh Supercomputing Center and the National Center for Supercomputing Applications.

The two-day MPI workshop will be held March 4-5, 2015. The U-M telecast will be held at Room 1180 of the Duderstadt Center, from 11 a.m. to 5 p.m. both days. Registration is required can can be done at the XSEDE website.

This workshop is intended to give C and Fortran programmers a hands-on introduction to MPI programming. Both days are compact, to accommodate multiple time zones, but packed with useful information and lab exercises. Attendees will leave with a working knowledge of how to write scalable codes using MPI – the standard programming tool of scalable parallel computing.

Agenda (Eastern time):

Wednesday, March 4

11:00 Welcome
11:15 Computing Environment
12:00 Intro to Parallel Computing
1:00 Lunch break
2:00 Introduction to MPI
3:30 Introductory Exercises
4:10 Intro Exercises Review
4:15 Scalable Programming: Laplace code
5:00 Adjourn/Laplace Exercises

Thursday, March 5

11:00 Laplace Exercises
12:00 Laplace Solution
12:30 Lunch break
1:30 Advanced MPI
2:30 Outro to Parallel Computing
3:30 Exercises
4:30 Adjourn

Science of Cyberinfrastructure conference call for papers — Feb. 23 deadline

By | Educational, Events

The Science of Cyberinfrastructure: Research, Experience, Applications and Models (SCREAM-15) conference in Portland, Oregon, July 16 is seeking submissions for papers. The deadline is Feb. 23. Visit the conference’s call for papers page for more details.

Topics of interest, in the context of distributed infrastructure, include, but are not limited to:

– Research
– Integration and interaction with commercial systems
– Cloud issues
– Sustainability and business models for both systems and software
– Integrating AAA systems
– Resilience
– Federation of resources
– Networking design/advances for widely distributed systems
– Clean slate designs
– Designing systems for data-intensive science, not just for computational science
– Quantitative/metrics driven design
– Management software
– Experiences, both successes & failures
– Cyberinfrastructure system experiences
– Software experiences
– Application experiences
– Applications
– Novel application types, including data-intensive applications
– Application requirements that lead to new systems characteristics
– Challenges in application development, deployment, and execution
– Models of theoretical performance
– Understanding actual performance
– Models
– Design principles
– Architecture layers, including middleware
– Next generation distributed cyberinfrastructure
– Research-As-A-Service

Open meeting for HPC users at U-M — Friday, Feb. 6

By | Educational, Events

Users of high performance computing resources are invited to meet Flux operators and support staff in person at an upcoming user meeting:

  • Room 1180, Duderstadt Center, Friday, Feb. 6, 2 – 5 p.m.

There is not a set agenda; come at anytime and stay as long as you please. You can come and talk about your use of any sort of computational resource, Flux, Nyx, XSEDE, or other.

Ask any questions you may have. The Flux staff will work with you on your specific projects, or just show you new things that can help you optimize your research.

This is also a good time to meet other researchers doing similar work.

This is open to anyone interested; it is not limited to Flux users.

Examples potential topics:

  • What Flux/ARC services are there, and how to access them?
  • How to make the most of PBS and learn its features specific to your work?
  • I want to do X, do you have software capable of it?
  • What is special about GPU/Xeon Phi/Accelerators?
  • Are there resources for people without budgets?
  • I want to apply for grant X, but it has certain limitations. What support can ARC provide?
  • I want to learn more about the compiler and debugging?
  • I want to learn more about performance tuning, can you look at my code with me?
  • Etc.

For more information, contact Brock Palen (brockp@umich.edu) at the College of Engineering; Dr. Charles Antonelli (cja@umich.edu) at LSA; Jeremy Hallum (jhallum@umich.edu) at the Medical School; or Vlad Wielbut (wlodek@umich.edu) at SPH.

We are planning to hold similar meetings monthly.

Workshop on Data Management in Python — Feb. 10-12 at Rackham

By | Educational, Events

CSCAR will offer a workshop on data management in Python on February 10, 11, and 12 from 4-6 p.m. each day, in the Rackham common room (Rackham Building lower level west).

The workshop will focus on using core Python, numpy, and Pandas to manage and process data sets. Participants will learn how to read and clean data sets, generate reports, produce graphical summaries, and perform simple statistical analyses.

This workshop will have a lecture/discussion format and is not held in a computer lab. Participants may bring their own laptops if they wish but this is not required. All software discussed in the workshop is free and open source, and runs on all major platforms. Follow-up consulting for UM researchers using Python in their research is available from CSCAR.

There is no registration for this workshop, and there is no charge to attend. Participants should plan to attend all three sessions.