U-M joins NSF-funded SLATE project to simplify scientific collaboration on a massive scale

By | Feature, General Interest, Happenings, News, Research | No Comments

From the Cosmic Frontier to CERN, New Platform Stitches Together Global Science Efforts

SLATE will enable creation of new platforms for collaborative science

Today’s most ambitious scientific quests — from the cosmic radiation measurements by the South Pole Telescope to the particle physics of CERN — are multi-institutional research collaborations requiring computing environments that connect instrumentation, data, and computational resources. Because of the scale of the data and the complexity of this science,  these resources are often distributed among university research computing centers, national high performance computing centers, or commercial cloud providers.  This can cause scientists to spend more time on the technical aspects of computation than on discoveries and knowledge creation, while computing support staff are required to invest more effort integrating domain specific software with limited applicability beyond the community served.  

With Services Layer At The Edge (SLATE), a $4 million project funded by the National Science Foundation, the University of Michigan joins a team led by the Enrico Fermi and Computation Institutes at University of Chicago to provide technology that simplifies connecting university and laboratory data center capabilities to the national cyberinfrastructure ecosystem. The University of Utah is also participating. Once installed, SLATE connects local research groups with their far-flung collaborators, allowing central research teams to automate the exchange of data, software and computing tasks among institutions without burdening local system administrators with installation and operation of highly customized scientific computing services. By stitching together these resources, SLATE will also expand the reach of domain-specific “science gateways” and multi-site research platforms.  

“Science, ultimately, is a collective endeavor. Most scientists don’t work in a vacuum, they work in collaboration with their peers at other institutions,” said Shawn McKee, a co-PI on the project and director of the Center for Network and Storage-Enabled Collaborative Computational Science at the University of Michigan. “They often need to share not only data, but systems that allow execution of workflows across multiple institutions. Today, it is a very labor-intensive, manual process to stitch together data centers into platforms that provide the research computing environment required by forefront scientific discoveries.”

SLATE works by implementing “cyberinfrastructure as code”, augmenting high bandwidth science networks with a programmable “underlayment” edge platform. This platform hosts advanced services needed for higher-level capabilities such as data and software delivery, workflow services and science gateway components.  

U-M  has numerous roles in the project including:

  • defining, procuring and configuring much of the SLATE hardware platform
  • working on the advanced networking aspects (along with Utah) which includes Software Defined Networking (SDN) and Network Function Virtualization (NFV),
  • developing the SLATE user interface and contributing to the core project design and implementation.

The project is similar to the OSiRIS project led by McKee, which also aims to remove bottlenecks to discovery posed by networking and data transfer infrastructure.

SLATE uses best-of-breed data center virtualization components, and where available, software defined networking, to enable automation of lifecycle management tasks by domain experts. As such, it simplifies the creation of scalable platforms that connect research teams, institutions and resources, accelerating science while reducing operational costs and development time. Since SLATE needs only commodity components, it can be used for distributed systems across all data center types and scales, thus enabling creation of ubiquitous, science-driven cyberinfrastructure.

slateAt UChicago, the SLATE team will partner with the Research Computing Center and Information Technology Services to help the ATLAS experiment at CERN, the South Pole Telescope and the XENON dark matter search collaborations create the advanced cyberinfrastructure necessary for rapidly sharing data, computer cycles and software between partner institutions.  The resulting systems will provide blueprints for national and international research platforms supporting a variety of science domains.  

For example, the SLATE team will work with researchers from the Computation Institute’s Knowledge Lab to develop a hybrid platform that elastically scales computational social science applications between commercial cloud and campus HPC resources. The platform will allow researchers to use their local computational resources with the analytical tools and sensitive data shared through Knowledge Lab’s Cloud Kotta infrastructure, reducing cost and preserving data security.

“SLATE is about creating a ubiquitous cyberinfrastructure substrate for hosting, orchestrating and managing the entire lifecycle of higher level services that power scientific applications that span multiple institutions,” said Rob Gardner, a Research Professor in the Enrico Fermi Institute and Senior Fellow in the Computation Institute. “It clears a pathway for rapidly delivering capabilities to an institution, maximizing the science impact of local research IT investments.”

Many universities and research laboratories use a “Science DMZ” architecture to balance security with the ability to rapidly move large amounts of data in and out of the local network. As sciences from physics to biology to astronomy become more data-heavy, the complexity and need for these subnetworks grows rapidly, placing additional strain on local IT teams.

That stress is further compounded when local scientists join multi-institutional collaborations, often requiring the installation of specialized, domain-specific services for the sharing of compute and data resources.

With SLATE, research groups will be able to fully participate in multi-institutional collaborations and contribute resources to their collective platforms with minimal hands-on effort from their local IT team. When joining a project, the researchers and admins can select a package of software from a cloud-based service — a kind of “app store” — that allows them to connect and work with the other partners.

“Software and data can then be updated automatically by experts from the platform operations and research teams, with little to no assistance required from local IT personnel,” said Joe Breen, Senior IT Architect for Advanced Networking Initiatives at the University of Utah’s Center for High Performance Computing. “While the SLATE platform is designed to work in any data center environment, it will utilize advanced network capabilities, such as software defined overlay networks, when the devices support it.”

By reducing the technical expertise and time demands for participating in multi-institution collaborations, the SLATE platform will be especially helpful to smaller universities that lack the resources and staff of larger institutions and computing centers. The SLATE functionality can also support the development of “science gateways” which make it easier for individual researchers to connect to HPC resources such as the Open Science Grid and XSEDE.

“A central goal of SLATE is to lower the threshold for campuses and researchers to create research platforms within the national cyberinfrastructure,” Gardner said.

Initial partner sites for testing the SLATE platform and developing its architecture include New Mexico State University and Clemson University, where the focus will be creating distributed  cyberinfrastructure in support of large scale bioinformatics and genomics workflows. The project will also work with the Science Gateways Community Institute, an NSF funded Scientific Software Innovation Institute, on SLATE integration to make gateways more powerful and reach more researchers and resources.

###

The Computation Institute (CI), a joint initiative of the University of Chicago and Argonne National Laboratory, is an intellectual nexus for scientists and scholars pursuing multi-disciplinary research and a resource center for developing and applying innovative computational approaches. Founded in 1999, it is home to over 100 faculty, fellows, and staff researching complex, system-level problems in such areas as biomedicine, energy and climate, astronomy and astrophysics, computational economics, social sciences and molecular engineering. CI is home to diverse projects including the Center for Robust Decision Making on Climate and Energy Policy, Knowledge Lab, The Urban Center for Computation and Data and the Center for Data Science and Public Policy.

For more information, contact Dan Meisler, Communications Manager, Advanced Research Computing at U-M: dmeisler@umich.edu, 734-764-7414

Info sessions on graduate studies in computational and data sciences — Sept. 21 and 25

By | Educational, Events, General Interest, News, Research | No Comments

Learn about graduate programs that will prepare you for success in computationally intensive fields — pizza and pop provided

  • The Ph.D. in Scientific Computing is open to all Ph.D. students who will make extensive use of large-scale computation, computational methods, or algorithms for advanced computer architectures in their studies. It is a joint degree program, with students earning a Ph.D. from their current departments, “… and Scientific Computing” — for example, “Ph.D. in Aerospace Engineering and Scientific Computing.”
  • The Graduate Certificate in Computational Discovery and Engineering trains graduate students in computationally intensive research so they can excel in interdisciplinary HPC-focused research and product development environments. The certificate is open to all students currently pursuing Master’s or Ph.D. degrees at the University of Michigan.
  • The Graduate Certificate in Data Science is focused on developing core proficiencies in data analytics:
    1) Modeling — Understanding of core data science principles, assumptions and applications;
    2) Technology — Knowledge of basic protocols for data management, processing, computation, information extraction, and visualization;
    3) Practice — Hands-on experience with real data, modeling tools, and technology resources.

Times / Locations:

MICDE sponsored miRcore Biotechnology Summer Camp for the second year in a row

By | Happenings, HPC, Uncategorized | No Comments

miRcoreBioTec2017This year’s miRcore’s Biotechnology summer camp was a big success.  The participants had hands-on experience in a wet-lab, and with the UNIX command line while accessing U-M’s High Performance Computing cluster, Flux, in a research setting. For the second year in a row MICDE and ARC-ts sponsored the campers to access Flux as they learned the steps that are needed to run code in a computer cluster. The camp also combined theoretical thermodynamic practices that gave participants an overall research experience in nucleotide biotechnology.

miRcore’s camps are designed to expose high school students to career opportunities in biomedicine and to provide research opportunities beyond the classroom setting. For more information please visit http://www.mircore.org/summer-camps/.

[SC2 jobs] CIBC

By | SC2 jobs | No Comments

Senior Quantitative Developer, Quantitative Solutions – Development

Work Location: Toronto, Canada

CIBClogo

BUSINESS UNIT DESCRIPTION

Capitals Market Risk Management (CMRM) is led by Senior Executive Vice-President and Chief Risk Officer and is accountable on matters relating to the independent oversight of the management of risks inherent to CIBC’s activities. These risks include but are not limited to ensuring that effective processes are in place for the identification, management, measurement, monitoring and control of operational, reputation and legal, market, credit, investment and liquidity risk, collectively “CIBC Risk”, incurred by CIBC’s retail and wholesale businesses, infrastructure and corporate governance groups.

The Quantitative Solutions (QS) group is responsible for providing quantitative support for model and market data usage across Capital Markets Risk Management, including market risk VaR models used for economic and regulatory capital, and credit PFE models used for counterparty credit management. The group is responsible for methodology, development, calibration of models, market data quality and usage and explaining and troubleshooting model performance. The group is also responsible for tracking and coordinating changes to models across CMRM.

The QS Development group is responsible for developing and prototyping models for use in CMRM, including end-user tools to enhance analysis and reporting capabilities in and relating to the risk systems. The group defines the theory and practice of risk model implementation in Capital Markets Risk, with an eye both to best practices in the field, and to the practical necessities of running large systems with multiple stakeholders. The Development group communicates regularly with stakeholders regarding work in progress, and ensures work schedules are realistic, but ambitious, and that work plans are transparently communicated to stakeholders, including senior management.

Job purpose

The Senior Quantitative Analyst, Quantitative Solutions – Development is a member of a small team of quantitative analysts and developers supporting the CMRM market and credit risk system. The group is in particular responsible for:

  • Developing, testing and ensuring the sound implementation of all risk models for both market and credit risk management in the Trading operation;
  • Implementing risk and data modeling software to allow for ad hoc or ongoing business analysis. The group is expected to work closely with end user stakeholders in this case;
  • Ensuring, jointly with CMRM and QS stakeholders, that the schedule of work is prioritized, maintained and transparently communicated to all involved;
  • Ensuring a high quality of communication occurs;
  • Managing key relationships with related Technology areas, including Treasury and Risk Management Technology (TRMT) and Wholesale Bank Technology (WBT);
  • Helping to establish the strategic context in which the Quantitative Solutions group functions, ensuring this context is informed by market practice and by practical aspects of existing architecture and risk systems implementation.

Key accountabilities

  • Provide rapid development of end-user computing tools to supplement analytics of risk systems, and ad hoc tools for risk quantification.
  • Development support of legacy risk systems, prototyping vendor solutions.
  • As directed, partner with risk quants and other technology groups.
  • Communicate ideas effectively to stakeholders.
  • Ability to support quantitative development in a variety of platforms: legacy in-house analytics, interaction with the vendor systems, ad hoc tools that can be deployed to end users.
  • Meet governance and documentation standards for code changes, ensuring code can be transitioned seamlessly to other developers.
  • Support continuous enhancement of the MRM models for pricing and risk measurement of derivatives and other complex products, market risk, credit risk, and calibration of parameters.
  • Participate as a key member in cross-functional working groups, to implement joint work in support of all accountabilities.

Cross functional relationships

  • The incumbent collaborates with peers and management within Capital Markets Risk Management, WBT, TRMT, Risk Systems, Vendor developers.

Compliance requirements/responsilibities

  • As an employee of CIBC, the incumbent must comply with all applicable CIBC and Line of Business policies, standards, guidelines and controls.

Job dimensions

  • Support enhancements in quantitative risk systems through systems development work, often working independently on projects involving external stakeholders.
  • Primary clients: Risk Managers, Risk Reporting, Technology, QS – Methodology
  • Accountable for Market and Credit Risk within Capital Markets.

Knowledge and skills

  • Graduate degree in an analytic discipline, such as computer science, mathematics, statistics or physics
  • Three years’ experience in quantitative development in risk management, sufficient to formulate and develop valuation, hedging and risk measurement concepts and models or extensive software development experience
  • Strong programming skills and ability to develop in multiple programing and statistical languages (C++/C#, R etc.)
  • Familiarity with Distributed Computing, Data Science discipline, statistical modelling, machine learning and working with large datasets
  • Significant experience with risk technologies, including knowledge of common methodological issues
  • Experience in small to medium-sized projects, likely as a subject matter lead.
  • Analytical/systematic thinker. Takes a well-ordered and logical approach to analyzing problems, organizing work, and planning action.
  • Relationship builder. Develops and maintains strong relationships with internal and external customers/contacts.
  • Results oriented. Strives to achieve high levels of individual and organizational performance.

Attributes

Accountability, Teamwork & Partnering
Building Trust and Relationships
Results Orientation
Initiative
Creative/Innovative, Analytic/Systematic, Conceptual and Forward & Strategic Thinking
Impact & Influence
Communication
Service Orientation

Working conditions

This role operates within a normal office environment with minimal risk of ill-health or injury. Work pressures caused by tight timelines and quick decisions required on a frequent basis.

The role may require that the employee be available to work non-standard business hours and holidays as assigned by management.

How to Apply

If interested please contact Dejan Kecman at dejan.kecman@cibc.com

[SC2 jobs] Altair Engineering

By | SC2 jobs | No Comments

Internship Position in Machine Learning


altair_logoJob Description:

We are looking for an intern that will lead a benchmark of machine learning algorithms for engineering applications; specifically related to digital twin and predictive maintenance.

Founded in 1985, Altair is headquartered in Troy Michigan with regional operations throughout 22 countries and provides software and services to over 5,000 corporate clients representing the automotive, aerospace, government and defense, and consumer products verticals. Altair also has a growing client presence in the electronics, architecture engineering and construction, and energy markets.

Altair prides itself on its business culture that enables open, creative thinking, deeply valuing our employees and their individual contributions towards our clients’ success as well as our own each and every day. There is an entrepreneurial spirit that flows and is encouraged throughout our global workforce to develop and gather technology that is relevant to engineering and business – including employing it within our own organization.

In this position, you will be working with a team of engineers, computer scientists and mathematicians that has over 25 years of experience in developing and applying predictive and prescriptive analytics to a variety of applications including but not limited to mobile phones, planes, bikes, consumer products, whitegoods and automobiles.

The outcome of this project will shape Altair’s offerings in emerging domains such as predictive maintenance, digital twin and autonomous vehicles.

Location:

Altair headquarter building in Troy, MI; part-time telecommuting is acceptable.

Responsibilities:

1. Review public datasets and choose a subset to be used for benchmarking.
2. Review available predictive modelling/machine learning tools and identify the ones to include in the benchmark.
3. Become familiar with the predictive modelling/machine learning implementations in Altair such as the ones in HyperStudy.
4. Conduct the benchmark for performance and accuracy
5. Make suggestions for methods and applications.

Requirements:

  • Graduate student in data science or related fields.
  • Bachelor of Science Degree in Mechanical Engineering, Aerospace Engineering, Materials Sciences or related fields.
  • Experience with data analysis and prediction modelling tools and languages such as R, Python, SAS, Matlab, Tableau, MicroSoft Azure.
  • Experience in engineering or scientific work is preferred.
  • Experience with machine learning, predictive and prescriptive modelling methods.
  • Excellent communication skills both written and verbal.
  • Good presentation skills.

Application procedures:

To apply please contactFatma Y. Koçer-Poyraz at fatma@altair.com

Siqian Shen (IOE) to receive an Early Career Award from the Department of Energy

By | News, Research | No Comments

siqian-shen-featuredMICDE Associate Director Siqian Shen has been selected to receive an Early Career Award for the Department of Energy Office of Science by the DoE Office of Advanced Scientific Computing Research. The objective of her proposal titled “Extreme‐Scale Stochastic Optimization and Simulation via Learning‐Enhanced Decomposition and Parallelization” is to develop an efficient and unified framework that integrates machine learning with discrete optimization and risk‐averse modeling. The models considered represent a broad class of complex decision‐making problems, where 0‐1 or continuous decisions are made before and/or after knowing multiple and potentially correlated sources of uncertainties. This research will shed new light on the traditional decomposition algorithms for high‐performance computing.

Prof. Shen was recently promoted to Associate Professor of Industrial and Operations Engineering. To learn more about her research please visit http://micde.umich.edu/faculty-member/siqian-shen/.

The Early Career Award program from the US Department of Energy is a funding opportunity for researchers in universities and DOE national laboratories to support the development of individual research programs of outstanding scientists early in their careers. For the past 8 years this program has helped stimulate research careers in the disciplines supported by the DOE Office of Science. These include Advanced Scientific Computing Research (ASCR); Biological and Environmental Research (BER); Basic Energy Sciences (BES), Fusion Energy Sciences (FES); High Energy Physics (HEP), and Nuclear Physics (NP).

U-M, SJTU research teams share $1 million for data science projects

By | Data, General Interest, Happenings, News, Research | No Comments

Five research teams from the University of Michigan and Shanghai Jiao Tong University in China are sharing $1 million to study data science and its impact on air quality, galaxy clusters, lightweight metals, financial trading and renewable energy.

Since 2009, the two universities have collaborated on a number of research projects that address challenges and opportunities in energy, biomedicine, nanotechnology and data science.

In the latest round of annual grants, the winning projects focus on data science and how it can be applied to chemistry and physics of the universe, as well as finance and economics.

For more, read the University Record article.

For descriptions of the research projects, see the MIDAS/SJTU partnership page.

SAVE THE DATE: MIDAS Annual Symposium, Oct. 11

By | Events, General Interest, News | No Comments

Please join us for the 2017 Michigan Institute for Data Science Symposium.

The keynote speaker will be Cathy O’Neil, mathematician and best-selling author of “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.”

Other speakers include:

  • Nadya Bliss, Director of the Global Security Initiative, Arizona State University
  • Francesca Dominici, Co-Director of the Data Science Initiative and Professor of Biostatistics, Harvard T.H. Chan School of Public Health
  • Daniela Whitten, Associate Professor of Statistics and Biostatistics, University of Washington
  • James Pennebaker, Professor of Psychology, University of Texas

More details, including how to register, will be available soon.

New Data Science Computing Platform Available to U-M Researchers

By | General Interest, Happenings, HPC, News | No Comments

Advanced Research Computing – Technology Services (ARC-TS) is pleased to announce an expanded data science computing platform, giving all U-M researchers new capabilities to host structured and unstructured databases, and to ingest, store, query and analyze large datasets.

The new platform features a flexible, robust and scalable database environment, and a set of data pipeline tools that can ingest and process large amounts of data from sensors, mobile devices and wearables, and other sources of streaming data. The platform leverages the advanced virtualization capabilities of ARC-TS’s Yottabyte Research Cloud (YBRC) infrastructure, and is supported by U-M’s Data Science Initiative launched in 2015. YBRC was created through a partnership between Yottabyte and ARC-TS announced last fall.

The following functionalities are immediately available:

  • Structured databases:  MySQL/MariaDB, and PostgreSQL.
  • Unstructured databases: Cassandra, MongoDB, InfluxDB, Grafana, and ElasticSearch.
  • Data ingestion: Redis, Kafka, RabbitMQ.
  • Data processing: Apache Flink, Apache Storm, Node.js and Apache NiFi.

Other types of databases can be created upon request.

These tools are offered to all researchers at the University of Michigan free of charge, provided that certain usage restrictions are not exceeded. Large-scale users who outgrow the no-cost allotment may purchase additional YBRC resources. All interested parties should contact hpc-support@umich.edu.

At this time, the YBRC platform only accepts unrestricted data. The platform is expected to accommodate restricted data within the next few months.

ARC-TS also operates a separate data science computing cluster available for researchers using the latest Hadoop components. This cluster also will be expanded in the near future.

XSEDE Research Allocation Requests due July 15th

By | Educational, General Interest, HPC, News | No Comments

XSEDE Allocations award eligible users access to compute, visualization, and/or storage resources as well as extended support services.

XSEDE has various types of allocations from short term exploratory request to year long projects. In order to access to XSEDE resources you must have an allocation. Submit your allocation requests via the XSEDE Resource Allocation System (XRAS) in the XSEDE User Portal.

ARC-TS consultants can help researchers navigate the XSEDE resources and process. Contact them at hpc-support@umich.edu