High Performance Computing at Berkeley Lab

BERKELEY LABORATORY RESEARCH COMPUTING
Berkeley Lab provides Lawrencium, a 860-node (12,372 computational cores) Linux cluster to its researchers needing access to computation to facilitate scientific discovery. The system, which consists of shared nodes and PI-contributed Condo nodes, has a theoretical peak performance rating of 250 teraflops, and has access to a 1PB parallel filesystem storage system. Large memory, GPU and Intel Phi Knight's Landing nodes are also available for users to try.

SCIENTIFIC CLUSTER SUPPORT

HPC Services offers comprehensive Linux cluster support, including pre-purchase consulting, procurement assistance, installation, and ongoing support for PI-owned clusters. Altogether the group manages over 30,000 compute cores and 824 users across 161 research projects for the Lab. Our HPC User Services consultants can help you to get your application running well to make best use of your new cluster. UC Berkeley PIs can also make use of our services through the new Berkeley Research Computing (BRC) program available through UC Berkeley Research IT.

NEWS
Nov 14, 2016 - LBNL Singularity wins HPCWire's Editors' Choice Award
At SC16 in Salt Lake City, Tom Tabor, publisher of HPCWire, presented Greg Kurtzer, for his work on Singularity, with HPCWire's 2016 Editor's Choice Award for one of the Top 5 new Technologies to Watch. These annual awards are highly coveted as prestigious recognition of achievement by the HPC industry and community. Staffer Krishna Muriki, along with Kurtzer, also ran two standing room only Singularity tutorials at the SC16 Intel Developers Conference.

Nov 11, 2016 - Juypter Notebook now available on Lawrencium
Jupyter Notebook is a useful web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. We've extended Juypterhub so that it can now leverage the Lawrencium cluster resource in order to support code needing high performance computing and to reduce turnaround time. Lawrencium users can go to our online documentation see how to get started. 


Oct 20, 2016 - LBNL Singularity featured in HPCWire
This week's issue of HPCWire features Singularity - a user-space container solution designed for HPC. Developed by LBNL architect Greg Kurtzer, Singularity,  is a platform to support users that have different system environment needs than what is typically provided on HPC resources. Users can develop on their Ubuntu laptop and then package up and run their Singularity container on a Linux cluster running a different operating system. What is different about Singularity is that it leverages a workflow and security model that makes it viable on multi-tenant HPC resources without requiring any modifications to the scheduler or system architecture. Read more.

Apr 3, 2016 - HPCS at Lustre Users Group Meeting 2016
HPCS Storage Lead John White will be giving at talk this week at the annual Lustre Users Group 2016 conference. John's presentation will provide an introduction to the challenges involved in providing parallel storage to a condo-style HPC infrastructure.

Mar 22, 2016 - Run on Lawrencium for Free - New Low Priority QoS
We are pleased to announce the “Low Priority QoS (Quality of Service)” pilot program which allows all users to run jobs requesting up to 64 nodes and up to 3 days of runtime on the Lawrencium Cluster resources at no charge when running at a lower priority. Users should check the Lawrencium user page for specific instructions for submitting jobs to the new QoS.

Feb 25, 2016 - Meet Climate Scientist Jennifer Holm
Climate scientist Jennifer Holm with the Climate and Ecosciences Division uses Lawrencium to run simulations for the DOE-funded NGEE Tropics project which studies how tropical forests are going to respond to a changing climate. See the Lab's facebook post here.

Feb 1, 2016 - Lawrencium LR4 Haswell Compute Partition
We recently announced the availability of the new Lawrencium LR4 Haswell compute partition which consists of 108 ea. compute nodes with the new Intel Haswell processors which can execute twice as many floating point operations per clock as compared to the previous Ivybridge processors. As before, researchers can purchase Haswell compute nodes to add to Lawrencium Condo and they will receive free cluster support in exchange for their excess cycles. Interested parties should contact HPCS manager Gary Jung.

Nov 18, 2015 - Linux Foundation OpenHPC project uses LBNL Warewulf
The Linux Foundation the nonprofit organization dedicated to accelerating the growth of Linux and collaborative development, has announced an intent to form the OpenHPC Collaborative Project. This project will provide a new, open source framework to support the world's most sophisticated High Performance Computing environments. Warewulf, our cluster provisioning tool developed by HPCS architect Greg Kurtzer, is specified as the provisioning tool for the OpenHPC standard cluster building recipe.

Nov 16, 2015 - HPCS at SuperComputing 2015
Warewulf will be featured at a roundtable at the SC15 DOE booth #502. HPCS staffer Bernard Li will be leading the Warewulf Roundtable starting at 4:45pm on Tuesday. Cluster admins and developers should drop by to hear the latest developments with Warewulf. On Weds Nov 18, Li along with other staff from NERSC and ESNET including John Shalf, Inder Monder, Dani Ushizima and Aydin Buluc will talk with students during the Students@SC "Dinner with Interesting People." 

Sep 10, 2015 - LabTech 2015 - Registration is now open
Berkeley Lab’s fourth annual computing conference, is scheduled for Sept. 9-10. The program features technical tutorials, discussion sessions, networking opportunities, demos, and more. LabTech is a multi-track conference aimed at scientists, technologists, and support staff. The event is free but registration is required. More

Feb 27, 2015 - Intel Phi Developer Training on March 27
HPCS is hosting a free one-day in-depth training, sponsored by Intel, on the Xeon Phi Coprocessor to be held at Perseverance Hall on March 27th, 2015 from 9:00am to 4:00pm. This training will provide software developers the foundation needed for modernizing their code to take advantage of parallel architectures found in the Intel® Xeon Phi™ coprocessor which are available to users on Lawrencium. For more information and registration, please go here.

Nov 5, 2014 - Data Center Efficiency Summit 2014
Data centers consume approximately 2 percent of the nation’s electrical energy and roughly half of that is consumed by the IT equipment. We partnered with EETD researcher Henry Coles to evaluate server energy use and efficiency among commercially available servers that appear to be similar in design and performance. The results will be presented at the upcoming 2014 Data Center Efficiency Summit held in Santa Clara, CA.

Oct 16, 2014 - HPC Simulations provide a Path to Better Batteries
Batteries with a significant increase in energy density will be needed in the future for automotive and other applications. David Prendergast and Liwen Wan, scientists working in the Theory of Nanostructured Materials group at the Molecular Foundry, a DOE nanoscience research facility hosted by Berkeley Lab, ran a series of computer simulations on their VULCAN Linux Cluster managed by HPCS and at NERSC that dispelled a long-standing misconception about Mg-ions in the electrolyte that transports the ions between a battery’s electrodes. See more...

Sep 23, 2014 - SLURM User Group Meeting
HPCS systems analyst Jackie Scoggins will be giving a talk about how we successfully transitioned our scheduling environment to the SchedMD SLURM job scheduler at the SLURM User Group Meeting this week. She will also be giving a tutorial on using Berkeley Lab NHC (Node Health Check). SLURM is the workload manager on about half of the systems on the Top500 supercomputer list

Sep 10, 2014 - LabTech 2014 is here

Highlights of the day include our morning mini-classes, including 3 one hour sessions on getting the most out of Python in scientific computing, a 3 hour Arduino basics class, and, new this year, Intro and Advanced LabVIEW. The (free) Lunch and Keynote starts at noon, with an overview of what's new from IT this year. In the afternoon, you'll find over 30 sessions on topics ranging from Globus Online and HPC Use Cases to Video Conferencing to Excel. Please go here to register.



Aug 5, 2014 - NHC Talk at Linux Cluster Institute Workshop
The Linux Cluster Institute (LCI) holds an annual workshop to promote best practices for running HPC systems.  This year, Senior HPC engineer Michael Jennings will be leading sessions on the LBNL-developed Warewulf cluster toolkit and Berkeley Lab Node Health Check (NHC) at their 20th annual LCI Workshop held from August 4-8, 2014 at the National Center for Supercomputing Applications (NCSA) in Urbana, Illinois.

May 22, 2014 - BERKELEY RESEARCH COMPUTING Launch Event
The UC Berkeley Research Computing (BRC) celebrates its launch on Thursday, May 22, 2014 at 3:00 p.m. in Sutardja Dai Hall. This exciting new program will include Condo/Institutional cluster (HPC) computing (in partnership with LBNL HPCS), cloud computing support, and virtual workstations. UC Berkeley researchers needing access to computation should look into this program. To attend the event, RSVP here

April 15, 2014 HPCS presenting at GlobusWorld 2014
LBL HPC consultant Krishna Muriki and HPCS systems engineer Karen Fernsler will be presenting at GlobusWorld 2014 on April 15-17, 2014 at Argonne National Laboratory.Their talk "Globus for Big Data and Science Gateways at LBL" will highlight some of our projects, including the Sloan Digital Sky Survey III and the X-Ray Diffraction and Microtomography Beamlines at the Advanced Light Source user facility, which benefit from Globus endpoints to construct Data Pipelines.

Feb 3, 2014 - NHC Talk at Stanford Exascale Conference
LBL HPCS senior engineer Michael Jennings will be giving a talk on the "Node Health Check (NHC)" on Feb 4, 2014 at the Stanford Conference and Exascale Workshop 2014 sponsored by the HPC Advisory Council. NHC, developed by Jennings, provides the framework and implementation for a highly reliable, flexible, extensible node health check solution. It is now widely recommended by major HPC job scheduler vendors and is in use at many large HPC sites and research institutions.

Sept 6, 2013 - Monitoring with Ganglia book released
HPC Services cluster expert Bernard Li's new book "Monitoring with Ganglia - Tracking Dynamic Host and Application Metrics at Scale" has been recently published by O'Reilly media. His book shows how to use Ganglia to collect and visualize metrics from clusters, grids, and cloud infrastructures at any scale.

Mar 21, 2013 - Supporting Science with HPC
HPC Services manager Gary Jung gave a presentation on "Supporting Science with HPC" at the "Enabling Discovery and Production Innovation with Dell HPC Solutions" workshop held in Santa Clara where he talked about how different Berkeley Lab researchers from EETD, ESD, NSD, and Physics are using HPC data pipelines to accomplish their science.

Mar 18, 2013 - GPU Accelerated Synchrotron Radiation Calculation
Today, HPC Services consultant Yong Qin will be presenting his work during a poster session at this week's GPU Technology Conference, GTC 2013, in San Jose, California. Yong's work demonstrates how data parallelism can be applied to spectrum calculation of undulator radiation, which is widely used at synchrotron light facilities across the world. More

Mar 13, 2013 - The Science of Clouds computed on Lawrencium
The climate models that scientists use to understand and project climate change are improving constantly; however the largest source of uncertainty in today’s climate models are clouds. As the source of rain and wind, clouds are important in modeling climate. Berkeley Lab scientist David Romps discusses his work to develop better cloud resolution models. More

Nov 13, 2012 - Node Health Check
HPCS staffers Jackie Scoggins and Michael Jennings gave a well-attended presentation on Jenning's Node Health Check (NHC) software today at the Adaptive Computing booth at SC12. NHC works in conjunction with the job scheduler and resource manager to ensure clean job runs on large HPC systems.

Nov 12, 2012 - Warewulf wins Intel Cluster Ready "Explorer" award
The Berkeley Lab Warewulf Cluster Toolkit development team has been honored with the 'Explorer Award' from the Intel(R) Cluster Ready team at Intel, which recognizes organizations who have continued to explore and implement Intel Cluster Ready (ICR) certified systems. The award was presented to lead developer Greg Kurtzer, and co-developers Michael Jennings, and Bernard Li of the IT Division's HPC Services Group at the annual Intel Partners meeting.

Oct 24, 2012 - HPCS at 2012 Data Center Efficiency Summit
HPCS staff member Yong Qin will be part of a panel, along with other Berkeley Lab scientists, at the 2012 Data Center Efficiency Summit today in San Jose talking about Berkeley Lab's recently released study to understand the feasibility of implementing Demand Response and control strategies in Data Centers. Yong will discuss the issues and our experiences related to reducing or geographically shifting computational workload to a remote datacenter as a response to a demand to lower electrical usage.

Sept 18, 2012 - Warewulf featured in HPC Admin Magazine
This month's issue of HPC Admin Magazine features the last in a 4-part series on how to best use the latest version of the Warewulf Cluster Toolkit. Warewulf, developed by LBNL's Greg Kurtzer and recently certified as Intel Cluster Ready, is a zero-cost, open source solution that guarantees integration and compatibility with Intel products as well as 3rd-party hardware and software solutions. Read this article to learn how to use it.

May 21, 2012 - Cloud Bursting for Particle Tracking
ALS physicists Changchun Sun and Hiroshi Nishimura along with HPCS staff Kai Song, Susan James, Krishna Muriki, Gary Jung, Bernard Li and Yong Qin recently explored the use of Amazon's VPC service to transparently extend the ALS compute cluster and software environment, into the public Cloud to provide on-demand compute resources for particle tracking and NGLS APEX development. Their work was presented during the poster session at the International Particle Accelerator Conference (IPAC12) in New Orleans this week.

Jan 24, 2012 - Bootstrapping Institutional Capability
HPC Services Manager Gary Jung talks about the issues institutions may encounter when developing new or enhancing existing infrastructure to support data intensive science at the Winter 2012 ESCC/Internet2 Joint Techs Conference in Baton Rouge, LA this week.


Nov 3, 2011 - Supercomputers Accelerate Development of Advanced Materials

Researchers from Berkeley Lab and MIT have teamed up to develop a new tool,as part of the Materials Project, to speed up the development for new materials. The project incorporates the use of supercomputing resources including Lawrencium to characterize the properties of inorganic compounds. More


FEATURED PROJECTS

Center for Financial Technology We are partnered with PI John Wu of the Computational Research Division to build a 56-node, 1344-core Haswell cluster to support a collaboration between the Lab and Guggenheim Partners. The cluster is being used to investigate modeling of financial markets.

VM Service for Science
We are piloting a new VM (Virtual Machine) service that will be available to meet the needs of Lab researchers needing a new or replacement system to support their research workflows. Interested users can email hpcshelp for further information.

Big Data at the ALS
We build a Data Pipeline using a Fast 400MB/s CCD, a 43,002 core GPU cluster and a 15TB Data Transfer Node with Globus Online for PI David Shapiro to do the X-ray diffraction 3D image reconstruction at Beamline 5.3.2.1. Read more here about how their project set the microscopy record by achieving the highest resolution ever.