Earth and Environmental Sciences Area Logo Earth and Environmental Sciences Area Logo
Lawrence Berkeley National Laboratory Logo
Menu
  • About Us
    • Contact Us
    • Organizational Charts
    • Virtual Tours
    • EESA Strategic Vision
  • Our People
    • A-Z People
    • Alumni Network
    • Area Offices
    • Committees
    • Directors
    • IDEA Working Group
    • Paul A. Witherspoon
    • Postdocs & Early Careers
    • Search by Expertise
  • Careers & Opportunities
    • Careers
    • Intern Pilot w/CSUEB
    • Mentorship Program
    • Recognition & Funding Opps
    • EESA Mini Grants
    • S&E Metrics for Performance and Promotion
    • Student Opportunities
    • Supervisor EnRichment (SupER) Program
    • Promotion Metrics (Scientific)
  • Research
    • Area-Wide Program Domain
      • Earth AI & Data
    • Our Divisions
    • Climate & Ecosystem Sciences Division
      • Environmental & Biological Systems Science
        • Programs
        • Environmental Remediation & Water Resources
        • Ecosystems Biology Program
        • Bioenergy
      • Biosphere-Atmosphere Interactions
        • Programs
        • Climate Modeling
        • Atmospheric System Research
        • Terrestrial Ecosystem Science
      • Climate & Atmosphere Processes
        • Programs
        • Climate Modeling
        • Atmospheric System Research
      • Earth Systems & Society
        • Programs
        • Climate Modeling
    • Energy Geosciences Division
      • Discovery Geosciences
        • Programs
        • Basic Energy Sciences (BES) Geophysics
        • Basic Energy Sciences (BES) Geochemistry
        • Basic Energy Sciences (BES) Isotope
      • Energy Resources and Carbon Management
        • Programs
        • Carbon Removal & Mineralization Program
        • Carbon Storage Program
        • Geothermal Systems
        • Hydrocarbon Science
        • Nuclear Energy & Waste
      • Resilient Energy, Water & Infrastructure
        • Programs
        • Water-Energy
        • Critical Infrastructure
        • Environmental Resilience
        • Grid-Scale Subsurface Energy Storage
        • National Alliance for Water Innovation (NAWI)
    • Projects
    • Research at a Glance
    • Publication Lists
    • Centers and Resources
    • Technologies & National User Programs
  • Departments
    • Climate Sciences
    • Ecology
    • Geochemistry
    • Geophysics
    • Hydrogeology
    • Operations
  • News & Events
    • News
    • Events
    • Earth & Environment Newsletter
  • Intranet
  • Safety
    • EESA Safety
  • FoW
  • Search

  • all
  • people
  • events
  • posts
  • pages
  • projects
  • publications

Berkeley Lab, BIDS Take on Big Data5 min read

by Linda Vu on September 4, 2018

Climate and Ecosystem Sciences Division Energy Geosciences Division GC-Future Water Hydrogeology Department

The world is currently generating data at a break-neck pace — about 2.5 quintillion bytes per day — and this trend is only accelerating. To make sense of this torrent of information, Berkeley Institute for Data Science (BIDS) has built an ecosystem of researchers to advance data-analytic methods and inquiry, develop and expand software and analytics tools, and share best practices.

The BIDS ecosystem comprises an impressive network of Fellows, including some who are Lawrence Berkeley National Laboratory (Berkeley Lab) scientists. This month, several Berkeley Lab-BIDS Fellows are organizing two of events to share their data-science expertise. Some are helping to organize a Machine Learning for Science (ML4Sci) Workshop that will be held in early September, where they will introduce and train scientists to use state-of-the-art machine learning applications on massively parallel supercomputers. At the end of September, another group is hosting the California Water Data Hackathon to help address the state’s lack of access to clean, safe drinking water.

“There’s a perspective that one promise of data science comes from the interdisciplinary nature of the research it enables. ‘Inter’ can mean among different fields of inquiry, but also can mean among alternative approaches to handling data-intensive workloads in research,” said BIDS Executive Director David Mongeau. “For instance, many data scientists at UC Berkeley might default to familiar infrastructure for their work, but by interacting with Berkeley Lab can explore alternative approaches made possible with high performance computing.”

Machine Learning For Science (ML4Sci)

Screen Shot 2018 06 07 at 4.57.25 PM 768x349

Cori supercomputer at NERSC. (Photo by Marilyn Chung, Berkeley Lab)

Some of the Berkeley Lab researchers bringing this expertise to BIDS are Deborah Agarwal, head of the Computational Research Division’s (CRD’s) Data Science and Technology Department; Daniela Ushizima, a staff scientist in the Center for Advanced Mathematics for Energy Research Applications (CAMERA) and CRD’s Data Analytics & Visualization group; and Kristofer Bouchard, computational bioscientist in the Biosciences Area. They are helping to organize the ML4Sci workshop, which will be held at Berkeley Lab Sept. 4-5 in conjunction with the National Energy Research Scientific Computing Center’s (NERSC’s) annual Data Day (Sept. 6-7). Other key organizers of the workshop are NERSC’s Data & Analytics Services group members Prabhat, Steve Farrel, Mustafa Mustafa, and Zarija Lukić  of Berkeley Lab’s Computational Cosmology Center.

The workshop will feature several UC Berkeley faculty-BIDS Fellows as keynote speakers, including Bin Yu, John Canny, Philip Stark, and Joshua Bloom. The event will introduce researchers to cutting-edge machine learning applications for high-energy physics, nuclear physics, cosmology, chemistry, biosciences, materials engineering, climate, and high performance computing. Additionally, machine learning experts, will provide hands-on training to deploy these applications on supercomputers at NERSC.

“There are so many benefits from the cross-pollination of expertise and resources between Berkeley Lab and BIDS,” said Ushizima. “During the ML4Sci workshop, Berkeley Lab staff will be showcasing Jupyter tools. Today, these tools are open source and serve a variety of data science needs—for example, there are currently more than 2 million Jupyter Notebooks hosted on Github. But the root of Jupyter was pioneered by Fernando Perez, one of the founding fathers of BIDS, currently a professor in the Department of Statistics at UC Berkeley, and a Berkeley Lab researcher.”

Earlier this year, the Association for Computing Machinery honored the Jupyter Project Team for developing a tool that has had a lasting influence on computing. At Berkeley Lab, Ushizima also leads the Department of Energy Early Career Project Image across Domains, Algorithms and Learning (IDEAL).

California Water Data Hackathon

RoyK Shyh Wang Hall

Berkeley View from Berkeley Lab. (Photo by Roy Kaltschmidt, Berkeley Lab)

Beyond scientific applications, BIDS also focuses on social impact issues. Earlier this year, when a number of state agencies, private companies and the West Big Data Innovation Hub joined forces to create the 2018 California Safe Drinking Water Data Challenge, BIDS knew it wanted to be a part of this effort. Zexuan Xu, a BIDS Data Science Fellow and a postdoctoral researcher in hydrology in Berkeley Lab’s Earth and Environmental Sciences Area, is helping  to organize BIDS’ participation in this event.

As part of the challenge, BIDS is teaming up with UC Berkeley’s Division of Data Sciences to host the California Water Data Hackathon on Sept. 14-15. According to Xu, the hackathon is open to all but mostly undergraduate and graduate students from a variety of disciplines. The goal is to teach the students about California’s water issues, then have them use publicly available data to help find innovative ways to increase community access to safe drinking water, better understand vulnerabilities, then help identify and deploy solutions.

“Up to 1 million Californians lack access to clean, safe drinking water at some point during the year. Droughts and other disruptions in water supply and contamination in water quality can limit or eliminate access to safe drinking water for days, months, or years,” said Xu. “All the topics that the hackathon participants will address are currently open questions. If they come up with interesting questions and/or solutions, we will deliver their interests to the state agencies, and encourage them to continue the research.”

In many ways the hackathon embodies the philosophy of BIDS, which takes a broad view of data science and welcomes candidates from a full range of research focuses—from digital humanities and psychology to statistics and computer science—who are interested in pushing the frontiers of data-intensive research in their own field and in cross-disciplinary collaborations.

“The greatest benefit of being a BIDS Fellow is getting to know people that work in different fields of science. I am a domain expert in earth and environmental science, but others are experts in math, software development, statistics, bioscience, etc.,” said Xu. “Because the community is so integrated, I can collaborate with mathematicians that I don’t normally have access to. We work on research projects together, then I have a chance to learn the cutting-edge research in other science areas and also share my knowledge and insights with others in my domain area.”

That benefit and the bonds that Berkeley Lab and Univeristy continue to strengthen come in part through Nobel Laureate Saul Perlmutter serving as BIDS Director. He shares the 2011 Nobel Prize in Physics for the discovery of the accelerating expansion of the universe.

Although registration for the ML4Sci workshop is closed, you can still register for the California Water Data Hackathon here: https://www.eventbrite.com/e/california-water-data-hackathon-tickets-48720835330

A full list of BIDS Fellows: https://bids.berkeley.edu/people

News & Events

Daniel Stolper Selected by DOE’s Early Career Research Program2 min read

June 22, 2022

Daniel Stolper is among five Berkeley Lab researchers to receive funding through the Department of Energy’s Early Career Research Program (ECRP), and is one of just 83 nationwide to be selected this year by the DOE for this prestigious award. Stolper is an EESA faculty scientist with a joint appointment at UC Berkeley, where he…

Wageningen Students Visit Ecology Department Team2 min read

On May 31, a delegation of students from Wageningen University & Research Center (WUR) Microbiology and Systems Biology Groups in the Netherlands came to visit EESA’s Ecology department. WUR is a highly esteemed world-class Dutch university that trains specialists in a variety of life sciences disciplines. WUR’s research and teaching activities range from sustainable agriculture…

Strengthening Wildland Fire Science and Scientific Collaboration through New Data Management Platform3 min read

June 13, 2022

  Wildfires are increasing in severity and frequency worldwide. A new report called Spreading like Wildfire: The Rising Threat of Extraordinary Landscape Fires indicates that wildfires are responsible for significant economic, environmental, and sociopolitical damage (UNEP, GRID-Arendal, 2021). They also contribute significantly to greenhouse gas emissions – thereby further fueling climate change.  Researchers need to…

Bhavna Arora Describes Agricultural Managed Aquifer Recharge5 min read

June 7, 2022

Managed Aquifer Recharge is a water management strategy used to store excess surface water underground and thereby replenish groundwater basins when and where possible. This strategy enables communities to use depleted groundwater basins as natural water storage to augment water supplies and prevent land subsidence. In coastal regions, MAR can be implemented to act as…

  • Our People
    • Area Offices
    • Committees
    • Directors
    • Organizational Charts
    • Postdocs
    • Staff Only
    • Search by Expertise
  • Departments
    • Climate Sciences
    • Ecology
    • Geochemistry
    • Geophysics
    • Hydrogeology
  • Research
    • Climate & Ecosystem Sciences Division
    • Energy Geosciences Division
    • Program Domains
      • Programs
    • Projects
  • Contact
    • 510 486 6455
    • eesawebmaster@lbl.gov
    • Our Identity

Earth and Environmental Sciences Area Logo DOE Earth and Environmental Sciences Area Logo UC

A U.S. Department of Energy National Laboratory Managed by the University of California

Lawrence Berkeley National Laboratory · Earth and Environmental Sciences Area · Privacy & Security Notice