Improving Data Center Water Efficiency via Online Resource Management (NSF-CNS-1565474, completed)

Overview

Data centers, where our cloud and many Internet services reside, are so thirsty that they consume millions of gallons of water each day. Nonetheless, very little attention has been paid by the research community to data centers’ water footprints!

The exploding emergence of cloud computing has been constantly urging many large IT companies to expand the number and scale of their data centers. While energy consumption as well as its associated emissions have been regarded as critical concerns by data center operators and environmental observers, the fast expansion of data centers also leaves an astonishing water footprint and challenges the rapidly depleting groundwater supply. For example, it is reported that the U.S. National Security Agency's massive data center in Bluffdale, Utah, will consume 1.7 million gallons of fresh water each day to cool down its servers. In addition to onsite cooling water, data centers also indirectly consume an enormous amount of water embedded in electricity generation which, even excluding hydroelectricity, is estimated to consume 1.8 liters of water per kilowatt-hour (L/kWh) electricity in the U.S.

Although some of the largest data center operators (e.g., Google and Microsoft) have begun to slash their onsite cooling water usage, the overall water footprint (including both onsite and offsite water) is still one of the forgotten aspects of cloud computing from the perspective of sustainability. In fact, while we've been talking for years about saving energy/electricity cost/carbon footprint and undoubtedly we've made a remarkable progress (e.g., lowering PUE from 2.0 to less than 1.1), water footprint in data centers has received incredibly less attention from the research community, silently presenting an emerging challenge to water sustainability.

Water consumption in power plant Water storage tank in Google's cooling towers Water consumption in Google's cooling towers
Power plant is "drinking" water
(from: www.cleantechnica.com)
240,000-gallon water storage tank at Google's data center in Berkeley County, SC
(from: www.google.com/about/datacenters/)
Google's data center in The Dalles, OR, is "drinking" water at dusk
(from: www.google.com/about/datacenters/)

Why should we care about data centers’ water footprints?

Here're some simple facts.
  1. See LBNL's “Guideline for Water and Energy Considerations During Federal Data Center Consolidations”, prepared for the U.S. DoE Federal Energy Management Program

  2. See California's drought emergency

  3. Green certifications: The 2013 survey by Uptime Institute shows that a vast majority of the 1,000 data centers surveyed are actively seeking green certifications (e.g., LEED by U.S. Green Building Council), which often carry tax credits as well as other benefits. For these green certifications, energy efficiency is just ONE component; water conservation is another key factor and often a prerequisite!

  4. Facebook open sourced its dashboard code for monitoring data center water usage (and PUE), in the hope that more data centers will pay attention to water efficiency. (Link)

  5. AT&T's cooling towers in large (data center) facilities consume 1 billion gallons of water in 2012, about 30% of the entire company's water consumption. (Link)

California declared drought emergency on January 17, 2014

As California is experiencing the record drought, Gov. Jerry Brown urges a 20% cut in water consumption; mandatory conservation measure could be coming soon.

  1. NBC: California governor declares drought emergency, asks for conservation

  2. CNN: California fights wildfire, expects more as drought emergency declared

  3. LA Times: California declares drought emergency

  4. The Wall Street Journal: California governor declares drought emergency

  5. “Businesses have been ordered to cut water use 35%” in certain areas (source)

  6. Sacramento and Folsom have issued mandatory water usage restrictions (source)

  7. Parts of 11 states, including Arkansas, California, Colorado, Hawaii, Idaho, Kansas, New Mexico, Nevada, Oklahoma, Texas and Utah, are designed as drought disaster areas (source)

  8. Undoubtedly, data centers are extremely important for California…

See AT&T's 2012 Water Sustainability Report: Here
"Our Cloud is Thirsty"
Related work.

Why do data centers consume water?

As aforementioned, data centers consume water both directly and indirectly. Here, I would like to first draw the readers’ attention to the difference between water withdrawal and water consumption. The former refers to getting water from somewhere (e.g., public water facilities), whereas the latter refers to “losing” water (e.g., into the environment via evaporation) and producing waste water (e.g., into sewage systems). Both water withdrawal and consumption deserve our attention: water withdrawal causes an increasingly high pressure on the water supply side as the population continues to grow (e.g., thermal electricity generation accounts for 53% of fresh surface-water withdrawals in the U.S.), while water consumption threatens the long-term water sustainability and availability. In what follows, I will particularly focus on water consumption (also interchangeably referred to as water usage wherever applicable) which bears an immediate impact on the availability of groundwater.

  • Offiste/indirect water consumption: Indirect water consumption depends on electricity generation methods as well as cooling techniques and is mostly attributed to the evaporation process for steam condensation in cooling towers (typically required by nuclear and thermal electricity generation). While certain types of electricity (e.g., by solar photovoltaics and wind) consume virtually zero water, “water-free” electricity only makes up a very small portion of the total electric generation capacity (e.g., less than 10% in the U.S.). Overall, a non-negligible amount of water is “lost/consumed”: considering the power transmission loss but excluding hydroelectric, the U.S. national average water consumption is 1.8L/kWh (also referred to as Energy Water Intensity Factor or EWIF).

Cooling system for data center 
  • Onsite/direct water consumption: Large data centers often employ water-cooled chiller systems as their cooling systems, thereby consuming fresh, clean (but not necessarily drinkable) water directly: water evaporates in cooling towers as a heat rejection mechanism. The figure to the left illustrates a cooling tower and the water flow for cooling data centers. Cooling towers and/or chillers are not required under suitable climate conditions. Nonetheless, even with state-of-the-art cooling system combining cold outside air and evaporative cooling (without cooling towers or chillers), the trailing 12-month water efficiency of Facebook's data center in Prineville, OR, is still 0.52L/kWh as of March, 2013. To my best knowledge, Facebook is the only company reporting its real-time water usage information online as of March, 2013.

Combining both direct and indirect water consumption, data centers’ water footprints are now surfacing as a critical concern for future sustainability. Even with state-of-the-art facilities, Facebook's data center in Prineville, OR, is estimated to consume an average of more than 3.6L of water (both direct and indirect) per kWh of IT energy. The overall water footprint will be enormous considering the mega scale of data centers, and sometimes even exeed the capacity of local water utilities (as attested by the example of Microsoft's data center in Northlake, Illinois). To sum up, the growing trend of data centers’ water footprints can no longer be neglected and deserves careful attention from the research community.

What has been done?

Despite its emergence as a critical concern for data centers, water footprint has been largely and unfortunately neglected. Just as James Hamilton (Amazon) said in 2009, “water is tomorrow's big problem. The water consumption (in data centers) is super embarrassing. It just doesn't feel responsible. We need designs that stop using water.” In recent years, some large IT companies such as Microsoft and Google have made an impressive step towards reducing their water footprints. To summarize, the existing approaches to saving water in data centers can be classified as follows.

  • Leveraging outside cold air: Data centers built in cold regions (e.g., Dublin) can leverage ‘‘free’’ cooling by pushing outside cold air into data center computer rooms where hot air and cool air mix to remove heat generated by high-density servers. Thus, cooling towers, where water evaporates to reject the heat into the environment, can be eliminated. In some cooling systems (e.g., employed by Facebook's data center in Prineville, Oregon), outside air will mix with water sprayed by misting nozzles before entering computer rooms to keep appropriate operational temperature and humidity. Thus, cooling systems combining outside air with evaporative cooling still consume a non-negligible amount of water: as of March, 2013, the cooling water usage at Facebook's data center in Oregon still reaches 0.52L per kWh of IT energy.

  • Using non-potable water: Google and Microsoft have been using recycled/waste/sea water in lieu of potable water for cooling their servers. The water will be treated prior to entering their data centers’ cooling system.

  • Reusing warm water for heating: Warm water returned from data centers may be reused for heating offices and redidential buildings. Thus, cooling towers are not necessarily required.

These engineering-based approaches, however, mostly concentrate on improving cooling facilities and suffer from one or more of the following limitations. First, they require appropriate climate conditions and/or desirable locations that are not applicable for all data centers (e.g., “free air cooling” is ideally suitable in cold areas such as Dublin where Google has one data center). Second, they do not address indirect off-site water consumption. Last but not least, some of these approaches, such as building water treatment facilities, often require substantial capital investments that may not be affordable for all data center operators.

  • Media attention: Besides the existing engineering efforts made by large data center operators, data centers’ water footprints have received much media attention. I'll list a few representative media reports as follows, and a more comprehensive list can be found here: list of media reports.

  1. Data Centers and Hidden Water Use (The Wall Street Journal)

  2. Do You Know the Hydro-Footprint Of Your Data Center?

  3. Google Greens Up Data Center With Recycled Water For Cooling

  4. New Utah NSA center requires 1.7M gallons of water daily to operate

  5. Data Center Water Usage: An Emerging Challenge

  6. Data Center Water Use Moves to the Forefront

  7. When You Feed Data To Your Data Centers, They Get Thirsty Too

  8. Water, Water, Everywhere and Not A Drop to Drink…

  9. Water Usage Effectivess: A Green Grid Data Center Sustainability Metric

  10. Data Centers Not Just Power Hungry, They're Thirsty, Too

  11. Data Centers are Huge Water Users

  12. Water Usage Effectiveness As An Important Data Center Metric

  13. Quenching the Thirst of Power Hungry Data Centers

  14. 4 Hidden Data Center Costs (And How to Avoid Them)

My work

Recognizing that water efficiency is emerging as a growing priority for data centers but the pace of its innovation is lagging far behind its energy counterpart, the objective of my research is to reduce the water footprints of data centers via software-based online resource management, which is complementary to the existing data center research and also fundamentally differs from engineering-based water-saving techniques (e.g., improving cooling facilities done by Google and Microsoft). I have done some preliminary work to optimize data centers’ water efficiency. In what follows, I'll list some of my recent work.

  • Optimizing Water Efficiency in Distributed Data Centers:

Energy fuel mix for California I begin my research by investigating the characteristics of data centers’ water consumption. In particular, I identify temporal and spatial diversities of data center water usage effectiveness (WUE): data centers’ WUE changes over time and also over location.

Temporal diversity
The temporal diversity can be explained by noting that temporal changes in outside environment (e.g., temperature/humidity) will affect the usage of cooling water and that power plant's electricity production consists of time-varying mixes of energy fuel sources (each type of which requires different amount of water for a unit electricity generation, e.g., thermal electricity consumes a large amount of water while wind electricity consumes virtually zero water). Readers may refer to Facebook's dashboard at https://fbpuewue.com to view temporal variations of direct WUE (although Facebook is not using cooling towers). The figure to the left shows a snapshot of time-varying energy fuel mixes in California, which will thus lead to a time-varying indirect WUE (also referred to Electricity Water Intensity Factor, which measures the water consumption per unit of electricity production).

Spatial diversity
The spatial diversity can be easily understood: power plants in different places use different energy fuel mixes and/or cooling towers, resulting in spatial differences in EWIF (or indirect WUE); data centers in different places have different temperatures/humidities/cooling techniques/server configurations, etc., all of which will jointly affect direct WUE.

By exploiting the temporal and spatial diversites of data centers’ water efficiency, I proposed a new geographic load balancing algorithm (GLB) to dynamically schedule workloads to water-efficient data centers while satisfying a set of constraints such as cost and delay. To my best knowledge, the new resource management solutions (despite in its infancy) represent the first research efforts to address the emerging issue of data centers’ water footprints.

Selected papers
  1. M. A. Islam, K. Ahmed, H. Xu, N. H. Tran, G. Quan, and S. Ren, “Exploiting Spatio-Temporal Diversity for Water Saving in Geo-Distributed Data Centers,” to appear at IEEE Transactions on Cloud Computing, 2016.

  2. M. A. Islam, S. Ren, G. Quan, M. Z. Shakir, and A. V. Vasilakos, “Water-Constrained Geographic Load Balancing in Data Centers,” to appear at IEEE Transactions on Cloud Computing (Special Issue on Green and Energy-Efficient Cloud Computing), 2015.

  3. K. Ahmed, M. A. Islam, S. Ren, and G. Quan, “Can Data Center Become Water Self-Sufficient?6th Workshop on Power-Aware Computing and Systems (HotPower, co-located with OSDI), 2014.

  4. S. Ren, “Optimizing Water Efficiency in Distributed Data Centers,” Conference on Cloud and Green Computing (CGC), 2013. [PDF]

  5. M. A. Islam, K. Ahmed, S. Ren, and G. Quan, “Exploiting Temporal Diversity of Water Efficiency to Make Data Center Less 'Thirsty',” USENIX International Conference on Autonomic Computing (ICAC), 2014. [PDF]*

Outreach activities

  1. 11/2014: “Burn Your Brain” Workshop at Florida International University

Participants

  1. Shaolei Ren (PI)

  2. Gang Quan (co-PI)

  3. Mohammad A. Islam

  4. Kishwar Ahmed

  5. Hasan Mahmud

  6. Soamar Homsi

Some related work

  1. R. Sharma, A. Shah, C. Bash, T. Christian, and C. Patel. Water efficiency management in datacenters: Metrics and methodology. In ISSST, 2009.

  2. E. Frachtenberg. Holistic datacenter design in the open compute project. Computer, 45(7):83-85, July 2012.

  3. C. Bash, T. Cader, Y. Chen, D. Gmach, R. Kaufman, D. Milojicic, A. Shah, and P. Sharma. Cloud sustainability dashboard, dynamically assessing sustainability of data centers and clouds. HP Labs Tech. Report (HPL-2011-148).

  4. D. Alger, Grow a Greener Data Center, Cisco Press (ISBN-13: 978-1587058134), 2009.

Acknowledgement

This project is supported in part by the U.S. National Science Foundation (NSF) under the grant CNS-1565474. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.