# Welcome to the RectifHydPlus data pipeline!

This repository contains all code required to reproduce *RectifHydPlus* from original source data. All scripts are written in `R`, following the `targets` pipeline framework ([learn more about "targets"](https://books.ropensci.org/targets/)).

*RectifHydPlus* provides estimates of monthly hydropower net generation totals for hydropower plants in the conterminous (lower 48) United States (> 10MW). The data fill a gap in official EIA surveys, which reverted from monthly to annual resolution for most plants after 2003. For more detail see our [journal article](https://doi.org/10.1038/s41597-025-05323-y) in *Scientific Data*.

📢 **If you are you are looking to download the latest RectidHydPlus hydropower generation data, please visit [Hydrosource](https://hydrosource.ornl.gov/dataset/rectifhydplus). This repository contains only the R code required to *reproduce* RectifHydPlus.**

## 🎺 Version 1.1 Improvements 🎺

🌄 **New source data for California**. *RectifHydPlus* now benefits from monthly hydropower observations for all >10MW plants in California. Monthly data in RectifHydPlus for California from 2001 onward can now be considered observational rather than downscaled data.

🔍 **New USGS gauge identification**. *RectifHydPlus* version 1.1 adds 85 new USGS gauges linked directly to powerhouse outflows, providing significant improvements in monthly hydropower estimation across associated facilities.

🔧 **Fixes for mal-performing dams**. *RectifHydPlus* version 1.1 removes inaopropriate links to ResOpsUS for eight dams. In these cases, the ResOps release data represent non-powered outflows rather than turbined release.

![Data source improvements in RectifHydPlus version 1.1](assets/images/v1.1%20data%20update.png "Data source improvements in RectifHydPlus version 1.1")


## 📝 Instructions for running the data pipeline

#### 1. Clone or download this repository

`git clone https://code.ornl.gov/turnersw/rectifhydplus.git`

#### 2. Obtain and insert source data

**A correctly formatted directory with all inputs required to run this data pipeline can be downloaded from [Hydrosource](https://hydrosource.ornl.gov/dataset/rectifhydplus).** To run the data pipeline, you must download these data and then place them directly into the `/data/` directory of this repository.


#### 3. Run the pipeline

Open an R session and run the following:
```
library(targets)
tar_make()
```

To visualize the dependency graph:
```
tar_visnetwork()
```

To view information on the targets:
```
tar_manifest()
```

([Learn more about "targets"](https://books.ropensci.org/targets/))


## 🚧 Updates under construction

We are planning the following enhancements to *RectifHydPlus* in 2026:

📆 Extension of the monthly dataset to 1980-2024, for a 45-year hydropower reanaylsis dataset.

5️⃣ Threshold adjustment to 5MW from 10MW, made available with new data sources (resultsing in approximately 300 additional plants).

🔍 Further identification of problem cases and introduction of new proxy data and observations where available.

See [issues](https://code.ornl.gov/turnersw/rectifhydplus/-/issues) for further details on planned improvements to RectifHydPlus, and feel free to add your own issues!


## 🤝 Contributing
Contributions are welcome! Please open an [issue](https://code.ornl.gov/turnersw/rectifhydplus/-/issues) to proposed ideas, or submit a merge request. For substantial changes, include a short design note and relevant tests or example updates.

## Developers and contact

[Sean Turner](https://www.ornl.gov/staff-profile/sean-turner), Oak Ridge National Lab

[A.B. Siddik](https://www.ornl.gov/staff-profile/ab-siddik), Oak Ridge National Lab

Contact [turnersw\@ornl.gov](mailto:turnersw@ornl.gov)

## License

BSD 2-Clause License.

## Acknowledgements

*RectifHydPlus* depends on the following source data:

📁 **Energy Information Administration** -- [EIA-923](https://www.eia.gov/electricity/data/eia923/) & [EIA-860](https://www.eia.gov/electricity/data/eia860/)
    
📁 **Oak Ridge National Laboratory** -- [Dayflow](https://hydrosource.ornl.gov/data/datasets/dayflow-v2/), [EHA](https://hydrosource.ornl.gov/data/datasets/eha2024/), [HILARRI](https://hydrosource.ornl.gov/data/datasets/hilarri-v3/), & [HESC](https://hydrosource.ornl.gov/data/datasets/hydropower-energy-storage-capacity-dataset/).
    
📁 **National Renewable Energy Laboratory** -- [ReEDS regions](https://github.com/NREL/ReEDS-2.0)

📁 **U.S. Geological Survey** -- [Water Data for the Nation](https://waterdata.usgs.gov/nwis/rt)

📁 **California Energy Commission** -- [WebQFER Source Files](https://www.energy.ca.gov/files/webqfer-source-files)

📁 [Global Reservoir and Dams Database version 1.3](https://www.globaldamwatch.org/grand/)

📁 [ResOpsUS](https://zenodo.org/records/6612040)

📁 [Inferred Storage Targets and Release Functions for CONUS](https://zenodo.org/records/4602277)


*RectifHydPlus* is supported by the U.S. Department of Energy Water Power Technologies Office as part of the [HydroSource](https://hydrosource.ornl.gov/) Initiative.