GoiEner smart meters data
- Granja, Carlos Quesada 1
- Hernández, Cruz Enrique Borges 1
- Astigarraga, Leire 2
- Merveille, Chris 2
-
1
Universidad de Deusto
info
- 2 GoiEner
Éditeur: Zenodo
Année de publication: 2022
Type: Dataset
Résumé
<strong>Name</strong>: GoiEner smart meters data <strong>Summary</strong>: The dataset contains hourly time series of electricity consumption (kWh) provided by the Spanish electricity retailer GoiEner. The time series are arranged in four compressed files: <strong>raw.tzst</strong>, contains raw time series of all GoiEner clients (any date, any length, may have missing samples). <strong>imp-pre.tzst</strong>, contains processed time series (imputation of missing samples), longer than one year, collected before March 1, 2020. <strong>imp-in.tzst</strong>, contains processed time series (imputation of missing samples), longer than one year, collected between March 1, 2020 and May 30, 2021. <strong>imp-post.tzst</strong>, contains processed time series (imputation of missing samples), longer than one year, collected after May 30, 2020. <strong>metadata.csv</strong>, contains relevant information for each time series. <strong>License</strong>: CC-BY-SA <strong>Acknowledge</strong>: These data have been collected in the framework of the WHY project. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 891943. <strong>Disclaimer</strong>: The sole responsibility for the content of this publication lies with the authors. It does not necessarily reflect the opinion of the Executive Agency for Small and Medium-sized Enterprises (EASME) or the European Commission (EC). EASME or the EC are not responsible for any use that may be made of the information contained therein. <strong>Collection Date</strong>: From November 2, 2014 to June 8, 2022. <strong>Publication Date</strong>: December 1, 2022. <strong>DOI</strong>: 10.5281/zenodo.7362094 <strong>Other repositories</strong>: None. <strong>Author</strong>: GoiEner, University of Deusto. <strong>Objective of collection</strong>: This dataset was originally used to establish a methodology for clustering households according to their electricity consumption. <strong>Description</strong>: The meaning of each column is described next for each file. <strong>raw.tzst</strong>: (no column names provided) timestamp; electricity consumption in kWh. <strong>imp-pre.tzst</strong>, <strong>imp-in.tzst</strong>, <strong>imp-post.tzst</strong>: “<em>timestamp</em>”: timestamp; “<em>kWh</em>”: electricity consumption in kWh; “<em>imputed</em>”: binary value indicating whether the row has been obtained by imputation. <strong>metadata.csv</strong>: “<em>user</em>”: 64-character identifying a user; “<em>start_date</em>”: initial timestamp of the time series; “<em>end_date</em>”: final timestamp of the time series; “<em>length_days</em>”: number of days elapsed between the initial and the final timestamps; “<em>length_years</em>”: number of years elapsed between the initial and the final timestamps; “<em>potential_samples</em>”: number of samples that should be between the initial and the final timestamps of the time series if there were no missing values; “<em>actual_samples</em>”: number of actual samples of the time series; “<em>missing_samples_abs</em>”: number of potential samples minus actual samples; “<em>missing_samples_pct</em>”: potential samples minus actual samples as a percentage; “<em>contract_start_date</em>”: contract start date; “<em>contract_end_date</em>”: contract end date; “<em>contracted_tariff</em>”: type of tariff contracted (2.X: households and SMEs, 3.X: SMEs with high consumption, 6.X: industries, large commercial areas, and farms); “<em>self_consumption_type</em>”: the type of self-consumption to which the users are subscribed; “<em>p1</em>”, “<em>p2</em>”, “<em>p3</em>”, “<em>p4</em>”, “<em>p5</em>”, “<em>p6</em>”: contracted power (in kW) for each of the six time slots; “<em>province</em>”: province where the user is located; “<em>municipality</em>”: municipality where the user is located (municipalities below 50.000 inhabitants have been removed); “<em>zip_code</em>”: post code (post codes of municipalities below 50.000 inhabitants have been removed); “<em>cnae</em>”: CNAE (<em>Clasificación Nacional de Actividades Económicas</em>) code for economic activity classification. <strong>5 star</strong>: ⭐⭐⭐ <strong>Preprocessing steps</strong>: Data cleaning (imputation of missing values using the Last Observation Carried Forward algorithm using weekly seasons); data integration (combination of multiple SIMEL files, i.e. the data sources); data transformation (anonymization, unit conversion, metadata generation). <strong>Reuse:</strong> This dataset is related to datasets: "A database of features extracted from different electricity load profiles datasets" (DOI 10.5281/zenodo.7382818), where time series feature extraction has been performed. "Measuring the flexibility achieved by a change of tariff" (DOI 10.5281/zenodo.7382924), where the metadata has been extended to include the results of a socio-economic characterization and the answers to a survey about barriers to adapt to a change of tariff. <strong>Update policy:</strong> There might be a single update in mid-2023. <strong>Ethics and legal aspects:</strong> The data provided by GoiEner contained values of the CUPS (Meter Point Administration Number), which are personal data. A pre-processing step has been carried out to replace the CUPS by random 64-character hashes. <strong>Technical aspects:</strong> <strong>raw.tzst</strong> contains a 15.1 GB folder with 25,559 CSV files; <strong>imp-pre.tzst </strong>contains a 6.28 GB folder with 12,149 CSV files; <strong>imp-in.tzst</strong> contains a 4.36 GB folder with 15.562 CSV files; and <strong>imp-post.tzst</strong> contains a 4.01 GB folder with 17.519 CSV files. <strong>Other:</strong> None.