Back to All Events

Data Science for Renewable Energy

  • Greenbyte 16 Östra Hamngatan Göteborg, 411 06 Sweden (map)

Welcome to the second meetup during this autumn! This time the focus is on renewable energy where Greenbyte will present no less than three different talks. Some of the topics that will be covered are challenges faced with data science and machine learning within the field of renewable energy, how you can work with the open-source Luigi framework as well as a deep dive into autoencoder models, model interpretability and some of their ongoing work. See abstracts below for detailed descriptions on each of the topics.

This meetup will be recorded and posted on our YouTube channel later.

From Digitization to Digital Transformation (by Pramod Bangalore)

90% of all the digital data has been generated in the past few years. Every day we create 2.5 quintillion bytes of data. The world is on its way to becoming digital and this brings in a plethora of opportunities. There is inherent value in all the data that is being collected. Machine learning and AI can play an instrumental role in making businesses more profitable by enabling informed decision making driven by data. However, a survey by MIT Sloan Management review found that only one in five organizations have incorporated AI in an offering or a process.

We at Greenbyte, as B2B service providers, focus on providing the right support to our customers to take small steps to digitally transform their businesses with fully integrated and scalable ML and AI services. In this talk we will share our experience about challenges in using ML and AI to accelerate the process of digital transformation.

Data Science with evolving data (by Edmund Hood Highcock)

Data is like shifting sand, so how do we build tools and models upon this foundation? One approach is to use a data version control system, a bit like git for data. Many tools exist for this. But these tools assume a clear separation between training and running models. Training happens on a versioned set of data. Running the models happens on current data. 

But what if we want to continually train on changing data? What happens if we want to run models on changing data and append the results to previous runs? What if we want to ensure that calculations are only repeated when necessary, i.e. when the data has changed, and we want this to happen automatically? What if we want these calculations to be parallel with dependencies automatically calculated, to run on multiple platforms, without expensive dedicated hardware or proprietary software solutions.

We present a data science workflow, based on the open-source Luigi framework, which uses rolling data versioning and checksum-based data storage to provide all these features while running anywhere from the laptop to the cloud. Using this system we ensure that we are always up-to-date while minimizing our use of resources; a flexible, scalable, data-version-aware system.

Condition monitoring system for wind turbines based on deep autoencoders (by Johanna Renman)

Component failures in wind turbines (WT) can lead to both high repair costs and long downtimes. Often, failures do not occur instantaneously but rather as a consequence of gradual degrading. By detecting this degradation in advance, preventive maintenance can be performed and critical failures avoided.

Previously, machine learning methods such as polynomial fitting have been developed to detect anomalous behaviour in the sensor data. However, they usually only monitor one component at a time; a system that covers all aspects of a WTs operation would, therefore, require a different model for each component. Such a system would quickly become difficult to manage at scale. An alternative anomaly detection model is the autoencoder, a neural network that reconstructs all its input signals This model can monitor a WT holistically; a single model would have the potential to detect failures in multiple components, using many channels of information simultaneously.

In this presentation, we describe a strategy for designing autoencoder models that capture anomalous behaviour in SCADA data. Various hyperparameters that affect the performance of the models will be presented. Our work continues in improving the interpretability of the model and we will present the latest results from our ongoing work.


Headquartered in Gothenburg, Sweden - Greenbyte is on a mission to build the data hub for renewables. With a rapidly increasing amount of renewables in the energy system, new problems appear that are solved by combining data from many sources and in large quantities. Our product, Greenbyte Energy Cloud is a data aggregation and management platform built for rapidly growing renewable energy portfolios seeking to extract more energy from renewable resources.


18:00 Everyone welcome for some food and drinks
18:30 Presentation starts
20:00 Presentation done, everyone is welcome to stay and mingle
21:00 The door closes

Earlier Event: September 4
Model Interpretability
Later Event: November 4
GAIA Annual Meeting