Automatic filling of data gaps.

Description | A lift station has the possibility to enable up to three different water pumps, in order to lift the water that arrives continuously to the station. There are some water level thresholds where a change in the number of pumps simultaneously working occurs, and those values are known. In the following picture we have sketched for a whole day, in orange the water level (scale on the right) and in blue the total power consumption, that allows us easily to infer how many pumps are working at each time.

The granularity of the data is of 5 minutes, which implies that on the data we might not be able to infer from the level evolution if a certain threshold was attained. The objective of this problem is to determine the blue curve only from the data of the orange curve. There is an excel file available with the data from a whole year in order to develop a model with a large training set.

Mathematical background | Basic statistics, correlation factor and linear regression, basic knowledge on time series or machine learning.

Coordinators | Ricardo Enguiça, ISEL.


