Processing of MQ-X sensors’ data with Machine Learning.
Written by George Soloupis ML GDE.
This blog post is about processing simple air sensors’ data with the help of Machine Learning. The data are collected from an array of air sensors with the help of a Raspberry Pi 4B. You can see the set up and get informed at this tutorial. For further information and an updated guide of how you can connect sensors and prepare the single board computer you can refer to that blog post.
The first thing you have to pay attention is the version of the tflite-runtime you have installed inside the Raspberry. Doing:
at the terminal of the Pi you can identify the version and use the same when you will process the data. At this example the version is 2.5.0.
The training of the model was done inside Google’s Colaboratory. First of all we install the desired version of TensorFlow:
!pip install tensorflow==2.5.0
import tensorflow as tf
One of the critical steps at the training is to prepare the data correctly. We have collected data from 8 sensors at the same time, in intervals of 1 minute and for 1 day. Minimum value of each sensor’s data is 0 and maximum value is 65472. If you need to read more about the ADC MCP3008 circuit python library that sends the data to the Raspberry you can refer to this github repository. We tag the first hours with number “0” and the rest of the time with the number “1”:
We can check the data of each sensor by ploring its diagram:
Each value inside the .csv file is divided by the maximum value (65472) so the data to be normalized. That way the processing with machine learning will be easier. We end up with a dataframe of this kind:
Then we split the data into features and target values. We shuffle the features data before we split them.
# First take the good data
data_first_day_good = fields1[10:210]
data_second_day_good = fields2[10:210]
data_third_day_good = fields3[10:210]# Then take the bad data
data_first_day_bad = fields1[-200:]
data_second_day_bad = fields2[-200:]
data_third_day_bad = fields3[-200:]data_all_days = [data_first_day_good, data_second_day_good, data_third_day_good, data_first_day_bad, data_second_day_bad, data_third_day_bad]data_rest = pd.concat(data_all_days)# SHUFFLE the dataset before splitting into training and validation and reset the indexdata_rest = data_rest.sample(frac=1).reset_index(drop=True)# Separate the data into features and targets
target_field = ['day']
data_features, data_targets = data_rest.drop(target_field, axis=1), data_rest[target_field]
Targets’ data set is in the form of 0 and 1. We convert them into categorical data with the simplicity of the ‘tf.keras.utils.to_categorical’ function:
We build a very simple Keras model with 4 Dense layers which will have inputs of size 8 and will give a prediction if the data corresponds to the first or the second day:
We select ‘Categorical Cross entropy’ loss, Adam optimizer and we start the training which finishes really fast. The quality of the data were really good and the accuracy was over 90%:
During training the model is saved in a .h5 format which can be saved and reconstructed later. Also we can save the model when training has ended to the ‘saved_model’ format which will later can be used to convert to the .tflite file:
# Convert the model
converter = tf.lite.TFLiteConverter.from_saved_model('/content/saved_model') # path to the SavedModel directory
tflite_model = converter.convert()
# Save the model.
with open('food_model_250.tflite', 'wb') as f:
In the end we can use the TFLite Interpreter to test our file which will going to be used inside the Raspberry:
This blog post is showing how you can process the data that are collected from an array of air sensors with the help of the Raspberry Pi. You can see the whole colab notebook at this github repository. Stay tuned for the last blog post of these series where we run inference inside the single board computer with the help of the tflite-runtime.