Impact of different image loading and resizing libraries in TensorFlow inference output and comparison with corresponding android results(part 1)
Written by George Soloupis
Τhe objectives with this tutorial are to:
- Load an image file inside android, resize it and do inference with a .tflite model.
- Use different libraries to load and resize images inside colab notebook, do inference with the same .tflite file.
- Compare results and find the optimum method.
For part 1 of this tutorial we will use a square image file of a plant and a .tflite model. For rectangle images view part 2.
Exploring first the android side of our research we will find straightforward methods of loading and resizing bitmaps. If we have the image file inside assets folder we load the bitmap by:
then we resize (downsizing) it keeping the aspect ratio:
You can find more info for the createScaledBitmap method at developers guide.
Then we convert to bytebuffer:
and inference result is:
I/RESULT: [0.0017617701, 1.964645E-5, 1.8741742E-4, 1.0189252E-4, 7.009291E-5, 9.940497E-4, 0.14571801, 0.85106224, 8.326311E-5, 1.5728457E-6]
For the same procedure we have the TensorFlow Lite Android Support Library. This library is designed to help process the input and output of TensorFlow Lite models, and make the TensorFlow Lite interpreter easier to use.
We load bitmap the same way as above and then support library takes control:
Printing log result we get:
I/RESULT: [0.0017617701, 1.964645E-5, 1.8741742E-4, 1.0189252E-4, 7.009291E-5, 9.940497E-4, 0.14571801, 0.85106224, 8.326311E-5, 1.5728457E-6]
and comparing with the first output of bitmap resizing we end up with the same exact result, but TensorFlow support library has way more operations and flexibility than Bitmap.createScaledBitmap() function.
People tend to complain that results from a colab notebook with the same .tflite and image file are different from those of mobile phones. And of course everyone blames first android about that!
Images are not like text where each word corresponds to a number from a list. Images consist of pixels and different methods of resizing can give images with different pixels which will end up with remarkable discrepancies after inference. Especially when we have extreme situations as our example where an image of 512x512 is resized to 32x32 to match model input!
So let’s see the first 2 famous libraries that are used to load and resize images, here with the parameter of BILINEAR interpolation.
PIL library which reads images in RGB format:
Result with PIL library:
[[1.7165932e-03 5.4474713e-05 2.5516949e-04 1.8444049e-04 5.7022298e-05 2.6554947e-03 8.3325535e-01 1.6157104e-01 2.4710680e-04 3.3399576e-06]]
OpenCV library which reads images in BGR format:
Result with OpenCV library:
[[1.2340680e-03 1.9341662e-05 1.5484141e-04 8.8167602e-05 5.2695308e-05 9.2836347e-04 1.5790647e-01 8.3954442e-01 7.0430971e-05 1.2166572e-06]]
OpenCV result is closer to android’s… but not enough. You can also use
Keras image data preprocessing:
but as stated in documentation, it loads an image into PIL format and result is the same:
[[1.7165932e-03 5.4474713e-05 2.5516949e-04 1.8444049e-04 5.7022298e-05 2.6554947e-03 8.3325535e-01 1.6157104e-01 2.4710680e-04 3.3399576e-06]]
If we print the first 10 pixel values of different methods after loading and resizing we will have:
PIL method pixels:
[[205 203 201]
[205 202 201]
[204 202 200]
[203 202 200]
[203 202 200]
[203 202 200]
[203 201 198]
[202 201 198]
[201 202 198]
[202 202 199]]
OpenCV method pixels (converted to RGB):
[[202 201 199]
[204 201 200]
[202 201 199]
[204 203 201]
[203 203 201]
[203 202 200]
[201 198 195]
[199 199 194]
[199 200 196]
[200 200 197]]
while android method gives:
[[201 200 198]
[201 200 198]
[200 200 198]
[202 202 200]
[201 202 199]
[202 202 200]
[199 198 193]
[198 199 193]
[198 199 194]
[199 200 195]]
pretty different! You can play with different methods with this colab notebook and .tflite and image files.
Now let’s go to TensorFlow’s loading and resizing method:
note that tf.image.resize()
function returns pixels with float values which we have to trancate to get rid of fractions to have result:
[[1.7620893e-03 1.9610761e-05 1.8804680e-04 1.0167842e-04 6.9446462e-05 9.9966815e-04 1.4880715e-01 8.4796780e-01 8.2977458e-05 1.5546660e-06]]
and pixel values:
[[201. 200. 198.]
[201. 200. 198.]
[200. 200. 198.]
[202. 202. 200.]
[201. 202. 199.]
[202. 202. 200.]
[199. 198. 193.]
[198. 199. 193.]
[198. 199. 194.]
[199. 200. 195.]]
We see that pixel values are exactly the same as android and result is almost the same and close from every other method (involves float value inference)!
Conclusion
Using a different load/resize algorithm for training a model would not affect the model accuracy as a good model should be robust enough to deal with such differences. During model training, data augmentation is usually applied to create certain distortion making the model generalize better. In case you want to load models inside android and compare the results during development then TensorFlow libraries for loading and resizing are mandatory.