Use CameraX with Image Segmentation android project.
Written by George Soloupis ML GDE.
This is a tutorial on how to use CameraX inside an android project that performs segmentation on images. This project is an effort to update the original android app demonstrated here that uses Camera2 implementation. You can find more info about CameraX usage here and switching to this class instead of Camera2, saves a lot of code making inference with TensorFlow Lite Interpreter and TensorFlow Lite Task library easier.
Use cases inside the application.
ImageCapture: The image capture use case is designed for capturing high-resolution, high-quality photos and provides auto-white-balance, auto-exposure, and auto-focus (3A) functionality, in addition to simple manual camera controls. You can find this use case in master branch.
ImageAnalysis: The image analysis use case provides your app with a CPU-accessible image to perform image processing, computer vision, or machine learning inference on. You can find this use case in ImageAnalysis branch.
For the first use case the Image Capture, you can find more info at the official documentation. Initialization of the CameraX with this case here. Inside the project we are using the OnImageSavedCallback() option. The idea here is:
3. Determine the rotation of the captured image with the help of ExifOrientation
5. Save the rotated Bitmap to the specific File path
After that, the Bitmap is used by the ImageSegmentationModelExecutor class of the Task Library or the Interpreter. Inside Android Studio, you can change the build variant to whichever one you want to build and run. Just go to Build > Select Build Variant and select one from the drop-down menu. See configure product flavors in Android Studio for more details.
In the first option the usage of the Bitmap by the Task Library is really easy. As the .tflite file contains metadata it is easy for the Task Libary to determine the resize and normalization options that are going to be used by the model. If we use the MetadataDisplayer we can see all the metadata of the file. After the inference we have all the data that we need to present on screen the original image, the masked bitmap and the stack of the two ones.
In the second option the usage of the Bitmap by the Interpreter is straightforward. We have predefined variables for the input shape of the model, the number of the labels, the mean and the std. The Bitmap that was previously generated is scaled and is converted to ByteBuffer. Then we run the interpreter and we have again the results of the original image, the masked bitmap and the stack of the two ones.
All the above implementation is based to a generated Bitmap from the CameraX. So what if we do not want to generate a bitmap, store it inside the internal mobile’s storage and use it by the TensorFlow Lite libraries? Below we are going to use CameraX’s ImageAnalysis which generates a media.Image that is used directly by the Task Library and by the Interpreter (after converted to Bitmap) but is never stored locally.
You can find initilization and the usage of the ImageAnalysis class here. CameraX produces media.Image in the format of YUV_420_888 directly and continuously as long as we set image.close(). As this implementation of application is to use an image when a capture button is pressed, we have to close() the image when the button is clicked. That way the media.Image and the rotation is fed directly to the Task Library. There we use ImageProcessingOptions to determine the options that are going to be used by the ImageSegmenter class. This class uses directly the media.Image and produces the result which is used to generate the mask bitmap. Later from the media.Image we create the original bitmap, from the result the mask bitmap and combining them together we have a stacked one.
We follow the same procedure with the Interpreter Implementation. The generation of the media.Image is the same and upon capture button click we pass the Image in the ImageSegmentationModelExecutor class. Until now neither Interpreter can use directly the media.Image nor the TensorFlow Lite Support library. So we have to first convert this Image to Bitmap, rotate it and create the ByteBuffer to use it by the Interpreter. This is happening here. As we have some extra steps to handle like bitmap generation the overall time that we spend is slightly bigger than when we use directly the Task Library
An exceptional project with so many options to select. First of all CameraX can provide us with a Bitmap or a media.Image. With every option we can use Task Library or Interpreter. So depend on the restrictions, the mobile storage usage and the inference time as a guidance the developer can select from multiple options to create the project. For a common task like Segmentation of images with a lot of .tflite models provided inside TF Hub the use of a high level API as the Task Library is the best option.
Future work would be that the ImageProcessor of the TensorFlow Lite Support library could support media.Image directly from the CameraX without the need of converting it to Bitmap to use it with the Interpreter.
Project available here:
This brings us to the end of the tutorial. I hope you have enjoyed reading it and will apply what you learned to your real-world applications with TensorFlow Lite.