Use Gemini and context caching inside android
Written by George Soloupis ML and Android GDE.
In this blog post, we will demonstrate how to implement conversation caching in an Android application, allowing users to resume their conversations with the Gemini API from where they left off. We will utilize Room for message storage, Jetpack Compose for UI development, and the Gemini API as the language model. By the end of this tutorial, users will have two options: starting a new conversation with the model or continuing the previous one.
Jetpack Compose is a modern toolkit for building native Android UIs that greatly enhances the development experience, making it an essential choice for starting new projects. It simplifies and accelerates UI development with its declarative approach, allowing developers to describe the UI in code and have it instantly reflect any changes. This leads to more intuitive and maintainable code. Compose is fully integrated with Kotlin, offering seamless compatibility and taking advantage of Kotlin’s powerful features. Its reusable components and customizability streamline the process of creating complex UIs, reducing boilerplate code and potential errors. Furthermore, Jetpack Compose is designed to be fully interoperable with existing Android views, ensuring a smooth transition for projects that might require integration with legacy code. With features like live previews and real-time updates, Jetpack Compose significantly boosts productivity and fosters a more iterative and creative development process.
The Room library is a powerful and efficient persistence library provided by Android for managing local databases. It abstracts the complexities of SQLite, offering a more intuitive and type-safe API for database interactions. Room ensures compile-time verification of SQL queries, reducing runtime errors and enhancing code reliability. It also supports robust data access patterns, such as DAOs (Data Access Objects), making database operations clear and maintainable. With built-in support for LiveData and RxJava, Room facilitates seamless integration with reactive programming paradigms, enabling real-time data updates in the UI. This combination of simplicity, safety, and powerful features makes Room an indispensable tool for modern Android development.
The Gemini API is a state-of-the-art language model designed to facilitate natural language understanding and generation. It leverages advanced machine learning techniques to provide high-quality, contextually relevant responses across a wide range of topics and applications. The Gemini API is known for its ability to handle complex queries, maintain conversational context, and generate coherent and engaging text, making it an excellent choice for developers looking to integrate sophisticated AI-driven interactions into their applications. Whether you’re building chatbots, virtual assistants, or content generation tools, the Gemini API offers robust capabilities to enhance user experiences with intelligent, responsive dialogue.
The combination of these three libraries makes it possible to save conversations with Gemini and resume them at any time inside android. The final options on the screen will be:
Now that we’ve discussed the key libraries enabling our functionality, it’s time to dive into the code.
Room Component
The database contains the following columns:
- id
- timestamp
- text (the plain text)
- message (the text with start and end tokens)
- author (user or model)
At the start, the app checks the database for any saved conversations. If a saved conversation is found, the user is presented with the option to continue from where they left off. If no saved conversations are found, the user can choose to start a new conversation.
Gemini API
You can find a step by step guide on how to set up and use this API to chat with this exceptional LLM inside android. Before calling the Gemini API, you need to set up your Android project, which includes setting up your API key, adding the SDK dependencies to your Android project, and initializing the model. For this project we have used “Gemini 1.5 Flash” model which is versatile, fast and has a free-tier plan. The code to initialize and start a conversation is:
val generativeModel = GenerativeModel(
modelName = "gemini-1.5-flash",
apiKey = ApiKeyString
)
val chat = generativeModel.startChat(
history = listOf(
content(role = "user") { text("Hello, I have 2 dogs in my house.") },
content(role = "model") { text("Great to meet you. What would you like to know?") }
)
)
chat.sendMessage("How many paws are in my house?")
When the user selects to start a new conversation the database provides the previous messages and we are creating the history:
fun convertMessagesToGeminiPrompt(): List<Content> {
val messages = mutableListOf<Content>()
this._messages.forEach { chatMessage ->
if (chatMessage.author == "user") {
messages.add(content(role = "user") { text(chatMessage.text) })
} else {
messages.add(content(role = "model") { text(chatMessage.text) })
}
}
return messages
}
which will be loaded with the final prompt:
try {
val chat = generativeModel.startChat(
history = messageManager.convertMessagesToGeminiPrompt()
)
val response = chat.sendMessage("Answer based on the conversation")
response.text?.let { message ->
addMessage(message, MODEL_PREFIX)
}
} catch (e: Exception) {
messageManager.addMessage(e.localizedMessage ?: "Unknown Error", MODEL_PREFIX)
setInputEnabled(true)
}
Tip: You can add a TextField on screen to change the role of the Model anytime. For example changing to a “french chef” you can customize the prompt like:
...
val response = chat.sendMessage("Answer based on the conversation. " +
"Pretend to be a french chef")
response.text?.let { message ->
addMessage(message, MODEL_PREFIX)
}
...
Also the project features Speech to Text capabilities with the exceptional Whisper model. There are options on screen to use the keyboard or just click the microphone button and speak.
You can find the complete Android project here. To start chatting with Gemini, obtain your API key and insert it into the ViewModel variable. Warning: Do not share your API key by uploading it on Github or anywhere else.
Enjoy the enhanced conversational experience!
This project is the result of collaboration among three dedicated individuals.
Their combined efforts have culminated in a sophisticated Android application that integrates storage, UI development, and usage of the Gemini API for advanced conversational features.
Conclusion
In this blog post we have showcased how to implement conversation caching in an Android application. Using the Room library for local message storage, Jetpack Compose for UI development, and the Gemini API as the language model, the tutorial guided you through creating an app that allows users to resume their conversations from where they left off. The app checks for saved conversations at the start and presents options to either continue an existing conversation or start a new one. The post covered the setup and integration of these components, providing code examples and tips for enhancing user interactions.