Kotlin SDK

Agents Platform SDK: deploy customized, interactive voice agents in minutes for Android apps.

Refer to the Agents Platform overview for an explanation of how Agents Platform works.

Installation

Add the ElevenLabs SDK to your Android project by including the following dependency in your app-level build.gradle file:

build.gradle.kts
1dependencies {
2 // ElevenLabs Agents SDK (Android)
3 implementation("io.elevenlabs:elevenlabs-android:<latest>")
4
5 // Kotlin coroutines, AndroidX, etc., as needed by your app
6}

An example Android app using this SDK can be found here

Requirements

  • Android API level 21 (Android 5.0) or higher
  • Internet permission for API calls
  • Microphone permission for voice input
  • Network security configuration for HTTPS calls

Setup

Manifest Configuration

Add the necessary permissions to your AndroidManifest.xml:

1<uses-permission android:name="android.permission.INTERNET" />
2<uses-permission android:name="android.permission.RECORD_AUDIO" />
3<uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />

Runtime Permissions

For Android 6.0 (API level 23) and higher, you must request microphone permission at runtime:

1import android.Manifest
2import android.content.pm.PackageManager
3import androidx.core.app.ActivityCompat
4import androidx.core.content.ContextCompat
5
6private fun requestMicrophonePermission() {
7 if (ContextCompat.checkSelfPermission(this, Manifest.permission.RECORD_AUDIO)
8 != PackageManager.PERMISSION_GRANTED) {
9
10 if (ActivityCompat.shouldShowRequestPermissionRationale(this, Manifest.permission.RECORD_AUDIO)) {
11 // Show explanation to the user
12 showPermissionExplanationDialog()
13 } else {
14 ActivityCompat.requestPermissions(
15 this,
16 arrayOf(Manifest.permission.RECORD_AUDIO),
17 MICROPHONE_PERMISSION_REQUEST_CODE
18 )
19 }
20 }
21}

Network Security Configuration

For apps targeting Android 9 (API level 28) or higher, ensure your network security configuration allows clear text traffic if needed:

1<!-- In AndroidManifest.xml -->
2<application
3 android:networkSecurityConfig="@xml/network_security_config"
4 ... >
1<!-- res/xml/network_security_config.xml -->
2<?xml version="1.0" encoding="utf-8"?>
3<network-security-config>
4 <domain-config cleartextTrafficPermitted="true">
5 <domain includeSubdomains="true">your-api-domain.com</domain>
6 </domain-config>
7</network-security-config>

Usage

Initialize the ElevenLabs SDK in your Application class or main activity:

Start a conversation session with either:

  • Public agent: pass agentId
  • Private agent: pass conversationToken provisioned from your backend (never expose your API key to the client).
1import io.elevenlabs.ConversationClient
2import io.elevenlabs.ConversationConfig
3import io.elevenlabs.ConversationSession
4import io.elevenlabs.ClientTool
5import io.elevenlabs.ClientToolResult
6
7// Start a public agent session (token generated for you)
8val config = ConversationConfig(
9 agentId = "<your_public_agent_id>", // OR conversationToken = "<token>"
10 userId = "your-user-id",
11 // Optional callbacks
12 onConnect = { conversationId ->
13 // Called when the conversation is connected and returns the conversation ID. You can access conversationId via session.getId() too
14 },
15 onMessage = { source, messageJson ->
16 // Raw JSON messages from data channel; useful for logging/telemetry
17 },
18 onModeChange = { mode ->
19 // "speaking" | "listening" — drive UI indicators
20 },
21 onStatusChange = { status ->
22 // "connected" | "connecting" | "disconnected"
23 },
24 onCanSendFeedbackChange = { canSend ->
25 // Enable/disable thumbs up/down buttons for feedback reporting
26 },
27 onUnhandledClientToolCall = { call ->
28 // Agent requested a client tool not registered on the device
29 },
30 onVadScore = { score ->
31 // Voice Activity Detection score, range from 0 to 1 where higher values indicate higher confidence of speech
32 },
33 // List of client tools the agent can invoke
34 clientTools = mapOf(
35 "logMessage" to object : ClientTool {
36 override suspend fun execute(parameters: Map<String, Any>): ClientToolResult {
37 val message = parameters["message"] as? String
38
39 Log.d("ExampleApp", "[INFO] Client Tool Log: $message")
40 return ClientToolResult.success("Message logged successfully")
41 }
42 }
43 ),
44)
45
46// In an Activity context
47val session: ConversationSession = ConversationClient.startSession(config, this)

Note that Agents Platform requires microphone access. Consider explaining and requesting permissions in your app’s UI before the conversation starts, especially on Android 6.0+ where runtime permissions are required.

If a tool is configured with expects_response=false on the server, return null from execute to skip sending a tool result back to the agent.

Public vs Private Agents

  • Public agents (no auth): Initialize with agentId in ConversationConfig. The SDK requests a conversation token from ElevenLabs without needing an API key on device.
  • Private agents (auth): Initialize with conversationToken in ConversationConfig. Your server requests a conversation token from ElevenLabs using your ElevenLabs API key.
Never embed API keys in clients. They can be easily extracted and used maliciously.

Client Tools

Register client tools to allow the agent to call local capabilities on the device.

1val config = ConversationConfig(
2 agentId = "<public_agent>",
3 clientTools = mapOf(
4 "logMessage" to object : io.elevenlabs.ClientTool {
5 override suspend fun execute(parameters: Map<String, Any>): io.elevenlabs.ClientToolResult? {
6 val message = parameters["message"] as? String ?: return io.elevenlabs.ClientToolResult.failure("Missing 'message'")
7
8 android.util.Log.d("ClientTool", "Log: $message")
9 return null // No response needed for fire-and-forget tools
10 }
11 }
12 )
13)

When the agent issues a client_tool_call, the SDK executes the matching tool and responds with a client_tool_result. If the tool is not registered, onUnhandledClientToolCall is invoked and a failure result is returned to the agent (if a response is expected).

Callbacks Overview

  • onConnect - Called when the WebRTC connection is established. Returns the conversation ID.
  • onMessage - Called when a new message is received. These can be tentative or final transcriptions of user voice, replies produced by LLM, or debug messages. Provides source ("ai" or "user") and raw JSON message.
  • onModeChange - Called when the conversation mode changes. This is useful for indicating whether the agent is speaking ("speaking") or listening ("listening").
  • onStatusChange - Called when the conversation status changes ("connected", "connecting", or "disconnected").
  • onCanSendFeedbackChange - Called when the ability to send feedback changes. Enables/disables feedback buttons.
  • onUnhandledClientToolCall - Called when the agent requests a client tool that is not registered on the device.
  • onVadScore - Called when the voice activity detection score changes. Range from 0 to 1 where higher values indicate higher confidence of speech.

Not all client events are enabled by default for an agent. If you have enabled a callback but aren’t seeing events come through, ensure that your ElevenLabs agent has the corresponding event enabled. You can do this in the “Advanced” tab of the agent settings in the ElevenLabs dashboard.

Methods

startSession

The startSession method initiates the WebRTC connection and starts using the microphone to communicate with the ElevenLabs Agents agent.

Public agents

For public agents (i.e. agents that don’t have authentication enabled), only the agentId is required. The Agent ID can be acquired through the ElevenLabs UI.

1val session = ConversationClient.startSession(
2 config = ConversationConfig(
3 agentId = "your-agent-id"
4 ),
5 context = this
6)
Private agents

For private agents, you must pass in a conversationToken obtained from the ElevenLabs API. Generating this token requires an ElevenLabs API key.

The conversationToken is valid for 10 minutes.
1// Server-side token generation (Node.js example)
2
3app.get('/conversation-token', yourAuthMiddleware, async (req, res) => {
4 const response = await fetch(
5 `https://api.elevenlabs.seobdtools.com/v1/convai/conversation/token?agent_id=${process.env.AGENT_ID}`,
6 {
7 headers: {
8 // Requesting a conversation token requires your ElevenLabs API key
9 // Do NOT expose your API key to the client!
10 'xi-api-key': process.env.ELEVENLABS_API_KEY,
11 },
12 }
13 );
14
15 if (!response.ok) {
16 return res.status(500).send('Failed to get conversation token');
17 }
18
19 const body = await response.json();
20 res.send(body.token);
21});

Then, pass the token to the startSession method. Note that only the conversationToken is required for private agents.

1// Get conversation token from your server
2val conversationToken = fetchConversationTokenFromServer()
3
4// For private agents, pass in the conversation token
5val session = ConversationClient.startSession(
6 config = ConversationConfig(
7 conversationToken = conversationToken
8 ),
9 context = this
10)

You can optionally pass a user ID to identify the user in the conversation. This can be your own customer identifier. This will be included in the conversation initiation data sent to the server.

1val session = ConversationClient.startSession(
2 config = ConversationConfig(
3 agentId = "your-agent-id",
4 userId = "your-user-id"
5 ),
6 context = this
7)

endSession

A method to manually end the conversation. The method will disconnect and end the conversation.

1session.endSession()

sendUserMessage

Send a text message to the agent during an active conversation. This will trigger a response from the agent.

1session.sendUserMessage("Hello, how can you help me?")

sendContextualUpdate

Sends contextual information to the agent that won’t trigger a response.

1session.sendContextualUpdate(
2 "User navigated to the profile page. Consider this for next response."
3)

sendFeedback

Provide feedback on the conversation quality. This helps improve the agent’s performance. Use onCanSendFeedbackChange to enable your thumbs up/down UI when feedback is allowed.

1// Positive feedback
2session.sendFeedback(true)
3
4// Negative feedback
5session.sendFeedback(false)

sendUserActivity

Notifies the agent about user activity to prevent interruptions. Useful for when the user is actively using the app and the agent should pause speaking, i.e. when the user is typing in a chat.

The agent will pause speaking for ~2 seconds after receiving this signal.

1session.sendUserActivity()

getId

Get the conversation ID.

1val conversationId = session.getId()
2Log.d("Conversation", "Conversation ID: $conversationId")
3// e.g., "conv_123"

Mute/ Unmute

1session.toggleMute()
2session.setMicMuted(true) // mute
3session.setMicMuted(false) // unmute

Observe session.isMuted to update the UI label between “Mute” and “Unmute”.

Properties

status

Get the current status of the conversation.

1val status = session.status
2Log.d("Conversation", "Current status: $status")
3// Values: DISCONNECTED, CONNECTING, CONNECTED

ProGuard / R8

If you shrink/obfuscate, ensure Gson models and LiveKit are kept. Example rules (adjust as needed):

1-keep class io.elevenlabs.** { *; }
2-keep class io.livekit.** { *; }
3-keepattributes *Annotation*

Troubleshooting

  • Ensure microphone permission is granted at runtime
  • If reconnect hangs, verify your app calls session.endSession() and that you start a new session instance before reconnecting
  • For emulators, verify audio input/output routes are working; physical devices tend to behave more reliably

Example Implementation

For an example implementation, see the example app in the ElevenLabs Android SDK repository. The app demonstrates:

  • One‑tap connect/disconnect
  • Speaking/listening indicator
  • Feedback buttons with UI enable/disable
  • Typing indicator via sendUserActivity()
  • Contextual and user messages from an input
  • Microphone mute/unmute button