Kotlin SDK | ElevenLabs Documentation

Refer to the Agents Platform overview for an explanation of how Agents Platform works.

Installation

Add the ElevenLabs SDK to your Android project by including the following dependency in your app-level build.gradle file:

build.gradle.kts

1 dependencies {
2     // ElevenLabs Agents SDK (Android)
3     implementation("io.elevenlabs:elevenlabs-android:<latest>")
4 
5     // Kotlin coroutines, AndroidX, etc., as needed by your app
6 }

An example Android app using this SDK can be found here

Requirements

Android API level 21 (Android 5.0) or higher
Internet permission for API calls
Microphone permission for voice input
Network security configuration for HTTPS calls

Setup

Manifest Configuration

Add the necessary permissions to your AndroidManifest.xml:

1 <uses-permission android:name="android.permission.INTERNET" />
2 <uses-permission android:name="android.permission.RECORD_AUDIO" />
3 <uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />

Runtime Permissions

For Android 6.0 (API level 23) and higher, you must request microphone permission at runtime:

1 import android.Manifest
2 import android.content.pm.PackageManager
3 import androidx.core.app.ActivityCompat
4 import androidx.core.content.ContextCompat
5 
6 private fun requestMicrophonePermission() {
7     if (ContextCompat.checkSelfPermission(this, Manifest.permission.RECORD_AUDIO)
8         != PackageManager.PERMISSION_GRANTED) {
9 
10         if (ActivityCompat.shouldShowRequestPermissionRationale(this, Manifest.permission.RECORD_AUDIO)) {
11             // Show explanation to the user
12             showPermissionExplanationDialog()
13         } else {
14             ActivityCompat.requestPermissions(
15                 this,
16                 arrayOf(Manifest.permission.RECORD_AUDIO),
17                 MICROPHONE_PERMISSION_REQUEST_CODE
18             )
19         }
20     }
21 }

Network Security Configuration

For apps targeting Android 9 (API level 28) or higher, ensure your network security configuration allows clear text traffic if needed:

1 <!-- In AndroidManifest.xml -->
2 <application
3     android:networkSecurityConfig="@xml/network_security_config"
4     ... >

1 <!-- res/xml/network_security_config.xml -->
2 <?xml version="1.0" encoding="utf-8"?>
3 <network-security-config>
4     <domain-config cleartextTrafficPermitted="true">
5         <domain includeSubdomains="true">your-api-domain.com</domain>
6     </domain-config>
7 </network-security-config>

Usage

Initialize the ElevenLabs SDK in your Application class or main activity:

Start a conversation session with either:

Public agent: pass agentId
Private agent: pass conversationToken provisioned from your backend (never expose your API key to the client).

1 import io.elevenlabs.ConversationClient
2 import io.elevenlabs.ConversationConfig
3 import io.elevenlabs.ConversationSession
4 import io.elevenlabs.ClientTool
5 import io.elevenlabs.ClientToolResult
6 
7 // Start a public agent session (token generated for you)
8 val config = ConversationConfig(
9     agentId = "<your_public_agent_id>", // OR conversationToken = "<token>"
10     userId = "your-user-id",
11     // Optional callbacks
12     onConnect = { conversationId ->
13         // Called when the conversation is connected and returns the conversation ID. You can access conversationId via session.getId() too
14     },
15     onMessage = { source, messageJson ->
16         // Raw JSON messages from data channel; useful for logging/telemetry
17     },
18     onModeChange = { mode ->
19         // "speaking" | "listening" — drive UI indicators
20     },
21     onStatusChange = { status ->
22         // "connected" | "connecting" | "disconnected"
23     },
24     onCanSendFeedbackChange = { canSend ->
25         // Enable/disable thumbs up/down buttons for feedback reporting
26     },
27     onUnhandledClientToolCall = { call ->
28         // Agent requested a client tool not registered on the device
29     },
30     onVadScore = { score ->
31         // Voice Activity Detection score, range from 0 to 1 where higher values indicate higher confidence of speech
32     },
33     // List of client tools the agent can invoke
34     clientTools = mapOf(
35         "logMessage" to object : ClientTool {
36             override suspend fun execute(parameters: Map<String, Any>): ClientToolResult {
37                 val message = parameters["message"] as? String
38 
39                 Log.d("ExampleApp", "[INFO] Client Tool Log: $message")
40                 return ClientToolResult.success("Message logged successfully")
41             }
42         }
43     ),
44 )
45 
46 // In an Activity context
47 val session: ConversationSession = ConversationClient.startSession(config, this)

Note that Agents Platform requires microphone access. Consider explaining and requesting permissions in your app’s UI before the conversation starts, especially on Android 6.0+ where runtime permissions are required.

If a tool is configured with expects_response=false on the server, return null from execute to skip sending a tool result back to the agent.

Public vs Private Agents

Public agents (no auth): Initialize with agentId in ConversationConfig. The SDK requests a conversation token from ElevenLabs without needing an API key on device.
Private agents (auth): Initialize with conversationToken in ConversationConfig. Your server requests a conversation token from ElevenLabs using your ElevenLabs API key.

Never embed API keys in clients. They can be easily extracted and used maliciously.

Client Tools

1 val config = ConversationConfig(
2     agentId = "<public_agent>",
3     clientTools = mapOf(
4         "logMessage" to object : io.elevenlabs.ClientTool {
5             override suspend fun execute(parameters: Map<String, Any>): io.elevenlabs.ClientToolResult? {
6                 val message = parameters["message"] as? String ?: return io.elevenlabs.ClientToolResult.failure("Missing 'message'")
7 
8                 android.util.Log.d("ClientTool", "Log: $message")
9                 return null // No response needed for fire-and-forget tools
10             }
11         }
12     )
13 )

When the agent issues a client_tool_call, the SDK executes the matching tool and responds with a client_tool_result. If the tool is not registered, onUnhandledClientToolCall is invoked and a failure result is returned to the agent (if a response is expected).

Callbacks Overview

onConnect - Called when the WebRTC connection is established. Returns the conversation ID.
onMessage - Called when a new message is received. These can be tentative or final transcriptions of user voice, replies produced by LLM, or debug messages. Provides source ("ai" or "user") and raw JSON message.
onModeChange - Called when the conversation mode changes. This is useful for indicating whether the agent is speaking ("speaking") or listening ("listening").
onStatusChange - Called when the conversation status changes ("connected", "connecting", or "disconnected").
onCanSendFeedbackChange - Called when the ability to send feedback changes. Enables/disables feedback buttons.
onUnhandledClientToolCall - Called when the agent requests a client tool that is not registered on the device.
onVadScore - Called when the voice activity detection score changes. Range from 0 to 1 where higher values indicate higher confidence of speech.

Not all client events are enabled by default for an agent. If you have enabled a callback but aren’t seeing events come through, ensure that your ElevenLabs agent has the corresponding event enabled. You can do this in the “Advanced” tab of the agent settings in the ElevenLabs dashboard.

Methods

startSession

The startSession method initiates the WebRTC connection and starts using the microphone to communicate with the ElevenLabs Agents agent.

Public agents

For public agents (i.e. agents that don’t have authentication enabled), only the agentId is required. The Agent ID can be acquired through the ElevenLabs UI.

1 val session = ConversationClient.startSession(
2     config = ConversationConfig(
3         agentId = "your-agent-id"
4     ),
5     context = this
6 )

Private agents

For private agents, you must pass in a conversationToken obtained from the ElevenLabs API. Generating this token requires an ElevenLabs API key.

The conversationToken is valid for 10 minutes.

1 // Server-side token generation (Node.js example)
2 
3 app.get('/conversation-token', yourAuthMiddleware, async (req, res) => {
4   const response = await fetch(
5     `https://api.elevenlabs.seobdtools.com/v1/convai/conversation/token?agent_id=${process.env.AGENT_ID}`,
6     {
7       headers: {
8         // Requesting a conversation token requires your ElevenLabs API key
9         // Do NOT expose your API key to the client!
10         'xi-api-key': process.env.ELEVENLABS_API_KEY,
11       },
12     }
13   );
14 
15   if (!response.ok) {
16     return res.status(500).send('Failed to get conversation token');
17   }
18 
19   const body = await response.json();
20   res.send(body.token);
21 });

Then, pass the token to the startSession method. Note that only the conversationToken is required for private agents.

1 // Get conversation token from your server
2 val conversationToken = fetchConversationTokenFromServer()
3 
4 // For private agents, pass in the conversation token
5 val session = ConversationClient.startSession(
6     config = ConversationConfig(
7         conversationToken = conversationToken
8     ),
9     context = this
10 )

You can optionally pass a user ID to identify the user in the conversation. This can be your own customer identifier. This will be included in the conversation initiation data sent to the server.

1 val session = ConversationClient.startSession(
2     config = ConversationConfig(
3         agentId = "your-agent-id",
4         userId = "your-user-id"
5     ),
6     context = this
7 )

endSession

A method to manually end the conversation. The method will disconnect and end the conversation.

1 session.endSession()

sendUserMessage

Send a text message to the agent during an active conversation. This will trigger a response from the agent.

1 session.sendUserMessage("Hello, how can you help me?")

sendContextualUpdate

Sends contextual information to the agent that won’t trigger a response.

1 session.sendContextualUpdate(
2     "User navigated to the profile page. Consider this for next response."
3 )

sendFeedback

Provide feedback on the conversation quality. This helps improve the agent’s performance. Use onCanSendFeedbackChange to enable your thumbs up/down UI when feedback is allowed.

1 // Positive feedback
2 session.sendFeedback(true)
3 
4 // Negative feedback
5 session.sendFeedback(false)

sendUserActivity

Notifies the agent about user activity to prevent interruptions. Useful for when the user is actively using the app and the agent should pause speaking, i.e. when the user is typing in a chat.

The agent will pause speaking for ~2 seconds after receiving this signal.

1 session.sendUserActivity()

getId

Get the conversation ID.

1 val conversationId = session.getId()
2 Log.d("Conversation", "Conversation ID: $conversationId")
3 // e.g., "conv_123"

Mute/ Unmute

1 session.toggleMute()
2 session.setMicMuted(true)   // mute
3 session.setMicMuted(false)  // unmute

Observe session.isMuted to update the UI label between “Mute” and “Unmute”.

Properties

status

Get the current status of the conversation.

1 val status = session.status
2 Log.d("Conversation", "Current status: $status")
3 // Values: DISCONNECTED, CONNECTING, CONNECTED

ProGuard / R8

If you shrink/obfuscate, ensure Gson models and LiveKit are kept. Example rules (adjust as needed):

1 -keep class io.elevenlabs.** { *; }
2 -keep class io.livekit.** { *; }
3 -keepattributes *Annotation*

Troubleshooting

Ensure microphone permission is granted at runtime
If reconnect hangs, verify your app calls session.endSession() and that you start a new session instance before reconnecting
For emulators, verify audio input/output routes are working; physical devices tend to behave more reliably

Example Implementation

For an example implementation, see the example app in the ElevenLabs Android SDK repository. The app demonstrates:

One‑tap connect/disconnect
Speaking/listening indicator
Feedback buttons with UI enable/disable
Typing indicator via sendUserActivity()
Contextual and user messages from an input
Microphone mute/unmute button