Speaking Bots

AI-Powered Meeting Participants with Customizable Personas

Speaking Bots is an open-source implementation that creates AI meeting agents that can join and participate in video meetings with distinct personalities and context defined in Markdown files.

Our implementation extends Pipecat's Python framework to create:
  • Meeting agents that can join Google Meet, Zoom or Microsoft Teams through the Meeting BaaS API
  • Customizable personas with unique context
  • Support for running multiple instances locally or at scale
  • Real-time audio processing through WebSocket infrastructure

The API follows a minimalist design with sensible defaults while offering optional customization. A bot can be deployed with just a meeting URL and API key, with additional parameters available for tailoring behavior.

Features

  • AI-Powered Conversations: Bots can engage in natural-sounding conversations, responding to meeting participants in real-time.
  • Customizable Personas: Deploy bots with different personalities and specific knowledge bases through Markdown-defined contexts.
  • Multiplatform Support: Works with Google Meet, Zoom, and Microsoft Teams through one API.
  • Voice Activity Detection: Intelligent detection of when to speak and when to listen with configurable parameters.
  • Function Calling Tools: Built-in tools for checking weather, time, and other contextual information that can be enabled or disabled.
  • LLM Context Management: Maintains consistent, coherent conversations throughout meetings.
  • Multiple Bot Instances: Run several bots in one meeting with different personas and roles.

Core Technologies:

  • Pipecat: Python framework powering real-time audio processing pipeline
  • MeetingBaaS: Meeting bot deployment API for video platforms
  • Text-to-Speech: Cartesia for bot voice synthesis
  • Speech-to-Text: Deepgram or Gladia with language recognition
  • Voice AI: OpenAI GPT models for conversation generation
  • Data Transport: WebSocket communication with Protocol Buffers

API Usage

The Speaking Bot API is accessible at speaking.meetingbaas.com. You can access the OpenAPI specification at speaking.meetingbaas.com/openapi.json.


The API provides endpoints to:

  • Create and deploy speaking bots in meetings
  • Remove bots from meetings
  • Manage WebSocket connections for audio streaming

Deployment Example

To deploy a Speaking Bot using the API, simply make a POST request:


The following shell script demonstrates how to deploy a speaking bot to a meeting. Below it, you can see the Python implementation of a speaking bot persona.

SPEAKING_BOT.SH
#!/bin/bash curl -X POST "https://speaking.meetingbaas.com/bots" \ -H "Content-Type: application/json" \ -d '{ "meeting_url": "https://us06web.zoom.us/j/123456789?pwd=example", "personas": ["baas_onboarder"], "meeting_baas_api_key": "your-api-key" }'
View on GitHub
SPEAKING_BOT.PY
import json import requests # API endpoint url = "https://speaking.meetingbaas.com/bots" # Request headers headers = {"Content-Type": "application/json"} # Request payload payload = { "meeting_url": "https://meet.google.com/abc-defg-hij", "personas": ["interviewer", "note_taker"], "meeting_baas_api_key": "your-api-key", "websocket_url": "wss://your-server.com/ws", # Optional "tts_provider": "cartesia", # Optional "stt_provider": "deepgram", # Optional } # Send the request response = requests.post(url, headers=headers, data=json.dumps(payload)) # Process the response if response.status_code == 200: result = response.json() print(f"Success! Bot ID: {result.get('bot_id')}") print(f"Bot Name: {result.get('bot_name')}") else: print(f"Error: {response.status_code}") print(response.text)
preview features