To ensure these features work properly, administrators must complete the appropriate configurations in the system admin console. Below is the setup guidance and underlying logic for each feature.
1. Recording Transcription (Speech-to-Text)
Feature Overview
After a meeting ends and the recording is generated, the system will attempt to transcribe the audio content into text, which can be used for future meeting summaries, search, and analysis.
Configuration Requirements
✅ Object Storage Configuration (Required) STT models typically do not accept local file uploads and require the audio to be hosted in publicly accessible cloud storage.
✅ AI Model Configuration (STT Model)
Configuration Steps
Step 1: Configure Object Storage (Provides publicly accessible storage for STT model access)
Go to System Settings > OSS Management, and configure access credentials such as Access Key, and Secret.
Then go to System Settings > Basic Settings > System Storage, and select a cloud platform for uploading generated recordings.
Note: If you're using Deepgram as your AI model, you can skip cloud storage and select local storage instead.
Step 2: Configure AI Model (STT)
(Select the model used for converting audio to text)
Navigate to System Settings > AI Management and enter your API credentials;
Then go to System Settings > Basic Settings > System AI Settings, and choose an STT model such as Alibaba or Deepgram.
Step 3: Processing Logic
Once the recording is generated, the system automatically uploads the audio file to the configured cloud storage;
The audio file link and prompt are submitted to the STT model;
After receiving the transcription result, the cloud audio file is automatically deleted and only the text result is retained.
2. Smart Meeting Summary (LLM-Based)
Feature Overview
Once the transcription result is available, users can click “Generate Summary” to trigger the system to use an LLM model to create a meeting summary, including full-text highlights, chapter overviews, and action items.
Configuration Requirements
❌ No object storage required (summary is generated from local transcription text)
✅ AI Model Configuration (LLM Model)
Configuration Steps
Step 1: Configure AI Model (LLM)
(Select the LLM model used to generate the summary)
Go to System Settings > AI Management, and enter your API credentials;
Then go to System Settings > Basic Settings > System AI Settings, and choose a model for Smart Summary, such as OpenAI or Volcengine (LLM).
Step 2: Trigger & Processing Logic
When the user clicks “Generate Summary” for the first time, the system submits the transcription result and system prompt to the LLM model;
The generated summary will appear in the meeting summary section, with support for keyword search and video timestamp navigation.
3. Live Captions
Feature Overview
During a live meeting, the system converts real-time speech into subtitles that are only visible to the current user.
Configuration Requirements
✅ AI Model Configuration (STT)
❌ No object storage required (captions are generated from in-memory audio streams)
Configuration Steps
Step 1: Configure AI Model (STT for real-time use)
Navigate to System Settings > AI Management, and enter your STT model credentials;
Then go to System Settings > Basic Settings > System AI Settings, and select the STT model for “Live Captions,” such as Alibaba or Deepgram.
Step 2: Processing Logic
When a user joins a meeting and enables captions, the system captures short segments of the current audio stream;
These segments are sent to the STT model in real time for processing;
The resulting captions are displayed only in the user's local interface and are not stored or uploaded.
Summary
Feature
Requires Object Storage
Requires STT Model
Requires LLM Model
Recording Transcription
✅
✅
❌
Smart Meeting Summary
❌
✅ depends on transcription
✅
Live Captions
❌
✅
❌
Get Started for free Now
Your meetings, your data, your control, your privacy