Admin Configuration Guide: Enabling AI Features

update 19/05/2025

VidyVault provides multiple AI-powered speech features, including:

  • Recording Transcription (STT: Speech-to-Text)
  • Smart Meeting Summary (LLM: Large Language Model)
  • Live Captions

To ensure these features work properly, administrators must complete the appropriate configurations in the system admin console. Below is the setup guidance and underlying logic for each feature.


1. Recording Transcription (Speech-to-Text)

Feature Overview

After a meeting ends and the recording is generated, the system will attempt to transcribe the audio content into text, which can be used for future meeting summaries, search, and analysis.

Configuration Requirements

  • Object Storage Configuration (Required) STT models typically do not accept local file uploads and require the audio to be hosted in publicly accessible cloud storage.
  • AI Model Configuration (STT Model)

Configuration Steps

Step 1: Configure Object Storage (Provides publicly accessible storage for STT model access)

  • Go to System Settings > OSS Management, and configure access credentials such as Access Key, and Secret.
  • Then go to System Settings > Basic Settings > System Storage, and select a cloud platform for uploading generated recordings.
VidyVault object storage connection settings
VidyVault cloud platform selection menu

Step 2: Configure AI Model (STT)

  • Navigate to System Settings > AI Management and enter your API credentials;
  • Then go to System Settings > Basic Settings > System AI Settings, and choose an STT model such as Alibaba or Deepgram.
VidyVault API credentials input screen
VidyVault STT model selection interface

Step 3: Processing Logic

  • Once the recording is generated, the system automatically uploads the audio file to the configured cloud storage;
  • The audio file link and prompt are submitted to the STT model;
  • After receiving the transcription result, the cloud audio file is automatically deleted and only the text result is retained.

2. Smart Meeting Summary (LLM-Based)

Feature Overview

Once the transcription result is available, users can click “Generate Summary” to trigger the system to use an LLM model to create a meeting summary, including full-text highlights, chapter overviews, and action items.

Configuration Requirements

  • No object storage required (summary is generated from local transcription text)
  • AI Model Configuration (LLM Model)

Configuration Steps

Step 1: Configure AI Model (LLM)

  • Go to System Settings > AI Management, and enter your API credentials;
  • Then go to System Settings > Basic Settings > System AI Settings, and choose a model for Smart Summary, such as OpenAI or Volcengine (LLM).
VidyVault AI credential management screen
VidyVault smart summary model selector

Step 2: Trigger & Processing Logic

  • When the user clicks “Generate Summary” for the first time, the system submits the transcription result and system prompt to the LLM model;
  • The generated summary will appear in the meeting summary section, with support for keyword search and video timestamp navigation.

3. Live Captions

Feature Overview

During a live meeting, the system converts real-time speech into subtitles that are only visible to the current user.

Configuration Requirements

  • AI Model Configuration (STT)
  • No object storage required (captions are generated from in-memory audio streams)

Configuration Steps

Step 1: Configure AI Model (STT for real-time use)

  • Navigate to System Settings > AI Management, and enter your STT model credentials;
  • Then go to System Settings > Basic Settings > System AI Settings, and select the STT model for “Live Captions,” such as Alibaba or Deepgram.
VidyVault STT credential input fields
VidyVault caption engine selection menu

Step 2: Processing Logic

  • When a user joins a meeting and enables captions, the system captures short segments of the current audio stream;
  • These segments are sent to the STT model in real time for processing;
  • The resulting captions are displayed only in the user's local interface and are not stored or uploaded.

Summary

FeatureRequires Object StorageRequires STT ModelRequires LLM Model
Recording Transcription
Smart Meeting Summary
depends on transcription
Live Captions

Get Started for free Now

Your meetings, your data, your control, your privacy