Back to blog

AI-Powered Audio Transcriptions

AI-Powered Audio Transcriptions

At Dyte, we are committed to revolutionizing real-time communication, and we're thrilled to announce an exciting addition to our platform: AI-powered audio transcriptions. This feature, currently in beta, is designed to enhance your communication experience by effortlessly converting spoken words into written text.

We're introducing an efficient and accurate way to transcribe your conversations in meetings. We provide transcriptions in 2 forms:

  1. Live transcriptions: The transcripts can be consumed on the client side using the Dyte SDK that's suitable for your platform. These transcripts are generated on the server in real-time.
  2. Post-meeting webhooks: The meeting transcript can be consumed via a webhook after the meeting ends.

This beta release marks the step in a journey toward providing you with cutting-edge AI capabilities. We’re also in the process of developing features to support other AI features like meeting agenda generation and meeting summarization.


Dyte's transcription APIs likely offer a programmatic way for developers to integrate transcription capabilities into their applications or services. To know more about the transcription APIs in detail, check out this guide.

As always, making sure that our features are developer-friendly is our top priority. Thus, we provide a very simple-to-use API in our client SDKs for you to be able to consume real-time audio transcriptions. Here’s an example of how to use it in our web core SDK.'transcript', (transcriptionData) => {

The transcriptionData object consists of the following information:

  • An ID to uniquely identify the transcript
  • The name of the speaker
  • The ID of the speaker
  • The transcribed speech
  • A timestamp of when the speaker had spoken

The transcriptionData object can be represented with the help of the following interface.

export interface Transcript {
  id: string;
  name: string;
  peerId: string;
  transcript: string;
  date: Date;

The object emits transcripts only when it’s enabled in the preset of the participant who is speaking. To learn more about how to enable this feature for a participant, check out the transcription guide.

Key Features

Dyte's transcriptions offer a robust suite of features, as mentioned below.

  • Real-Time Accuracy: Our AI engine provides instant and accurate audio transcriptions, ensuring you stay in sync with the conversation.
  • Speaker Identification: Easily identify speakers with our speaker attribution feature, making it clearer who said what.
  • Searchable Transcripts: Search through the transcript to quickly locate specific points in the conversation, streamlining post-meeting analysis.


Our AI-driven audio transcription service is priced at $0.015 per minute, offering precise and rapid transcription for any volume of content. This straightforward rate ensures transparent billing for all your transcription needs.

Feedback and Improvement

Your feedback is invaluable in refining and perfecting our AI-powered audio transcriptions. As this feature is in beta, we encourage you to share your thoughts, suggestions, and any issues you may encounter through our Discord community.

Want to get beta access? Get in touch

Great! Next, complete checkout for full access to Dyte.
Welcome back! You've successfully signed in.
You've successfully subscribed to Dyte.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info has been updated.
Your billing was not updated.