llm.generateClonedAudio

llm.generateClonedAudio

Generate cloned audio using a unique voice ID and text

The llm.generateClonedAudio() method generates cloned audio by utilizing Text-To-Speech (TTS) technology. This function is useful when you need to create audio that resembles a specific voice. It generates cloned audio by using a unique voice ID and a specified text. It is used in conjunction with the llm.cloneVoice() function to create custom voice clones for the given text.


Demo Video

 

Checkout the voice cloning process for the above voice here.

Syntax

const { audioUrlReturn } = await llm.generateClonedAudio(options: LLMGenerateClonedAudioOptions);
const { audioUrlReturn } = await llm.generateClonedAudio(options: LLMGenerateClonedAudioOptions);

Parameters

  • options: LLMGenerateClonedAudioOptions: An object that contains the following properties:

    • voiceID: string: The unique voice ID generated by the llm.cloneVoice() function.

    • text: string: The text to be used for generating the cloned audio.

    • quality: string (optional): The quality of the generated audio. Possible values are "low", "medium", or "high". The default value is "medium".

    • output_format: string (optional): The output format of the generated audio. Possible values are "mp3", "wav", or "ogg".

    • speed: number (optional)`: The speed of the generated audio. The default value is 1, which represents the normal speed.

    • sample_rate: number (optional): The sample rate of the generated audio. The default value is 24000.

Returns

A Promise that resolves to an object:

  • { audioUrlReturn: string }: A base64 encoded data of the generated cloned audio.

Example

Here's an example of how to use llm.generateClonedAudio() in a React component (live demo):

"use client";
 
import useLLM, { OpenAIMessage } from "usellm";
import { useState } from "react";
import { stat } from "fs";
 
export default function CloneVoice() {
  const [status, setStatus] = useState<Status>("idle");
  const [text, setText] = useState<string>("");
  const [voiceID, setVoiceID] = useState<string>("");
  const [audioUrlReturn, setAudioUrlReturn] = useState<string>("");
  const llm = useLLM({
    serviceUrl: "https://usellm.org/api/llm", // For testing only. Follow this guide to create your own service URL: https://usellm.org/docs/api-reference/create-llm-service
  });
 
 
  async function handleSubmit(){
    if(status==="idle"){
      setStatus("Generating Audio");
      const {audioUrlReturn} = await llm.generateClonedAudio({
        voiceID,
        text,
      })
      setAudioUrlReturn(audioUrlReturn);
      setStatus("idle");
    }
  }
 
  return (
    <div className="p-4 items-start overflow-y-auto">
      <h2 className="font-semibold text-2xl">AI Voice Cloning</h2>
    
      {status !== "idle" && (
        <div className="mt-4 text-lg">{capitalize(status)}...</div>
      )}
 
      <input
        className="p-2 border rounded w-full block mt-4 dark:bg-gray-900 dark:text-white"
        placeholder="Enter your voice ID"
        value={voiceID}
        onChange={(e) => setVoiceID(e.target.value)}
      />
      <textarea
        className="p-2 border rounded w-full block mt-4 dark:bg-gray-900 dark:text-white"
        placeholder="Enter some text here"
        rows={5}
        value={text}
        onChange={(e) => setText(e.target.value)}
      />
      
      <button 
      type="submit" 
      onClick={handleSubmit}
      className="p-2 border rounded bg-gray-100 hover:bg-gray-200 active:bg-gray-300 dark:bg-white dark:text-black font-medium mt-4 "
      >Generate Voice</button>
      {audioUrlReturn && <audio autoPlay className="mt-4" controls src={audioUrlReturn} />}
    </div>
  );
}
 
function capitalize(word: string) {
  return word.charAt(0).toUpperCase() + word.substring(1);
}
 
type Status =
  | "idle"
  | "recording"
  | "transcribing"
  | "understanding"
  | "thinking"
  | "cloning"
  | "clonedAudio"
  | "Generating Audio"
  | "speaking";
"use client";
 
import useLLM, { OpenAIMessage } from "usellm";
import { useState } from "react";
import { stat } from "fs";
 
export default function CloneVoice() {
  const [status, setStatus] = useState<Status>("idle");
  const [text, setText] = useState<string>("");
  const [voiceID, setVoiceID] = useState<string>("");
  const [audioUrlReturn, setAudioUrlReturn] = useState<string>("");
  const llm = useLLM({
    serviceUrl: "https://usellm.org/api/llm", // For testing only. Follow this guide to create your own service URL: https://usellm.org/docs/api-reference/create-llm-service
  });
 
 
  async function handleSubmit(){
    if(status==="idle"){
      setStatus("Generating Audio");
      const {audioUrlReturn} = await llm.generateClonedAudio({
        voiceID,
        text,
      })
      setAudioUrlReturn(audioUrlReturn);
      setStatus("idle");
    }
  }
 
  return (
    <div className="p-4 items-start overflow-y-auto">
      <h2 className="font-semibold text-2xl">AI Voice Cloning</h2>
    
      {status !== "idle" && (
        <div className="mt-4 text-lg">{capitalize(status)}...</div>
      )}
 
      <input
        className="p-2 border rounded w-full block mt-4 dark:bg-gray-900 dark:text-white"
        placeholder="Enter your voice ID"
        value={voiceID}
        onChange={(e) => setVoiceID(e.target.value)}
      />
      <textarea
        className="p-2 border rounded w-full block mt-4 dark:bg-gray-900 dark:text-white"
        placeholder="Enter some text here"
        rows={5}
        value={text}
        onChange={(e) => setText(e.target.value)}
      />
      
      <button 
      type="submit" 
      onClick={handleSubmit}
      className="p-2 border rounded bg-gray-100 hover:bg-gray-200 active:bg-gray-300 dark:bg-white dark:text-black font-medium mt-4 "
      >Generate Voice</button>
      {audioUrlReturn && <audio autoPlay className="mt-4" controls src={audioUrlReturn} />}
    </div>
  );
}
 
function capitalize(word: string) {
  return word.charAt(0).toUpperCase() + word.substring(1);
}
 
type Status =
  | "idle"
  | "recording"
  | "transcribing"
  | "understanding"
  | "thinking"
  | "cloning"
  | "clonedAudio"
  | "Generating Audio"
  | "speaking";

The provided example demonstrates generation of cloned voice feature. It allows the user to enter their voice ID and a text input. When the "Generate Voice" button is clicked, the component triggers the handleSubmit function. If the status is currently "idle," it sets the status to "Generating Audio" and calls the llm.generateClonedAudio() function from the useLLM hook, passing the entered voiceID and text. Once the audio URL is returned, it updates the audioUrlReturn state and resets the status to "idle." The generated cloned audio is then played using an <audio> element if audioUrlReturn is not empty. This example demonstrates how to integrate the AI voice cloning functionality into a React component.