llm.callReplicate

llm.callReplicate

Calls the Replicate API to generate predictions using pre-trained models.

llm.callReplicate() is a method in usellm which interacts with the Replicate API to generate predictions from pre-trained models. The llm.callReplicate() function makes a POST request to the Replicate API, providing the version(modelId) and input as parameters. It then waits for the model to generate a prediction, checks the status, and retrieves the result. This allows users to easily generate predictions from pre-trained models provided by the Replicate platform.

Replicate is a platform that provides a straightforward way for developers to train, version, and deploy machine learning models in any programming language. Moreover you can use their open-source pre-trained models off the shelf without needing to understand how machine learning works.

Syntax

const response = await llm.callReplicate({
  version,
  input,
  timeout,
});
const response = await llm.callReplicate({
  version,
  input,
  timeout,
});

Parameters

  • options - An object that encapsulates the options for the callReplicate function. It includes the following properties:
    • input(Required) - An object that represents the input for the prediction model.
    • version(Required) - A string that represents the model's version.
    • timeout(Optional: Default 10000 milliseconds) - A number that sets the time to wait while the model is training after making the initial POST request. (in milliseconds)

Returns

The llm.callReplicate() function returns a Promise that resolves to an object when a successful response is received from the Replicate API. The resolved object includes the following properties:

  • id - The unique identifier of the prediction.
  • urls - The URLs related to the prediction.
  • status - The status of the prediction. A status of "succeeded" indicates a successful prediction.
  • output - The output generated by the prediction model.
  • metrics - Any metrics related to the prediction.

In case the prediction model does not complete the operation within the given time, the function will return a string "Training Not Completed! Please increase the value of timeout and try again.".

Here's an example of a successful response:

{
  id: "60d7fe3034fd6b0034c72bfe",
  urls: {
    view: "https://replicate.com/myuser/myproject/predict/60d7fe3034fd6b0034c72bfe",
    cancel: "https://replicate.com/myuser/myproject/predict/60d7fe3034fd6b0034c72bfe/cancel"
  },
  status: "succeeded",
  output: "This is the predicted text...",
  metrics: {...}
}
{
  id: "60d7fe3034fd6b0034c72bfe",
  urls: {
    view: "https://replicate.com/myuser/myproject/predict/60d7fe3034fd6b0034c72bfe",
    cancel: "https://replicate.com/myuser/myproject/predict/60d7fe3034fd6b0034c72bfe/cancel"
  },
  status: "succeeded",
  output: "This is the predicted text...",
  metrics: {...}
}

If there is an error during the API call, an error message will be thrown.

Note

For larger models or complex inputs, generating predictions can take longer. Make sure to set the timeout parameter according to your needs to avoid premature termination.

Requirements

You can use the usellm "https://usellm.org/api/llm" service URL for testing. But if you're working on your personal project you'll need to use your own Replicate API. Go to the Replicate API page, create your own API token and paste the API Key in .env.local file (check .env.example). You'll also have to pass the API key saved in .env.local file while defining CreateLLMService.

Example

Below is an example of how to use llm.callReplicate() function. It demonstrates how to generate a prediction from a Replicate model (live demo):

import { useState } from "react";
import useLLM from "usellm";
 
export default function DemoReplicateModel() {
  const llm = useLLM({
    serviceUrl: "https://usellm.org/api/llm", // For testing only. Follow this guide to create your own service URL: https://usellm.org/docs/api-reference/create-llm-service
  });
 
  const [input, setInput] = useState("");
  const [result, setResult] = useState("");
  const [version, setVersion] = useState(
    "c49dae362cbaecd2ceabb5bd34fdb68413c4ff775111fea065d259d577757beb"
  );
  const [timeout, setTimeout] = useState("10000");
 
  async function handleClick() {
    setResult("");
    const response = await llm.callReplicate({
      version: version,
      input: { prompt: input },
      timeout: parseInt(timeout),
    });
    console.log(response);
    setResult(response.output);
  }
 
  return (
    <div className="p-4 overflow-y-scroll">
      <h2 className="text-2xl font-semibold mb-4">
        Replicate Stable LM Model Demo
      </h2>
      <input
        className="p-2 border rounded mr-2 w-full mb-4 block dark:bg-gray-900 dark:text-white"
        type="text"
        placeholder="Enter Model Id Ex"
        value={version}
        onChange={(e) => setVersion(e.target.value)}
      />
      <input
        className="p-2 border rounded mr-2 w-full mb-4 block dark:bg-gray-900 dark:text-white"
        type="text"
        placeholder="Enter Time in Seconds"
        value={timeout}
        onChange={(e) => setTimeout(e.target.value)}
      />
      <input
        className="p-2 border rounded mr-2 w-full mb-4 block dark:bg-gray-900 dark:text-white"
        type="text"
        placeholder="Enter Prompt"
        value={input}
        onChange={(e) => setInput(e.target.value)}
      />
 
      <button
        className="p-2 border rounded bg-gray-100 hover:bg-gray-200 active:bg-gray-300 dark:bg-white dark:text-black font-medium"
        onClick={handleClick}
      >
        Generate
      </button>
      <div
        className="
 
whitespace-pre-wrap my-4"
      >
        {result}
      </div>
    </div>
  );
}
import { useState } from "react";
import useLLM from "usellm";
 
export default function DemoReplicateModel() {
  const llm = useLLM({
    serviceUrl: "https://usellm.org/api/llm", // For testing only. Follow this guide to create your own service URL: https://usellm.org/docs/api-reference/create-llm-service
  });
 
  const [input, setInput] = useState("");
  const [result, setResult] = useState("");
  const [version, setVersion] = useState(
    "c49dae362cbaecd2ceabb5bd34fdb68413c4ff775111fea065d259d577757beb"
  );
  const [timeout, setTimeout] = useState("10000");
 
  async function handleClick() {
    setResult("");
    const response = await llm.callReplicate({
      version: version,
      input: { prompt: input },
      timeout: parseInt(timeout),
    });
    console.log(response);
    setResult(response.output);
  }
 
  return (
    <div className="p-4 overflow-y-scroll">
      <h2 className="text-2xl font-semibold mb-4">
        Replicate Stable LM Model Demo
      </h2>
      <input
        className="p-2 border rounded mr-2 w-full mb-4 block dark:bg-gray-900 dark:text-white"
        type="text"
        placeholder="Enter Model Id Ex"
        value={version}
        onChange={(e) => setVersion(e.target.value)}
      />
      <input
        className="p-2 border rounded mr-2 w-full mb-4 block dark:bg-gray-900 dark:text-white"
        type="text"
        placeholder="Enter Time in Seconds"
        value={timeout}
        onChange={(e) => setTimeout(e.target.value)}
      />
      <input
        className="p-2 border rounded mr-2 w-full mb-4 block dark:bg-gray-900 dark:text-white"
        type="text"
        placeholder="Enter Prompt"
        value={input}
        onChange={(e) => setInput(e.target.value)}
      />
 
      <button
        className="p-2 border rounded bg-gray-100 hover:bg-gray-200 active:bg-gray-300 dark:bg-white dark:text-black font-medium"
        onClick={handleClick}
      >
        Generate
      </button>
      <div
        className="
 
whitespace-pre-wrap my-4"
      >
        {result}
      </div>
    </div>
  );
}

In this example, llm.callReplicate() is used to generate a prediction based on the user's input and the given model. We've provided 10000 milliseconds for the model to train and then fetch the predictions from the model. The resulting prediction is then displayed to the user.