A TypeScript client for interacting with the LLM canister on the Internet Computer.
npm install icp-llm-client
import { LLMClient, Model } from "icp-llm-client";
// Create a client instance (connects to IC mainnet by default)
const client = new LLMClient();
// Simple prompt
const response = await client.prompt(
Model.Llama3_1_8B,
"What is the Internet Computer?"
);
// Chat with multiple messages
const chatResponse = await client.chat(Model.Llama3_1_8B, [
{
role: { system: null },
content: "You are a helpful assistant",
},
{
role: { user: null },
content: "What is the Internet Computer?",
},
]);
You can specify a custom host or identity:
import { LLMClient, Model } from "icp-llm-client";
import { HttpAgent, Identity } from "@dfinity/agent";
// Create a client with custom host (e.g., local development)
const localClient = new LLMClient({ host: "http://localhost:8000" });
// Create a client with custom identity
const identityClient = new LLMClient({ identity: myIdentity });
// Create a client with both custom host and identity
const customClient = new LLMClient({
host: "https://custom-ic-host.com",
identity: myIdentity,
});
- Currently only supports Llama 3.1 8B model
- More models planned based on community feedback
- Maximum 10 messages per chat request
- Prompt length across all messages cannot exceed 10kiB
- Output is limited to 200 tokens (responses will be truncated)
- Prompts are not completely private
- AI Worker operators can theoretically see prompts
- User identity remains anonymous
- DFINITY only logs aggregate metrics (request counts, token usage)
-
Handle Truncated Responses
const response = await client.prompt(Model.Llama3_1_8B, "Your prompt"); if (LLMClient.isResponseTruncated(response)) { console.log("Response was truncated. Consider breaking up your request."); }
-
Keep Prompts Concise
// Good - concise prompt const response = await client.prompt( Model.Llama3_1_8B, "What is ICP? Be brief." ); // Avoid - too long const longResponse = await client.prompt( Model.Llama3_1_8B, "Write a detailed essay about the Internet Computer..." );
-
Break Up Long Conversations
// Instead of sending many messages at once const messages = [ { role: { system: null }, content: "You are a helpful assistant" }, { role: { user: null }, content: "Question 1" }, { role: { assistant: null }, content: "Answer 1" }, { role: { user: null }, content: "Question 2" }, // ... more messages ]; // Break into multiple requests const firstPart = await client.chat(Model.Llama3_1_8B, messages.slice(0, 5)); const secondPart = await client.chat(Model.Llama3_1_8B, messages.slice(5));
try {
const response = await client.prompt(Model.Llama3_1_8B, "Your prompt");
console.log(response);
} catch (error) {
if (error.message.includes("10kiB")) {
console.error("Prompt too long. Please reduce the length.");
} else if (error.message.includes("10 messages")) {
console.error("Too many messages. Please reduce the number of messages.");
} else {
console.error("An error occurred:", error);
}
}
MIT