这是 LangExtract 的非官方 Node.js SDK。它与 Google 没有关联或认可。有关官方 Python 库,请访问 官方 GitHub 仓库.
使用 npm 或 yarn 安装 LangExtract Node.js SDK:
# npm
npm install langextract
# yarn
yarn add langextract
# pnpm
pnpm add langextract
通过这个简单的示例开始在 Node.js 中使用 LangExtract:
使用 TypeScript 从文本中提取信息:
import { extract, ExampleData } from "langextract";
const examples: ExampleData[] = [
{
text: "John Smith is 30 years old and works at Google.",
extractions: [
{
extractionClass: "person",
extractionText: "John Smith",
attributes: {
age: "30",
employer: "Google"
}
}
]
}
];
async function extractPersonInfo() {
const result = await extract("Alice Johnson is 25 and works at Microsoft.", {
promptDescription: "Extract person information including name, age, and employer",
examples: examples,
modelType: "gemini",
apiKey: process.env.LANGEXTRACT_API_KEY
});
console.log(result.extractions);
}
为 Gemini 等云端模型设置您的 API 密钥:
# .env 文件
LANGEXTRACT_API_KEY=your-api-key-here
SDK 支持多个 LLM 提供商:
从文本中提取信息的主要方法
async function extract(
text: string,
options: {
promptDescription: string;
examples: ExampleData[];
modelType: "gemini" | "openai" | "ollama";
apiKey?: string;
modelId?: string;
}
): Promise<ExtractionResult>
生成提取结果的交互式 HTML 可视化
function visualize(
results: ExtractionResult[],
options?: {
theme?: "light" | "dark";
highlightColors?: string[];
}
): string
并行处理多个文档
async function batchExtract(
documents: string[],
options: ExtractOptions & {
concurrency?: number;
onProgress?: (completed: number, total: number) => void;
}
): Promise<ExtractionResult[]>
从简单文本中提取实体
const result = await extract("The product costs $99.99", {
promptDescription: "Extract product price",
examples: [{
text: "Price: $50",
extractions: [{
extractionClass: "price",
extractionText: "$50"
}]
}],
modelType: "gemini"
});
使用属性和自定义提取类
const examples = [{
text: "Dr. Smith prescribed 500mg aspirin",
extractions: [{
extractionClass: "medication",
extractionText: "aspirin",
attributes: {
dosage: "500mg",
prescriber: "Dr. Smith"
}
}]
}];
高效处理多个文档
const documents = [
"Document 1 content...",
"Document 2 content...",
"Document 3 content..."
];
const results = await batchExtract(documents, {
promptDescription: "Extract entities",
examples: examples,
modelType: "gemini",
concurrency: 3,
onProgress: (done, total) => {
console.log(`Progress: ${done}/${total}`);
}
});
生成并保存交互式可视化
const html = visualize(results, {
theme: "light",
highlightColors: ["#3B82F6", "#10B981", "#F59E0B"]
});
// Save to file
import { writeFileSync } from "fs";
writeFileSync("results.html", html);
包含类型定义的完整 TypeScript 支持:
正确的错误处理模式:
处理 API 错误和验证失败
try {
const result = await extract(text, options);
console.log("Extraction successful:", result);
} catch (error) {
if (error.code === "INVALID_API_KEY") {
console.error("Invalid API key provided");
} else if (error.code === "RATE_LIMIT_EXCEEDED") {
console.error("Rate limit exceeded, retry after:", error.retryAfter);
} else if (error.code === "VALIDATION_ERROR") {
console.error("Validation error:", error.details);
} else {
console.error("Unexpected error:", error.message);
}
}