upsertFromPDF

Inserts or updates vectors from a PDF File.

upsertFromPDF(
    pdf: File,
    options?: {
      splitPages?: boolean
      metadata?: Record<string, any>
      textSplitter?: SplitterParams
    }
): Promise<string[]>

Reference

import { myVectorStore, myAssets } from "#elements";
export default async function () {
  const pdfFile = myAssets["/babel.pdf"];
  const count = await myVectorStore.upsertFromPDF(pdfFile, {
    splitPages: true,
  });
  console.log(`${count} vectors upserted`);
}

Parameters

pdf: The File representing the PDF to extract content from.
options: Optional configuration parameters, including:
- splitPages: (optional) Whether to split the PDF into separate vectors for each page, defaults to true.
- metadata: (optional) The metadata to associate with the vectors.
- textSplitter: (optional) The text splitter employed to divide the content into multiple vectors. In the absence of a provided splitter, the token splitter is used by default.

Returns

Promise of an array of IDs of the upserted vectors.

Caveats

You can query all the results by filtering the metadata field source-by-babel to file name.
Only one PDF file can be uploaded at a time, and its size should not exceed 256MB.

upsertFromPDF

Reference​

Parameters​

Returns​

Caveats​

Reference

Parameters

Returns

Caveats