Unstructured
This notebook covers how to use Unstructured document loader to load files of many types. Unstructured currently supports loading of text files, powerpoints, html, pdfs, images, and more.
Please see this guide for more instructions on setting up Unstructured locally, including setting up required system dependencies.
Overview
Integration details
| Class | Package | Local | Serializable | JS support |
|---|---|---|---|---|
| UnstructuredLoader | langchain_unstructured | ✅ | ❌ | ✅ |
Loader features
| Source | Document Lazy Loading | Native Async Support |
|---|---|---|
| UnstructuredLoader | ✅ | ❌ |
Setup
Credentials
By default, langchain-unstructured installs a smaller footprint that requires offloading of the partitioning logic to the Unstructured API, which requires an API key. If you use the local installation, you do not need an API key. To get your API key, head over to this site and get an API key, and then set it in the cell below:
import getpass
import os
if "UNSTRUCTURED_API_KEY" not in os.environ:
os.environ["UNSTRUCTURED_API_KEY"] = getpass.getpass(
"Enter your Unstructured API key: "
)