curl
or Postman.Unstructured-IO/unstructured-ingest
repository in GitHub.pip install "unstructured-client>=0.30.6"
If you already have the Unstructured Python SDK installed, upgrade to at least version 0.30.6 by running the following command instead:
pip install --upgrade "unstructured-client>=0.30.6"
The Unstructured Python SDK code examples, shown later on this page and on related pages, use the following environment variable, which you can set as follows:
export UNSTRUCTURED_API_KEY="<your-unstructured-api-key>"
This environment variable enables you to more easily run the following Unstructured Python SDK examples and help prevent you from storing scripts that contain sensitive API keys in public source code repositories. To get your Unstructured API key, do the following:
unstructured_client
functions for creating, listing, updating, and deleting connectors, workflows, and jobs in the Unstructured UI all use the Unstructured Workflow Endpoint URL. This URL was provided to you when your Unstructured account was created. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io. To specify an API URL in your code, set the server_url
parameter in the UnstructuredClient
constructor to the target API URL. The Unstructured Workflow Endpoint enables you to work with connectors, workflows, and jobs in the Unstructured UI.
curl
and Postman. You can adapt this information as needed for your preferred programming languages and libraries, for example by using the requests
library with Python. curl and PostmanThe following curl
examples use the following environment variables, which you can set as follows:
export UNSTRUCTURED_API_URL="https://platform.unstructuredapp.io/api/v1"
export UNSTRUCTURED_API_KEY="<your-unstructured-api-key>"
For the API URL, this URL was provided to you when your Unstructured account was created. If you do not have this URL, contact Unstructured Sales at sales@unstructured.io. These environment variables enable you to more easily run the following curl
examples and help prevent you from storing scripts that contain sensitive URLs and API keys in public source code repositories. To get your Unstructured API key, do the following:
UNSTRUCTURED_API_URL
default
UNSTRUCTURED_API_KEY
secret
<your-unstructured-api-key>
<your-unstructured-api-key>
Enter
:
https://raw.githubusercontent.com/Unstructured-IO/docs/main/examplecode/codesamples/api/Unstructured-REST-API-Workflow-Endpoint.postman_collection.json
unstructured-api-key
, enter your Unstructured API key in the Value column. As applicable, add, remove, or modify any other required headers for the request.https://api.unstructuredapp.io/general/v0/general
(the default Unstructured Partition Endpoint URL). ConnectorsYou can list, get, create, update, delete, and test source connectors. You can also list, get, create, update, delete, and test destination connectors. For general information, see Connectors. List source connectorsTo list source connectors, use the UnstructuredClient
object’s sources.list_sources
function (for the Python SDK) or the GET
method to call the /sources
endpoint (for curl
or Postman). To filter the list of source connectors, use the ListSourcesRequest
object’s source_type
parameter (for the Python SDK) or the query parameter source_type=<type>
(for curl
or Postman), replacing <type>
with the source connector type’s unique ID (for example, for the Amazon S3 source connector type, S3
for the Python SDK or s3
for curl
or Postman). To get this ID, see Sources. Get a source connectorTo get information about a source connector, use the UnstructuredClient
object’s sources.get_source
function (for the Python SDK) or the GET
method to call the /sources/<connector-id>
endpoint (for curl
or Postman), replacing <connector-id>
with the source connector’s unique ID. To get this ID, see List source connectors. Create a source connectorTo create a source connector, use the UnstructuredClient
object’s sources.create_source
function (for the Python SDK) or the POST
method to call the /sources
endpoint (for curl
or Postman). In the CreateSourceConnector
object (for the Python SDK) or the request body (for curl
or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Sources. For the Python SDK, replace <type>
with the source connector type’s unique ID (for example, for the Amazon S3 source connector type, S3
). To get this ID, see Sources. Update a source connectorTo update information about a source connector, use the UnstructuredClient
object’s sources.update_source
function (for the Python SDK) or the PUT
method to call the /sources/<connector-id>
endpoint (for curl
or Postman), replacing <connector-id>
with the source connector’s unique ID. To get this ID, see List source connectors. In the UpdateSourceConnector
object (for the Python SDK) or the request body (for curl
or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Sources. For the Python SDK, replace <type>
with the source connector type’s unique ID (for example, for the Amazon S3 source connector type, S3
). To get this ID, see Sources. You must specify all of the settings for the connector, even for settings that are not changing. You can change any of the connector’s settings except for its name
and type
. Delete a source connectorTo delete a source connector, use the UnstructuredClient
object’s sources.delete_source
function (for the Python SDK) or the DELETE
method to call the /sources/<connector-id>
endpoint (for curl
or Postman), replacing <connector-id>
with the source connector’s unique ID. To get this ID, see List source connectors. Test a source connectorTo test a source connector, use the POST
method to call the /sources/<connector-id>/connection-check
endpoint (for curl
or Postman), replacing <connector-id>
with the connector’s unique ID. To get this ID, see List source connectors. The Python SDK does not support testing source connectors. To get information about the most recent connector check for a source connector, use the GET
method to call the /sources/<connector-id>/connection-check
endpoint (for curl
or Postman), replacing <connector-id>
with the connector’s unique ID. To get this ID, see List source connectors. The Python SDK does not support getting information about the most recent connector check for a source connector. List destination connectorsTo list destination connectors, use the UnstructuredClient
object’s destinations.list_destinations
function (for the Python SDK) or the GET
method to call the /destinations
endpoint (for curl
or Postman). To filter the list of destination connectors, use the ListDestinationsRequest
object’s destination_type
parameter (for the Python SDK) or the query parameter destination_type=<type>
(for curl
or Postman), replacing <type>
with the destination connector type’s unique ID (for example, for the Amazon S3 source connector type, S3
for the Python SDK or s3
for curl
or Postman). To get this ID, see Destinations. Get a destination connectorTo get information about a destination connector, use the UnstructuredClient
object’s destinations.get_destination
function (for the Python SDK) or the GET
method to call the /destinations/<connector-id>
endpoint (for curl
or Postman), replacing <connector-id>
with the destination connector’s unique ID. To get this ID, see List destination connectors. Create a destination connectorTo create a destination connectors, use the UnstructuredClient
object’s destinations.create_destination
function (for the Python SDK) or the POST
method to call the /destinations
endpoint (for curl
or Postman). In the CreateDestinationConnector
object (for the Python SDK) or the request body (for curl
or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Destinations. For the Python SDK, replace <type>
with the destination connector type’s unique ID (for example, for the Amazon S3 source connector type, S3
). To get this ID, see Destinations. Update a destination connectorTo update information about a destination connector, use the UnstructuredClient
object’s destinations.update_destination
function (for the Python SDK) or the PUT
method to call the /destinations/<connector-id>
endpoint (for curl
or Postman), replacing <connector-id>
with the destination connector’s unique ID. To get this ID, see List destination connectors. In the UpdateDestinationConnector
object (for the Python SDK) or the request body (for curl
or Postman), specify the settings for the connector. For the specific settings to include, which differ by connector, see Destinations. You must specify all of the settings for the connector, even for settings that are not changing. You can change any of the connector’s settings except for its name
and type
. Delete a destination connectorTo delete a destination connector, use the UnstructuredClient
object’s destinations.delete_destination
function (for the Python SDK) or the DELETE
method to call the /destinations/<connector-id>
endpoint (for curl
or Postman), replacing <connector-id>
with the destination connector’s unique ID. To get this ID, see List destination connectors. Test a destination connectorTo test a destination connector, use the POST
method to call the /destinations/<connector-id>/connection-check
endpoint (for curl
or Postman), replacing <connector-id>
with the connector’s unique ID. To get this ID, see List destination connectors. The Python SDK does not support testing destination connectors. To get information about the most recent connector check for a destination connector, use the GET
method to call the /destinations/<connector-id>/connection-check
endpoint (for curl
or Postman), replacing <connector-id>
with the connector’s unique ID. To get this ID, see List destination connectors. The Python SDK does not support getting information about the most recent connector check for a destination connector. WorkflowsYou can list, get, create, run, update, and delete workflows. For general information, see Workflows. List workflowsTo list workflows, use the UnstructuredClient
object’s workflows.list_workflows
function (for the Python SDK) or the GET
method to call the /workflows
endpoint (for curl
or Postman). To filter the list of workflows, use one or more of the following ListWorkflowsRequest
parameters (for the Python SDK) or query parameters (for curl
or Postman):
source_id=<connector-id>
, replacing <connector-id>
with the source connector’s unique ID. To get this ID, see List source connectors.destination_id=<connector-id>
, replacing <connector-id>
with the destination connector’s unique ID. To get this ID, see List destination connectors.status=WorkflowState.<status>
(for the Python SDK) or status=<status>
(for curl
or Postman), replacing <status>
with one of the following workflow statuses: ACTIVE
or INACTIVE
(for the Python SDK) or active
or inactive
(for curl
or Postman).?source_id=<connector-id>&status=<status>
. Get a workflowTo get information about a workflow, use the UnstructuredClient
object’s workflows.get_workflow
function (for the Python SDK) or the GET
method to call the /workflows/<workflow-id>
endpoint (for curl
or Postman), replacing <workflow-id>
with the workflow’s unique ID. To get this ID, see List workflows. Create a workflowTo create a workflow, use the UnstructuredClient
object’s workflows.create_workflow
function (for the Python SDK) or the POST
method to call the /workflows
endpoint (for curl
or Postman). In the CreateWorkflow
object (for the Python SDK) or the request body (for curl
or Postman), specify the settings for the workflow. For the specific settings to include, see Create a workflow. Run a workflowTo run a workflow manually, use the UnstructuredClient
object’s workflows.run_workflow
function (for the Python SDK) or the POST
method to call the /workflows/<workflow-id>/run
endpoint (for curl
or Postman), replacing <workflow-id>
with the workflow’s unique ID. To get this ID, see List workflows. To run a workflow on a schedule instead, specify the schedule
setting in the request body when you create or update a workflow. See Create a workflow or Update a workflow. Update a workflowTo update information about a workflow, use the UnstructuredClient
object’s workflows.update_workflow
function (for the Python SDK) or the PUT
method to call the /workflows/<workflow-id>
endpoint (for curl
or Postman), replacing <workflow-id>
with the workflow’s unique ID. To get this ID, see List workflows. In UpdateWorkflow
object (for the Python SDK) or the request body (for curl
or Postman), specify the settings for the workflow. For the specific settings to include, see Update a workflow. Delete a workflowTo delete a workflow, use the UnstructuredClient
object’s workflows.delete_workflow
function (for the Python SDK) or the DELETE
method to call the /workflows/<workflow-id>
endpoint (for curl
or Postman), replacing <workflow-id>
with the workflow’s unique ID. To get this ID, see List workflows. JobsYou can list, get, and cancel jobs. A job is created automatically whenever a workflow runs on a schedule; see Create a workflow. A job is also created whenever you run a workflow; see Run a workflow. For general information, see Jobs. List jobsTo list jobs, use the UnstructuredClient
object’s jobs.list_jobs
function (for the Python SDK) or the GET
method to call the /jobs
endpoint (for curl
or Postman). To filter the list of jobs, use one or both of the following ListJobsRequest
parameters (for the Python SDK) or query parameters (for curl
or Postman):
workflow_id=<workflow-id>
, replacing <workflow-id>
with the workflow’s unique ID. To get this ID, see List workflows.status=<status>
, replacing <status>
with one of the following job statuses: completed
, failed
, im progress
, scheduled
, and stopped
.curl
or Postman, you can specify multiple query parameters as ?workflow_id=<workflow-id>&status=<status>
. Get a jobTo get basic information about a job, use the UnstructuredClient
object’s jobs.get_job
function (for the Python SDK) or the GET
method to call the /jobs/<job-id>
endpoint (for curl
or Postman), replacing <job-id>
with the job’s unique ID. To get this ID, see List jobs. This function/endpoint returns basic information about the job, such as:
UnstructuredClient
object’s jobs.get_job_details
function (for the Python SDK) or the GET
method to call the /jobs/<job-id>/details
endpoint (for curl
or Postman), replacing <job-id>
with the job’s unique ID. To get this ID, see List jobs. To get basic information about a job, see Get a job. Get failed file details for a jobTo get the list of any failed files for a job and why those files failed, use the UnstructuredClient
object’s jobs.get_job_failed_files
function (for the Python SDK) or the GET
method to call the /jobs/<job-id>/failed-files
endpoint (for curl
or Postman), replacing <job-id>
with the job’s unique ID. To get this ID, see List jobs. Cancel a jobTo cancel a running job, use the UnstructuredClient
object’s jobs.cancel_job
function (for the Python SDK) or the POST
method to call the /jobs/<job-id>/cancel
endpoint (for curl
or Postman), replacing <job-id>
with the job’s unique ID. To get this ID, see List jobs. Download a processed local file from a jobThis applies only to jobs that use a workflow with a local source and a local destination. To download a processed local file from a completed job, use GET
to call the /jobs/<job-id>/download
endpoint, replacing <job-id>
with the job’s unique ID. To get this ID, see List jobs. You must also provide Unstructured’s IDs for the file to download and the workflow’s output node. To get these IDs, see Get a job. In the response:
output_node_files
array.output_node_files
array’s file_id
field.output_node_files
array’s node_id
field.RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4