Content Processing
The Content Processing endpoints allow you to preprocess input materials — documents or URLs — before running an assessment. This is useful for extracting and reviewing text content ahead of time, validating that documents are readable, and preparing materials for assessment.
Content processing is asynchronous: you submit a file or URL, receive a job UUID, then poll for status until processing is complete. Once complete, you can retrieve the extracted text content.
Use these endpoints to:
- Extract text from uploaded documents (PDF, DOCX, etc.)
- Extract text content from web pages via URL
- Check processing status for individual or multiple jobs
- Retrieve the extracted text for review or display
- Re-process content that may have changed (e.g., updated web pages)
- Delete processed content when no longer needed
Submit File for Processing
Code
Content-Type: multipart/form-data
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
file | File | Yes | The document file to process |
Supported file types are validated server-side. Common supported formats include PDF, DOCX, DOC, TXT, and other document types.
Example Request
Code
Response
Status: 202 Accepted with a Location header pointing to the status endpoint.
Code
Submit URL for Processing
Code
Content-Type: application/json
Request Body
Code
| Field | Type | Required | Description |
|---|---|---|---|
url | string | Yes | The URL of the web page to process |
Example Request
Code
Response
Status: 202 Accepted with a Location header pointing to the status endpoint.
Code
Check Processing Status
Code
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
uuid | string (path) | Yes | The UUID of the content processing job |
Example Request
Code
Response
Status: 200 OK (while processing) or 303 See Other (when completed, with Location header pointing to the content endpoint).
Code
Poll this endpoint periodically to monitor processing progress. When completed, follow the Location header or use the content endpoint to retrieve the extracted text.
Batch Status Check
Check the status of multiple content processing jobs in a single request.
Code
Content-Type: application/json
Request Body
Code
| Field | Type | Required | Description |
|---|---|---|---|
uuids | string array | Yes | List of content processing job UUIDs to check |
Example Request
Code
Response
Status: 200 OK
Returns an array of status objects, one per requested UUID:
Code
Use batch status checks instead of individual status calls when monitoring multiple jobs to reduce the number of API requests.
Get Processed Content
Retrieve the extracted text content from a completed processing job.
Code
Produces: text/plain
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
uuid | string (path) | Yes | The UUID of the completed content processing job |
Example Request
Code
Response
Status: 200 OK
Returns the extracted text content as plain text (text/plain).
Code
This endpoint is only available for jobs with status COMPLETED. Requesting content for a job that has not completed will result in an error.
Refresh Content
Re-process a previously submitted content processing job. This is useful when the source material has changed (e.g., an updated web page) and you need to extract the latest content.
Code
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
uuid | string (path) | Yes | The UUID of the content processing job to refresh |
Example Request
Code
Response
Status: 202 Accepted with a Location header pointing to the status endpoint.
Code
After refreshing, poll the status endpoint again to know when the new content is ready.
Delete Content Processing Job
Delete a content processing job and its associated data. This can also be used to cancel an in-progress job.
Code
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
uuid | string (path) | Yes | The UUID of the content processing job to delete |
Example Request
Code
Response
Status: 200 OK with an empty response body.
Response Fields
All content processing endpoints return or use the following response structure:
| Field | Type | Description |
|---|---|---|
uuid | string | Unique identifier for the content processing job |
sourceType | string | Type of source: FILE or URL |
sourceUrl | string | The original URL (only for URL-type jobs, null for files) |
fileName | string | Name of the processed file (original filename for uploads, generated name for URLs) |
status | string | Current processing status: PENDING, RUNNING, COMPLETED, FAILED, or CANCELLED |
error | string | Error message if processing failed, null otherwise |
createdAt | string (datetime) | Timestamp when the job was created |
completedAt | string (datetime) | Timestamp when processing completed, null if not yet complete |
fetchedAt | string (datetime) | For URL sources, timestamp when the content was last fetched from the URL |
Integration Workflow
A typical integration flow using content processing with assessments:
- Submit materials — Upload documents via
/content-processing/fileor submit URLs via/content-processing/url - Monitor status — Poll individual jobs via
/content-processing/{uuid}/statusor use batch status via/content-processing/status/batch - Review content — Once completed, retrieve extracted text via
/content-processing/{uuid}/contentto display to users for review - Run assessment — Submit the original documents or URLs to the
/assessmentsendpoint for compliance evaluation - Manage lifecycle — Refresh outdated content or delete jobs that are no longer needed
Content processing and assessment are independent workflows. You can use content processing to preview extracted text without running an assessment, or run assessments directly without prior content processing.
For more detailed specifications of the endpoints and request parameters, see the dedicated API Reference Pages.