Extract Data from File
Extract structured data from an uploaded file’s transcript using a custom schema.
Provide a file_id and a JSON schema describing the fields to extract. The file must be processed and have a transcript available. Optionally include what_to_extract to guide the extraction.
Schema format: Each field must have type and description. Supported types: String, Number, Boolean, Integer, Object, Array, Enum. Max 10 root fields, max 3 nesting levels.
Content-Type: Accepts application/json or YAML (application/x-yaml, text/yaml).
Token usage: Set include_usage=true to include prompt/completion token counts in the response.
Billing: Each extraction consumes at least 1 analysis credit. For longer transcripts, billing scales as ceil(total_tokens / 15000) credits. All charges are reverted if the request fails.
Overview
Extract structured data from files you have previously uploaded to your VidNavigator library. The file must be fully processed and have a transcript available.How It Works
- Provide a
file_idand aschemadescribing the structured fields you want - VidNavigator reads the transcript from the uploaded file
- VidNavigator runs AI extraction against your schema
- You receive structured JSON matching your schema definition, plus
file_info
Schema Rules
- Each field must have
typeanddescription - Supported types:
String,Number,Boolean,Integer,Object,Array,Enum - Maximum 10 root-level fields
- Maximum 3 nesting levels
transcribe parameter. It works only on files that already have a transcript available.Example Usage
Response Example
usage field is only included when include_usage=true in the request. file_info is included in the response metadata.Billing
- AI extraction consumes
analysis_requestunits in blocks of 15,000 total tokens - Formula:
ceil(total_tokens / 15000) - Examples:
14,000tokens ->1analysis_requestunit17,000tokens ->2analysis_requestunits31,000tokens ->3analysis_requestunits
- There is no speech-to-text billing on this endpoint
- Failed requests are not charged
Authorizations
API key authentication. Include your VidNavigator API key in the X-API-Key header.
Body
ID of the uploaded file to extract data from
Custom extraction schema defining the fields to extract. Max 10 root-level fields, max 3 nesting levels. Each field must have type and description.
{
"main_topics": {
"type": "Array",
"description": "List of main topics discussed",
"items": {
"type": "String",
"description": "A topic"
}
},
"sentiment": {
"type": "Enum",
"description": "Overall sentiment of the video",
"enum": ["positive", "negative", "neutral"]
},
"key_takeaway": {
"type": "String",
"description": "The single most important takeaway"
}
}Optional guidance for what to extract from the transcript
"Extract action items and deadlines from this meeting"
When true, includes token usage statistics in the response
Response
Data extracted successfully
success Extracted data matching the provided schema. The shape of this object mirrors the input schema fields.
Video metadata (title, channel, duration, views, etc.). Only present for /extract/video requests.
File metadata (name, size, type, duration, etc.). Only present for /extract/file requests.
Token usage statistics. Only present when include_usage=true.

