Extract structured data from uploaded file

curl --request POST \
  --url https://api.vidnavigator.com/v1/extract/file \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "file_id": "<string>",
  "schema": {
    "main_topics": {
      "type": "Array",
      "description": "List of main topics discussed",
      "items": {
        "type": "String",
        "description": "A topic"
      }
    },
    "sentiment": {
      "type": "Enum",
      "description": "Overall sentiment of the video",
      "enum": [
        "positive",
        "negative",
        "neutral"
      ]
    },
    "key_takeaway": {
      "type": "String",
      "description": "The single most important takeaway"
    }
  },
  "what_to_extract": "Extract action items and deadlines from this meeting",
  "include_usage": false
}
'

{
  "status": "success",
  "data": {},
  "video_info": {
    "title": "<string>",
    "description": "<string>",
    "thumbnail": "<string>",
    "url": "<string>",
    "channel": "<string>",
    "channel_url": "<string>",
    "duration": 123,
    "views": 123,
    "likes": 123,
    "published_date": "<string>",
    "keywords": [
      "<string>"
    ],
    "category": "<string>",
    "available_languages": [
      "<string>"
    ],
    "selected_language": "<string>",
    "carousel_info": {
      "total_items": 123,
      "video_count": 123,
      "image_count": 123,
      "selected_index": 123
    }
  },
  "file_info": {
    "id": "<string>",
    "name": "<string>",
    "size": 123,
    "type": "<string>",
    "duration": 123,
    "status": "pending",
    "created_at": "2023-11-07T05:31:56Z",
    "updated_at": "2023-11-07T05:31:56Z",
    "original_file_date": "2023-11-07T05:31:56Z",
    "has_transcript": true,
    "error_message": "<string>",
    "namespace_ids": [
      "<string>"
    ],
    "namespaces": [
      {
        "id": "<string>",
        "name": "<string>"
      }
    ]
  },
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  }
}

Local Files

Extract Data from File

Extract structured data from an uploaded file’s transcript using a custom schema.

Provide a file_id and a JSON schema describing the fields to extract. The file must be processed and have a transcript available. Optionally include what_to_extract to guide the extraction.

Schema format: Each field must have type and description. Supported types: String, Number, Boolean, Integer, Object, Array, Enum. Max 10 root fields, max 3 nesting levels.

Content-Type: Accepts application/json or YAML (application/x-yaml, text/yaml).

Token usage: Set include_usage=true to include prompt/completion token counts in the response.

Billing: Each extraction consumes at least 1 analysis credit. For longer transcripts, billing scales as ceil(total_tokens / 15000) credits. All charges are reverted if the request fails.

POST

extract

file

Extract structured data from uploaded file

curl --request POST \
  --url https://api.vidnavigator.com/v1/extract/file \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "file_id": "<string>",
  "schema": {
    "main_topics": {
      "type": "Array",
      "description": "List of main topics discussed",
      "items": {
        "type": "String",
        "description": "A topic"
      }
    },
    "sentiment": {
      "type": "Enum",
      "description": "Overall sentiment of the video",
      "enum": [
        "positive",
        "negative",
        "neutral"
      ]
    },
    "key_takeaway": {
      "type": "String",
      "description": "The single most important takeaway"
    }
  },
  "what_to_extract": "Extract action items and deadlines from this meeting",
  "include_usage": false
}
'

{
  "status": "success",
  "data": {},
  "video_info": {
    "title": "<string>",
    "description": "<string>",
    "thumbnail": "<string>",
    "url": "<string>",
    "channel": "<string>",
    "channel_url": "<string>",
    "duration": 123,
    "views": 123,
    "likes": 123,
    "published_date": "<string>",
    "keywords": [
      "<string>"
    ],
    "category": "<string>",
    "available_languages": [
      "<string>"
    ],
    "selected_language": "<string>",
    "carousel_info": {
      "total_items": 123,
      "video_count": 123,
      "image_count": 123,
      "selected_index": 123
    }
  },
  "file_info": {
    "id": "<string>",
    "name": "<string>",
    "size": 123,
    "type": "<string>",
    "duration": 123,
    "status": "pending",
    "created_at": "2023-11-07T05:31:56Z",
    "updated_at": "2023-11-07T05:31:56Z",
    "original_file_date": "2023-11-07T05:31:56Z",
    "has_transcript": true,
    "error_message": "<string>",
    "namespace_ids": [
      "<string>"
    ],
    "namespaces": [
      {
        "id": "<string>",
        "name": "<string>"
      }
    ]
  },
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  }
}

Extract structured data from an uploaded file’s transcript using a custom schema you define.

Overview

Extract structured data from files you have previously uploaded to your VidNavigator library. The file must be fully processed and have a transcript available.

How It Works

Provide a file_id and a schema describing the structured fields you want
VidNavigator reads the transcript from the uploaded file
VidNavigator runs AI extraction against your schema
You receive structured JSON matching your schema definition, plus file_info

Schema Rules

Each field must have type and description
Supported types: String, Number, Boolean, Integer, Object, Array, Enum
Maximum 10 root-level fields
Maximum 3 nesting levels

You can also send the request body as YAML by setting Content-Type to application/x-yaml or text/yaml.

This endpoint does not support a transcribe parameter. It works only on files that already have a transcript available.

Example Usage

curl -X POST "https://api.vidnavigator.com/v1/extract/file" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "file_abc123",
    "schema": {
      "action_items": {
        "type": "Array",
        "description": "Action items and tasks mentioned in the meeting",
        "items": {
          "type": "Object",
          "description": "An action item",
          "properties": {
            "task": { "type": "String", "description": "The task description" },
            "assignee": { "type": "String", "description": "Person assigned" },
            "deadline": { "type": "String", "description": "Deadline if mentioned" }
          }
        }
      },
      "decisions_made": {
        "type": "Array",
        "description": "Key decisions made during the meeting",
        "items": { "type": "String", "description": "A decision" }
      }
    },
    "what_to_extract": "Extract action items and deadlines from this meeting",
    "include_usage": true
  }'

Response Example

{
  "status": "success",
  "data": {
    "action_items": [
      {
        "task": "Update the product roadmap with Q2 priorities",
        "assignee": "Sarah",
        "deadline": "Friday"
      },
      {
        "task": "Schedule customer interviews for user research",
        "assignee": "Mike",
        "deadline": "next week"
      }
    ],
    "decisions_made": [
      "Proceed with option B for the pricing model",
      "Delay the mobile app launch to Q3"
    ]
  },
  "file_info": {
    "id": "file_abc123",
    "name": "weekly-product-meeting.mp4",
    "size": 248517632,
    "type": "video/mp4",
    "duration": 3720.0,
    "status": "completed",
    "created_at": "2026-04-10T09:30:00Z",
    "updated_at": "2026-04-10T09:36:12Z",
    "has_transcript": true,
    "namespace_ids": ["ns_team_notes"],
    "namespaces": [
      {
        "id": "ns_team_notes",
        "name": "Team Notes"
      }
    ]
  },
  "usage": {
    "prompt_tokens": 3200,
    "completion_tokens": 120,
    "total_tokens": 3320
  }
}

The usage field is only included when include_usage=true in the request. file_info is included in the response metadata.

Billing

AI extraction consumes analysis_request units in blocks of 15,000 total tokens
Formula: ceil(total_tokens / 15000)
Examples:
- 14,000 tokens -> 1 analysis_request unit
- 17,000 tokens -> 2 analysis_request units
- 31,000 tokens -> 3 analysis_request units
There is no speech-to-text billing on this endpoint
Failed requests are not charged

Authorizations

X-API-Key

string

header

required

API key authentication. Include your VidNavigator API key in the X-API-Key header.

Body

application/json

file_id

string

required

ID of the uploaded file to extract data from

schema

object

required

Custom extraction schema defining the fields to extract. Max 10 root-level fields, max 3 nesting levels. Each field must have type and description.

Show child attributes

Example:

{
  "main_topics": {
    "type": "Array",
    "description": "List of main topics discussed",
    "items": {
      "type": "String",
      "description": "A topic"
    }
  },
  "sentiment": {
    "type": "Enum",
    "description": "Overall sentiment of the video",
    "enum": ["positive", "negative", "neutral"]
  },
  "key_takeaway": {
    "type": "String",
    "description": "The single most important takeaway"
  }
}

what_to_extract

string

Optional guidance for what to extract from the transcript

Example:

"Extract action items and deadlines from this meeting"

include_usage

boolean

default:false

When true, includes token usage statistics in the response

Response

Data extracted successfully

status

enum<string>

Available options:

success

data

object

Extracted data matching the provided schema. The shape of this object mirrors the input schema fields.

video_info

object

Video metadata (title, channel, duration, views, etc.). Only present for /extract/video requests.

Show child attributes

file_info

object

File metadata (name, size, type, duration, etc.). Only present for /extract/file requests.

Show child attributes

usage

object

Token usage statistics. Only present when include_usage=true.

Show child attributes

Analyze Uploaded File Search Files

API Documentation

Online media

Local Files

System

Extract Data from File

Overview

How It Works

Schema Rules

Example Usage

Response Example

Billing

Authorizations

Body

Response

API Documentation

Online media

Local Files

System

Documentation Index

​Overview

​How It Works

​Schema Rules

​Example Usage

​Response Example

​Billing

Authorizations

Body

Response

Overview

How It Works

Schema Rules

Example Usage

Response Example

Billing