Extracting Structured Data - VidNavigator API

If transcripts are for humans, extraction is for software. The Extract API lets you define the exact fields you want, then returns structured JSON that matches your schema so you can send the result straight into a database, CRM, workflow, or AI pipeline.

What This Guide Helps You Do

By the end of this guide, you will know how to:

choose the right extraction endpoint
design a schema the model can follow reliably
use what_to_extract to improve focus
understand auto-transcription, caching, and billing
move from one test video to a repeatable production workflow

Why This Is More Than Just an LLM Prompt

The real challenge is not only extracting structured data. It is getting the video content into a usable text layer in the first place. With VidNavigator, the workflow is end to end:

fetch the transcript when the platform already exposes one
auto-transcribe supported non-YouTube videos with speech-to-text when no transcript exists
run AI extraction on that text using your schema
return validated JSON you can use immediately in software

That is the main difference. You are not expected to manually retrieve captions, run a separate speech-to-text step, clean the text, and then prompt a model yourself. Compared with building that flow manually, VidNavigator gives you:

transcript retrieval and speech-to-text built into the same extraction workflow
one API call instead of stitching together multiple tools
a fixed response shape defined by your schema
validated output instead of free-form text
prompt compilation that is cached for 2 hours for repeated schemas

This is what makes video data extraction practical at scale, especially for platforms where transcripts are inconsistent, missing, or not easily accessible.

Choose the Right Endpoint

Online Videos

Use /v1/extract/video when you have a public video URL from YouTube, Instagram, TikTok, Facebook, X, Vimeo, Dailymotion, Loom, and similar platforms.

Uploaded Files

Use /v1/extract/file when the content is already in your VidNavigator library and you want to extract from a file_id.

extract/video can auto-transcribe non-YouTube videos when transcribe=true (the default). extract/file does not transcribe for you, so the file must already have a transcript.

How Extraction Works

The Extract API runs a simple 2-step pipeline:

You send a schema and optional what_to_extract instruction.
VidNavigator compiles an optimized extraction plan and caches it for 2 hours.
The plan is applied to the transcript for the current video or file.
The response is validated so the output matches your schema.

That cache matters. The first call for a new schema usually has a small compilation overhead. Reusing the same schema on later calls is faster because the compiled prompt is reused automatically.

Start with a Small Schema

The fastest way to succeed is to start with 2-4 fields, validate the output, then expand. Here is a good starter schema for product review videos:

{
  "products_mentioned": {
    "type": "Array",
    "description": "Products mentioned in the video",
    "items": {
      "type": "String",
      "description": "Product name"
    }
  },
  "main_claim": {
    "type": "String",
    "description": "The single most important claim or conclusion made in the video"
  },
  "sentiment": {
    "type": "Enum",
    "description": "Overall sentiment toward the main product discussed",
    "enum": ["positive", "neutral", "negative"]
  }
}

products_mentioned:
  type: Array
  description: Products mentioned in the video
  items:
    type: String
    description: Product name
main_claim:
  type: String
  description: The single most important claim or conclusion made in the video
sentiment:
  type: Enum
  description: Overall sentiment toward the main product discussed
  enum:
    - positive
    - neutral
    - negative

Quickstart

Extract from an Online Video

Use this when you have a public video URL:

cURL

curl -X POST "https://api.vidnavigator.com/v1/extract/video" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "video_url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "what_to_extract": "Focus on product names, pricing, and the speaker''s overall verdict",
    "schema": {
      "products_mentioned": {
        "type": "Array",
        "description": "Products mentioned in the video",
        "items": { "type": "String", "description": "Product name" }
      },
      "pricing_signals": {
        "type": "Array",
        "description": "Any price points, pricing plans, or cost references",
        "items": { "type": "String", "description": "Pricing detail" }
      },
      "verdict": {
        "type": "String",
        "description": "Final recommendation or overall verdict from the speaker"
      }
    }
  }'

Extract from an Uploaded File

Use this when the transcript already lives in your VidNavigator workspace:

cURL

curl -X POST "https://api.vidnavigator.com/v1/extract/file" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "file_abc123",
    "what_to_extract": "Extract action items and owners from this meeting recording",
    "schema": {
      "action_items": {
        "type": "Array",
        "description": "Tasks agreed on during the meeting",
        "items": { "type": "String", "description": "Action item" }
      },
      "owners": {
        "type": "Array",
        "description": "People assigned to specific tasks",
        "items": { "type": "String", "description": "Person name and task if available" }
      },
      "deadline_mentions": {
        "type": "Array",
        "description": "Any due dates or timing commitments mentioned",
        "items": { "type": "String", "description": "Deadline detail" }
      }
    }
  }'

When to Use `what_to_extract`

Your schema defines the output shape. what_to_extract tells the model where to focus. Use it when:

the transcript covers multiple topics and you care about one of them
you want the model to prioritize a specific lens such as pricing, claims, objections, or action items
the field descriptions are correct but still too broad

Good examples:

Focus on feature claims, competitor mentions, and pricing strategy.
Extract only factual claims and cited statistics. Ignore jokes and off-topic banter.
Prioritize buyer intent, objections, and next steps from the sales call.

Schema Rules

Your schema must follow these limits:

maximum 10 root fields
maximum 3 nesting levels
maximum 10 subfields per Object
supported field types: String, Number, Boolean, Integer, Array, Object, Enum
every field must include both type and description

Write descriptions like instructions, not labels. Primary pricing strategy discussed, in one sentence is much better than pricing.

Best Practices That Improve Results

1. Be specific in field descriptions

Weak:

{
  "topic": {
    "type": "String",
    "description": "Topic"
  }
}

Better:

{
  "primary_topic": {
    "type": "String",
    "description": "Primary topic discussed in the video, in 5 to 10 words"
  }
}

2. Use enums when the answer should come from a fixed list

If a field is really a classification, use Enum instead of String. That makes downstream automation much easier.

3. Start simple, then expand

Do not begin with a 10-field nested schema unless you already know it works. Start with the 2-3 fields that matter most, test them on a few videos, then add more.

4. Write descriptions in the output language you want

The response is returned in the same language as your schema descriptions. If you write field descriptions in French, the output will also come back in French.

Ready-to-Use Ideas

Common extraction patterns include:

lead generation: companies, buyer intent, pain points, next steps
market research: competitor mentions, feature claims, pricing strategy, objections
creator analysis: hooks, quotes, sponsored mentions, content format
RAG ingestion: dense summary, entities, claims, topics, language code
fact-checking: claims, statistics, cited sources, controversy markers

Online Videos vs. Uploaded Files

`POST /extract/video`

Choose this endpoint when:

you already have a public video URL
you want VidNavigator to fetch the transcript for you
you want automatic speech-to-text fallback for supported non-YouTube videos

Important behavior:

transcribe=true by default
automatic transcription applies to non-YouTube platforms only
YouTube relies on platform captions and cannot use speech-to-text fallback through this endpoint

`POST /extract/file`

Choose this endpoint when:

the media is already uploaded to your VidNavigator account
you want to extract from private or internal content
you already completed transcription earlier in your workflow

Typical file workflow:

Upload the file.
Generate a transcript if needed.
Call /v1/extract/file with the resulting file_id.

Billing and Performance

Each extraction consumes analysis_request units. One unit covers up to 15,000 total tokens; longer transcripts scale with: ceil(total_tokens / 15000) A few practical notes:

the first call for a new schema includes prompt compilation overhead
repeated calls with the same schema reuse the cached plan for 2 hours
extract/video may also trigger transcription_hour billing when auto-transcription is used
failed requests are reverted and are not billed

Troubleshooting

`transcript_not_available`

This usually means the source does not have a usable transcript and auto-transcription is either disabled or unavailable.

For non-YouTube URLs, keep transcribe=true on /extract/video.
For uploaded files, transcribe the file before calling /extract/file.

`invalid_schema`

This usually means a field is missing type or description, nesting is too deep, or the schema exceeds the limits above.

Results are too broad

Tighten the schema descriptions and add a more targeted what_to_extract instruction.

First request is slower than later requests

That is expected when a schema is new. The compiled extraction plan is cached for 2 hours, so repeated schemas are faster.

​What This Guide Helps You Do

​Why This Is More Than Just an LLM Prompt

​Choose the Right Endpoint

Online Videos

Uploaded Files

​How Extraction Works

​Start with a Small Schema

​Quickstart

​Extract from an Online Video

​Extract from an Uploaded File

​When to Use what_to_extract

​Schema Rules

​Best Practices That Improve Results

​1. Be specific in field descriptions

​2. Use enums when the answer should come from a fixed list

​3. Start simple, then expand

​4. Write descriptions in the output language you want

​Ready-to-Use Ideas

​Online Videos vs. Uploaded Files

​POST /extract/video

​POST /extract/file

​Billing and Performance

​Troubleshooting

​transcript_not_available

​invalid_schema

​Results are too broad

​First request is slower than later requests

​Next Steps

What This Guide Helps You Do

Why This Is More Than Just an LLM Prompt

Choose the Right Endpoint

How Extraction Works

Start with a Small Schema

Quickstart

Extract from an Online Video

Extract from an Uploaded File

When to Use `what_to_extract`

Schema Rules

Best Practices That Improve Results

1. Be specific in field descriptions

2. Use enums when the answer should come from a fixed list

3. Start simple, then expand

4. Write descriptions in the output language you want

Ready-to-Use Ideas

Online Videos vs. Uploaded Files

`POST /extract/video`

`POST /extract/file`

Billing and Performance

Troubleshooting

`transcript_not_available`

`invalid_schema`

Results are too broad

First request is slower than later requests

Next Steps