Skip to main content
POST
/
extract
/
video
Extract structured data from online video
curl --request POST \
  --url https://api.vidnavigator.com/v1/extract/video \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "video_url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
  "schema": {
    "main_topics": {
      "type": "Array",
      "description": "List of main topics discussed",
      "items": {
        "type": "String",
        "description": "A topic"
      }
    },
    "sentiment": {
      "type": "Enum",
      "description": "Overall sentiment of the video",
      "enum": [
        "positive",
        "negative",
        "neutral"
      ]
    },
    "key_takeaway": {
      "type": "String",
      "description": "The single most important takeaway"
    }
  },
  "what_to_extract": "Extract the main topics and any product names mentioned",
  "transcribe": true,
  "include_usage": false
}
'
{
  "status": "success",
  "data": {},
  "video_info": {
    "title": "<string>",
    "description": "<string>",
    "thumbnail": "<string>",
    "url": "<string>",
    "channel": "<string>",
    "channel_url": "<string>",
    "duration": 123,
    "views": 123,
    "likes": 123,
    "published_date": "<string>",
    "keywords": [
      "<string>"
    ],
    "category": "<string>",
    "available_languages": [
      "<string>"
    ],
    "selected_language": "<string>",
    "carousel_info": {
      "total_items": 123,
      "video_count": 123,
      "image_count": 123,
      "selected_index": 123
    }
  },
  "file_info": {
    "id": "<string>",
    "name": "<string>",
    "size": 123,
    "type": "<string>",
    "duration": 123,
    "status": "pending",
    "created_at": "2023-11-07T05:31:56Z",
    "updated_at": "2023-11-07T05:31:56Z",
    "original_file_date": "2023-11-07T05:31:56Z",
    "has_transcript": true,
    "error_message": "<string>",
    "namespace_ids": [
      "<string>"
    ],
    "namespaces": [
      {
        "id": "<string>",
        "name": "<string>"
      }
    ]
  },
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  }
}

Documentation Index

Fetch the complete documentation index at: https://docs.vidnavigator.com/llms.txt

Use this file to discover all available pages before exploring further.

Extract structured data from an online video’s transcript using a custom schema you define.

Overview

The extraction endpoint lets you pull structured, typed data from any video transcript by providing a JSON schema describing the fields you need. This is ideal for building automated pipelines that need consistent, machine-readable output from video content.

How It Works

  1. Provide a video_url and a schema defining the fields to extract
  2. VidNavigator fetches the platform transcript when available
  3. For supported non-YouTube sources, it can auto-transcribe audio when no transcript exists
  4. VidNavigator runs AI extraction against your schema
  5. You receive structured JSON matching your schema definition, plus video_info

Schema Rules

  • Each field must have type and description
  • Supported types: String, Number, Boolean, Integer, Object, Array, Enum
  • Maximum 10 root-level fields
  • Maximum 3 nesting levels
You can also send the request body as YAML by setting Content-Type to application/x-yaml or text/yaml.

Automatic Transcription

The transcribe parameter controls whether VidNavigator should automatically fall back to speech-to-text when a transcript is not available.
  • transcribe=true by default
  • applies to non-YouTube videos only
  • useful for platforms like Instagram, TikTok, Facebook, X, and similar sources
  • YouTube extraction relies on platform captions and does not support speech-to-text fallback through this endpoint
If you set transcribe=false, the request will fail when no transcript is available instead of triggering speech-to-text processing.

Example Usage

Basic Extraction

curl -X POST "https://api.vidnavigator.com/v1/extract/video" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "video_url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
    "schema": {
      "main_topics": {
        "type": "Array",
        "description": "List of main topics discussed",
        "items": { "type": "String", "description": "A topic" }
      },
      "sentiment": {
        "type": "Enum",
        "description": "Overall sentiment of the video",
        "enum": ["positive", "negative", "neutral"]
      },
      "key_takeaway": {
        "type": "String",
        "description": "The single most important takeaway"
      }
    }
  }'

With Extraction Guidance

Use what_to_extract to provide additional context to the AI about what to focus on:
cURL
curl -X POST "https://api.vidnavigator.com/v1/extract/video" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "video_url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
    "what_to_extract": "Focus on product names and pricing mentioned in the video",
    "schema": {
      "products": {
        "type": "Array",
        "description": "Products mentioned in the video",
        "items": {
          "type": "Object",
          "description": "A product",
          "properties": {
            "name": { "type": "String", "description": "Product name" },
            "price": { "type": "String", "description": "Price if mentioned" }
          }
        }
      }
    },
    "include_usage": true
  }'

Disable Auto-Transcription

Use transcribe=false when you want extraction to run only if a platform transcript already exists:
cURL
curl -X POST "https://api.vidnavigator.com/v1/extract/video" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "video_url": "https://www.instagram.com/reel/example/",
    "transcribe": false,
    "schema": {
      "main_claim": {
        "type": "String",
        "description": "Main claim made in the video"
      }
    }
  }'

Response Example

{
  "status": "success",
  "data": {
    "main_topics": ["machine learning", "neural networks", "data preprocessing"],
    "sentiment": "positive",
    "key_takeaway": "Start with clean data before choosing a model architecture"
  },
  "video_info": {
    "title": "Machine Learning Fundamentals",
    "description": "An introduction to practical machine learning workflows.",
    "thumbnail": "https://example.com/thumbnail.jpg",
    "url": "https://youtube.com/watch?v=example123",
    "channel": "AI Academy",
    "duration": 4520.0,
    "views": 152340,
    "likes": 8100,
    "published_date": "2026-03-05",
    "keywords": ["machine learning", "ai", "data science"],
    "category": "Education"
  },
  "usage": {
    "prompt_tokens": 2150,
    "completion_tokens": 85,
    "total_tokens": 2235
  }
}
The usage field is only included when include_usage=true in the request. video_info is included in the response metadata.

Billing

  • AI extraction consumes analysis_request units in blocks of 15,000 total tokens
  • Formula: ceil(total_tokens / 15000)
  • Examples:
    • 14,000 tokens -> 1 analysis_request unit
    • 17,000 tokens -> 2 analysis_request units
    • 31,000 tokens -> 3 analysis_request units
  • If transcribe=true triggers speech-to-text fallback, speech-to-text is charged separately as transcription_hour usage based on video/audio duration
  • Failed requests are not charged

Use Cases

Data Pipelines

Build automated pipelines that extract consistent structured data from video content at scale

Content Cataloging

Automatically tag and categorize videos with custom taxonomies

Market Research

Extract product mentions, sentiment, and competitive insights from video reviews

Compliance Monitoring

Pull specific compliance-relevant data points from training or policy videos

Authorizations

X-API-Key
string
header
required

API key authentication. Include your VidNavigator API key in the X-API-Key header.

Body

application/json
video_url
string<uri>
required

URL of the video to extract data from

Example:

"https://youtube.com/watch?v=dQw4w9WgXcQ"

schema
object
required

Custom extraction schema defining the fields to extract. Max 10 root-level fields, max 3 nesting levels. Each field must have type and description.

Example:
{
"main_topics": {
"type": "Array",
"description": "List of main topics discussed",
"items": {
"type": "String",
"description": "A topic"
}
},
"sentiment": {
"type": "Enum",
"description": "Overall sentiment of the video",
"enum": ["positive", "negative", "neutral"]
},
"key_takeaway": {
"type": "String",
"description": "The single most important takeaway"
}
}
what_to_extract
string

Optional guidance for what to extract from the transcript

Example:

"Extract the main topics and any product names mentioned"

transcribe
boolean
default:true

When true, automatically transcribes the video audio if no platform transcript is available. Applies to non-YouTube videos only (Instagram, TikTok, Facebook, X, etc.). Uses speech-to-text credits based on video duration.

include_usage
boolean
default:false

When true, includes token usage statistics in the response

Response

Data extracted successfully

status
enum<string>
Available options:
success
data
object

Extracted data matching the provided schema. The shape of this object mirrors the input schema fields.

video_info
object

Video metadata (title, channel, duration, views, etc.). Only present for /extract/video requests.

file_info
object

File metadata (name, size, type, duration, etc.). Only present for /extract/file requests.

usage
object

Token usage statistics. Only present when include_usage=true.