Extract structured data from online video

curl --request POST \
  --url https://api.vidnavigator.com/v1/extract/video \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "video_url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
  "schema": {
    "main_topics": {
      "type": "Array",
      "description": "List of main topics discussed",
      "items": {
        "type": "String",
        "description": "A topic"
      }
    },
    "sentiment": {
      "type": "Enum",
      "description": "Overall sentiment of the video",
      "enum": [
        "positive",
        "negative",
        "neutral"
      ]
    },
    "key_takeaway": {
      "type": "String",
      "description": "The single most important takeaway"
    }
  },
  "what_to_extract": "Extract the main topics and any product names mentioned",
  "transcribe": true,
  "include_usage": false
}
'

{
  "status": "success",
  "data": {},
  "video_info": {
    "title": "<string>",
    "description": "<string>",
    "thumbnail": "<string>",
    "url": "<string>",
    "channel": "<string>",
    "channel_url": "<string>",
    "duration": 123,
    "views": 123,
    "likes": 123,
    "published_date": "<string>",
    "keywords": [
      "<string>"
    ],
    "category": "<string>",
    "available_languages": [
      "<string>"
    ],
    "selected_language": "<string>",
    "carousel_info": {
      "total_items": 123,
      "video_count": 123,
      "image_count": 123,
      "selected_index": 123
    }
  },
  "file_info": {
    "id": "<string>",
    "name": "<string>",
    "size": 123,
    "type": "<string>",
    "duration": 123,
    "status": "pending",
    "created_at": "2023-11-07T05:31:56Z",
    "updated_at": "2023-11-07T05:31:56Z",
    "original_file_date": "2023-11-07T05:31:56Z",
    "has_transcript": true,
    "error_message": "<string>",
    "namespace_ids": [
      "<string>"
    ],
    "namespaces": [
      {
        "id": "<string>",
        "name": "<string>"
      }
    ]
  },
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  }
}

Online media

Extract Data from Video

Extract structured data from an online video’s transcript using a custom schema.

Provide a video_url and a JSON schema describing the fields to extract. Optionally include what_to_extract to guide the extraction.

Auto-transcription: For non-YouTube videos without an existing transcript (e.g. Instagram, TikTok, Facebook), the API automatically transcribes the video audio when transcribe is true (the default). This uses speech-to-text credits (video_uploads quota). YouTube videos rely on platform captions and cannot be auto-transcribed. Set transcribe=false to disable this behavior.

Schema format: Each field must have type and description. Supported types: String, Number, Boolean, Integer, Object, Array, Enum. Max 10 root fields, max 3 nesting levels.

Content-Type: Accepts application/json or YAML (application/x-yaml, text/yaml).

Token usage: Set include_usage=true to include prompt/completion token counts in the response.

Billing: Each extraction consumes at least 1 analysis credit. For longer transcripts, billing scales as ceil(total_tokens / 15000) credits. If auto-transcription is triggered, speech-to-text hours are also charged based on video duration. All charges are reverted if the request fails.

POST

extract

video

Extract structured data from online video

curl --request POST \
  --url https://api.vidnavigator.com/v1/extract/video \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "video_url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
  "schema": {
    "main_topics": {
      "type": "Array",
      "description": "List of main topics discussed",
      "items": {
        "type": "String",
        "description": "A topic"
      }
    },
    "sentiment": {
      "type": "Enum",
      "description": "Overall sentiment of the video",
      "enum": [
        "positive",
        "negative",
        "neutral"
      ]
    },
    "key_takeaway": {
      "type": "String",
      "description": "The single most important takeaway"
    }
  },
  "what_to_extract": "Extract the main topics and any product names mentioned",
  "transcribe": true,
  "include_usage": false
}
'

{
  "status": "success",
  "data": {},
  "video_info": {
    "title": "<string>",
    "description": "<string>",
    "thumbnail": "<string>",
    "url": "<string>",
    "channel": "<string>",
    "channel_url": "<string>",
    "duration": 123,
    "views": 123,
    "likes": 123,
    "published_date": "<string>",
    "keywords": [
      "<string>"
    ],
    "category": "<string>",
    "available_languages": [
      "<string>"
    ],
    "selected_language": "<string>",
    "carousel_info": {
      "total_items": 123,
      "video_count": 123,
      "image_count": 123,
      "selected_index": 123
    }
  },
  "file_info": {
    "id": "<string>",
    "name": "<string>",
    "size": 123,
    "type": "<string>",
    "duration": 123,
    "status": "pending",
    "created_at": "2023-11-07T05:31:56Z",
    "updated_at": "2023-11-07T05:31:56Z",
    "original_file_date": "2023-11-07T05:31:56Z",
    "has_transcript": true,
    "error_message": "<string>",
    "namespace_ids": [
      "<string>"
    ],
    "namespaces": [
      {
        "id": "<string>",
        "name": "<string>"
      }
    ]
  },
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  }
}

Extract structured data from an online video’s transcript using a custom schema you define.

Overview

The extraction endpoint lets you pull structured, typed data from any video transcript by providing a JSON schema describing the fields you need. This is ideal for building automated pipelines that need consistent, machine-readable output from video content.

How It Works

Provide a video_url and a schema defining the fields to extract
VidNavigator fetches the platform transcript when available
For supported non-YouTube sources, it can auto-transcribe audio when no transcript exists
VidNavigator runs AI extraction against your schema
You receive structured JSON matching your schema definition, plus video_info

Schema Rules

Each field must have type and description
Supported types: String, Number, Boolean, Integer, Object, Array, Enum
Maximum 10 root-level fields
Maximum 3 nesting levels

You can also send the request body as YAML by setting Content-Type to application/x-yaml or text/yaml.

Automatic Transcription

The transcribe parameter controls whether VidNavigator should automatically fall back to speech-to-text when a transcript is not available.

transcribe=true by default
applies to non-YouTube videos only
useful for platforms like Instagram, TikTok, Facebook, X, and similar sources
YouTube extraction relies on platform captions and does not support speech-to-text fallback through this endpoint

If you set transcribe=false, the request will fail when no transcript is available instead of triggering speech-to-text processing.

Example Usage

Basic Extraction

curl -X POST "https://api.vidnavigator.com/v1/extract/video" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "video_url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
    "schema": {
      "main_topics": {
        "type": "Array",
        "description": "List of main topics discussed",
        "items": { "type": "String", "description": "A topic" }
      },
      "sentiment": {
        "type": "Enum",
        "description": "Overall sentiment of the video",
        "enum": ["positive", "negative", "neutral"]
      },
      "key_takeaway": {
        "type": "String",
        "description": "The single most important takeaway"
      }
    }
  }'

With Extraction Guidance

Use what_to_extract to provide additional context to the AI about what to focus on:

cURL

curl -X POST "https://api.vidnavigator.com/v1/extract/video" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "video_url": "https://youtube.com/watch?v=dQw4w9WgXcQ",
    "what_to_extract": "Focus on product names and pricing mentioned in the video",
    "schema": {
      "products": {
        "type": "Array",
        "description": "Products mentioned in the video",
        "items": {
          "type": "Object",
          "description": "A product",
          "properties": {
            "name": { "type": "String", "description": "Product name" },
            "price": { "type": "String", "description": "Price if mentioned" }
          }
        }
      }
    },
    "include_usage": true
  }'

Disable Auto-Transcription

Use transcribe=false when you want extraction to run only if a platform transcript already exists:

cURL

curl -X POST "https://api.vidnavigator.com/v1/extract/video" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "video_url": "https://www.instagram.com/reel/example/",
    "transcribe": false,
    "schema": {
      "main_claim": {
        "type": "String",
        "description": "Main claim made in the video"
      }
    }
  }'

Response Example

{
  "status": "success",
  "data": {
    "main_topics": ["machine learning", "neural networks", "data preprocessing"],
    "sentiment": "positive",
    "key_takeaway": "Start with clean data before choosing a model architecture"
  },
  "video_info": {
    "title": "Machine Learning Fundamentals",
    "description": "An introduction to practical machine learning workflows.",
    "thumbnail": "https://example.com/thumbnail.jpg",
    "url": "https://youtube.com/watch?v=example123",
    "channel": "AI Academy",
    "duration": 4520.0,
    "views": 152340,
    "likes": 8100,
    "published_date": "2026-03-05",
    "keywords": ["machine learning", "ai", "data science"],
    "category": "Education"
  },
  "usage": {
    "prompt_tokens": 2150,
    "completion_tokens": 85,
    "total_tokens": 2235
  }
}

The usage field is only included when include_usage=true in the request. video_info is included in the response metadata.

Billing

AI extraction consumes analysis_request units in blocks of 15,000 total tokens
Formula: ceil(total_tokens / 15000)
Examples:
- 14,000 tokens -> 1 analysis_request unit
- 17,000 tokens -> 2 analysis_request units
- 31,000 tokens -> 3 analysis_request units
If transcribe=true triggers speech-to-text fallback, speech-to-text is charged separately as transcription_hour usage based on video/audio duration
Failed requests are not charged

Use Cases

Data Pipelines

Build automated pipelines that extract consistent structured data from video content at scale

Content Cataloging

Automatically tag and categorize videos with custom taxonomies

Market Research

Extract product mentions, sentiment, and competitive insights from video reviews

Compliance Monitoring

Pull specific compliance-relevant data points from training or policy videos

Authorizations

X-API-Key

string

header

required

API key authentication. Include your VidNavigator API key in the X-API-Key header.

Body

application/json

video_url

string<uri>

required

URL of the video to extract data from

Example:

"https://youtube.com/watch?v=dQw4w9WgXcQ"

schema

object

required

Custom extraction schema defining the fields to extract. Max 10 root-level fields, max 3 nesting levels. Each field must have type and description.

Show child attributes

Example:

{
  "main_topics": {
    "type": "Array",
    "description": "List of main topics discussed",
    "items": {
      "type": "String",
      "description": "A topic"
    }
  },
  "sentiment": {
    "type": "Enum",
    "description": "Overall sentiment of the video",
    "enum": ["positive", "negative", "neutral"]
  },
  "key_takeaway": {
    "type": "String",
    "description": "The single most important takeaway"
  }
}

what_to_extract

string

Optional guidance for what to extract from the transcript

Example:

"Extract the main topics and any product names mentioned"

transcribe

boolean

default:true

When true, automatically transcribes the video audio if no platform transcript is available. Applies to non-YouTube videos only (Instagram, TikTok, Facebook, X, etc.). Uses speech-to-text credits based on video duration.

include_usage

boolean

default:false

When true, includes token usage statistics in the response

Response

Data extracted successfully

status

enum<string>

Available options:

success

data

object

Extracted data matching the provided schema. The shape of this object mirrors the input schema fields.

video_info

object

Video metadata (title, channel, duration, views, etc.). Only present for /extract/video requests.

Show child attributes

file_info

object

File metadata (name, size, type, duration, etc.). Only present for /extract/file requests.

Show child attributes

usage

object

Token usage statistics. Only present when include_usage=true.

Show child attributes

Analyze Video Search Videos

API Documentation

Online media

Local Files

System

Extract Data from Video

Overview

How It Works

Schema Rules

Automatic Transcription

Example Usage

Basic Extraction

With Extraction Guidance

Disable Auto-Transcription

Response Example

Billing

Use Cases

Data Pipelines

Content Cataloging

Market Research

Compliance Monitoring

Authorizations

Body

Response

API Documentation

Online media

Local Files

System

Documentation Index

​Overview

​How It Works

​Schema Rules

​Automatic Transcription

​Example Usage

​Basic Extraction

​With Extraction Guidance

​Disable Auto-Transcription

​Response Example

​Billing

​Use Cases

Data Pipelines

Content Cataloging

Market Research

Compliance Monitoring

Authorizations

Body

Response

Overview

How It Works

Schema Rules

Automatic Transcription

Example Usage

Basic Extraction

With Extraction Guidance

Disable Auto-Transcription

Response Example

Billing

Use Cases