npm.io
0.1.0 • Published 6d ago

yt-innertube

Licence
MIT
Version
0.1.0
Deps
1
Size
101 kB
Vulns
0
Weekly
0

yt-innertube

Fast, cookieless YouTube audio and stream extraction. It resolves a direct, signed stream URL in about 200ms (versus 46 to 82s for yt-dlp --get-url) by talking to YouTube's innertube player endpoint as the ANDROID_VR client. No cookies, no paid API, no signature cipher to decode. Runs on Deno and Node.

Built at SocialGravity to clone voices from a person's public talks. We pivoted and no longer needed the YouTube path, so the tech is open here under MIT. It ran in production, it is not a toy.

Resolving a YouTube audio stream via the ANDROID_VR client, then downloading an 8 second clip, end to end in about half a second

Why

The usual way to get a YouTube audio stream URL is yt-dlp --get-url, which probes several player clients over a spawn, parse and decipher chain. When one client stalls on YouTube throttling the whole subprocess blocks, and end to end it routinely takes 46 to 82 seconds per video.

The ANDROID_VR (Oculus) client is special: YouTube serves it direct, signed stream URLs with no signature cipher to decode. So a single HTTP POST to /youtubei/v1/player returns a URL ffmpeg can read immediately. That is the core trick here. yt-dlp is still wired in as a fallback so the path stays reliable if YouTube changes the contract. Stream copy (-c:a copy) instead of re-encoding takes a clip from roughly 97s down to about 5s.

Install

Deno (import directly, no install):

import { downloadYouTubeAudioReliable, getStreamUrlViaInnertube }
  from "https://raw.githubusercontent.com/AlvaroBalbin/yt-innertube/main/mod.ts";

Node / Bun:

npm install yt-innertube
import { downloadYouTubeAudioReliable, getStreamUrlViaInnertube } from "yt-innertube";

Requirements

  • ffmpeg on PATH (for clip extraction)
  • yt-dlp on PATH (used only as a fallback)
  • A runtime: Deno 2.x, or Node 18+ / Bun

Quick start

Get the direct audio stream URL with one HTTP call, no subprocess:

const url = await getStreamUrlViaInnertube("dQw4w9WgXcQ", "audio");

Download a time window to a temp file (innertube fast path, yt-dlp fallback, ffmpeg stream-copy, MP3 re-encode fallback):

const res = await downloadYouTubeAudioReliable("https://youtu.be/dQw4w9WgXcQ", 60, 90);
if (res.path) console.log("audio at", res.path);
else console.log("failed:", res.errorKind); // bot_detection | geo_block | unavailable | ...

CLI demo (Deno):

deno run --allow-net --allow-run --allow-read --allow-write --allow-env \
  examples/cli.ts "https://www.youtube.com/watch?v=dQw4w9WgXcQ" --start 60 --dur 30 --out clip.mka

# or just print the resolved stream URL, no download:
deno run --allow-net examples/cli.ts dQw4w9WgXcQ --url-only

Environment variables

All optional. The innertube fast path needs none of them.

  • YT_PROXY (or WEBSHARE_PROXY_URL): an HTTP proxy URL, e.g. http://user:pass@host:port. Passed to yt-dlp as --proxy. A rotating residential proxy is the most reliable way past the "Sign in to confirm you're not a bot" wall when running from a datacenter IP.
  • YT_DISABLE_PROXY=true: force-disable the proxy even when the URL is set.
  • YOUTUBE_COOKIES_TXT: Netscape-format cookies from a logged-in browser session, used as a yt-dlp fallback for bot detection. Optional, and a treadmill (cookies expire in a few weeks); prefer a proxy.
  • YTDLP_BIN / YTDLP_PREARGS: override how yt-dlp is invoked. Defaults to yt-dlp on Windows and python3 -m yt_dlp elsewhere.

API

  • getStreamUrlViaInnertube(videoId, "audio" | "video") resolves a direct signed URL or null. One HTTP call, never throws.
  • downloadYouTubeAudioReliable(url, startSec?, endSec?, preStreamUrl?, opts?) returns { path, videoId, errorKind?, keyframesDir? }.
  • downloadMultipleClips(clips, opts?) downloads many windows in parallel.
  • downloadPodcastAudio(mp3Url, startSec?, endSec?) windows a direct podcast MP3.
  • extractKeyframeAtTimestamp(idOrUrl, timestampSec, outPath, signal?) pulls a single JPEG keyframe at an absolute timestamp via fast-seek.
  • extractVideoId(urlOrId) returns the 11-char id, or null.

How it works

  1. getStreamUrlViaInnertube POSTs the ANDROID_VR client context to /youtubei/v1/player and reads streamingData.adaptiveFormats, picking the best itag with a plain url (no signatureCipher).
  2. If innertube returns null it falls back to yt-dlp --get-url, first anonymous (the anonymous manifest tends to expose clean audio-only formats), then with cookies only if the anonymous attempt hit bot detection.
  3. ffmpeg reads the signed URL directly and copies the requested window without re-encoding. Stream URLs are cached per video id for 20s.

Notes

  • Stream URLs are short-lived (CDN signed, ~30 to 60s). Resolve, then use promptly.
  • The ANDROID_VR client version in src/innertube.ts occasionally needs a bump when YouTube rotates. It is backward compatible, so updating is safe.
  • This talks to YouTube directly. Respect YouTube's Terms of Service and the rights of content owners; you are responsible for how you use it.

License

MIT. See LICENSE.

Keywords