Skip to content

Instantly share code, notes, and snippets.

@FHRNet
Created September 22, 2024 18:07
Show Gist options
  • Select an option

  • Save FHRNet/84afd6e9ee9cfc01c680542f0a9b7905 to your computer and use it in GitHub Desktop.

Select an option

Save FHRNet/84afd6e9ee9cfc01c680542f0a9b7905 to your computer and use it in GitHub Desktop.

Accessing deleted Twitch VODs

The method works on the basis of calculating the address of an m3u8 playlist link for the stream, which might (though not necessarily has to) still exist even if the VOD has been deleted on the website. First of all, we need to obtain the streamer name, stream ID and the start time of the Stream - sounds difficult if the VOD is not accessible using Twitch's website. Luckily, it's possible to find some stream metadata using tools like TwitchTracker.

Extracting data from TwitchTracker

https://twitchtracker.com/zackrawrr/streams/44841720763
-> Streamer name: zackrawrr                         - extract from the URL
-> Stream ID: 44841720763                           - extract from the URL
-> Approximate time started: 2024-09-21 17:14 UTC   - found on the website

Keep in mind TwitchTracker displays times in the local timezone, what's required is time in UTC, do the conversion manually according to your timezone.

Playlist filename structure

https://vod-secure.twitch.tv/4baf162fc318522d9326_zackrawrr_44841720763_1726938842/chunked/index-dvr.m3u8
https://vod-secure.twitch.tv/{urlhash}_{streamer_name}_{stream_id}_{timestamp}/chunked/index-dvr.m3u8

hashable_base   - {streamer_name} + "_" + {stream_id} + "_" + {timestamp}
urlhash         - first 20 characters of sha1 hex sum of hashable_base
streamer_name   - self explanatory
stream_id       - self explanatory
timestamp       - utc unix timestamp

Calculating the urlhash

import hashlib

>>> streamer_name = 'zackrawrr'
>>> stream_id = 44841720763
>>> timestamp = 1726938842
>>>
>>> hashable_base = f'{streamer_name}_{stream_id}_{timestamp}'
>>> hashable_base
'zackrawrr_44841720763_1726938842'
>>> urlhash = hashlib.sha1(hashable_base.encode('ascii')).hexdigest()[:20]
>>> urlhash
'4baf162fc318522d9326'

Timestamp problems

Tools like TwitchTracker generally only give out stream start times down to minute precision (I am sure it can be obtained somehow, but did not want to waste time on finding that). Since we do need the seconds component as well, it is necessary to bruteforce the precise stream start time.

>>> timestamp = 1726938840
>>> for ts in range(0, 60):
>>>     timestamp += ts
>>>     print(timestamp)
1726938840
1726938841
1726938842
...

Example code

import time
import hashlib
import random
import calendar
import requests
from datetime import datetime

USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246'

input_streamer_name = input('Streamer name: ').strip()
input_stream_id = input('Stream ID: ').strip()
input_timestamp = input('Start timestamp in UTC (format: YYYY-MM-DD HH:MM): ').strip()

if len(input_streamer_name) < 1:
    raise ValueError('Invalid streamer name')

try:
    stream_id = int(input_stream_id)
    if stream_id < 1:
        raise Exception
except:
    raise ValueError('Invalid stream id')

try:
    timestamp = calendar.timegm(datetime.strptime(input_timestamp, '%Y-%m-%d %H:%M').utctimetuple())

    print(f'-> parsed timestamp: {timestamp}')
except:
    raise ValueError('Invalid timestamp')

def generate_hash(streamer_name: str, stream_id: str, timestamp: int) -> str:
    hashable_base = f'{streamer_name}_{stream_id}_{timestamp}'
    return hashlib.sha1(hashable_base.encode('utf-8')).hexdigest()[:20]

def generate_url(streamer_name: str, stream_id: str, timestamp: int) -> str:
    urlhash = generate_hash(streamer_name, stream_id, timestamp)
    return f'https://vod-secure.twitch.tv/{urlhash}_{streamer_name}_{stream_id}_{timestamp}/chunked/index-dvr.m3u8'

def sleep_random_time() -> None:
    time.sleep(random.randint(100,300) / 1000.0)

def check_status_code(url: str) -> bool:
    req = requests.head(url, timeout=30, headers={'User-Agent': USER_AGENT})
    return req.status_code == 200

print("Working, please wait ...")

for ts in range(0, 60):
    url = generate_url(input_streamer_name, stream_id, timestamp + ts)

    if check_status_code(url):
        print(f'Found VOD at {url}')
        break

    # Sleep for a bit in between requests
    sleep_random_time()

print("No VOD found, it could have been permanently deleted or the parameters are wrong")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment