Marko Budisic mbudisic

Enhancing RAG: A Practical Chunking Strategy for Video Transcripts with Timestamp Alignment

A detailed walkthrough of an initial approach to semantic chunking for Retrieval Augmented Generation over video timestamps.

Retrieval Augmented Generation (RAG) systems are powerful, but their performance heavily relies on the quality of context provided to the Large Language Model (LLM). When dealing with extensive content like video tutorial transcripts, naive chunking can lead to fragmented, irrelevant, or incomplete information, ultimately degrading the user's experience. This article presents the first-iteration of a practical chunking strategy implemented in PsTuts RAG project as a part of the learning path toward LLM engineering (s/o AI Makerspace ). I'll detail how we combine semantic chunking with timestamp alignment to tackle these challenges, offering a method to create contextually rich and accurately timed chunks fro

title

markmap

initialExpandLevel
3

	{
	"basics": {
	"name": "Marko Budišić",
	"label": "Staff / Principal Technical Leader — Robotics, AI/ML, Systems Engineering",
	"email": "mbudisic@gmail.com",
	"phone": "(805) 452-1480",
	"summary": "Senior individual contributor operating at the intersection of hands-on systems engineering, applied AI/ML, and early-stage product definition. Specializes in turning ambiguous, high-stakes problems into working, auditable prototypes—particularly in robotics and regulated industrial environments. Trusted technical lead who bridges operators, engineers, and leadership by grounding strategy in evidence, prototypes, and operational reality rather than slideware.\n\nPrimary operating mode: Senior / Staff-level Individual Contributor. Secondary modes: Technical Program Leadership, Team Mentorship (hands-on).\n\nOpen to Individual Contributor [IC], Technical Portfolio Manager [TPM], or Engineering Manager [EM] roles.",
	"location": {
	"city": "",
	"region": "Central Virginia",

	import numpy as np

	def xyindependent(x,b,c,dt,randomfun):
	for idx in range(2,len(x)):
	x[idx] =(2-bdt)x[idx-1] + (bdt-cdtdt-1)x[idx-2] + dtdtrandomfun()
	return x

	import matplotlib.pyplot as plt

	x0 = np.zeros(100,)

Marko Budisic mbudisic

Enhancing RAG: A Practical Chunking Strategy for Video Transcripts with Timestamp Alignment

AI/ML Engineering

Core Infrastructure

Compute Systems