Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save onEnterFrame/f946ce2602796482c759b388eaa15771 to your computer and use it in GitHub Desktop.

Select an option

Save onEnterFrame/f946ce2602796482c759b388eaa15771 to your computer and use it in GitHub Desktop.

Migration Plan: Moving to Cloud Tasks

Overview

This document outlines a plan to migrate the current Firestore-triggered step processing to a more robust Cloud Tasks implementation. The migration will improve reliability, enable better retry mechanisms, and allow parallel processing where appropriate.

Benefits of Cloud Tasks

  1. Built-in Retry Logic: Configurable retry with exponential backoff
  2. Deduplication: Avoid duplicate processing of the same task
  3. Rate Limiting: Control the rate of task execution
  4. Scheduling: Delay task execution or schedule for future
  5. Observability: Better monitoring and debugging of task execution
  6. Parallel Processing: Execute independent tasks concurrently
  7. Long-running Operations: Better handling of tasks that exceed function timeouts

Implementation Plan Checklist

Phase 1: Setup Cloud Tasks Infrastructure

  • 1.1. Install Cloud Tasks SDK
    npm install @google-cloud/tasks
  • 1.2. Create a helper module for Cloud Tasks (tasks.js)
    • Implement task creation functions
    • Configure retry settings and queue management
    • Define task payload structure

Phase 2: Refactor Step Management

  • 2.1. Create a new TaskManager class to replace step creation logic
    • Implement methods for each step type
    • Define task payloads and routing
  • 2.2. Create HTTP endpoint Cloud Functions for each task type
    • Implement security validation for task execution
    • Modify resource configurations for different task types
  • 2.3. Update the main course generation flow
    • Replace Firestore step creation with task creation
    • Implement task chaining for sequential processing

Phase 3: Implement Task Processing Logic

  • 3.1. Create task handler for video preparation
    • Extract logic from current prepareVideo method
    • Implement idempotent processing
  • 3.2. Create task handler for frame extraction
    • Extract logic from current extractVideoFrames method
    • Add validation and idempotence checks
  • 3.3. Create task handler for video transcription
    • Extract logic from current transcribeVideo method
    • Ensure proper completion tracking
  • 3.4. Create task handler for Gemini analysis
    • Extract logic from current analyzeWithGemini method
    • Configure with high-memory settings
  • 3.5. Create task handler for frame analysis
    • Extract logic from current analyzeFrames method
    • Implement error handling and retries
  • 3.6. Create task handler for course structure generation
    • Extract logic from current generateCourseStructure method
    • Add validation for prerequisites
  • 3.7. Create task handler for interactivity addition
    • Extract logic from current addInteractivity method
    • Ensure proper dependency checking
  • 3.8. Create task handler for course finalization
    • Extract logic from current finalizeCourse method
    • Implement final validation and status updates

Phase 4: Update Tracking and Analytics

  • 4.1. Update analytics events for task-based processing
    • Modify event tracking for task creation
    • Add events for task success and failure
  • 4.2. Implement task status tracking in Firestore
    • Create schema for task status documents
    • Track task attempts and results

Phase 5: Testing and Validation

  • 5.1. Create test harness for Cloud Tasks processing
    • Implement local testing capabilities
    • Create task simulation tools
  • 5.2. Implement logging and monitoring enhancements
    • Add detailed logs for task lifecycle
    • Create monitoring dashboards for task execution
  • 5.3. Perform migration testing
    • Test with sample courses
    • Validate all task types function correctly
  • 5.4. Optimize resource allocation
    • Tune memory and CPU settings for each task type
    • Configure concurrency and rate limits

Phase 6: Deployment and Switchover

  • 6.1. Deploy infrastructure updates
    • Create Cloud Tasks queues in production
    • Deploy task handler functions
  • 6.2. Implement phased rollout
    • Enable A/B testing capability
    • Monitor performance metrics
  • 6.3. Perform final switchover
    • Disable Firestore-triggered functions
    • Route all processing through Cloud Tasks

Technical Design

Task Queue Structure

We'll create different queues for different task types:

  • video-preparation-queue: For video download and preparation tasks
  • media-processing-queue: For CPU-intensive tasks like frame extraction
  • ai-analysis-queue: For AI processing tasks (Gemini, OpenAI)
  • course-building-queue: For course structure generation tasks

Task Payload Structure

Each task will have a standardized payload:

{
  courseId: "course-id-here",
  taskType: "task-type-here",
  attempt: 1,
  data: {
    // Task-specific data
  },
  created: timestamp,
  idempotencyKey: "unique-key-here"
}

Task Handler Structure

Each task handler will follow a standard pattern:

exports.processTask = onTaskDispatched(
  {
    retryConfig: {
      maxAttempts: 5,
      minBackoffSeconds: 30,
    },
    rateLimits: {
      maxConcurrentDispatches: 10,
    },
  },
  async (req) => {
    const { courseId, taskType, data } = req.data;
    
    // Idempotency check
    if (await isTaskAlreadyProcessed(courseId, taskType, req.data.idempotencyKey)) {
      console.log(`Task already processed: ${taskType} for course ${courseId}`);
      return;
    }
    
    try {
      // Task-specific processing logic
      await processTaskLogic(courseId, data);
      
      // Mark as complete and enqueue next task
      await markTaskComplete(courseId, taskType, req.data.idempotencyKey);
      await enqueueNextTask(courseId);
    } catch (error) {
      console.error(`Error processing ${taskType} for course ${courseId}:`, error);
      
      // Log failure - will be retried according to retryConfig
      await logTaskFailure(courseId, taskType, error);
      throw error; // Rethrow to trigger retry
    }
  }
);

Parallel Processing Tasks

Some tasks can be processed in parallel:

  • Frame extraction and transcription can run concurrently
  • Different frame analysis tasks can run concurrently
  • Course structure generation and interactivity can remain sequential

This will be implemented through task dependencies and completion tracking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment