Migration Plan: Moving to Cloud Tasks

Overview

This document outlines a plan to migrate the current Firestore-triggered step processing to a more robust Cloud Tasks implementation. The migration will improve reliability, enable better retry mechanisms, and allow parallel processing where appropriate.

Benefits of Cloud Tasks

Built-in Retry Logic: Configurable retry with exponential backoff
Deduplication: Avoid duplicate processing of the same task
Rate Limiting: Control the rate of task execution
Scheduling: Delay task execution or schedule for future
Observability: Better monitoring and debugging of task execution
Parallel Processing: Execute independent tasks concurrently
Long-running Operations: Better handling of tasks that exceed function timeouts

Implementation Plan Checklist

Phase 1: Setup Cloud Tasks Infrastructure

1.1. Install Cloud Tasks SDK
```
npm install @google-cloud/tasks
```
1.2. Create a helper module for Cloud Tasks (tasks.js)
- Implement task creation functions
- Configure retry settings and queue management
- Define task payload structure

Phase 2: Refactor Step Management

2.1. Create a new TaskManager class to replace step creation logic
- Implement methods for each step type
- Define task payloads and routing
2.2. Create HTTP endpoint Cloud Functions for each task type
- Implement security validation for task execution
- Modify resource configurations for different task types
2.3. Update the main course generation flow
- Replace Firestore step creation with task creation
- Implement task chaining for sequential processing

Phase 3: Implement Task Processing Logic

Phase 4: Update Tracking and Analytics

4.1. Update analytics events for task-based processing
- Modify event tracking for task creation
- Add events for task success and failure
4.2. Implement task status tracking in Firestore
- Create schema for task status documents
- Track task attempts and results

Phase 5: Testing and Validation

Phase 6: Deployment and Switchover

6.1. Deploy infrastructure updates
- Create Cloud Tasks queues in production
- Deploy task handler functions
6.2. Implement phased rollout
- Enable A/B testing capability
- Monitor performance metrics
6.3. Perform final switchover
- Disable Firestore-triggered functions
- Route all processing through Cloud Tasks

Technical Design

Task Queue Structure

We'll create different queues for different task types:

video-preparation-queue: For video download and preparation tasks
media-processing-queue: For CPU-intensive tasks like frame extraction
ai-analysis-queue: For AI processing tasks (Gemini, OpenAI)
course-building-queue: For course structure generation tasks

Task Payload Structure

Each task will have a standardized payload:

{
  courseId: "course-id-here",
  taskType: "task-type-here",
  attempt: 1,
  data: {
    // Task-specific data
  },
  created: timestamp,
  idempotencyKey: "unique-key-here"
}

Task Handler Structure

Each task handler will follow a standard pattern:

exports.processTask = onTaskDispatched(
  {
    retryConfig: {
      maxAttempts: 5,
      minBackoffSeconds: 30,
    },
    rateLimits: {
      maxConcurrentDispatches: 10,
    },
  },
  async (req) => {
    const { courseId, taskType, data } = req.data;
    
    // Idempotency check
    if (await isTaskAlreadyProcessed(courseId, taskType, req.data.idempotencyKey)) {
      console.log(`Task already processed: ${taskType} for course ${courseId}`);
      return;
    }
    
    try {
      // Task-specific processing logic
      await processTaskLogic(courseId, data);
      
      // Mark as complete and enqueue next task
      await markTaskComplete(courseId, taskType, req.data.idempotencyKey);
      await enqueueNextTask(courseId);
    } catch (error) {
      console.error(`Error processing ${taskType} for course ${courseId}:`, error);
      
      // Log failure - will be retried according to retryConfig
      await logTaskFailure(courseId, taskType, error);
      throw error; // Rethrow to trigger retry
    }
  }
);

Parallel Processing Tasks

Some tasks can be processed in parallel:

Frame extraction and transcription can run concurrently
Different frame analysis tasks can run concurrently
Course structure generation and interactivity can remain sequential

This will be implemented through task dependencies and completion tracking.

onEnterFrame/cloud-tasks-implementation-plan.md

Select an option

No results found