This document outlines a plan to migrate the current Firestore-triggered step processing to a more robust Cloud Tasks implementation. The migration will improve reliability, enable better retry mechanisms, and allow parallel processing where appropriate.
- Built-in Retry Logic: Configurable retry with exponential backoff
- Deduplication: Avoid duplicate processing of the same task
- Rate Limiting: Control the rate of task execution
- Scheduling: Delay task execution or schedule for future
- Observability: Better monitoring and debugging of task execution
- Parallel Processing: Execute independent tasks concurrently
- Long-running Operations: Better handling of tasks that exceed function timeouts
- 1.1. Install Cloud Tasks SDK
npm install @google-cloud/tasks
- 1.2. Create a helper module for Cloud Tasks (
tasks.js)- Implement task creation functions
- Configure retry settings and queue management
- Define task payload structure
- 2.1. Create a new
TaskManagerclass to replace step creation logic- Implement methods for each step type
- Define task payloads and routing
- 2.2. Create HTTP endpoint Cloud Functions for each task type
- Implement security validation for task execution
- Modify resource configurations for different task types
- 2.3. Update the main course generation flow
- Replace Firestore step creation with task creation
- Implement task chaining for sequential processing
- 3.1. Create task handler for video preparation
- Extract logic from current
prepareVideomethod - Implement idempotent processing
- Extract logic from current
- 3.2. Create task handler for frame extraction
- Extract logic from current
extractVideoFramesmethod - Add validation and idempotence checks
- Extract logic from current
- 3.3. Create task handler for video transcription
- Extract logic from current
transcribeVideomethod - Ensure proper completion tracking
- Extract logic from current
- 3.4. Create task handler for Gemini analysis
- Extract logic from current
analyzeWithGeminimethod - Configure with high-memory settings
- Extract logic from current
- 3.5. Create task handler for frame analysis
- Extract logic from current
analyzeFramesmethod - Implement error handling and retries
- Extract logic from current
- 3.6. Create task handler for course structure generation
- Extract logic from current
generateCourseStructuremethod - Add validation for prerequisites
- Extract logic from current
- 3.7. Create task handler for interactivity addition
- Extract logic from current
addInteractivitymethod - Ensure proper dependency checking
- Extract logic from current
- 3.8. Create task handler for course finalization
- Extract logic from current
finalizeCoursemethod - Implement final validation and status updates
- Extract logic from current
- 4.1. Update analytics events for task-based processing
- Modify event tracking for task creation
- Add events for task success and failure
- 4.2. Implement task status tracking in Firestore
- Create schema for task status documents
- Track task attempts and results
- 5.1. Create test harness for Cloud Tasks processing
- Implement local testing capabilities
- Create task simulation tools
- 5.2. Implement logging and monitoring enhancements
- Add detailed logs for task lifecycle
- Create monitoring dashboards for task execution
- 5.3. Perform migration testing
- Test with sample courses
- Validate all task types function correctly
- 5.4. Optimize resource allocation
- Tune memory and CPU settings for each task type
- Configure concurrency and rate limits
- 6.1. Deploy infrastructure updates
- Create Cloud Tasks queues in production
- Deploy task handler functions
- 6.2. Implement phased rollout
- Enable A/B testing capability
- Monitor performance metrics
- 6.3. Perform final switchover
- Disable Firestore-triggered functions
- Route all processing through Cloud Tasks
We'll create different queues for different task types:
video-preparation-queue: For video download and preparation tasksmedia-processing-queue: For CPU-intensive tasks like frame extractionai-analysis-queue: For AI processing tasks (Gemini, OpenAI)course-building-queue: For course structure generation tasks
Each task will have a standardized payload:
{
courseId: "course-id-here",
taskType: "task-type-here",
attempt: 1,
data: {
// Task-specific data
},
created: timestamp,
idempotencyKey: "unique-key-here"
}Each task handler will follow a standard pattern:
exports.processTask = onTaskDispatched(
{
retryConfig: {
maxAttempts: 5,
minBackoffSeconds: 30,
},
rateLimits: {
maxConcurrentDispatches: 10,
},
},
async (req) => {
const { courseId, taskType, data } = req.data;
// Idempotency check
if (await isTaskAlreadyProcessed(courseId, taskType, req.data.idempotencyKey)) {
console.log(`Task already processed: ${taskType} for course ${courseId}`);
return;
}
try {
// Task-specific processing logic
await processTaskLogic(courseId, data);
// Mark as complete and enqueue next task
await markTaskComplete(courseId, taskType, req.data.idempotencyKey);
await enqueueNextTask(courseId);
} catch (error) {
console.error(`Error processing ${taskType} for course ${courseId}:`, error);
// Log failure - will be retried according to retryConfig
await logTaskFailure(courseId, taskType, error);
throw error; // Rethrow to trigger retry
}
}
);Some tasks can be processed in parallel:
- Frame extraction and transcription can run concurrently
- Different frame analysis tasks can run concurrently
- Course structure generation and interactivity can remain sequential
This will be implemented through task dependencies and completion tracking.