Auto-Dispatch System Architecture

This document outlines the architecture of the automated job dispatch system. The system is designed to be scalable, resilient, and non-blocking, ensuring that a high volume of jobs can be processed efficiently without delays.

Core Architecture: A Dual-Queue System

The dispatch process is managed by two distinct Redis queues: the dispatch-queue and the dispatch-timeout-queue. This separation of concerns is the key to the system's performance and reliability.

graph TD
    A[Form Submission Received] --> B{Add to `dispatch-queue`};
    B --> C[dispatchProcessor.js];
    C --> D{1. Find Nearby Techs & Create Job};
    D --> E{2. Dispatch to First Tech};
    E --> F{3. Add Delayed Job to `dispatch-timeout-queue`};
    F --> G((dispatchProcessor Done));

    subgraph 5 minutes Later
        H(dispatch-timeout-queue) --> I[dispatchTimeoutProcessor.js];
        I --> J{Job Assigned?};
        J -- Yes --> K((End));
        J -- No --> L{More Techs to Try?};
        L -- Yes --> M{Dispatch to Next Tech};
        M --> F;
        L -- No --> N[Alert Dispatchers];
        N --> O((End));
    end

Step-by-Step Flow

Initial Job Processing (dispatch-queue):
- When a new form submission is ready for dispatch, a job is added to the dispatch-queue.
- The dispatchProcessor.js worker picks up this job.
- Responsibilities:
  - Find all nearby technicians.
  - Create a new job record in the database with a status of dispatching.
  - Dispatch the job to the first technician in the list via a Twilio voice call.
  - Crucially, it then immediately adds a delayed job to the dispatch-timeout-queue. This job contains the list of the remaining technicians. The delay is configurable (e.g., 5 minutes).
- The dispatchProcessor is now finished with its work and is free to process the next job, making the initial dispatch process non-blocking.
Timeout and Retry Logic (dispatch-timeout-queue):
- After the 5 minutes delay, the dispatchTimeoutProcessor.js worker picks up the job from the dispatch-timeout-queue.
- Responsibilities:
  - It first checks the current status of the job. If it has already been assigned (meaning the first technician accepted), the worker's job is done.
  - If the job is still dispatching, it dispatches to the next technician in the list it received.
  - It then re-queues another delayed job for the next technician in the list.
  - This cycle continues until a technician accepts the job or all technicians in the list have been tried.

Failure and Alerting

The system has a built-in mechanism to handle cases where no technician accepts a job.

If the dispatchTimeoutProcessor runs out of technicians to try, it calls the jobsService.handleDispatchFailure() function.
This function:
1. Changes the job's status from dispatching to pending.
2. Creates a notification that is sent to all users with the "Dispatcher" role.
3. This ensures that any job that fails to be auto-assigned is immediately flagged for manual intervention, preventing any job from being missed.

Benefits of this Architecture

Scalability: The primary dispatchProcessor is non-blocking, allowing it to handle a large volume of incoming jobs without getting stuck in setTimeout delays.
Resilience: Using a dedicated dispatch-timeout-queue for retries is more robust than in-memory timers. If the application restarts, the retry jobs are persisted in Redis and will be processed when the workers come back online.
Maintainability: The logic is separated into two distinct workers, each with a clear responsibility, making the system easier to understand, debug, and extend.
Configurability: The delay between dispatch attempts is a configurable value, allowing for easy tuning of the system's timing without code changes.

anthonycoffey/AUTODISPATCH_UPDATES.md

Select an option

No results found