This document outlines the architecture of the automated job dispatch system. The system is designed to be scalable, resilient, and non-blocking, ensuring that a high volume of jobs can be processed efficiently without delays.
The dispatch process is managed by two distinct Redis queues: the dispatch-queue and the dispatch-timeout-queue. This separation of concerns is the key to the system's performance and reliability.
graph TD
A[Form Submission Received] --> B{Add to `dispatch-queue`};
B --> C[dispatchProcessor.js];
C --> D{1. Find Nearby Techs & Create Job};
D --> E{2. Dispatch to First Tech};
E --> F{3. Add Delayed Job to `dispatch-timeout-queue`};
F --> G((dispatchProcessor Done));
subgraph 5 minutes Later
H(dispatch-timeout-queue) --> I[dispatchTimeoutProcessor.js];
I --> J{Job Assigned?};
J -- Yes --> K((End));
J -- No --> L{More Techs to Try?};
L -- Yes --> M{Dispatch to Next Tech};
M --> F;
L -- No --> N[Alert Dispatchers];
N --> O((End));
end
-
Initial Job Processing (
dispatch-queue):- When a new form submission is ready for dispatch, a job is added to the
dispatch-queue. - The
dispatchProcessor.jsworker picks up this job. - Responsibilities:
- Find all nearby technicians.
- Create a new job record in the database with a status of
dispatching. - Dispatch the job to the first technician in the list via a Twilio voice call.
- Crucially, it then immediately adds a delayed job to the
dispatch-timeout-queue. This job contains the list of the remaining technicians. The delay is configurable (e.g., 5 minutes).
- The
dispatchProcessoris now finished with its work and is free to process the next job, making the initial dispatch process non-blocking.
- When a new form submission is ready for dispatch, a job is added to the
-
Timeout and Retry Logic (
dispatch-timeout-queue):- After the 5 minutes delay, the
dispatchTimeoutProcessor.jsworker picks up the job from thedispatch-timeout-queue. - Responsibilities:
- It first checks the current status of the job. If it has already been
assigned(meaning the first technician accepted), the worker's job is done. - If the job is still
dispatching, it dispatches to the next technician in the list it received. - It then re-queues another delayed job for the next technician in the list.
- This cycle continues until a technician accepts the job or all technicians in the list have been tried.
- It first checks the current status of the job. If it has already been
- After the 5 minutes delay, the
The system has a built-in mechanism to handle cases where no technician accepts a job.
- If the
dispatchTimeoutProcessorruns out of technicians to try, it calls thejobsService.handleDispatchFailure()function. - This function:
- Changes the job's status from
dispatchingtopending. - Creates a notification that is sent to all users with the "Dispatcher" role.
- This ensures that any job that fails to be auto-assigned is immediately flagged for manual intervention, preventing any job from being missed.
- Changes the job's status from
- Scalability: The primary
dispatchProcessoris non-blocking, allowing it to handle a large volume of incoming jobs without getting stuck insetTimeoutdelays. - Resilience: Using a dedicated
dispatch-timeout-queuefor retries is more robust than in-memory timers. If the application restarts, the retry jobs are persisted in Redis and will be processed when the workers come back online. - Maintainability: The logic is separated into two distinct workers, each with a clear responsibility, making the system easier to understand, debug, and extend.
- Configurability: The delay between dispatch attempts is a configurable value, allowing for easy tuning of the system's timing without code changes.