Skip to content

Instantly share code, notes, and snippets.

@SeLub
Created October 2, 2025 10:42
Show Gist options
  • Select an option

  • Save SeLub/2455aef4f30096f4a75025cd950d68bc to your computer and use it in GitHub Desktop.

Select an option

Save SeLub/2455aef4f30096f4a75025cd950d68bc to your computer and use it in GitHub Desktop.
How Clustering Works in Node.js

When using clusters in Node.js, it's important to understand how clustering works and whether or not a load balancer is necessary. Clustering allows you to create multiple server instances that share the same port, which can be beneficial for utilizing multi-core systems. However, this does not inherently provide any load balancing capabilities on its own.

How Clustering Works in Node.js

When you use cluster module in Node.js, it forks several child processes to handle incoming connections. Each fork listens on the same port and can receive requests for different parts of your application. Here's a basic example:

const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
  console.log(`Master ${process.pid} is running`);

  // Fork workers.
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} died`);
  });
} else {
  // Workers can share any TCP connection
  // Here, it is an HTTP server
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end('Hello World\n');
  }).listen(8000);

  console.log(`Worker ${process.pid} started`);
}

Spread Requests Between Forks

Without a load balancer, each fork will handle requests independently. If you want to balance the load between these forks, you can implement custom logic or use an external load balancer. Here are some strategies:

  1. Round-Robin: Distribute incoming connections in a round-robin fashion across the available workers.
  2. Least Connections: Direct new connections to the worker with the fewest active connections.
  3. Consistent Hashing: Use consistent hashing to route requests to specific workers based on a hash of the request data.
  4. Weighted Round-Robin or Weighted Least Connections: Assign weights to different workers and distribute connections according to these weights.
  5. External Load Balancer: Use an external service like Nginx, HAProxy, or AWS ELB to distribute traffic across your Node.js clusters. This is a common practice in production environments.

Example: Round-Robin Strategy

Here's a simple example using the round-robin strategy with cluster module:

const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
  console.log(`Master ${process.pid} is running`);

  // Round-robin strategy
  let currentIndex = 0;

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} died`);
    // Optional: Bring up a new worker to replace the dead one
    if (!cluster.workers[currentIndex]) {
      currentIndex = 0;
    }
    cluster.fork();
  });

  setInterval(() => {
    const workers = Object.values(cluster.workers);
    const worker = workers[currentIndex % workers.length];
    worker.send('ping');
    currentIndex++;
  }, 1000);
} else {
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end('Hello World\n');
  }).listen(8000);

  console.log(`Worker ${process.pid} started`);

  process.on('message', (msg) => {
    if (msg === 'ping') {
      // Optional: Respond to the master for health checks or other signals
      process.send('pong');
    }
  });
}

Conclusion

While Node.js clusters allow multiple processes to share a port, they do not inherently provide load balancing. To effectively distribute incoming requests across multiple workers, you can implement custom strategies like round-robin or use an external load balancer for better control and performance.

@SeLub
Copy link
Author

SeLub commented Oct 2, 2025

Number of CPU as index for fork.

const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
  console.log(`Master ${process.pid} is running`);

  // Round-robin strategy
  let currentIndex = 0;

  for (let i = 0; i < numCPUs; i++) {
    const worker = cluster.fork();
    worker.send({ type: 'init', cpuIndex: i });
  }

  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} died`);
    // Optional: Bring up a new worker to replace the dead one
    if (!cluster.workers[currentIndex]) {
      currentIndex = 0;
    }
    cluster.fork();
  });

  setInterval(() => {
    const workers = Object.values(cluster.workers);
    const worker = workers[currentIndex % workers.length];
    worker.send('ping');
    
    currentIndex++;
  }, 1000);
} else {
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end('Hello World\n');
  }).listen(8000);

  process.on('message', (msg) => {
    if (msg === 'ping') {
      // Optional: Respond to the master for health checks or other signals
      process.send('pong');
    } else if (msg.type === 'init') {
      // Received the init message (if you used worker.send)
      console.log(`Worker ${process.pid} started on CPU ${msg.cpuIndex}`);
    }
  });
}

Result:
Master 1238000 is running
Worker 1238030 started on CPU 8
Worker 1238028 started on CPU 7
Worker 1238010 started on CPU 3
Worker 1238012 started on CPU 5

@SeLub
Copy link
Author

SeLub commented Oct 2, 2025

Screenshot from 2025-10-02 14-05-57

Now your test will send 500 concurrent requests every 100ms for 2 minutes. This should generate around 60,000 requests total.

For even more stress, try these escalating levels:

Level 1 (Current): 500 requests/100ms = ~300,000 req/min
Level 2: Change to CONCURRENT_REQUESTS = 1000 and REQUEST_INTERVAL = 50
Level 3: Change to CONCURRENT_REQUESTS = 2000 and REQUEST_INTERVAL = 25

Start with the current settings and monitor your system with htop or top to see CPU usage. If it's still handling it easily, escalate to the next level.Now lets make some tests.

const http = require('http');

const TARGET_URL = 'http://localhost:8000';
const CONCURRENT_REQUESTS = 500;
const DURATION_SECONDS = 120;
const REQUEST_INTERVAL = 100;

let requestCount = 0;
let errorCount = 0;

function makeRequest() {
  const start = Date.now();
  
  const req = http.request(TARGET_URL, (res) => {
    requestCount++;
    const duration = Date.now() - start;
    console.log(`Request ${requestCount}: ${res.statusCode} (${duration}ms)`);
  });

  req.on('error', () => {
    errorCount++;
    console.log(`Error ${errorCount}`);
  });

  req.end();
}

console.log(`Starting load test: ${CONCURRENT_REQUESTS} concurrent requests for ${DURATION_SECONDS}s`);

const interval = setInterval(() => {
  for (let i = 0; i < CONCURRENT_REQUESTS; i++) {
    makeRequest();
  }
}, REQUEST_INTERVAL);

setTimeout(() => {
  clearInterval(interval);
  console.log(`\nTest completed: ${requestCount} requests, ${errorCount} errors`);
}, DURATION_SECONDS * 1000);

@SeLub
Copy link
Author

SeLub commented Oct 2, 2025

This will send 60,000 requests per second by:

  • Sending 1,000 requests every ~17ms
  • Removing verbose logging (major bottleneck)
  • Only logging every 10,000 requests

Warning: 60k req/sec is extreme and will likely:

  • Exhaust file descriptors
  • Overwhelm network stack
  • Crash your system

Start with lower values:

  • REQUESTS_PER_SECOND = 5000 (5k/sec)
  • REQUESTS_PER_SECOND = 10000 (10k/sec)
  • Then gradually increase

Also increase system limits: ulimit -n 65536 # Increase file descriptors

const http = require('http');

const TARGET_URL = 'http://localhost:8000';
const REQUESTS_PER_SECOND = 60000;
const DURATION_SECONDS = 10;
const BATCH_SIZE = 1000;
const BATCH_INTERVAL = Math.floor(1000 / (REQUESTS_PER_SECOND / BATCH_SIZE));

let requestCount = 0;
let errorCount = 0;

function makeRequest() {
  const req = http.request(TARGET_URL, (res) => {
    requestCount++;
  });

  req.on('error', () => {
    errorCount++;
  });

  req.end();
}

console.log(`Starting load test: ${REQUESTS_PER_SECOND} requests/sec for ${DURATION_SECONDS}s`);
console.log(`Batch size: ${BATCH_SIZE}, Interval: ${BATCH_INTERVAL}ms`);

const interval = setInterval(() => {
  for (let i = 0; i < BATCH_SIZE; i++) {
    makeRequest();
  }
  if (requestCount % 10000 === 0) {
    console.log(`Sent: ${requestCount}, Errors: ${errorCount}`);
  }
}, BATCH_INTERVAL);

setTimeout(() => {
  clearInterval(interval);
  console.log(`\nTest completed: ${requestCount} requests, ${errorCount} errors`);
}, DURATION_SECONDS * 1000);

@SeLub
Copy link
Author

SeLub commented Oct 2, 2025

Q: As I now Fastify server is conducted to handle 47000 req/sec, while Express 10000. Info from here: https://fastify.dev/benchmarks/

My system handles 60000 easly. Is it possible? Explain difference. Fastify for test uses autocannon https://github.com/fastify/benchmarks/blob/main/package.json

  • Intel Core i7-14700 Processor
  • Gigabyte Z790 D AX Motherboard (rev. 1.0)
  • Palit GeForce RTX 4060 Ti JetStream 16GB Graphics Card (NE6406T019T1-1061J)
  • G.Skill Trident Z5 Neo RGB 2x32GB DDR5 6000MHz F5-6000J3040G32GX2-TZ5NR RAM
  • DeepCool AK620 WH R-AK620-WHNNMT-G-1
  • Raijintek Silenos MS Pro Case (Black)
  • Samsung 1TB Internal SSD MZ-V8P1T0CW (MZ-V8P1T0CW)
  • Ginzzu 700W Power Supply (CB700) ATX, 12CM, 20+4p, CPU (4+4), 2 PCI-E (6+2), 6 SATA, 3IDE, power cable.

A: Yes, it's absolutely possible! Here's why your system handles 60k+ req/sec easily:

Key Differences
Your Setup vs Benchmarks:

  • Your CPU : i7-14700 (20 cores, 28 threads) vs benchmark machines (usually 4-8 cores)
  • Your RAM : 64GB DDR5-6000 vs typical 16-32GB DDR4
  • Your server : Raw Node.js cluster (28 workers) vs single-process frameworks

Framework Performance:

  • Raw Node.js HTTP : ~100k+ req/sec (what you're using)
  • Fastify : ~47k req/sec (adds routing, validation, serialization)
  • Express : ~10k req/sec (heavy middleware stack)

Why You're Faster

  • No Framework Overhead : Your cluster uses raw http.createServer() - minimal processing
  • 28 Workers : Each CPU core runs a separate process
  • Simple Response : Just "Hello World" with no routing/middleware
  • Powerful Hardware : Your system is enterprise-grade

@SeLub
Copy link
Author

SeLub commented Oct 2, 2025

Your system handled 44k req/sec easily! That's a powerful setup. Let's push it harder:

load-test.cjs

const http = require('http');

const TARGET_URL = 'http://localhost:8000';
const REQUESTS_PER_SECOND = 200000;
const DURATION_SECONDS = 30;
const BATCH_SIZE = 2000;
const BATCH_INTERVAL = Math.floor(1000 / (REQUESTS_PER_SECOND / BATCH_SIZE));

let requestCount = 0;
let errorCount = 0;

function makeRequest() {
  const req = http.request(TARGET_URL, (res) => {
    requestCount++;
  });

  req.on('error', () => {
    errorCount++;
  });

  req.end();
}

console.log(`Starting load test: ${REQUESTS_PER_SECOND} requests/sec for ${DURATION_SECONDS}s`);
console.log(`Batch size: ${BATCH_SIZE}, Interval: ${BATCH_INTERVAL}ms`);

const interval = setInterval(() => {
  for (let i = 0; i < BATCH_SIZE; i++) {
    makeRequest();
  }
  if (requestCount % 10000 === 0) {
    console.log(`Sent: ${requestCount}, Errors: ${errorCount}`);
  }
}, BATCH_INTERVAL);

setTimeout(() => {
  clearInterval(interval);
  console.log(`\nTest completed: ${requestCount} requests, ${errorCount} errors`);
}, DURATION_SECONDS * 1000);

Now it will attempt 200,000 requests/second for 30 seconds.

For even more stress, create multiple test instances:

stress-test.sh

#!/bin/bash

echo "Starting 4 parallel load tests..."

node load-test.cjs &
node load-test.cjs &
node load-test.cjs &
node load-test.cjs &

wait
echo "All tests completed"

Run with:

chmod +x stress-test.sh
./stress-test.sh

This will launch 4 parallel instances = 800k requests/second total. Monitor with htop to see if you can finally stress that beast of a machine!

Your i7-14700 (20 cores/28 threads) with 64GB DDR5-6000 should handle serious load before breaking a sweat.

RESULT:
Screenshot from 2025-10-02 14-51-56

./stress-test.sh
Starting 4 parallel load tests...
Starting load test: 200000 requests/sec for 30s
Starting load test: 200000 requests/sec for 30s
Starting load test: 200000 requests/sec for 30s
Batch size: 2000, Interval: 10ms
Batch size: 2000, Interval: 10ms
Batch size: 2000, Interval: 10ms
Starting load test: 200000 requests/sec for 30s
Batch size: 2000, Interval: 10ms
Sent: 0, Errors: 0
Sent: 0, Errors: 0
Sent: 0, Errors: 0
Sent: 0, Errors: 0
Sent: 0, Errors: 0
Sent: 0, Errors: 0
Sent: 0, Errors: 0
Sent: 0, Errors: 0
Sent: 0, Errors: 0
Sent: 0, Errors: 0
Sent: 0, Errors: 0
Sent: 0, Errors: 0
Sent: 10000, Errors: 0
Sent: 10000, Errors: 1244
Sent: 10000, Errors: 1299

Test completed: 39709 requests, 3167 errors

Test completed: 39238 requests, 3523 errors

Test completed: 41456 requests, 3377 errors

Test completed: 37749 requests, 3518 errors
All tests completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment