Node applications run in a single thread. All requests made to the application are queued up and often resolved asynchronously. A lot of requests will cause the application to respond slowly.
Lots of requests can be handled in multiple threads. Each thread serves requests. All requests to the application are handled by a single master process. The master process will create a cluster of clusters, and send requests to these clusters. Each cluster is a node server, it's own process running on an available port.
When your code has a problem, it will throw an exception. If your exception is not handled properly, your server will come down. New requests to the server will not be handled. When a user requests a page on your site, and your code throws an uncaught exception, the page will stop responding. When all requests are handled by the master process, then when the server is down the master will send the request to another server.
When a cluster goes down, you want to send it a command to shutdown and close the process. When the process hangs and refuses to go down, you can forcefully end the process. Create a method to terminate the process if the server refuses to go down gracefully. All incoming requests are handled round robin.
These child Nodes are still whole new instances of V8. Assume at least 30ms startup and 10mb memory for each new Node. That is, you cannot
create many thousands of them.