NodeJs Forum News: TCP server not closing properly when using cluster

(same I posted in SO also couple of days back)

I have below sample code where I am forking child processes, starting a TCP server and closing the server after certain timeout. When I looked at TCP connections after timeout, I still see that one connection is in listening state.

Also there are no client connections on this server and I see that server's 'close' event is also firing for each worker.

Strangely the server accepts new connections also. How to close the TCP server completely in cluster environment?

I am using node 0.10.31 version.

var net = require('net');

var cluster = require('cluster');
var numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
    // Fork workers.
    for (var i = 0; i < numCPUs; i++) {
        cluster.fork();
    }
} else {

    var server = net.createServer(function(sock) {
        console.log('received connection...');
    });

    server.on('listening', function() {
        console.log('listening....');
        setTimeout(function() {
            server.close();
        }, 5000);
    });

    server.on('close', function() {
        console.log('server closed');
    });

    server.listen('2222', '101.30.33.194');
}

on running lsof after timeout to check tcp connections:

#lsof -ni -P | grep 2222
node      17228    root   14u  IPv4 440505953      0t0  TCP 101.30.33.194:2222 (LISTEN)

> (same I posted in SO also couple of days back)Post the link so I can get SO points! :-)

> I have below sample code where I am forking child processes, starting a TCP
> server and closing the server after certain timeout. When I looked at TCP
> connections after timeout, I still see that one connection is in listening
> state.That would be the master. What happens in cluster is the master opens (once) a listening socket, and every worker who "listens" on that same
port ever after just gets a duplicate descriptor for that single
socket in the master, which is always in "listening" state, but that
does not, in fact, accept any connections. The close event just means the socket in the worker is closed, the underlying "listening" socket may (and in this case, does) have other socket descriptors referencing it in other node processes.

Its possible, in theory, for the master to notice that no worker is
using the socket anymore, and close it. I'm not sure if that would be
a feature, though. As-is, your connections are handshaked by the TCP stack, and available on the socket, and wait, so that a new worker that is forked AFTER the TCP connection has handshaked should start getting those connections.

cluster is very, very biased in its activity. Its for persistently up
symetrical TCP servers (workers may come and go, but the assumption is
they keep coming back), and has giant caveats when used in other ways.
Such as opening a TCP server, and closing it, but not exiting the
cluster... which is what you are doing, is an edge case.

Is this actually causing you problems, or did it just surprise you
when poking around? Why are trying to close a TCP server, but not exit the cluster?

Yes, it is causing the problem. The application is not just a server to exit the cluster. Based on some criteria we wanted to start server (accept connections) or close server so that no worker should accept connections. We need more control over here.

Yes, as you said it looks it is the master which is not getting killed and still accepting connections.

Doesn't it inconsistent/ a bug or a gap? If we are using cluster no way we an close server gracefully without killing node application.

Should we file a jira?

and for your points :)

http://stackoverflow.com/questions/27603430/properly-close-tcp-server-when-using-cluster

> Yes, it is causing the problem. The application is not just a server to exit
> the cluster. Based on some criteria we wanted to start server (accept
> connections) or close server so that no worker should accept connections. We
> need more control over here.
Just to be clear: if all workers close, then NO worker will ever get a
connection accepted.

> Yes, as you said it looks it is the master which is not getting killed and
> still accepting connections.
And the master does NOT accept connections. The kernel DOES accept
them, but they will never be accepted() by the node application, and
will eventually timeout.

> Doesn't it inconsistent/ a bug or a gap? If we are using cluster no way we
> an close server gracefully without killing node application.
>
> Should we file a jira?
Not worth it against v0.10, changing this would be either a feature
addition, not allowed, or backwards incompatible, also not allowed.

Try v0.11, it is sufficiently different that it might work like you
want already, and its "pretty close" to release.

You also may consider using undocumented features of cluster. This
code won't be portable across v0.10 and v0.11/0.12, but if you poke
around in the cluster master's undocumented internal properties, I
think you can find the data structures it uses to store the server
sockets. If you can get the handles, you can close them yourself.

NodeJs Forum News

2014년 12월 25일 목요일

TCP server not closing properly when using cluster

댓글 없음:

댓글 쓰기