Date: Sun, 16 Oct 2011 21:05:22 +0200 From: Tijl Coosemans <tijl@coosemans.org> To: freebsd-questions@freebsd.org Cc: Vikash Badal <Vikash.Badal@is.co.za> Subject: Re: probably stupid questions about select() and FS_SET in a multithreaded environment [ select() failed (Bad file descriptor) ] Message-ID: <201110162105.31476.tijl@coosemans.org> In-Reply-To: <9B425C841283E0418B1825D40CBCFA616E38A1AC3D@ZABRYSVISEXMBX1.af.didata.local> References: <9B425C841283E0418B1825D40CBCFA616E38A1AC3D@ZABRYSVISEXMBX1.af.didata.local>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --]
On Sunday 16 October 2011 18:18:39 Vikash Badal wrote:
> Greetings,
>
> Can some point me in the correction direction please.
>
> I have a treaded socket application that has a problem with select()
> returning -1.
> The select() and accept() is taken care of in one thread. The worker
> threads deal with client requests after the new client connection is
> pushed to queue.
>
> The logged error is :
> select() failed (Bad file descriptor) getdtablesize = 65536
>
> Sysctls at the moment are:
> kern.maxfiles: 65536
> kern.maxfilesperproc: 65536
>
>
> <code>
> void client_accept(int listen_socket)
> {
> ...
> while ( loop )
> {
> FD_ZERO(&socket_set);
> FD_SET(listen_socket, &socket_set);
> timeout.tv_sec = 1;
> timeout.tv_usec = 0;
>
> rcode = select(listen_socket + 1, &socket_set, NULL, NULL, &timeout);
>
> if ( rcode < 0 )
> {
> Log(DEBUG_0, "ERROR: select() failed (%s) getdtablesize = %d",
> strerror(errno), getdtablesize());
> loop = 0;
> sleep(30);
> fcloseall();
> assert(1==0);
> }
>
> if ( rcode > 0 )
> {
> remotelen = sizeof(remote);
> client_sock = accept(listen_socket, .....
>
> if (msgsock != -1 )
> {
> // Allocate memory for request
> request = malloc(sizeof(struct requests));
> // test for malloc etc ...
> // set request values ...
> //
> // Push request to a queue.
> }
> }
>
> }
> ...
> }
> void* tcpworker(void* arg)
> {
> // initialise stuff
>
> While ( loop )
> {
> // pop request from queue
>
> If ( request != NULL )
> {
> // deal with request
> free(request)
> }
> }
> }
>
> </code>
> When the problem occurs, i have between 1000 and 1400 clients
> connected.
>
> Questions:
> 1. do i need to FD_CLR(client_sock,&socket_set) before i push to a
> queue ?
> 2. do i need to FD_CLR(client_sock, &socket_set) when this client
> request closes in the the tcpworker() function ?
> 3. would setting kern.maxfilesperproc and kern.maxfiles to higher
> values solve the problem or just take longer for the problem to
> re-appear.
> 4. should is replace select() with kqueue() as from google-ing it
> seems select() is not that great.
The size of an fd_set is limited by FD_SETSIZE which is 1024 by
default. So if you pass a descriptor larger than that to FD_SET() or
select(), you have a buffer overflow and memory beyond the fd_set can
become corrupted.
You can define FD_SETSIZE to a larger value before including
sys/select.h, but you should also verify if a descriptor is less than
FD_SETSIZE before using it with select or any of the fd_set macros and
return error if not.
kqueue doesn't have this problem, but it's not as portable as select.
[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)
iF4EABEIAAYFAk6bKvsACgkQfoCS2CCgtivCVQD/bNpha14lvhQe/7VYVKX6VNQL
Oh/MrF+LSoIl561F+OYA/2ofAtMPmE5aNvc6CR8FI+QkQ+6giE/X+Rzjpig3iV1V
=qBhu
-----END PGP SIGNATURE-----
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201110162105.31476.tijl>
