From owner-freebsd-questions@FreeBSD.ORG  Sun Oct 16 16:42:57 2011
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D0D45106564A
	for <freebsd-questions@freebsd.org>;
	Sun, 16 Oct 2011 16:42:57 +0000 (UTC)
	(envelope-from Vikash.Badal@is.co.za)
Received: from morpheus.is.co.za (morpheus.is.co.za [196.35.45.229])
	by mx1.freebsd.org (Postfix) with ESMTP id 0B1408FC14
	for <freebsd-questions@freebsd.org>;
	Sun, 16 Oct 2011 16:42:56 +0000 (UTC)
Received: from morpheus.is.co.za (localhost.is.co.za [127.0.0.1])
	by morpheus.is.co.za (Postfix) with ESMTP id A691EF2732
	for <freebsd-questions@freebsd.org>;
	Sun, 16 Oct 2011 18:18:42 +0200 (SAST)
Received: from ZABRYSVISMFW3 (zajnbisit03.mfw.is.co.za [196.26.2.110])
	by morpheus.is.co.za (Postfix) with ESMTP id 79A12F2729
	for <freebsd-questions@freebsd.org>;
	Sun, 16 Oct 2011 18:18:42 +0200 (SAST)
Received: from zabrysvisexhub3.af.didata.local (Not Verified[10.1.8.38]) by
	ZABRYSVISMFW3 with MailMarshal (v6, 7, 2, 8378)
	id <B4e9b048c0000>; Sun, 16 Oct 2011 18:21:32 +0200
Received: from ZABRYSVISEXMBX1.af.didata.local ([fe80::1856:470d:1193:14bf])
	by zabrysvisexhub3.af.didata.local ([fe80::9023:67c3:e2b7:a5ba%10])
	with mapi; Sun, 16 Oct 2011 18:18:42 +0200
From: Vikash Badal <Vikash.Badal@is.co.za>
To: "freebsd-questions@freebsd.org" <freebsd-questions@freebsd.org>
Date: Sun, 16 Oct 2011 18:18:39 +0200
Thread-Topic: probably stupid questions about select() and FS_SET in a
	multithreaded environment [ select() failed (Bad file descriptor) ]
Thread-Index: AcyMH0PhIRcTsTgLS3S41GEtz0UpQg==
Message-ID: <9B425C841283E0418B1825D40CBCFA616E38A1AC3D@ZABRYSVISEXMBX1.af.didata.local>
Accept-Language: en-US, en-ZA
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
acceptlanguage: en-US, en-ZA
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Virus-Scanned: ClamAV using ClamSMTP
Subject: probably stupid questions about select() and FS_SET in a
 multithreaded environment [ select() failed (Bad file descriptor) ]
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 16 Oct 2011 16:42:58 -0000

Greetings,

Can some point me in the correction direction please.

I have a treaded socket application that has a problem with select() retu=
rning -1.
The select() and accept() is taken care of in one thread. The worker thre=
ads deal with client requests after the new client connection is pushed t=
o queue.

The logged error is :
select() failed (Bad file descriptor) getdtablesize =3D 65536

Sysctls at the moment  are:
kern.maxfiles: 65536=20
kern.maxfilesperproc: 65536


<code>
void client_accept(int listen_socket)
{
...
=20  while ( loop )
=20  {
=20     FD_ZERO(&socket_set);
=20     FD_SET(listen_socket, &socket_set);
=20     timeout.tv_sec =3D 1;
=20     timeout.tv_usec =3D 0;

=20     rcode =3D select(listen_socket + 1, &socket_set, NULL, NULL, &tim=
eout);

=20     if ( rcode < 0 )
=20     {
=20        Log(DEBUG_0, "ERROR: select() failed (%s) getdtablesize =3D %d=
",
=20            strerror(errno), getdtablesize());
=20        loop =3D 0;
=20        sleep(30);
=20        fcloseall();
=20        assert(1=3D=3D0);
=20     }

=20     if ( rcode > 0 )
=20     {
=20         remotelen =3D sizeof(remote);
=20         client_sock =3D accept(listen_socket, .....
=20        =20
=20         if (msgsock !=3D -1 )
=20         {=20
=20            // Allocate memory for request
=20            request =3D malloc(sizeof(struct requests));
=20            // test for malloc etc ...
=20            // set request values ...
=20            //
=20            // Push request to a queue.=20
=20         }
=20     }

=20  }
=20...
}
void* tcpworker(void* arg)
{
=20  // initialise stuff

=20  While ( loop )
=20  {
=20     // pop request from queue
=20    =20
=20     If ( request !=3D NULL )
=20     {
=20        // deal with request
=20        free(request)
=20     }
=20  }  =20
}

</code>
When the problem occurs, i have between 1000 and 1400 clients connected.

Questions:
1. do i need to FD_CLR(client_sock,&socket_set) before i push to a queue =
?
2. do i need to FD_CLR(client_sock, &socket_set) when this client request=
=20closes in the the tcpworker() function ?
3. would setting kern.maxfilesperproc and kern.maxfiles to higher values =
solve the problem or just take longer for the problem to re-appear.
4. should is replace select() with kqueue() as from google-ing it seems s=
elect() is not that great.


Thanks
Vikash

Please note: This email and its content are subject to the disclaimer as =
displayed at the following link http://www.is.co.za/legal/E-mail+Confiden=
tiality+Notice+and+Disclaimer.htm. Should you not have Web access, send a=
=20mail to disclaimers@is.co.za and a copy will be emailed to you.