From owner-freebsd-questions@FreeBSD.ORG Sun Oct 16 19:35:07 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 49CD61065677 for ; Sun, 16 Oct 2011 19:35:07 +0000 (UTC) (envelope-from tijl@coosemans.org) Received: from mailrelay006.isp.belgacom.be (mailrelay006.isp.belgacom.be [195.238.6.172]) by mx1.freebsd.org (Postfix) with ESMTP id CDF698FC0C for ; Sun, 16 Oct 2011 19:35:06 +0000 (UTC) X-Belgacom-Dynamic: yes X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AsQHAJwqm05bsbDt/2dsb2JhbABCmgSMaYF0gQaBbgEBBVYjEAsYGQIICzkeiBq0RoUIAoJ+BIdSlQyIdA Received: from 237.176-177-91.adsl-dyn.isp.belgacom.be (HELO kalimero.tijl.coosemans.org) ([91.177.176.237]) by relay.skynet.be with ESMTP; 16 Oct 2011 21:05:34 +0200 Received: from kalimero.tijl.coosemans.org (kalimero.tijl.coosemans.org [127.0.0.1]) by kalimero.tijl.coosemans.org (8.14.5/8.14.5) with ESMTP id p9GJ5XpX004742; Sun, 16 Oct 2011 21:05:34 +0200 (CEST) (envelope-from tijl@coosemans.org) From: Tijl Coosemans To: freebsd-questions@freebsd.org Date: Sun, 16 Oct 2011 21:05:22 +0200 User-Agent: KMail/1.13.7 (FreeBSD/9.0-BETA1; KDE/4.6.5; i386; ; ) References: <9B425C841283E0418B1825D40CBCFA616E38A1AC3D@ZABRYSVISEXMBX1.af.didata.local> In-Reply-To: <9B425C841283E0418B1825D40CBCFA616E38A1AC3D@ZABRYSVISEXMBX1.af.didata.local> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart2519024.BcGF6WijWI"; protocol="application/pgp-signature"; micalg=pgp-sha256 Content-Transfer-Encoding: 7bit Message-Id: <201110162105.31476.tijl@coosemans.org> Cc: Vikash Badal Subject: Re: probably stupid questions about select() and FS_SET in a multithreaded environment [ select() failed (Bad file descriptor) ] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Oct 2011 19:35:07 -0000 --nextPart2519024.BcGF6WijWI Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable On Sunday 16 October 2011 18:18:39 Vikash Badal wrote: > Greetings, >=20 > Can some point me in the correction direction please. >=20 > I have a treaded socket application that has a problem with select() > returning -1. > The select() and accept() is taken care of in one thread. The worker > threads deal with client requests after the new client connection is > pushed to queue. >=20 > The logged error is : > select() failed (Bad file descriptor) getdtablesize =3D 65536 >=20 > Sysctls at the moment are: > kern.maxfiles: 65536=20 > kern.maxfilesperproc: 65536 >=20 >=20 > > void client_accept(int listen_socket) > { > ... > while ( loop ) > { > FD_ZERO(&socket_set); > FD_SET(listen_socket, &socket_set); > timeout.tv_sec =3D 1; > timeout.tv_usec =3D 0; >=20 > rcode =3D select(listen_socket + 1, &socket_set, NULL, NULL, &timeo= ut); >=20 > if ( rcode < 0 ) > { > Log(DEBUG_0, "ERROR: select() failed (%s) getdtablesize =3D %d", > strerror(errno), getdtablesize()); > loop =3D 0; > sleep(30); > fcloseall(); > assert(1=3D=3D0); > } >=20 > if ( rcode > 0 ) > { > remotelen =3D sizeof(remote); > client_sock =3D accept(listen_socket, ..... > =20 > if (msgsock !=3D -1 ) > {=20 > // Allocate memory for request > request =3D malloc(sizeof(struct requests)); > // test for malloc etc ... > // set request values ... > // > // Push request to a queue.=20 > } > } >=20 > } > ... > } > void* tcpworker(void* arg) > { > // initialise stuff >=20 > While ( loop ) > { > // pop request from queue > =20 > If ( request !=3D NULL ) > { > // deal with request > free(request) > } > } =20 > } >=20 > > When the problem occurs, i have between 1000 and 1400 clients > connected. >=20 > Questions: > 1. do i need to FD_CLR(client_sock,&socket_set) before i push to a > queue ? > 2. do i need to FD_CLR(client_sock, &socket_set) when this client > request closes in the the tcpworker() function ? > 3. would setting kern.maxfilesperproc and kern.maxfiles to higher > values solve the problem or just take longer for the problem to > re-appear. > 4. should is replace select() with kqueue() as from google-ing it > seems select() is not that great. The size of an fd_set is limited by FD_SETSIZE which is 1024 by default. So if you pass a descriptor larger than that to FD_SET() or select(), you have a buffer overflow and memory beyond the fd_set can become corrupted. You can define FD_SETSIZE to a larger value before including sys/select.h, but you should also verify if a descriptor is less than =46D_SETSIZE before using it with select or any of the fd_set macros and return error if not. kqueue doesn't have this problem, but it's not as portable as select. --nextPart2519024.BcGF6WijWI Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iF4EABEIAAYFAk6bKvsACgkQfoCS2CCgtivCVQD/bNpha14lvhQe/7VYVKX6VNQL Oh/MrF+LSoIl561F+OYA/2ofAtMPmE5aNvc6CR8FI+QkQ+6giE/X+Rzjpig3iV1V =qBhu -----END PGP SIGNATURE----- --nextPart2519024.BcGF6WijWI--