From owner-freebsd-questions@FreeBSD.ORG Sun Oct 16 16:42:57 2011 Return-Path: <owner-freebsd-questions@FreeBSD.ORG> Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0D45106564A for <freebsd-questions@freebsd.org>; Sun, 16 Oct 2011 16:42:57 +0000 (UTC) (envelope-from Vikash.Badal@is.co.za) Received: from morpheus.is.co.za (morpheus.is.co.za [196.35.45.229]) by mx1.freebsd.org (Postfix) with ESMTP id 0B1408FC14 for <freebsd-questions@freebsd.org>; Sun, 16 Oct 2011 16:42:56 +0000 (UTC) Received: from morpheus.is.co.za (localhost.is.co.za [127.0.0.1]) by morpheus.is.co.za (Postfix) with ESMTP id A691EF2732 for <freebsd-questions@freebsd.org>; Sun, 16 Oct 2011 18:18:42 +0200 (SAST) Received: from ZABRYSVISMFW3 (zajnbisit03.mfw.is.co.za [196.26.2.110]) by morpheus.is.co.za (Postfix) with ESMTP id 79A12F2729 for <freebsd-questions@freebsd.org>; Sun, 16 Oct 2011 18:18:42 +0200 (SAST) Received: from zabrysvisexhub3.af.didata.local (Not Verified[10.1.8.38]) by ZABRYSVISMFW3 with MailMarshal (v6, 7, 2, 8378) id <B4e9b048c0000>; Sun, 16 Oct 2011 18:21:32 +0200 Received: from ZABRYSVISEXMBX1.af.didata.local ([fe80::1856:470d:1193:14bf]) by zabrysvisexhub3.af.didata.local ([fe80::9023:67c3:e2b7:a5ba%10]) with mapi; Sun, 16 Oct 2011 18:18:42 +0200 From: Vikash Badal <Vikash.Badal@is.co.za> To: "freebsd-questions@freebsd.org" <freebsd-questions@freebsd.org> Date: Sun, 16 Oct 2011 18:18:39 +0200 Thread-Topic: probably stupid questions about select() and FS_SET in a multithreaded environment [ select() failed (Bad file descriptor) ] Thread-Index: AcyMH0PhIRcTsTgLS3S41GEtz0UpQg== Message-ID: <9B425C841283E0418B1825D40CBCFA616E38A1AC3D@ZABRYSVISEXMBX1.af.didata.local> Accept-Language: en-US, en-ZA Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US, en-ZA Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Subject: probably stupid questions about select() and FS_SET in a multithreaded environment [ select() failed (Bad file descriptor) ] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions <freebsd-questions.freebsd.org> List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, <mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions> List-Post: <mailto:freebsd-questions@freebsd.org> List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help> List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, <mailto:freebsd-questions-request@freebsd.org?subject=subscribe> X-List-Received-Date: Sun, 16 Oct 2011 16:42:58 -0000 Greetings, Can some point me in the correction direction please. I have a treaded socket application that has a problem with select() retu= rning -1. The select() and accept() is taken care of in one thread. The worker thre= ads deal with client requests after the new client connection is pushed t= o queue. The logged error is : select() failed (Bad file descriptor) getdtablesize =3D 65536 Sysctls at the moment are: kern.maxfiles: 65536=20 kern.maxfilesperproc: 65536 <code> void client_accept(int listen_socket) { ... =20 while ( loop ) =20 { =20 FD_ZERO(&socket_set); =20 FD_SET(listen_socket, &socket_set); =20 timeout.tv_sec =3D 1; =20 timeout.tv_usec =3D 0; =20 rcode =3D select(listen_socket + 1, &socket_set, NULL, NULL, &tim= eout); =20 if ( rcode < 0 ) =20 { =20 Log(DEBUG_0, "ERROR: select() failed (%s) getdtablesize =3D %d= ", =20 strerror(errno), getdtablesize()); =20 loop =3D 0; =20 sleep(30); =20 fcloseall(); =20 assert(1=3D=3D0); =20 } =20 if ( rcode > 0 ) =20 { =20 remotelen =3D sizeof(remote); =20 client_sock =3D accept(listen_socket, ..... =20 =20 =20 if (msgsock !=3D -1 ) =20 {=20 =20 // Allocate memory for request =20 request =3D malloc(sizeof(struct requests)); =20 // test for malloc etc ... =20 // set request values ... =20 // =20 // Push request to a queue.=20 =20 } =20 } =20 } =20... } void* tcpworker(void* arg) { =20 // initialise stuff =20 While ( loop ) =20 { =20 // pop request from queue =20 =20 =20 If ( request !=3D NULL ) =20 { =20 // deal with request =20 free(request) =20 } =20 } =20 } </code> When the problem occurs, i have between 1000 and 1400 clients connected. Questions: 1. do i need to FD_CLR(client_sock,&socket_set) before i push to a queue = ? 2. do i need to FD_CLR(client_sock, &socket_set) when this client request= =20closes in the the tcpworker() function ? 3. would setting kern.maxfilesperproc and kern.maxfiles to higher values = solve the problem or just take longer for the problem to re-appear. 4. should is replace select() with kqueue() as from google-ing it seems s= elect() is not that great. Thanks Vikash Please note: This email and its content are subject to the disclaimer as = displayed at the following link http://www.is.co.za/legal/E-mail+Confiden= tiality+Notice+and+Disclaimer.htm. Should you not have Web access, send a= =20mail to disclaimers@is.co.za and a copy will be emailed to you.