From owner-freebsd-arch@FreeBSD.ORG Sun Dec 22 19:28:50 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B9D057FA; Sun, 22 Dec 2013 19:28:50 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 7919F12C4; Sun, 22 Dec 2013 19:28:50 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id rBMJSgl0003934 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 22 Dec 2013 11:28:42 -0800 (PST) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id rBMJSg1K003933; Sun, 22 Dec 2013 11:28:42 -0800 (PST) (envelope-from jmg) Date: Sun, 22 Dec 2013 11:28:42 -0800 From: John-Mark Gurney To: Warner Losh Subject: mountroot issues (was Re: 10.0-release proposed patch for Atmel) Message-ID: <20131222192842.GI99167@funkthat.com> Mail-Followup-To: Warner Losh , "freebsd-arm@freebsd.org" , freebsd-arch@FreeBSD.org References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Sun, 22 Dec 2013 11:28:42 -0800 (PST) Cc: "freebsd-arm@freebsd.org" , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Dec 2013 19:28:50 -0000 Warner Losh wrote this message on Sat, Dec 21, 2013 at 23:44 -0700: > Right now, the mountroot prompt doesn't work on Atmel CPUs. Almost all the characters are eaten. I recently committed an elegant fix for this into head to mask the interrupt for new characters and only do polling. So, a similar issue plages i386/amd64 too. There the console mostly works, but it will drop characters on occasion... The problem is that mount root spins calling into the console code instead of asking the console code for a single character and having the console code wait for this character... and if you type a character while it's outside the console routines, that character will be dropped... The problem is, not many of us spend time at the mountroot prompt, and so even if we notice the issue, it's so minor that we just deal w/ it... The method I came up with years ago was to add a routine/flag that would have the console wait for a character instead of simply returning when there was no character... Though if we implement cngrab properly where we don't flush buffers, turn off interupts, etc, then that would work too... I've cc'd -arch since it's not just an -arm issue. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Sun Dec 22 21:50:41 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AC0D1F56 for ; Sun, 22 Dec 2013 21:50:41 +0000 (UTC) Received: from mail-ie0-f170.google.com (mail-ie0-f170.google.com [209.85.223.170]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 73CA71B4D for ; Sun, 22 Dec 2013 21:50:41 +0000 (UTC) Received: by mail-ie0-f170.google.com with SMTP id qd12so5288590ieb.15 for ; Sun, 22 Dec 2013 13:50:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=6uDmgAL98JDk42NCR3xur7VDhTja6d1/Lr9QzvTAWeE=; b=F2Xy77/i3nKU4wTIY0ZECfZPcNlrLKVQ0WeVljO2TgqJgBCONvfuCNs99nUr44wWP4 3OAZ9Brtm2iTyhcCB2rJDFs2L5laZXSCvLE7LaDeHoC6Zit2ijdwvT8rI8hj+pFZIJZT iVmdc5K3NOxXeYnQTuoWimW7Do+W02N3Emk5VOJISBuCmWdJuHjeimLKv9RwKisQ7/fD SSlxA9PjQi0E8hplTVDXN45YZqam2nixzjHg+fuOhyY2XgId+h/x0/8e9mFHY71Zgj0f ioMLr3LA0wyIcXq8f6KtMhDvh/KJKdLJwf8F2fqjIItCL1qzrOSU0AY1Bb+hjHaDIao9 dbTQ== X-Gm-Message-State: ALoCoQlJic4+8zSd5fymgO3n1oaoDX3ae6i4eGjFmWBT+iNlNdPTpQ3BZEQN4B/CTOXFmrlctEhU X-Received: by 10.50.73.136 with SMTP id l8mr18450781igv.7.1387749035552; Sun, 22 Dec 2013 13:50:35 -0800 (PST) Received: from [10.0.0.23] (50-78-194-198-static.hfc.comcastbusiness.net. [50.78.194.198]) by mx.google.com with ESMTPSA id ml2sm17347417igb.10.2013.12.22.13.50.34 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sun, 22 Dec 2013 13:50:35 -0800 (PST) Sender: Warner Losh Subject: Re: mountroot issues (was Re: 10.0-release proposed patch for Atmel) Mime-Version: 1.0 (Apple Message framework v1085) Content-Type: text/plain; charset=us-ascii From: Warner Losh In-Reply-To: <20131222192842.GI99167@funkthat.com> Date: Sun, 22 Dec 2013 14:50:33 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <774852B0-2B02-46BD-8054-FAF3CB3DDAA7@bsdimp.com> References: <20131222192842.GI99167@funkthat.com> To: John-Mark Gurney X-Mailer: Apple Mail (2.1085) Cc: "freebsd-arm@freebsd.org" , freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Dec 2013 21:50:41 -0000 On Dec 22, 2013, at 12:28 PM, John-Mark Gurney wrote: > Warner Losh wrote this message on Sat, Dec 21, 2013 at 23:44 -0700: >> Right now, the mountroot prompt doesn't work on Atmel CPUs. Almost = all the characters are eaten. I recently committed an elegant fix for = this into head to mask the interrupt for new characters and only do = polling. >=20 > So, a similar issue plages i386/amd64 too. There the console mostly > works, but it will drop characters on occasion... The problem is that > mount root spins calling into the console code instead of asking the > console code for a single character and having the console code wait > for this character... and if you type a character while it's outside > the console routines, that character will be dropped... I don't think that's the problem... Why is the character dropping? > The problem is, not many of us spend time at the mountroot prompt, and > so even if we notice the issue, it's so minor that we just deal w/ > it... If characters are being dropped, it is because interrupts are enabled, = the interrupt fires and the ISR eats them. This will happen because we = don't properly implement cngrab/cnungrab in the serial driver at the = moment... I guess I've just gotten lucky and not been bit by this, or if I have it = has been so infrequently that it hasn't registered. With Atmel, it = happens all the time and I've seen it for perhaps a decade... > The method I came up with years ago was to add a routine/flag that = would > have the console wait for a character instead of simply returning when > there was no character... Though if we implement cngrab properly = where > we don't flush buffers, turn off interupts, etc, then that would work > too... That's the more robust way to cope. cngrab works really well... > I've cc'd -arch since it's not just an -arm issue. >=20 Agreed.=20 Warner From owner-freebsd-arch@FreeBSD.ORG Mon Dec 23 10:17:52 2013 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7F3187EC; Mon, 23 Dec 2013 10:17:52 +0000 (UTC) Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au [211.29.132.249]) by mx1.freebsd.org (Postfix) with ESMTP id 8E8521DE7; Mon, 23 Dec 2013 10:17:51 +0000 (UTC) Received: from c122-106-144-87.carlnfd1.nsw.optusnet.com.au (c122-106-144-87.carlnfd1.nsw.optusnet.com.au [122.106.144.87]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 9985510420A7; Mon, 23 Dec 2013 20:55:57 +1100 (EST) Date: Mon, 23 Dec 2013 20:55:56 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: John-Mark Gurney Subject: Re: mountroot issues (was Re: 10.0-release proposed patch for Atmel) In-Reply-To: <20131222192842.GI99167@funkthat.com> Message-ID: <20131223164823.B954@besplex.bde.org> References: <20131222192842.GI99167@funkthat.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=YYGEuWhf c=1 sm=1 tr=0 a=p/w0leo876FR0WNmYI1KeA==:117 a=PO7r1zJSAAAA:8 a=KDmCbJ2lt7cA:10 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=bMQ0z2ntKMwA:10 a=_mH0PAV41rktP4ceC3AA:9 a=CjuIK1q_8ugA:10 Cc: "freebsd-arm@freebsd.org" , freebsd-arch@FreeBSD.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Dec 2013 10:17:52 -0000 On Sun, 22 Dec 2013, John-Mark Gurney wrote: > Warner Losh wrote this message on Sat, Dec 21, 2013 at 23:44 -0700: >> Right now, the mountroot prompt doesn't work on Atmel CPUs. Almost all the characters are eaten. I recently committed an elegant fix for this into head to mask the interrupt for new characters and only do polling. > > So, a similar issue plages i386/amd64 too. There the console mostly > works, but it will drop characters on occasion... The problem is that > mount root spins calling into the console code instead of asking the > console code for a single character and having the console code wait > for this character... and if you type a character while it's outside > the console routines, that character will be dropped... This was broken by the multiple console changes (and buggy drivers). It last (mostly) worked in FreeBSD-4, since there were no multiple console changes there. It worked just like you hinted: the mount root code _doesn't_ spin calling into the console code, but calls into the console code which spins. Non-buggy drivers of course do something to prevent interrupt handlers doing anything while spinning. syscons is the main FreeBSD driver that I know of that once upon a time was non-buggy here. In sccngetch(), it simply disabled all tty interrupts including the keyboard one using spltty(). This worked between FreeBSD-1 and FreeBSD-4. This was broken by SMPng. spltty() became null of course, and the keyboard interrupt became Giant-locked (it still is). But the spltty()s for locking out the interrupt handler in console code were not replaced by Giant locking, and Giant locking (or almost any normal locking) is not permitted in console drivers anyway. Apparently, AT keyboard interrupts have magic timing (delayed interrupts?) allows console interrupt to mostly work without locking. Multiplexed keyboards with some actually being USB increase the problem significantly -- they probably have different interrupt timing, and since no normal locking is permitted it is difficult to lock things throughout the poll. The big SMPng change occurred just 6 weeks before multiple consoles. Thus there was a small window in which the old API would have worked if its locking had been updated. Closing of this window gave the current brokenness less any fixes for this brokenness in syscons's cngrab() -- the polling loop is at too high a level for locking in the driver to help, and syscons didn't have any anyway. My cndbctl() API was designed to fix this problem, except it was misdesigned and only applied for ddb. Syscons was the only console driver that supported it, but syscons only used it for console _output_, and it was broken a few years after the multiple console changes. cngrab() was mostly designed by me and mostly implemented by avg. It was only used by syscons too. Strangely, it now does mostly keyboard stuff where cndbctl() did only screen stuff. I don't see where it does anything to stop the interrupt handler doing anything. Syscons console output was also broken by the null spltty(). The breakage was more serious and was work around by sprinkling buggy locking. This sometimes gives deadlock if the locks are non-recursive, and doesn't work if they are recursive. Syscons was never properly locked. spltty() was like a recursive lock. It basically only worked for locking out interrupt handlers, and if you used it more (say to prevent the console driver entering when spltty() is raised), then you make it like a non-recursive lock and get deadlock. The races from not locking are the usual ones. You can have one thread in console output updating critical pointers or in console input accessing critical device registers. Then another thread may want to do console i/o. With no locking, these threads just clobber each others' state. With locking, the console output cannot be as synchronous as desired, and may cause deadlock. Deadlock is only a serious problem for ddb console i/o, but after avoiding it there it is trivial to avoid it for threads. Normal locking is not permitted anywhere in ddb since it may deadlock, and blowing away the locks is not a solution since it gives the problem of clobbering state. With normal locking, deadlock always occurs in cases like the following: console driver aquires locks console driver changes critical state --> debugger trap ... debugger does console i/o console driver blocks aquiring same locks (deadlock) The debugger trap may be on the same or a different CPU. If it is on the same CPU, then it is obviously difficult to get back to the interrupted code. (Similarly if the critical code is interrupted by a fast interrupt handler, an NMI, or some other trap. Except for debugger traps, the problem can be reduced by forbidding console i/o except in emergencies like panics.) If it is only a different CPU, then the CPU handling the trap can more easily wait for the CPU doing the i/o, but this is still hard and not done (kdb_enter() actually starts by stopping all the other CPUs as soon as possible). The com driver in 386BSD and 4.4BSD worked similarly to syscons. It used splhigh() instead of spltty() around its polling loops. This disables too many more interrupts. This even breaks clock interrupts and thus even the system time in the 386BSD era when the system time was just in timer interrupt ticks. In sio, I used essentially the cngrab() method, but pushed the method into the driver since I don't like churning APIs (the driver calls the methid whenever it is entered, so as to not depend on higher layers doing it). This was based on multiple console code in my debugger, where the the enabled consoles were "opened" on each entry to the driver, or by a debugger command to change the set of enable consoles. There was also an "ioctl" method to change the state of enabled consoles. The "open" method should do a complete initialization of the device state if necessary. Similarly, the "close" method should try to restore the orginal state. This was of course broken by the multiple console changes. The "pollc" method never really worked, since it cannot keep the device "open" across polls unless a higher level does the "open", and switching the device state in open/close makes it flap too much if the switch is non- null. sio also had some smaller bugs here: - when there are shared interrupts, devices attached to the interrupt or even all sio devices are polled for activity. Since the polling doesn't use the interrupt status, it detects activity on devices that are not supposed to interrupt. - there is some SMP modification (amplification?) of the previous bug. I forget the details. uart did nothing to disable device interrupts while polling, at least for ns8250. uart's locking is convoluted and still seems buggy: from a slightly old version: % static void % ns8250_putc(struct uart_bas *bas, int c) % { % int limit; % % limit = 250000; % while ((uart_getreg(bas, REG_LSR) & LSR_THRE) == 0 && --limit) % DELAY(4); % uart_setreg(bas, REG_DATA, c); % uart_barrier(bas); % limit = 250000; % while ((uart_getreg(bas, REG_LSR) & LSR_TEMT) == 0 && --limit) % DELAY(4); % } This is wrapped by uart_putc(). uart_putc() acquires a mutex (di->hwmtx), unless kdb_active it doesn't acquire the mutex. Not acquiring the mutex avoids deadlock with ddb (if not kdb), but means that the mutex doesn't actually work for ddb i/o, and ddb input is the most common case of input. di->hwmtx doesn't do much except provide locking for console i/o routines. With correct console grabbing, it should be impossible for the interrupt handle to run while console i/o is in progress. Synchronization occurs when grabbing acquires the mutex. % % static int % ns8250_rxready(struct uart_bas *bas) % { % % return ((uart_getreg(bas, REG_LSR) & LSR_RXRDY) != 0 ? 1 : 0); % } Similarly. Upper layers acquire the mutex. % % static int % ns8250_getc(struct uart_bas *bas, struct mtx *hwmtx) % { % int c; % % uart_lock(hwmtx); % % while ((uart_getreg(bas, REG_LSR) & LSR_RXRDY) == 0) { % uart_unlock(hwmtx); % DELAY(4); % uart_lock(hwmtx); % } % % c = uart_getreg(bas, REG_DATA); % % uart_unlock(hwmtx); % % return (c); % } This function does its own locking. It releases the mutex after every poll. Releasing the mutex doesn't do much except let the interrupt handler run and eat your input. It is done because the function may take aritrarily long to return, but I don't see why locking up all i/o on the device should be a problem. Anyway, correct grabbing gives a much longer-term "lock" on all the i/o (not the mutex, but whatever grabbing does to stop interrupt activity and keep it stopped). It makes the locking in the above have no effect. This function uses a correct API, but is currently bogus since the API is broken. uart still mostly uses the old console API internally, but the current console API has been broken to not have cn_getc, leaving this function unattached to it. In the console API, cngetc() still exists at the top level, but the cn_getc method became unused because the polling loop for it moved to the top level. It was changed to use only cn_checkc for input. Then cn_checkc was broken by renaming it to cn_getc and removing the real cn_getc. The CONS_DRIVER() obfuscation hides some of the details of this obfuscation from drivers, and uart still uses the old names internally. With multiple active consoles, bounding of the time taken by console i/o is actually a problem for output too. You might have a mixture of fast and slow devices. cngetc() only has to wait for one device, but cnputc() has to wait for all of them. It would be better if the output to the fastest device went out first, but there are no outer loops for output, so the output goes out in device order. The exact arrangement is: - cngets(): grab all devices until input is read - cngetc(): missing grabbing. Thus broken for external use. It is mainly used by cngets() where it is grabbed globally, and by cnputc() where it is grabbed around its call. - cncheckc(): missing grabbing. Thus broken for external use. It is mainly used when shutting down. Now a transient grab is correct. Grabbing is still needed for device initialization in some cases. Perhaps in shutdown a nearby printf() does (should do) sufficient initialization. - cnputs(): missing grabbing. It loops calling cnputc(). - cnputc(): missing grabbing So grabbing is basically only implemented for input. > The problem is, not many of us spend time at the mountroot prompt, and > so even if we notice the issue, it's so minor that we just deal w/ > it... Also, it is not noticeable for syscons. I rarely use serial consoles and deal with the problem for sio by holding down keys until the the right characters get through. Perhaps transiently disabling interrupts, or more likely doing lots of initialization and finalization on every console driver entry, makes the problem less noticeable for sio. The initialization and finalization does lots of slow bus accesses. This makes the duty cycle maybe 99% with interrupts disabled and 1% enabled. I now remember experimenting with much longer intentional delays in the initialization to work around the problem of losing state on non-null mode switches (especially high frequency ones caused by polling for input). The delays can be made as large as 10-20 milliseconds before becoming perceptible. > The method I came up with years ago was to add a routine/flag that would > have the console wait for a character instead of simply returning when > there was no character... Though if we implement cngrab properly where > we don't flush buffers, turn off interupts, etc, then that would work > too... That's how it used to work. It just doesn't work for multiple consoles. However, in the usual case of only 1 active console, this method would work fine. The API churn also means that you can't simply go back to the old method for 1 active console :-(. You could also try the old method with a limit of a second or so for each device. After getting input from one, give preference to that one so the delays for cycling round them all are not too painful. The timeouts would also be useful for output. Bruce From owner-freebsd-arch@FreeBSD.ORG Thu Dec 26 16:14:48 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E60DA71 for ; Thu, 26 Dec 2013 16:14:48 +0000 (UTC) Received: from htmail1.hostek.com (htmail1.hostek.com [216.15.165.50]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id AAC001CF7 for ; Thu, 26 Dec 2013 16:14:48 +0000 (UTC) Received: from [192.168.1.242] (tulsokzl.etel.64-207-234-198.easytel.com [64.207.234.198]) by htmail1.hostek.com with SMTP (version=Tls cipher=Aes128 bits=128); Thu, 26 Dec 2013 09:54:11 -0600 Message-ID: <52BC5177.70903@hostek.com> Date: Thu, 26 Dec 2013 09:55:35 -0600 From: Alex Long User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-arch@freebsd.org Subject: Default gateway lost after netif restart Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Dec 2013 16:14:49 -0000 I am new to FreeBSD so I apologize if this is the wrong place to post this. But there is a flaw in the logic regarding restart of the netif service. I understand that after restarting the netif service, you have to manually restart the routing service. The problem is that if you are configuring a machine remotely and you have to restart the netif service for whatever reason, your defaut gateway is lost, thus preventing you from restarting the routing service and you lose connectivity to the machine. Now I get around this by creating a shell script that does both and just executing that script. This works but it is sloppy in my opinion. It does not makes sense to restart a network service and lose ANY network functionality (i.e. your routes) once it comes back up. Regards, Alex Long From owner-freebsd-arch@FreeBSD.ORG Thu Dec 26 16:29:14 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 61893EED for ; Thu, 26 Dec 2013 16:29:14 +0000 (UTC) Received: from mail-oa0-x233.google.com (mail-oa0-x233.google.com [IPv6:2607:f8b0:4003:c02::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 2C01F1E3D for ; Thu, 26 Dec 2013 16:29:14 +0000 (UTC) Received: by mail-oa0-f51.google.com with SMTP id i7so8753590oag.38 for ; Thu, 26 Dec 2013 08:29:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=7vGPHpIhF00hlrevybUe0V2DAqScrFCWwq4v4P+uags=; b=cCUcJBHrcTdzq5a9q/3wlWAHDFMyqzsaF6+uNEzVGap4+VhqnlMDHFQrRqcgmAauyU Lc0/Vbt8YSO3vHnETDlRgmrUV5KuJFvv8+HVAm1sazxLHOFGPedgX6E6HETDimgmh1u8 CHqeXNvAE7CFwMMNRHj0EdbQH+N4Th7MeppWrdG2QtkYagkEAJUtZcmdJv2lF/Jen/UX ii7+JUzNtud2KUpbFQ8iVN7+h/boXkGvkYGXFJrOqVUJY/QATmgj/GrMeo8KZ/xdS0ej p3iYE1cCnw3Lh+tFeNMoVNq9SFdW0cd1wcayxp30oLc/Q0LxsXZGp5aUWe977o84QtN1 F/fg== MIME-Version: 1.0 X-Received: by 10.60.54.168 with SMTP id k8mr2354632oep.56.1388075353428; Thu, 26 Dec 2013 08:29:13 -0800 (PST) Received: by 10.182.80.7 with HTTP; Thu, 26 Dec 2013 08:29:13 -0800 (PST) In-Reply-To: <52BC5177.70903@hostek.com> References: <52BC5177.70903@hostek.com> Date: Thu, 26 Dec 2013 17:29:13 +0100 Message-ID: Subject: Re: Default gateway lost after netif restart From: Oliver Pinter To: Alex Long Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Dec 2013 16:29:14 -0000 Hi! Try to use screen or tmux. ;) screen service netif restart; service routing restart or when you must restart only one interface, then try this command: service netif restart _if_name_ e.g.: service netif restart em0 when you have intel network card. On 12/26/13, Alex Long wrote: > I am new to FreeBSD so I apologize if this is the wrong place to post > this. But there is a flaw in the logic regarding restart of the netif > service. I understand that after restarting the netif service, you have > to manually restart the routing service. The problem is that if you are > configuring a machine remotely and you have to restart the netif service > for whatever reason, your defaut gateway is lost, thus preventing you > from restarting the routing service and you lose connectivity to the > machine. > > Now I get around this by creating a shell script that does both and just > executing that script. This works but it is sloppy in my opinion. It > does not makes sense to restart a network service and lose ANY network > functionality (i.e. your routes) once it comes back up. > > Regards, > Alex Long > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-arch@FreeBSD.ORG Fri Dec 27 22:41:33 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A2B1769D for ; Fri, 27 Dec 2013 22:41:33 +0000 (UTC) Received: from mail-qc0-x22a.google.com (mail-qc0-x22a.google.com [IPv6:2607:f8b0:400d:c01::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 654F11701 for ; Fri, 27 Dec 2013 22:41:33 +0000 (UTC) Received: by mail-qc0-f170.google.com with SMTP id x13so9219860qcv.29 for ; Fri, 27 Dec 2013 14:41:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=QegVfy/AUP17kVkttGoFvnn4iIFDtbJ5CfN+4+6nQVA=; b=U9L8mufoEjaJBEvTmsnjOZYTYw7ohN7eIRaHhcMy2XCvEQ2xjG6jBwj2QVnFHZXfNV bdHZQJujkTyIiebpb07ccxcedr3UsMFaRm0VWxHNvglj7v5KhYWJG9drhJenDmMykOo3 qysRYywdijKGtDuWrIZ9aa5JL2OK7/jawrUYVk8onN2rAgL+ckdhGRiEv5bUv+wF38Pl lyLJy1TZsTeDxWDZZAI9WE9Qz4rqPA249o7csYplPo4TQXuJHY69xJVYqY95/KJf05iD FrNsOuadn9QBX4hPPEagH34V9SAa1zhfXJ2FA1c4QZLWeQ4CfZdfVin8gmckIO7nd27l 6xIA== MIME-Version: 1.0 X-Received: by 10.224.124.195 with SMTP id v3mr84700042qar.55.1388184092590; Fri, 27 Dec 2013 14:41:32 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.224.52.8 with HTTP; Fri, 27 Dec 2013 14:41:32 -0800 (PST) Date: Fri, 27 Dec 2013 14:41:32 -0800 X-Google-Sender-Auth: DTV0pyj0q4VPLdUmKiwPyGkIlVU Message-ID: Subject: Default knote hash table size is too .. small From: Adrian Chadd To: "freebsd-arch@freebsd.org" Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Dec 2013 22:41:33 -0000 Hi, The default knote hash table size (KN_HASHSIZE) is 64. When doing dirty things with lots of sockets (say, 64k and more) that involve plenty of knote insert/remove (ie, oneshot events - think posix AIO and my upcoming kqueue sendfile notification stuff) it consumes stupid amounts of CPU. I'd like to make this a tunable so people like Adrian at Netflix can bump this to higher values (say 32k) but people like Adrian at Embedded can bump this to lower values (say 64.) Comments? I'll make the change soon if no-one objects. Thanks, -a From owner-freebsd-arch@FreeBSD.ORG Fri Dec 27 23:29:21 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C3D40996 for ; Fri, 27 Dec 2013 23:29:21 +0000 (UTC) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id B00FE19CD for ; Fri, 27 Dec 2013 23:29:21 +0000 (UTC) Received: from Alfreds-MacBook-Air.local (50-204-88-5-static.hfc.comcastbusiness.net [50.204.88.5]) by elvis.mu.org (Postfix) with ESMTPSA id A21FC1A3C37 for ; Fri, 27 Dec 2013 15:29:10 -0800 (PST) Message-ID: <52BE0D40.5020304@freebsd.org> Date: Fri, 27 Dec 2013 15:29:04 -0800 From: Alfred Perlstein Organization: FreeBSD User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: freebsd-arch@freebsd.org Subject: Re: Default knote hash table size is too .. small References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Dec 2013 23:29:21 -0000 On 12/27/13, 2:41 PM, Adrian Chadd wrote: > Hi, > > The default knote hash table size (KN_HASHSIZE) is 64. When doing > dirty things with lots of sockets (say, 64k and more) that involve > plenty of knote insert/remove (ie, oneshot events - think posix AIO > and my upcoming kqueue sendfile notification stuff) it consumes stupid > amounts of CPU. > > I'd like to make this a tunable so people like Adrian at Netflix can > bump this to higher values (say 32k) but people like Adrian at > Embedded can bump this to lower values (say 64.) > > Comments? I'll make the change soon if no-one objects. > > Thanks, Cool! What about auto-tune to maxfiles? -Alfred From owner-freebsd-arch@FreeBSD.ORG Fri Dec 27 23:39:57 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BBCAEC03; Fri, 27 Dec 2013 23:39:57 +0000 (UTC) Received: from mail-qa0-x233.google.com (mail-qa0-x233.google.com [IPv6:2607:f8b0:400d:c00::233]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 69BD71A5D; Fri, 27 Dec 2013 23:39:57 +0000 (UTC) Received: by mail-qa0-f51.google.com with SMTP id o15so9023482qap.3 for ; Fri, 27 Dec 2013 15:39:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=euVm13lQMBGYGbmeF4iXt2gwQUt2FHdyV62kJFi4930=; b=h5q7UAk/seZV5/9Yh9r6oCrCmdCtYHWHbfI/DY+6RTYMwwTnHNtemS0H9k6MOziuGt 4tLwVQ6JUYoojHQOeKIdij/L3TT9GWTuIMhYhmrH2lPwn8Qm95U1fIYtcpMqX49QvnV7 bC0aYV+5bVwv6dDH8IhYIbufxrZgYFHVpB6Zorklh/PwkkyUf9NgADEnKOuywGWgHMTR bq9iWuYM12kD+k25MbVdwDAcj4BK5v8W/R4zqhjD4UQZsFkWUMXNfjiuXexBa1Atf91j nw5RJjaKiqKO6LiivFcSlQlLgZNJeoDAgCEUB81qNLHnpK3MYA0oTi3F3ozRSTIJBW5k HXzw== MIME-Version: 1.0 X-Received: by 10.224.124.195 with SMTP id v3mr85068620qar.55.1388187596280; Fri, 27 Dec 2013 15:39:56 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.224.52.8 with HTTP; Fri, 27 Dec 2013 15:39:56 -0800 (PST) In-Reply-To: <52BE0D40.5020304@freebsd.org> References: <52BE0D40.5020304@freebsd.org> Date: Fri, 27 Dec 2013 15:39:56 -0800 X-Google-Sender-Auth: NwSvGF9bQLCJvswPHaJazdxeu1g Message-ID: Subject: Re: Default knote hash table size is too .. small From: Adrian Chadd To: Alfred Perlstein Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Dec 2013 23:39:57 -0000 On 27 December 2013 15:29, Alfred Perlstein wrote: > Cool! What about auto-tune to maxfiles? Well, ideally it'd be tunable per actual kqueue rather than a global. That way some crazy ioctl could be done to set the knote hash size per queue. I'm just worried that for some people doing say, high throughput UDP with one connected socket, it's just wasted RAM. But for mad parallel TCP the hash has to be bigger. But then if yo'ure doing say, 30 worker processes versus 1000 sockets each versus say, 8 worker processes doing 4000 sockets each, you need a different hash size. See? It's not as easy as tune to maxfiles. the only thing that is that simple is "can we massively overspecify it." :-) I'll do a followup at some point to allow an ioctl to set/get it on a per-kqfd basis. Then we won't need the build limit. Thanks, -a From owner-freebsd-arch@FreeBSD.ORG Fri Dec 27 23:51:49 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AA545F15; Fri, 27 Dec 2013 23:51:49 +0000 (UTC) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 951A81B33; Fri, 27 Dec 2013 23:51:49 +0000 (UTC) Received: from Alfreds-MacBook-Air.local (50-204-88-5-static.hfc.comcastbusiness.net [50.204.88.5]) by elvis.mu.org (Postfix) with ESMTPSA id ED5E61A3C1F; Fri, 27 Dec 2013 15:51:48 -0800 (PST) Message-ID: <52BE1293.3050305@freebsd.org> Date: Fri, 27 Dec 2013 15:51:47 -0800 From: Alfred Perlstein Organization: FreeBSD User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Adrian Chadd Subject: Re: Default knote hash table size is too .. small References: <52BE0D40.5020304@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Dec 2013 23:51:49 -0000 On 12/27/13, 3:39 PM, Adrian Chadd wrote: > On 27 December 2013 15:29, Alfred Perlstein wrote: > >> Cool! What about auto-tune to maxfiles? > Well, ideally it'd be tunable per actual kqueue rather than a global. > That way some crazy ioctl could be done to set the knote hash size per > queue. > > I'm just worried that for some people doing say, high throughput UDP > with one connected socket, it's just wasted RAM. But for mad parallel > TCP the hash has to be bigger. But then if yo'ure doing say, 30 worker > processes versus 1000 sockets each versus say, 8 worker processes > doing 4000 sockets each, you need a different hash size. > > See? It's not as easy as tune to maxfiles. the only thing that is that > simple is "can we massively overspecify it." :-) > > I'll do a followup at some point to allow an ioctl to set/get it on a > per-kqfd basis. Then we won't need the build limit. That's only partially true, on a machine with let's say 20GB ram, then it's probably perfectly fine to autotune the kqueue hashes to be 128, 256 or even 512 so that out of the box performance on these machines is better. There's memory so "waste it" a person who truly knows what they are doing can save a few hundred bytes by setting the tunable lower. In all honesty, aren't kqueues used somewhat sparingly? Truly very little harm can come from raising it on large memory machines. In your specific examples, 30 or 8 worker processes, how many kqueues will be active on the box (relative to other kernel data structures such as sockets?). I could be wrong, there may be some use cases of 1000s of kqueues per thread, but I'm having a hard time wrapping my head around it. Another question, does the kqueue locking model allow for rehashing once there is something like 4*KN_HASHSIZE entries? Can it be dynamically set? Anyhow, I'm not suggesting that you shouldn't just add the tunable and be on your way, but more asking if maybe *I* should look at making it dynamically scale. -Alfred From owner-freebsd-arch@FreeBSD.ORG Sat Dec 28 02:59:50 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EBB537CB; Sat, 28 Dec 2013 02:59:50 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id A741E16FC; Sat, 28 Dec 2013 02:59:50 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id rBS2xmAM008362 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 27 Dec 2013 18:59:49 -0800 (PST) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id rBS2xmMs008361; Fri, 27 Dec 2013 18:59:48 -0800 (PST) (envelope-from jmg) Date: Fri, 27 Dec 2013 18:59:48 -0800 From: John-Mark Gurney To: Adrian Chadd Subject: Re: Default knote hash table size is too .. small Message-ID: <20131228025948.GP99167@funkthat.com> Mail-Followup-To: Adrian Chadd , "freebsd-arch@freebsd.org" References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Fri, 27 Dec 2013 18:59:49 -0800 (PST) Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Dec 2013 02:59:51 -0000 Adrian Chadd wrote this message on Fri, Dec 27, 2013 at 14:41 -0800: > The default knote hash table size (KN_HASHSIZE) is 64. When doing > dirty things with lots of sockets (say, 64k and more) that involve > plenty of knote insert/remove (ie, oneshot events - think posix AIO > and my upcoming kqueue sendfile notification stuff) it consumes stupid > amounts of CPU. Just for the record, KN_HASHSIZE is not used by sockets, since they have an fd to do the look up... It's AIO that's probably the one using the CPU time, but definately not sockets.. The notes put on the hash are _AIO/_LIO, _PROC, _TIMER, _USER, _FS, and _SIGNAL... So, should we look at another data structure like RB trees for this then? At least RB trees would scale better, or we should look at making the hash table growable? It looks like there is support to grow it, since kq_knhas and kq_knhashmask are both local to kq... >From a quick look, we can resize the hash only by holding the KQ lock... > I'd like to make this a tunable so people like Adrian at Netflix can > bump this to higher values (say 32k) but people like Adrian at > Embedded can bump this to lower values (say 64.) Yeh, a smaller default, say of 8 or 16 would make sense for embedded since there aren't many non-fd consumers of kq out there... This would mean each kq only consumes 32 or 64 bytes instead of 2k... Heck, even for normal systems, a default of 8 could/would make sense... Hmm... maybe we should get some stats on how many notes use the hash? -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Sat Dec 28 06:02:52 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D30D7D47; Sat, 28 Dec 2013 06:02:52 +0000 (UTC) Received: from mail-qa0-x232.google.com (mail-qa0-x232.google.com [IPv6:2607:f8b0:400d:c00::232]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 810951140; Sat, 28 Dec 2013 06:02:52 +0000 (UTC) Received: by mail-qa0-f50.google.com with SMTP id i13so9196666qae.2 for ; Fri, 27 Dec 2013 22:02:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:content-type; bh=jVzBtw8xQr0UvwuMHcMWKtao3G8Bl8LWXg272IEiMws=; b=dy4IDCgJ4S8p1E126oVLpCXCtJ3ludivy3BorxCuhTS70p27k4VXS6B9f2bHqitdz8 KNGUg/8xz8Rrk9fbEOb87ec3vz3kLca5I3h5WEF6KmUHgWH89C5ibOfJjS8dElUcupcZ ojcFsMJMhKHu50qA29QwhgpII2di65+VbtV7E+Z5MDbzQuypeC3hubk1xPSV3dJcbkkl RsJORnKg9s9+TR52aFfqfVg2rO6aAZ1EPaa5cyxa50KCKwfG98sXXsedHyzxYLw9kTgJ YQyEq3GNc9Tcn4FV7x4pDuHPlhAO/t6ta36tBXqPGKmrbrg8ghUnKsdmMSUy+iPbyGdj xc8Q== MIME-Version: 1.0 X-Received: by 10.224.13.141 with SMTP id c13mr79106339qaa.76.1388210570677; Fri, 27 Dec 2013 22:02:50 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.224.52.8 with HTTP; Fri, 27 Dec 2013 22:02:50 -0800 (PST) In-Reply-To: <20131228025948.GP99167@funkthat.com> References: <20131228025948.GP99167@funkthat.com> Date: Fri, 27 Dec 2013 22:02:50 -0800 X-Google-Sender-Auth: C4x7mvGJskb1tMVum5DvK6w6K5s Message-ID: Subject: Re: Default knote hash table size is too .. small From: Adrian Chadd To: Adrian Chadd , "freebsd-arch@freebsd.org" Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Dec 2013 06:02:52 -0000 On 27 December 2013 18:59, John-Mark Gurney wrote: > Adrian Chadd wrote this message on Fri, Dec 27, 2013 at 14:41 -0800: >> The default knote hash table size (KN_HASHSIZE) is 64. When doing >> dirty things with lots of sockets (say, 64k and more) that involve >> plenty of knote insert/remove (ie, oneshot events - think posix AIO >> and my upcoming kqueue sendfile notification stuff) it consumes stupid >> amounts of CPU. > > Just for the record, KN_HASHSIZE is not used by sockets, since they > have an fd to do the look up... It's AIO that's probably the one > using the CPU time, but definately not sockets.. > > The notes put on the hash are _AIO/_LIO, _PROC, _TIMER, _USER, _FS, > and _SIGNAL... > > So, should we look at another data structure like RB trees for this > then? At least RB trees would scale better, or we should look at > making the hash table growable? It looks like there is support to > grow it, since kq_knhas and kq_knhashmask are both local to kq... > > From a quick look, we can resize the hash only by holding the KQ > lock... > >> I'd like to make this a tunable so people like Adrian at Netflix can >> bump this to higher values (say 32k) but people like Adrian at >> Embedded can bump this to lower values (say 64.) > > Yeh, a smaller default, say of 8 or 16 would make sense for embedded > since there aren't many non-fd consumers of kq out there... This would > mean each kq only consumes 32 or 64 bytes instead of 2k... Heck, > even for normal systems, a default of 8 could/would make sense... > > Hmm... maybe we should get some stats on how many notes use the hash? It again is highly application dependent. If you're just doing socket IO then you'r elikely fine. If you're also doing other things in high volumes and lots of pending operations (eg posix aio, or as you said timers, etc) then it could get highly crappy very quickly. -a From owner-freebsd-arch@FreeBSD.ORG Sat Dec 28 23:31:00 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2A03C27C for ; Sat, 28 Dec 2013 23:31:00 +0000 (UTC) Received: from mail-ig0-f177.google.com (mail-ig0-f177.google.com [209.85.213.177]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E5C401387 for ; Sat, 28 Dec 2013 23:30:58 +0000 (UTC) Received: by mail-ig0-f177.google.com with SMTP id uy17so24701456igb.4 for ; Sat, 28 Dec 2013 15:30:58 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:content-type :content-transfer-encoding:subject:date:message-id:to:mime-version; bh=GCLwnBojW2jjhbdEYd7R4oGg4+lO1iwqH1Xl91O7OiA=; b=gO8wXitKnKBJu2LkJZQdlcHQeFkdv5WNDDI5VkZia8BdcOPCIlNRAKt/Rt6UmL1Crl uLEiyr+Y8ZnmKJdKKM4xfKVa1BXeDg11lwzynfGMUc7cs34lB25OhZDaIEBxswaB7XCz c9uPt2632PZJP5MraaXSXC7hhbI2KV1j+H63SUMVTs9eH7Dn1SGCI+MoUR61SA22q19j zJnzSn5diVF4ktPAF1QPWcsum7OjcZ29BHqNMXX5FVjpAJLfnMB77HDgnpAGhEqwpoJI a47IH1HXjPl07EBr2llbDf1ZHS+aCowuyYjW2AfykLd72obc+U6LNxwQW/YVlPDVA8wr MphA== X-Gm-Message-State: ALoCoQm9fFy51CH/aFxUN2vQ9sDnqvqMWGnoD9nmqwmhhL4tEW6BfJqZ9RUVirE6hTsqLBGzuXfB X-Received: by 10.50.176.165 with SMTP id cj5mr47305252igc.19.1388273458093; Sat, 28 Dec 2013 15:30:58 -0800 (PST) Received: from fusion-mac.bsdimp.com (50-78-194-198-static.hfc.comcastbusiness.net. [50.78.194.198]) by mx.google.com with ESMTPSA id qb4sm20169180igb.7.2013.12.28.15.30.57 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 28 Dec 2013 15:30:57 -0800 (PST) Sender: Warner Losh From: Warner Losh Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: UART grab patch Date: Sat, 28 Dec 2013 16:30:56 -0700 Message-Id: <79C659B9-F402-42B6-8240-C4AB0A0EB92B@bsdimp.com> To: "freebsd-arch@freebsd.org Arch" Mime-Version: 1.0 (Apple Message framework v1085) X-Mailer: Apple Mail (2.1085) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Dec 2013 23:31:00 -0000 OK. Looks like my initial assessment of mountroot was too optimistic. = With the exception of the imx driver, all the uart drivers in the tree = fell victim to this bug. I hit a second one with the Raspberry Pi, and = went looking. I've uploaded a patch to http://people.freebsd.org/~imp/uart-grab.diff = that should solve the problem. I don't have all these systems to test = on, so I'm hoping people can report back to me what works and what = doesn't. Boot -a with and without the patch for all serial consoles = will tell you if it is working. Type stuff. If it appears exactly as you = type it, then the patch is working. If not, please let me know. The basic strategy for all these is to disable RX interrupts when the = console is grabbed, and enable them when it is ungrabbed. This has to be = done in the hardware, since masking the interrupt at the CPU or PIC = level will cause many UARTs to misbehave and/or prevent other interrupts = from happening that are shared which can cause problems on some = platforms. Comments and critiques of these patches are welcome. I posted to arch@ = because it affects so many different platforms at once, and this seems = like a arch@y thing... Warner