From owner-freebsd-arch@FreeBSD.ORG Mon Jan 14 16:16:13 2013 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id D8DF95E6; Mon, 14 Jan 2013 16:16:13 +0000 (UTC) (envelope-from bright@mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id B1E43254; Mon, 14 Jan 2013 16:16:13 +0000 (UTC) Received: from Alfreds-MacBook-Pro-9.local (unknown [64.25.27.130]) by elvis.mu.org (Postfix) with ESMTPSA id 0FA2A1A3C24; Mon, 14 Jan 2013 08:16:11 -0800 (PST) Message-ID: <50F42F4B.303@mu.org> Date: Mon, 14 Jan 2013 11:16:11 -0500 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: svn commit: r243631 - in head/sys: kern sys References: <201211272119.qARLJxXV061083@svn.freebsd.org> <50C1BC90.90106@freebsd.org> <50C25A27.4060007@bluezbox.com> <50C26331.6030504@freebsd.org> <50C26AE9.4020600@bluezbox.com> <50C3A3D3.9000804@freebsd.org> <50C3AF72.4010902@rice.edu> <330405A1-312A-45A5-BB86-4969478D8BBD@bluezbox.com> <50D03E83.8060908@rice.edu> <50DD081E.8000409@bluezbox.com> <50EB1841.5030006@bluezbox.com> <50EB22D2.6090103@rice.edu> <50EB415F.8020405@freebsd.org> <50F04FE5.7010406@rice.edu> <50F1BD69.4060104@mu.org> <50F2F79C.7040109@mu.org> <50F41F8C.5030900@freebsd.org> In-Reply-To: <50F41F8C.5030900@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Adrian Chadd , src-committers@freebsd.org, Alan Cox , "Jayachandran C." , svn-src-all@freebsd.org, Oleksandr Tymoshenko , freebsd-arch@freebsd.org, svn-src-head@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jan 2013 16:16:13 -0000 On 1/14/13 10:09 AM, Andre Oppermann wrote: > On 13.01.2013 19:06, Alfred Perlstein wrote: >> On 1/12/13 10:32 PM, Adrian Chadd wrote: >>> On 12 January 2013 11:45, Alfred Perlstein wrote: >>> >>>> I'm not sure if regressing to the waterfall method of development >>>> is a good >>>> idea at this point. >>>> >>>> I see a light at the end of the tunnel and we to continue to just >>>> handle >>>> these minor corner cases as we progress. >>>> >>>> If we move to a model where a minor bug is grounds to completely >>>> remove >>>> helpful code then nothing will ever get done. >>>> >>> Allocating 512MB worth of callwheels on a 16GB MIPS machine is a >>> little silly, don't you think? >>> >>> That suggests to me that the extent of which maxfiles/maxusers/etc >>> percolates the codebase wasn't totally understood by those who wish to >>> change it. >>> >>> I'd rather see some more investigative work into outlining things that >>> need fixing and start fixing those, rather than "just change stuff and >>> fix whatever issues creep up." >>> >>> I kinda hope we all understand what we're working on in the kernel a >>> little better than that. >> >> Cool! I'm glad people are now aware of the callwheel allocation >> being insane with large maxusers. >> >> I saw this about a month ago (if not longer), but since there were >> half a dozen people calling me an >> imbecile who hadn't really yet read the code I didn't want to inflame >> them more by fixing that with >> "a hack". (actually a simple fix). >> >> A simple fix is to clamp callwheel size to the previous result of a >> maxusers of 384 and call it a day. >> >> However the simplicity of that approach would probably inflame too >> many feelings so I am unsure as >> how to proceed. >> >> Any ideas? > > I noticed the callwheel dependency as well and asked mav@ about it > in a short email exchange. He said it has only little use and goes > away with the calloutng import. While that is outstanding we need > to clamp it to a sane value. > > However I don't know what a sane value would be and why its size is > directly derived from maxproc and maxfiles. If there can be one > callout per process and open file descriptor in the system, then > it probably has to be so big. If it can deal with 'collisions' > in the wheel it can be much smaller. > If it really goes away with calloutng, then we should probably leave it be in -current. As far as clipping it when/if we push maxusers fixes in -stable (which we must do) then my impression (although maybe wrong) is that the callwheels (cc_callwheel) are just arrays of hash buckets based on what tick will be fired next MOD callwheelmask. This means that if cc_callwheel is way too small, then we will wind up with collisions, however if it's enormous then we wind up with a window that is so large it can accommodate something like hundreds of ticks into the future. Example: > Loaded symbols for /boot/kernel/profile.ko > #0 sched_switch (td=0xffffffff81373e40, newtd=0xfffffe001aab5960, > flags=) at ../../../kern/sched_ule.c:1954 > 1954 cpuid = PCPU_GET(cpuid); > (kgdb) p callwheelsize > $1 = 2097152 > Current language: auto; currently minimal > (kgdb) # .(16:06:31)(root@dan) > /usr/home/alfred # sysctl -a | grep hz > kern.clockrate: { hz = 1000, tick = 1000, profhz = 8128, stathz = 127 } > kern.dcons.poll_hz: 25 > kern.hz: 1000 > debug.psm.hz: 20 > .(16:06:37)(root@dan) > /usr/home/alfred # 2097152 > .(16:06:40)(root@dan) > /usr/home/alfred # bc > 2097152 / 1000 > 2097 > ^D# .(16:06:56)(root@dan) > /usr/home/alfred # sysctl kern.maxusers > kern.maxusers: 3406 So basically on this box there are enough callwheel slots for something like 2097 seconds, or 34 minutes into the future. I would assume that a machine that was capped at 384 maxusers would wind up with something that could handle callouts up to ~3 minutes in the future without wraparound and collisions. As far as the ncallout, that is for timeout(9) support. At a glance I'm not aware of any users of timeout(9) that are not "per device" so there's unlikely to be a need for a timeout(9) supporting pre-allocated timeout per prorcess/file, more likely something like N-devices*4, which is fine at something way lower than the max allocated at 384 maxusers from before all the changes we have made. I could be wrong.. but I still believe that it would be quite the system that would need more than callout=get_callout_from_maxusers(min(maxusers, 384)); Functions calling this function: timeout Functions calling this function: timeout File Function Line 0 si.c si_start 1439 pp->lstart_ch = timeout(si_lstart, (caddr_t)pp, time); 1 sio.c siobusycheck 1269 timeout(siobusycheck, com, hz / 100); 2 sio.c siopoll 1744 timeout(siobusycheck, com, hz / 100); 3 sio.c siosettimeout 2203 sio_timeout_handle = timeout(comwakeup, (void *)NULL, 4 sio.c comwakeup 2220 sio_timeout_handle = timeout(comwakeup, (void *)NULL, sio_timeout); 5 syscons.c scrn_timer 1834 timeout(scrn_timer, sc, hz / 10); 6 syscons.c scrn_timer 1884 timeout(scrn_timer, sc, hz / 10); 7 syscons.c scrn_timer 1902 timeout(scrn_timer, sc, hz / 25); 8 syscons.c blink_screen 3847 timeout(blink_screen, scp, hz / 10); 9 trm.c trm_ExecuteSRB 478 ccb->ccb_h.timeout_ch = timeout(trmtimeout, (caddr_t)srb, (ccb->ccb_h.timeout * hz) / 1000); a tws_cam.c tws_execute_scsi 782 ccb_h->timeout_ch = timeout(tws_timeout, req, (ccb_h->timeout * hz)/1000); b tws_cam.c tws_send_scsi_cmd 820 req->thandle = timeout(tws_timeout, req, (TWS_IO_TIMEOUT * hz)); c tws_cam.c tws_set_param 867 req->thandle = timeout(tws_timeout, req, (TWS_IOCTL_TIMEOUT * hz)); d tws_services.c tws_print_stats 398 timeout(tws_print_stats, sc, 300*hz); e if_wl.c wlstart 1022 sc->watchdog_ch = timeout(wlwatchdog, sc, 10); f spic.c spictimeout 429 sc->sc_timeout_ch = timeout(spictimeout, sc, spic_pollrate); g spic.c spictimeout 442 sc->sc_timeout_ch = timeout(spictimeout, sc, spic_pollrate); h spic.c spicopen 459 timeout(spictimeout, sc, spic_pollrate); i kern_cons.c sysbeep 624 timeout(sysbeepstop, (void *)NULL, period); j kern_fail.c fail_point_sleep 133 timeout(fp->fp_sleep_fn, fp->fp_sleep_arg, timo); k aarp.c aarptimer 128 aarptimer_ch = timeout(aarptimer, NULL, AARPT_AGE * hz); l aarp.c aarptnew 580 aarptimer_ch = timeout(aarptimer, (caddr_t)0, hz); m ng_btsocket_l2cap.c ng_btsocket_l2cap_timeout 2663 pcb->timo = timeout(ng_btsocket_l2cap_process_timeout, pcb, n ng_btsocket_rfcomm.c ng_btsocket_rfcomm_timeou 3449 pcb->timo = timeout(ng_btsocket_rfcomm_process_timeout, pcb, o ng_fec.c ng_fec_init 642 priv->fec_ch = timeout(ng_fec_tick, priv, hz); p ng_fec.c ng_fec_tick 717 priv->fec_ch = timeout(ng_fec_tick, priv, hz); q key.c key_timehandler 4551 (void )timeout((void *)key_timehandler, (void *)0, hz); r key.c key_init 7776 timeout((void *)key_timehandler, (void *)0, hz); s ncp_subr.c ncp_init 107 ncp_timer_handle = timeout(ncp_timer, NULL, NCP_TIMER_TICK); t fdc.c fd_turnon 1186 timeout(fd_motor_on, fd, hz); u fdc.c fdstate 1786 fd->toffhandle = timeout(fd_turnoff, fd, 4 * hz); v fdc.c fdstate 1877 timeout(fd_pseudointr, fdc, hz / 16); w fdc.c fdstate 2092 fd->tohandle = timeout(fd_iotimeout, fdc, hz); x fdc.c fdstate 2101 fd->tohandle = timeout(fd_iotimeout, fdc, hz); y fdc.c fdstate 2218 timeout(fd_pseudointr, fdc, hz / 8); * Lines 71-106 of 115, 10 more - press the space bar to display more * Functions calling this function: timeout File Function Line 0 olpt.c lptopen 421 timeout (lptout, (caddr_t)sc, 1 olpt.c lptout 440 timeout (lptout, (caddr_t)sc, sc->sc_backoff); 2 pckbd.c pckbd_timeout 260 timeout(pckbd_timeout, arg, hz/10); 3 sio.c sioattach 2012 timeout(siobusycheck, com, hz / 100); 4 sio.c sioattach 2696 timeout(siobusycheck, com, hz / 100); 5 sio.c sioattach 3330 sio_timeout_handle = timeout(comwakeup, (void *)NULL, 6 sio.c sioattach 3347 sio_timeout_handle = timeout(comwakeup, (void *)NULL, sio_timeout); 7 sio.c sioattach 3933 timeout(pc98_check_msr, (caddr_t)dev, 8 sio.c sioattach 3951 timeout(pc98_check_msr, (caddr_t)dev, 9 ncr.c ncr_timeout 5171 timeout (ncr_timeout, (caddr_t) np, step ? step : 1); * Press the space bar to display the first lines again * -Alfred