From owner-freebsd-embedded@FreeBSD.ORG Thu Jul 19 08:28:47 2012 Return-Path: Delivered-To: freebsd-embedded@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id ECEF1106567E; Thu, 19 Jul 2012 08:28:47 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com [209.85.217.182]) by mx1.freebsd.org (Postfix) with ESMTP id 2EA368FC20; Thu, 19 Jul 2012 08:28:46 +0000 (UTC) Received: by lbon10 with SMTP id n10so4128039lbo.13 for ; Thu, 19 Jul 2012 01:28:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=CTuz74yRXnE3xucxUKyVntmArNlvOkcKKT1qmWzHeMk=; b=I6xoAPa0vJDBf3yyFrcTYcb+iY9HieXZ7d6v8srlM/sCftPL6WBtY1gBd9bZraN+vS VmbdE8IU2iF6PgK2anJJs0ofuRPSKXiApxE0MIC/9bC66oXaoe+Bf9vT57fUinyoexjk YYHtD8ZaUIUd9pnCG2eOCZhO90hbvLJrbsm7JmYtBa1XZB8+sCEKRIp3pDj+xCd0b6ps qdLiQjG9LJu0yXWtdJ5XgBIDQKcQN3Ip01WaQHKyI4OcOfmxyUWjd6d4KNfzEuolMKed ckPx1vQROi6AX43RMGFhslpMEQ7OyUtg74WMr+dKgpPJtzVUsJet1sgQghONNurOKTrU pp5Q== MIME-Version: 1.0 Received: by 10.112.44.163 with SMTP id f3mr681247lbm.59.1342686526140; Thu, 19 Jul 2012 01:28:46 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.112.20.197 with HTTP; Thu, 19 Jul 2012 01:28:46 -0700 (PDT) In-Reply-To: <1342639315.2698.21.camel@manbearpig.dynamic.weites.net> References: <1341745590.2740.17.camel@manbearpig.dynamic.weites.net> <201207081805.33574.bschmidt@freebsd.org> <1341841445.2540.10.camel@manbearpig.dynamic.weites.net> <1341849727.2540.11.camel@manbearpig.dynamic.weites.net> <1342195983.2336.35.camel@manbearpig.dynamic.weites.net> <4B538596-937B-46F3-AF8F-17F34BE0C92D@bsdimp.com> <1342355969.5473.6.camel@revolution.hippie.lan> <1342472522.2336.97.camel@manbearpig.dynamic.weites.net> <1342639315.2698.21.camel@manbearpig.dynamic.weites.net> Date: Thu, 19 Jul 2012 01:28:46 -0700 X-Google-Sender-Auth: OdHHLXTosenq2f-QvW6fV115oyk Message-ID: From: Adrian Chadd To: Harm Weites Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-embedded@freebsd.org, Bernhard Schmidt Subject: Re: TP-Link wr1043nd out of swap space X-BeenThere: freebsd-embedded@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Dedicated and Embedded Systems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Jul 2012 08:28:48 -0000 Hi, re: MALLOC_PRODUCTION: It's not a kernel thing. It's part of the userland build configuration. Check out what my build scripts do (specifically build_freebsd) - I set it correctly in the relevant place. I'm more interested in any changes in available memory reported during kernel startup. Those need to be tracked down immediately. Same deal with any change in the post-boot malloc/slab allocator pools. Adrian On 18 July 2012 12:21, Harm Weites wrote: > No luck, both immediately crash with the out-of-swap message. > > I've checked out r234855, deleted ./root and ./obj and then did the > make-steps. Flashed the device, observed the error and noticed this > (nothing is started at this point, not even networking): > > # vmstat > procs memory page disk faults > cpu > r b w avm fre flt re pi po fr sr fl0 in sy cs us > sy id > 0 0 0 25340k 1776k 872 4 5 0 624 2604 29 0 103 119 3 > 27 70 > > # ps fauxwww > USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAN > [..] > 0 20 0.7 29.6 10780 9712 u0 Ss 6:54PM 0:00.19 -sh (sh) > 0 22 0.0 28.4 10516 9296 u0 R+ 6:55PM 0:00.15 ps fauxwww > [..] > > The processes listed didn't use that much %MEM before... in r231714 > those where both at 4-5. > > Sadly, an svn update of the tree to r234941 did not bring any > improvements. > > I did put "MALLOC_PRODUCTION=YES" in /etc/make.conf, though I'm not sure > if that is correct; the example (in /usr/share/examples/) does not list > it. Hence I configured it with makeoptions aswell. > > My kernel config: > --- sys/mips/conf/TP-WN1043ND (revision 234941) > +++ sys/mips/conf/TP-WN1043ND (working copy) > @@ -15,6 +15,9 @@ > # Force the board memory - 32mb > options AR71XX_REALMEM=32*1024*1024 > > +makeoptions MODULES_OVERRIDE="random gpio ar71xx if_gif if_gre > if_bridge bridgestp wlan wlan_xauth wlan_acl wlan_tkip wlan_ccmp > wlan_rssadapt wlan_amrr ath ath_ahb hwpmc pf if_vlan" > +makeoptions MALLOC_PRODUCTION > + > # read MSDOS formatted disks - USB > options MSDOSFS > options GEOM_PART_BSD > @@ -33,3 +36,15 @@ > > # Boot off of the rootfs, as defined in the geom_map setup. > options ROOTDEVNAME=\"ufs:map/rootfs.uzip\" > + > +nooptions INVARIANTS > +nooptions INVARIANT_SUPPORT > +nooptions WITNESS > +nooptions WITNESS_SKIPSPIN > + > +options NBUF=128 > + > +device pf > +device gif > +device vlan > + > > regards > > Adrian Chadd schreef op wo 18-07-2012 om 10:08 [-0700]: >> .. christ, has this really broken so significantly? >> >> I haven't updated my 1043nd in a couple months (as I have other test >> devices); are you sure you're correctly defining MALLOC_PRODUCTION? >> >> I'll see if I can/should just add NBUF=128 to the kernel configuration >> files, to save a little extra RAM. Thanks for that pointer. >> >> FWIW, I'm running this: >> >> FreeBSD home-11bg-ap 10.0-CURRENT FreeBSD 10.0-CURRENT #2 >> r234855:234941M: Wed Dec 31 16:00:00 PST 1969 >> adrian@dummy:/home/adrian/work/freebsd/svn/obj/mipseb/mips.mips/usr/home/adrian/work/freebsd/svn/src/sys/TP-WN1043ND >> mips >> >> .. so maybe try updating to that revision and see if you still see >> good/bad memory usage? >> >> I'd really appreciate it if both of you could build some kernel/world >> revisions and help me track down where this memory usage went up. I >> need the help. :) >> >> Thanks! >> >> >> >> Adrian >> >> >> On 16 July 2012 14:02, Harm Weites wrote: >> > Hi, >> > >> > setting NBUF to 128 didn't bring any noticable change. >> > >> > I've changed /etc/rc to just start /bin/sh to make it easier to run some >> > diagnostics right after kernel boot, here are some of my findings. >> > >> > r238194 >> > this is quite interesting since there are no (user) processes running, >> > apart from /bin/sh. >> > ---------------- >> > rtl8366rb0port0: link state changed to UP >> > *** Start /bin/sh >> > pid 18 (sh), uid 0, was killed: out of swap space >> > Jul 16 11:58:11 init: /bin/sh on /etc/rc terminated abnormally, going to >> > single user mode^M >> > Enter full pathname of shell or RETURN for /bin/sh: >> > # vmstat >> > pid 20 (sh), uid 0, was killed: out of swap space >> > Jul 16 11:59:41 init: single user shell terminated >> > >> > >> > r235767 >> > ---------------- >> > # vmstat >> > procs memory page disks faults >> > cpu >> > r b w avm fre flt re pi po fr sr fl0 md0 in sy cs >> > us sy id >> > 0 0 0 25360k 1796k 564 5 3 0 436 1420 29 0 0 63 >> > 83 2 18 80 >> > >> > r228268 >> > Right after kernel boot (so without active networking/services): >> > ---------------- >> > # vmstat >> > procs memory page disk faults cpu >> > r b w avm fre flt re pi po fr sr fl0 in sy cs us >> > sy id >> > 0 0 0 35088k 17M 36 0 2 0 57 0 30 0 18 72 0 >> > 9 91 >> > >> > And after initializing networking (and starting hostapd): >> > # vmstat >> > procs memory page disks faults >> > cpu >> > r b w avm fre flt re pi po fr sr fl0 md0 in sy cs >> > us sy id >> > 0 0 0 49548k 8620k 140 0 1 0 89 0 0 0 0 74 93 >> > 1 5 94 >> > >> > Furthermore, after manually starting all scripts and observing vmstat >> > after each step, I noticed a decrease from 13M to 10M after starting the >> > wifi script (this starting hostapd). >> > >> > r231714 with the following processes: >> > -hostapd >> > -dropbear >> > -dhcprelay >> > -syslogd >> > -rtadvd >> > -dhclient >> > ---------------- >> > # vmstat >> > procs memory page disks faults >> > cpu >> > r b w avm fre flt re pi po fr sr fl0 md0 in sy cs >> > us sy id >> > 1 1 0 176M 3080k 58 0 0 0 47 30 0 0 0 83 223 >> > 1 2 97 >> > >> > Starting bsnmpd/ntpd takes away another 2500k, which mostly resulted in >> > the 'out of swap space' error. Hopefully I can at least tweak those >> > services a little, or perhaps there is something with a smaller >> > footprint already in ports :) >> > >> > I can only hope ~ 3000k is enough to route traffic... >> > >> > r228256:228258 >> > This is from the image Adrian put online, where hostapd isn't running; >> > just inetd. >> > ---------------- >> > # vmstat >> > procs memory page disks faults >> > cpu >> > r b w avm fre flt re pi po fr sr fl0 md0 in sy cs >> > us sy id >> > 0 0 0 49928k 11M 255 1 3 0 186 0 0 0 0 144 122 >> > 2 18 80 >> > >> > I am by no means a kernel adept, so I can't do much but show my >> > observations upon different kernel/userland configurations. >> > >> > Any tips/pointers to aid in the dig are greatly appreciated. >> > >> > Perhaps someone else with a 1043ND can offer his/her findings with any >> > particular kernel revision. >> > >> > regards >> > >> > Ian Lepore schreef op zo 15-07-2012 om 06:39 [-0600]: >> >> On Sun, 2012-07-15 at 03:31 -0700, Adrian Chadd wrote: >> >> > Hi, >> >> > >> >> > I would really appreciate it if people (read; not me) would be able to >> >> > do the digging needed to get to the bottom of user/kernel memory >> >> > usage. >> >> > >> >> > I really need to focus on just the net80211/wifi stack side of things. >> >> > I'm going to focus on getting the ath(4) memory usage down over the >> >> > next few months so it remains feasible to run on 32MB platforms, as >> >> > those still ship. But I can't keep the rest of the kernel and userland >> >> > in check. >> >> > >> >> > Thanks, >> >> > >> >> > >> >> > Adrian >> >> >> >> I had to chase down "out of swap space" aborts on an ARM platform with >> >> 64MB not long ago, and I discovered that the kernel by default allocates >> >> 1/4 of available ram for vfs buffers (up to some limit, then it's 1/10 >> >> after that). I added "option NBUF=128" to our kernel config and that >> >> limited wired vfs buffer space to about 2MB, which seems much more >> >> reasonable for an embedded platform that does relatively little disk IO. >> >> >> >> I suspect the NBUF value could go even lower, but I'm also afraid that >> >> making it too low will lead to other problems; I don't really know >> >> enough to make an informed decision. So far the 128 value is working >> >> well in testing, but we haven't actually put any units in the field with >> >> that setting (I think we will pretty soon). >> >> >> >> -- Ian >> >> >> >> >> > >> > > >