Date: Mon, 25 Jul 2011 10:07:36 +0800 From: Adrian Chadd <adrian@freebsd.org> To: Peter Ross <Peter.Ross@bogen.in-berlin.de> Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>, freebsd-stable List <freebsd-stable@freebsd.org>, "Vogel, Jack" <jack.vogel@intel.com>, Scott Sipe <cscotts@gmail.com>, davidch@freebsd.org, Jeremy Chadwick <freebsd@jdc.parodius.com> Subject: Re: scp: Write Failed: Cannot allocate memory Message-ID: <CAJ-Vmom19KbU0kki0KVTSyzmq-CTKh-j7g%2BmFcRVscb%2B0XPxhQ@mail.gmail.com> In-Reply-To: <20110711115947.51686v4930s7ze37@webmail.in-berlin.de> References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> <20110706023234.GA72048@icarus.home.lan> <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> <20110706032425.GA72757@icarus.home.lan> <20110706135412.15276i0fxavg09k4@webmail.in-berlin.de> <20110706041504.GA73698@icarus.home.lan> <20110706143129.10696235ldx9bjmp@webmail.in-berlin.de> <20110706173242.23404ffbhkxz6mqi@webmail.in-berlin.de> <20110706182141.13056plxp148y61h@webmail.in-berlin.de> <CA%2B30O_O8b8O29rc6BLnnGVTY3cWzpuKQ1q8FTG1idJKM5ykrvA@mail.gmail.com> <20110711115947.51686v4930s7ze37@webmail.in-berlin.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Has someone asked for the output of netstat -mb? That error message is mbuf related, so I bet it's something to do with mbuf allocation. Is it possible that the system is incorrectly tuned when virtualbox is enab= led? Adrian On 11 July 2011 09:59, Peter Ross <Peter.Ross@bogen.in-berlin.de> wrote: > Quoting "Scott Sipe" <cscotts@gmail.com>: > >> On Wed, Jul 6, 2011 at 4:21 AM, Peter Ross >> <Peter.Ross@bogen.in-berlin.de>wrote: >> >>> Quoting "Peter Ross" <Peter.Ross@bogen.in-berlin.de**>: >>> >>> =A0Quoting "Peter Ross" <Peter.Ross@bogen.in-berlin.de**>: >>>> >>>> =A0Quoting "Jeremy Chadwick" <freebsd@jdc.parodius.com>: >>>>> >>>>> =A0On Wed, Jul 06, 2011 at 01:54:12PM +1000, Peter Ross wrote: >>>>>> >>>>>>> Quoting "Jeremy Chadwick" <freebsd@jdc.parodius.com>: >>>>>>> >>>>>>> =A0On Wed, Jul 06, 2011 at 01:07:53PM +1000, Peter Ross wrote: >>>>>>>> >>>>>>>>> Quoting "Jeremy Chadwick" <freebsd@jdc.parodius.com>: >>>>>>>>> >>>>>>>>> =A0On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: >>>>>>>>>> >>>>>>>>>>> Quoting "Jeremy Chadwick" <freebsd@jdc.parodius.com>: >>>>>>>>>>> >>>>>>>>>>> =A0On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I'm running virtualbox 3.2.12_1 if that has anything to do wi= th >>>>>>>>>>>>> it. >>>>>>>>>>>>> >>>>>>>>>>>>> sysctl vfs.zfs.arc_max: 6200000000 >>>>>>>>>>>>> >>>>>>>>>>>>> While I'm trying to scp, kstat.zfs.misc.arcstats.size is >>>>>>>>>>>>> hovering right around that value, sometimes above, sometimes >>>>>>>>>>>>> below (that's as it should be, right?). I don't think that it >>>>>>>>>>>>> dies when crossing over arc_max. I can run the same scp 10 >>>>>>>>>>>>> times >>>>>>>>>>>>> and it might fail 1-3 times, with no correlation to the >>>>>>>>>>>>> arcstats.size being above/below arc_max that I can see. >>>>>>>>>>>>> >>>>>>>>>>>>> Scott >>>>>>>>>>>>> >>>>>>>>>>>>> On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> =A0Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> just as an addition: an upgrade to last Friday's >>>>>>>>>>>>>> FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the >>>>>>>>>>>>>> problem. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I will experiment a bit more tomorrow after hours and grab >>>>>>>>>>>>>> >>>>>>>>>>>>> some statistics. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>> Regards >>>>>>>>>>>>>> Peter >>>>>>>>>>>>>> >>>>>>>>>>>>>> Quoting "Peter Ross" <Peter.Ross@bogen.in-berlin.de**>: >>>>>>>>>>>>>> >>>>>>>>>>>>>> =A0Hi all, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I noticed a similar problem last week. It is also very >>>>>>>>>>>>>>> similar to one reported last year: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2010-** >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> September/058708.html<http://lists.freebsd.org/pipermail/fr= eebsd-stable/2010-September/058708.html> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> My server is a Dell T410 server with the same bge card (the >>>>>>>>>>>>>>> same pciconf -lvc output as described by Mahlon: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2010-** >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> September/058711.html<http://lists.freebsd.org/pipermail/fr= eebsd-stable/2010-September/058711.html> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Yours, Scott, is a em(4).. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Another similarity: In all cases we are using VirtualBox. I >>>>>>>>>>>>>>> just want to mention it, in case it matters. I am still >>>>>>>>>>>>>>> running VirtualBox 3.2. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Most of the time kstat.zfs.misc.arcstats.size was reaching >>>>>>>>>>>>>>> vfs.zfs.arc_max then, but I could catch one or two cases >>>>>>>>>>>>>>> then the value was still below. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it >>>>>>>>>>>>>>> >>>>>>>>>>>>>> does not help. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>>> BTW: It looks as ARC only gives back the memory when I >>>>>>>>>>>>>>> destroy the ZFS (a cloned snapshot containing virtual >>>>>>>>>>>>>>> machines). Even if nothing happens for hours the buffer >>>>>>>>>>>>>>> isn't released.. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> My machine was still running 8.2-PRERELEASE so I am >>>>>>>>>>>>>>> upgrading. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am happy to give information gathered on old/new kernel i= f >>>>>>>>>>>>>>> it >>>>>>>>>>>>>>> helps. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regards >>>>>>>>>>>>>>> Peter >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Quoting "Scott Sipe" <cscotts@gmail.com>: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Jul 2, 2011, at 12:54 AM, jhell wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> =A0On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwi= ck >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I'm running 8.2-RELEASE and am having new problems >>>>>>>>>>>>>>>>>>> with scp. When scping >>>>>>>>>>>>>>>>>>> files to a ZFS directory on the FreeBSD server -- >>>>>>>>>>>>>>>>>>> most notably large files >>>>>>>>>>>>>>>>>>> -- the transfer frequently dies after just a few >>>>>>>>>>>>>>>>>>> seconds. In my last test, I >>>>>>>>>>>>>>>>>>> tried to scp an 800mb file to the FreeBSD system and >>>>>>>>>>>>>>>>>>> the transfer died after >>>>>>>>>>>>>>>>>>> 200mb. It completely copied the next 4 times I >>>>>>>>>>>>>>>>>>> tried, and then died again on >>>>>>>>>>>>>>>>>>> the next attempt. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On the client side: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> "Connection to home closed by remote host. >>>>>>>>>>>>>>>>>>> lost connection" >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In /var/log/auth.log: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Jul =A01 14:54:42 freebsd sshd[18955]: fatal: Write >>>>>>>>>>>>>>>>>>> failed: Cannot allocate >>>>>>>>>>>>>>>>>>> memory >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I've never seen this before and have used scp before >>>>>>>>>>>>>>>>>>> to transfer large files >>>>>>>>>>>>>>>>>>> without problems. This computer has been used in >>>>>>>>>>>>>>>>>>> production for months and >>>>>>>>>>>>>>>>>>> has a current uptime of 36 days. I have not been >>>>>>>>>>>>>>>>>>> able to notice any problems >>>>>>>>>>>>>>>>>>> copying files to the server via samba or netatalk, or >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> any problems in >>>>>>>>>>> >>>>>>>>>>>> apache. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Uname: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat >>>>>>>>>>>>>>>>>>> Feb 19 01:02:54 EST >>>>>>>>>>>>>>>>>>> 2011 =A0 =A0 root@xeon:/usr/obj/usr/src/**sys/GENERIC = =A0amd64 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I've attached my dmesg and output of vmstat -z. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have not restarted the sshd daemon or rebooted the >>>>>>>>>>>>>>>>>>> computer. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Am glad to provide any other information or test anythi= ng >>>>>>>>>>>>>>>>>>> else. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> {snip vmstat -z and dmesg} >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> You didn't provide details about your networking setup >>>>>>>>>>>>>>>>>> (rc.conf, >>>>>>>>>>>>>>>>>> ifconfig -a, etc.). =A0netstat -m would be useful too. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Next, please see this thread circa September 2010, title= d >>>>>>>>>>>>>>>>>> "Network >>>>>>>>>>>>>>>>>> memory allocation failures": >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2010= -** >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> September/thread.html#58708<http://lists.freebsd.org/pip= ermail/freebsd-stable/2010-September/thread.html#58708> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The user in that thread is using rsync, which relies on >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> scp by default. >>>>>>>>>>> >>>>>>>>>>>> I believe this problem is similar, if not identical, to yours. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please also provide your output of ( /usr/bin/limits -a ) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> for the server >>>>>>>>>>> >>>>>>>>>>>> end and the client. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I am not quite sure I agree with the need for ifconfig -a >>>>>>>>>>>>>>>>> but >>>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>> information about the networking driver your using for th= e >>>>>>>>>>>>>>>>> interface >>>>>>>>>>>>>>>>> would be helpful, uptime of the boxes. And configuration >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> of the pool. >>>>>>>>> >>>>>>>>>> e.g. ( zpool status -a ;zfs get all <poolname> ) You should >>>>>>>>>> probably >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> prop this information up somewhere so you can reference b= y >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> URL whenever >>>>>>>>>>> >>>>>>>>>>>> needed. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> rsync(1) does not rely on scp(1) whatsoever but rsync(1) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> can be made to >>>>>>>>>>> >>>>>>>>>>>> use ssh(1) instead of rsh(1) and I believe that is what Jeremy >>>>>>>>>>>> is >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> stating here but correct me if I am wrong. It does use >>>>>>>>>>>>>>>>> ssh(1) >>>>>>>>>>>>>>>>> by >>>>>>>>>>>>>>>>> default. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Its a possiblity as well that if using tmpfs(5) or mdmfs(= 8) >>>>>>>>>>>>>>>>> for /tmp >>>>>>>>>>>>>>>>> type filesystems that rsync(1) may be just filling up you= r >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> temp ram area >>>>>>>>>>> >>>>>>>>>>>> and causing the connection abort which would be >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> expected. ( df -h ) would >>>>>>>>>>>>>>>>> help here. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'm not using tmpfs/mdmfs at all. The clients yesterday >>>>>>>>>>>>>>>> were 3 different OSX computers (over gigabit). The FreeBSD >>>>>>>>>>>>>>>> server has 12gb of ram and no bce adapter. For what it's >>>>>>>>>>>>>>>> worth, the server is backed up remotely every night with >>>>>>>>>>>>>>>> rsync (remote FreeBSD uses rsync to pull) to an offsite >>>>>>>>>>>>>>>> (slow cable connection) FreeBSD computer, and I have not >>>>>>>>>>>>>>>> seen any errors in the nightly rsync. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Sorry for the omission of networking info, here's the >>>>>>>>>>>>>>>> output of the requested commands and some that popped up >>>>>>>>>>>>>>>> in the other thread: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://www.cap-press.com/misc/ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> In rc.conf: =A0ifconfig_em1=3D"inet 10.1.1.1 netmask >>>>>>>>>>>>>>>> 255.255.0.0" >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Scott >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>> Just to make it crystal clear to everyone: >>>>>>>>>>>> >>>>>>>>>>>> There is no correlation between this problem and use of ZFS. >>>>>>>>>>>> =A0People are >>>>>>>>>>>> attempting to correlate "cannot allocate memory" messages with >>>>>>>>>>>> "anything >>>>>>>>>>>> on the system that uses memory". =A0The VM is much more comple= x >>>>>>>>>>>> than >>>>>>>>>>>> that. >>>>>>>>>>>> >>>>>>>>>>>> Given the nature of this problem, it's much more likely the >>>>>>>>>>>> issue >>>>>>>>>>>> is >>>>>>>>>>>> "somewhere" within a networking layer within FreeBSD, whether = it >>>>>>>>>>>> be >>>>>>>>>>>> driver-level or some sort of intermediary layer. >>>>>>>>>>>> >>>>>>>>>>>> Two people who have this issue in this thread are both using >>>>>>>>>>>> VirtualBox. >>>>>>>>>>>> Can one, or both, of you remove VirtualBox from the >>>>>>>>>>>> configuration >>>>>>>>>>>> entirely (kernel, etc. -- not sure what is required) and then >>>>>>>>>>>> see >>>>>>>>>>>> if the >>>>>>>>>>>> issue goes away? >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On the machine in question I only can do it after hours so I wi= ll >>>>>>>>>>> do >>>>>>>>>>> it tonight. >>>>>>>>>>> >>>>>>>>>>> I was _successfully_ sending the file over the loopback interfa= ce >>>>>>>>>>> using >>>>>>>>>>> >>>>>>>>>>> cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > >>>>>>>>>>> /dev/null" >>>>>>>>>>> >>>>>>>>>>> I did it, btw, with the IPv6 localhost address first >>>>>>>>>>> (accidently), >>>>>>>>>>> and then using IPv4. Both worked. >>>>>>>>>>> >>>>>>>>>>> It always fails if I am sending it through the bce(4) interface= , >>>>>>>>>>> even if my target is the VirtualBox bridged to the bce card (so >>>>>>>>>>> it >>>>>>>>>>> does not "leave" the computer physically). >>>>>>>>>>> >>>>>>>>>>> Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and >>>>>>>>>>> kldstat output. >>>>>>>>>>> >>>>>>>>>>> I have another box where I do not see that problem. It copies >>>>>>>>>>> files >>>>>>>>>>> happily over the net using ssh. >>>>>>>>>>> >>>>>>>>>>> It is an an older HP ML 150 with 3GB RAM only but with a bge(4) >>>>>>>>>>> driver instead. It runs the same last week's RELENG_8. I >>>>>>>>>>> installed >>>>>>>>>>> VirtualBox and enabled vboxnet (so it loads the kernel modules)= . >>>>>>>>>>> But >>>>>>>>>>> I do not run VirtualBox on it (because it hasn't enough RAM). >>>>>>>>>>> >>>>>>>>>>> Regards >>>>>>>>>>> Peter >>>>>>>>>>> >>>>>>>>>>> DellT410one# uname -a >>>>>>>>>>> FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Th= u >>>>>>>>>>> Jun >>>>>>>>>>> 30 17:07:18 EST 2011 >>>>>>>>>>> root@DellT410one.vv.fda:/usr/**obj/usr/src/sys/GENERIC =A0amd64 >>>>>>>>>>> DellT410one# ifconfig -a >>>>>>>>>>> bce0: flags=3D8943<UP,BROADCAST,**RUNNING,PROMISC,SIMPLEX,** >>>>>>>>>>> MULTICAST> >>>>>>>>>>> metric 0 mtu 1500 >>>>>>>>>>> =A0 =A0 =A0 options=3Dc01bb<RXCSUM,TXCSUM,** >>>>>>>>>>> VLAN_MTU,VLAN_HWTAGGING,JUMBO_**MTU,VLAN_HWCSUM,TSO4,VLAN_** >>>>>>>>>>> HWTSO,LINKSTATE> >>>>>>>>>>> =A0 =A0 =A0 ether 84:2b:2b:68:64:e4 >>>>>>>>>>> =A0 =A0 =A0 inet 192.168.50.220 netmask 0xffffff00 broadcast >>>>>>>>>>> 192.168.50.255 >>>>>>>>>>> =A0 =A0 =A0 inet 192.168.50.221 netmask 0xffffff00 broadcast >>>>>>>>>>> 192.168.50.255 >>>>>>>>>>> =A0 =A0 =A0 inet 192.168.50.223 netmask 0xffffff00 broadcast >>>>>>>>>>> 192.168.50.255 >>>>>>>>>>> =A0 =A0 =A0 inet 192.168.50.224 netmask 0xffffff00 broadcast >>>>>>>>>>> 192.168.50.255 >>>>>>>>>>> =A0 =A0 =A0 inet 192.168.50.225 netmask 0xffffff00 broadcast >>>>>>>>>>> 192.168.50.255 >>>>>>>>>>> =A0 =A0 =A0 inet 192.168.50.226 netmask 0xffffff00 broadcast >>>>>>>>>>> 192.168.50.255 >>>>>>>>>>> =A0 =A0 =A0 inet 192.168.50.227 netmask 0xffffff00 broadcast >>>>>>>>>>> 192.168.50.255 >>>>>>>>>>> =A0 =A0 =A0 inet 192.168.50.219 netmask 0xffffff00 broadcast >>>>>>>>>>> 192.168.50.255 >>>>>>>>>>> =A0 =A0 =A0 media: Ethernet autoselect (1000baseT <full-duplex>= ) >>>>>>>>>>> =A0 =A0 =A0 status: active >>>>>>>>>>> bce1: flags=3D8802<BROADCAST,SIMPLEX,**MULTICAST> metric 0 mtu = 1500 >>>>>>>>>>> =A0 =A0 =A0 options=3Dc01bb<RXCSUM,TXCSUM,** >>>>>>>>>>> VLAN_MTU,VLAN_HWTAGGING,JUMBO_**MTU,VLAN_HWCSUM,TSO4,VLAN_** >>>>>>>>>>> HWTSO,LINKSTATE> >>>>>>>>>>> =A0 =A0 =A0 ether 84:2b:2b:68:64:e5 >>>>>>>>>>> =A0 =A0 =A0 media: Ethernet autoselect >>>>>>>>>>> lo0: flags=3D8049<UP,LOOPBACK,**RUNNING,MULTICAST> metric 0 mtu >>>>>>>>>>> 16384 >>>>>>>>>>> =A0 =A0 =A0 options=3D3<RXCSUM,TXCSUM> >>>>>>>>>>> =A0 =A0 =A0 inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb >>>>>>>>>>> =A0 =A0 =A0 inet6 ::1 prefixlen 128 >>>>>>>>>>> =A0 =A0 =A0 inet 127.0.0.1 netmask 0xff000000 >>>>>>>>>>> =A0 =A0 =A0 nd6 options=3D3<PERFORMNUD,ACCEPT_**RTADV> >>>>>>>>>>> vboxnet0: flags=3D8802<BROADCAST,SIMPLEX,**MULTICAST> metric 0 = mtu >>>>>>>>>>> 1500 >>>>>>>>>>> =A0 =A0 =A0 ether 0a:00:27:00:00:00 >>>>>>>>>>> DellT410one# netstat -rn >>>>>>>>>>> Routing tables >>>>>>>>>>> >>>>>>>>>>> Internet: >>>>>>>>>>> Destination =A0 =A0 =A0 =A0Gateway =A0 =A0 =A0 =A0 =A0 =A0Flags= =A0 =A0Refs =A0 =A0 =A0Use >>>>>>>>>>> =A0Netif >>>>>>>>>>> Expire >>>>>>>>>>> default =A0 =A0 =A0 =A0 =A0 =A0192.168.50.201 =A0 =A0 UGS =A0 = =A0 =A0 =A0 0 =A0 =A052195 >>>>>>>>>>> bce0 >>>>>>>>>>> 127.0.0.1 =A0 =A0 =A0 =A0 =A0link#11 =A0 =A0 =A0 =A0 =A0 =A0UH = =A0 =A0 =A0 =A0 =A00 =A0 =A0 =A0 =A06 >>>>>>>>>>> =A0lo0 >>>>>>>>>>> 192.168.50.0/24 =A0 =A0link#1 =A0 =A0 =A0 =A0 =A0 =A0 U =A0 =A0= =A0 =A0 =A0 0 =A01118212 >>>>>>>>>>> bce0 >>>>>>>>>>> 192.168.50.219 =A0 =A0 link#1 =A0 =A0 =A0 =A0 =A0 =A0 UHS =A0 = =A0 =A0 =A0 0 =A0 =A0 9670 >>>>>>>>>>> =A0lo0 >>>>>>>>>>> 192.168.50.220 =A0 =A0 link#1 =A0 =A0 =A0 =A0 =A0 =A0 UHS =A0 = =A0 =A0 =A0 0 =A0 =A0 8347 >>>>>>>>>>> =A0lo0 >>>>>>>>>>> 192.168.50.221 =A0 =A0 link#1 =A0 =A0 =A0 =A0 =A0 =A0 UHS =A0 = =A0 =A0 =A0 0 =A0 103024 >>>>>>>>>>> =A0lo0 >>>>>>>>>>> 192.168.50.223 =A0 =A0 link#1 =A0 =A0 =A0 =A0 =A0 =A0 UHS =A0 = =A0 =A0 =A0 0 =A0 =A043614 >>>>>>>>>>> =A0lo0 >>>>>>>>>>> 192.168.50.224 =A0 =A0 link#1 =A0 =A0 =A0 =A0 =A0 =A0 UHS =A0 = =A0 =A0 =A0 0 =A0 =A0 8358 >>>>>>>>>>> =A0lo0 >>>>>>>>>>> 192.168.50.225 =A0 =A0 link#1 =A0 =A0 =A0 =A0 =A0 =A0 UHS =A0 = =A0 =A0 =A0 0 =A0 =A0 8438 >>>>>>>>>>> =A0lo0 >>>>>>>>>>> 192.168.50.226 =A0 =A0 link#1 =A0 =A0 =A0 =A0 =A0 =A0 UHS =A0 = =A0 =A0 =A0 0 =A0 =A0 8338 >>>>>>>>>>> =A0lo0 >>>>>>>>>>> 192.168.50.227 =A0 =A0 link#1 =A0 =A0 =A0 =A0 =A0 =A0 UHS =A0 = =A0 =A0 =A0 0 =A0 =A0 8333 >>>>>>>>>>> =A0lo0 >>>>>>>>>>> 192.168.165.0/24 =A0 192.168.50.200 =A0 =A0 UGS =A0 =A0 =A0 =A0= 0 =A0 =A0 3311 >>>>>>>>>>> bce0 >>>>>>>>>>> 192.168.166.0/24 =A0 192.168.50.200 =A0 =A0 UGS =A0 =A0 =A0 =A0= 0 =A0 =A0 =A0699 >>>>>>>>>>> bce0 >>>>>>>>>>> 192.168.167.0/24 =A0 192.168.50.200 =A0 =A0 UGS =A0 =A0 =A0 =A0= 0 =A0 =A0 3012 >>>>>>>>>>> bce0 >>>>>>>>>>> 192.168.168.0/24 =A0 192.168.50.200 =A0 =A0 UGS =A0 =A0 =A0 =A0= 0 =A0 =A0 =A0552 >>>>>>>>>>> bce0 >>>>>>>>>>> >>>>>>>>>>> Internet6: >>>>>>>>>>> Destination =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Gateway >>>>>>>>>>> Flags =A0 =A0 =A0Netif Expire >>>>>>>>>>> ::1 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= ::1 >>>>>>>>>>> UH >>>>>>>>>>> lo0 >>>>>>>>>>> fe80::%lo0/64 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 link#11 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 U >>>>>>>>>>> lo0 >>>>>>>>>>> fe80::1%lo0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 link#11 >>>>>>>>>>> UHS >>>>>>>>>>> lo0 >>>>>>>>>>> ff01::%lo0/32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 fe80::1%l= o0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 U >>>>>>>>>>> lo0 >>>>>>>>>>> ff02::%lo0/32 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 fe80::1%l= o0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 U >>>>>>>>>>> lo0 >>>>>>>>>>> DellT410one# kldstat >>>>>>>>>>> Id Refs Address =A0 =A0 =A0 =A0 =A0 =A0Size =A0 =A0 Name >>>>>>>>>>> 1 =A0 19 0xffffffff80100000 dbf5d0 =A0 kernel >>>>>>>>>>> 2 =A0 =A03 0xffffffff80ec0000 4c358 =A0 =A0vboxdrv.ko >>>>>>>>>>> 3 =A0 =A01 0xffffffff81012000 131998 =A0 zfs.ko >>>>>>>>>>> 4 =A0 =A01 0xffffffff81144000 1ff1 =A0 =A0 opensolaris.ko >>>>>>>>>>> 5 =A0 =A02 0xffffffff81146000 2940 =A0 =A0 vboxnetflt.ko >>>>>>>>>>> 6 =A0 =A02 0xffffffff81149000 8e38 =A0 =A0 netgraph.ko >>>>>>>>>>> 7 =A0 =A01 0xffffffff81152000 153c =A0 =A0 ng_ether.ko >>>>>>>>>>> 8 =A0 =A01 0xffffffff81154000 e70 =A0 =A0 =A0vboxnetadp.ko >>>>>>>>>>> DellT410one# pciconf -lv >>>>>>>>>>> .. >>>>>>>>>>> bce0@pci0:1:0:0: =A0 =A0 =A0 =A0class=3D0x020000 card=3D0x028d1= 028 >>>>>>>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>>>>>>>>> =A0vendor =A0 =A0 =3D 'Broadcom Corporation' >>>>>>>>>>> =A0class =A0 =A0 =A0=3D network >>>>>>>>>>> =A0subclass =A0 =3D ethernet >>>>>>>>>>> bce1@pci0:1:0:1: =A0 =A0 =A0 =A0class=3D0x020000 card=3D0x028d1= 028 >>>>>>>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>>>>>>>>> =A0vendor =A0 =A0 =3D 'Broadcom Corporation' >>>>>>>>>>> =A0class =A0 =A0 =A0=3D network >>>>>>>>>>> =A0subclass =A0 =3D ethernet >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Could you please provide "pciconf -lvcb" output instead, specifi= c >>>>>>>>>> to >>>>>>>>>> the >>>>>>>>>> bce chips? =A0Thanks. >>>>>>>>>> >>>>>>>>> >>>>>>>>> Her it is: >>>>>>>>> >>>>>>>>> bce0@pci0:1:0:0: =A0 =A0 =A0 =A0class=3D0x020000 card=3D0x028d102= 8 >>>>>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>>>>>>> =A0vendor =A0 =A0 =3D 'Broadcom Corporation' >>>>>>>>> =A0class =A0 =A0 =A0=3D network >>>>>>>>> =A0subclass =A0 =3D ethernet >>>>>>>>> =A0bar =A0 [10] =3D type Memory, range 64, base 0xda000000, size >>>>>>>>> 33554432, enabled >>>>>>>>> =A0cap 01[48] =3D powerspec 3 =A0supports D0 D3 =A0current D0 >>>>>>>>> =A0cap 03[50] =3D VPD >>>>>>>>> =A0cap 05[58] =3D MSI supports 16 messages, 64 bit enabled with 1 >>>>>>>>> message >>>>>>>>> =A0cap 11[a0] =3D MSI-X supports 9 messages in map 0x10 >>>>>>>>> =A0cap 10[ac] =3D PCI-Express 2 endpoint max data 256(512) link x= 4(x4) >>>>>>>>> ecap 0003[100] =3D Serial 1 842b2bfffe6864e4 >>>>>>>>> ecap 0001[110] =3D AER 1 0 fatal 0 non-fatal 1 corrected >>>>>>>>> ecap 0004[150] =3D unknown 1 >>>>>>>>> ecap 0002[160] =3D VC 1 max VC0 >>>>>>>>> >>>>>>>> >>>>>>>> Thanks Peter. >>>>>>>> >>>>>>>> Adding Yong-Hyeon and David to the discussion, since they've both >>>>>>>> worked >>>>>>>> on the bce(4) driver in recent months (most of the changes made >>>>>>>> recently >>>>>>>> are only in HEAD), and also adding Jack Vogel of Intel who maintai= ns >>>>>>>> em(4). =A0Brief history for the devs: >>>>>>>> >>>>>>>> The issue is described "Network memory allocation failures" and wa= s >>>>>>>> reported last year, but two users recently (Scott and Peter) have >>>>>>>> reported the issue again: >>>>>>>> >>>>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2010-** >>>>>>>> >>>>>>>> September/thread.html#58708<http://lists.freebsd.org/pipermail/fre= ebsd-stable/2010-September/thread.html#58708> >>>>>>>> >>>>>>>> And was mentioned again by Scott here, which also contains some >>>>>>>> technical details: >>>>>>>> >>>>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2011-** >>>>>>>> >>>>>>>> July/063172.html<http://lists.freebsd.org/pipermail/freebsd-stable= /2011-July/063172.html> >>>>>>>> >>>>>>>> What's interesting is that Scott's issue is identical in form but >>>>>>>> he's >>>>>>>> using em(4), which isn't known to behave like this. =A0Both >>>>>>>> individuals >>>>>>>> are using VirtualBox, though we're not sure at this point if that = is >>>>>>>> the >>>>>>>> piece which is causing the anomaly. >>>>>>>> >>>>>>>> Relevant details of Scott's system (em-based): >>>>>>>> >>>>>>>> http://www.cap-press.com/misc/ >>>>>>>> >>>>>>>> Relevant details of Peter's system (bce-based): >>>>>>>> >>>>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2011-** >>>>>>>> >>>>>>>> July/063221.html<http://lists.freebsd.org/pipermail/freebsd-stable= /2011-July/063221.html> >>>>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2011-** >>>>>>>> >>>>>>>> July/063223.html<http://lists.freebsd.org/pipermail/freebsd-stable= /2011-July/063223.html> >>>>>>>> >>>>>>>> I think the biggest complexity right now is figuring out how/why s= cp >>>>>>>> fails intermittently in this nature. =A0The errno probably "trickl= es >>>>>>>> down" >>>>>>>> to userland from the kernel, but the condition regarding why it >>>>>>>> happens >>>>>>>> is unknown. >>>>>>>> >>>>>>> >>>>>>> BTW: I also saw 2 of the errors coming from a BIND9 running in a >>>>>>> jail on that box. >>>>>>> >>>>>>> DellT410one# fgrep -i allocate >>>>>>> /jails/bind/20110315/var/log/**messages >>>>>>> Apr 13 05:17:41 bind named[23534]: internal_send: >>>>>>> 192.168.50.145#65176: Cannot allocate memory >>>>>>> Jun 21 23:30:44 bind named[39864]: internal_send: >>>>>>> 192.168.50.251#36155: Cannot allocate memory >>>>>>> Jun 24 15:28:00 bind named[39864]: internal_send: >>>>>>> 192.168.50.251#28651: Cannot allocate memory >>>>>>> Jun 28 12:57:52 bind named[2462]: internal_send: >>>>>>> 192.168.165.154#1201: Cannot allocate memory >>>>>>> >>>>>>> My initial guess: it happens sooner or later somehow - whether it i= s >>>>>>> a lot of traffic in one go (ssh/scp copies of virtual disks) or a >>>>>>> lot of traffic over a longer period (a nameserver gets asked again >>>>>>> and again). >>>>>>> >>>>>> >>>>>> Scott, are you also using jails? =A0If both of you are: is there any >>>>>> possibility you can remove use of those? =A0I'm not sure how Virtual= Box >>>>>> fits into the picture (jails + VirtualBox that is), but I can imagin= e >>>>>> jails having different environmental constraints that might cause >>>>>> this. >>>>>> >>>>>> Basically the troubleshooting process here is to remove pieces of th= e >>>>>> puzzle until you figure out which piece is causing the issue. =A0I d= on't >>>>>> want to get the NIC driver devs all spun up for something that, for >>>>>> example, might be an issue with the jail implementation. >>>>>> >>>>> >>>>> I understand this. As said, I do some afterhours debugging tonight. >>>>> >>>>> The scp/ssh problems are happening _outside_ the jails. The bind runs >>>>> _inside_ the jail. >>>>> >>>>> I wanted to use the _host_ system to send VirtualBox virtual disks an= d >>>>> =A0filesystems used by jails to archive them and/or having them avail= able >>>>> on >>>>> other FreeBSD systems (as a cold standby solution). >>>>> >>>> >>>> I just switched off the VirtualBox (without removing the kernel >>>> modules). >>>> >>>> The copy succeeds now. >>>> >>>> Well, it could be a VirtualBox related problem, or is the server just >>>> relieved to have 2GB more memory at hands now? >>>> >>>> Do you have a quick idea to "emulate" the 2GB memory load usually >>>> delivered by VirtualBox? >>>> >>> >>> Well, managed that (using lookbusy) >>> >>> Interestingly I could copy a large file (30GB) without problems, as soo= n >>> as >>> I switched off the VirtualBox. As said, the kernel modules weren't >>> unloaded, >>> they are still there. >>> >>> The copy crashes seconds after I started the VirtualBox. According to >>> vmstat and top I had more free memory (ca. 1.5GB) as I had without >>> VirtualBox and lookbusy (ca. 350MB). >>> >>> So, it looks (to me, at least) as I have a VirtualBox related problem, >>> somehow. >>> >>> Any ideas? I am happy to play a bit more to get it sorted although it h= as >>> some limits (it is running the company mailserver, after all) >>> >>> Regards >>> Peter >>> >> >> This is it -- I'm seeing the exact same thing. >> >> Scp dies reliably with VirtualBox running. Quit VirtualBox and I was abl= e >> to >> scp about 30 large files with no errors. Once I started VirtualBox an >> in-progress scp died within seconds. >> >> Ditto that the Kernel modules merely being loaded don't seem to make a >> difference, it's VirtualBox actually running. >> >> virtualbox-ose-3.2.12_1 > > Hi, > > I wonder whether anyone has new ideas. > > I am puzzled that it happens when VirtualBoxes are running, while the loa= d > or unload of the VirtualBox kernel modules doesn't seem to have an effect= . > > Should I describe the case at the -emulation mailing list to get some ide= as > from the engineers working on VirtualBox? > > I do not want to create too much noise so I would like to know your thoug= hts > on it first. > > I experimented a little bit with the ssh code and know which write(2) in > /usr/src/crypto/openssh/roaming_common.c (in function roaming_write) retu= rns > the ENOMEM (an error it should never return, according to the mainpage;-) > > but unfortunately I am lost to track it further down in the kernel. I do = not > know enough about it, to be frankly. > > Are there any memory stats inside the kernel that could help? > > Thank you for all ideas > Peter > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmom19KbU0kki0KVTSyzmq-CTKh-j7g%2BmFcRVscb%2B0XPxhQ>