Date: Wed, 06 Jul 2011 13:07:53 +1000 From: "Peter Ross" <Peter.Ross@bogen.in-berlin.de> To: "Jeremy Chadwick" <freebsd@jdc.parodius.com> Cc: freebsd-stable List <freebsd-stable@freebsd.org>, Scott Sipe <cscotts@gmail.com> Subject: Re: scp: Write Failed: Cannot allocate memory Message-ID: <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> In-Reply-To: <20110706023234.GA72048@icarus.home.lan> References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> <20110706023234.GA72048@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
Quoting "Jeremy Chadwick" <freebsd@jdc.parodius.com>: > On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: >> Quoting "Jeremy Chadwick" <freebsd@jdc.parodius.com>: >> >> >On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: >> >>I'm running virtualbox 3.2.12_1 if that has anything to do with it. >> >> >> >>sysctl vfs.zfs.arc_max: 6200000000 >> >> >> >>While I'm trying to scp, kstat.zfs.misc.arcstats.size is >> >>hovering right around that value, sometimes above, sometimes >> >>below (that's as it should be, right?). I don't think that it >> >>dies when crossing over arc_max. I can run the same scp 10 times >> >>and it might fail 1-3 times, with no correlation to the >> >>arcstats.size being above/below arc_max that I can see. >> >> >> >>Scott >> >> >> >>On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: >> >> >> >>>Hi all, >> >>> >> >>>just as an addition: an upgrade to last Friday's >> >>>FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the >> >>>problem. >> >>> >> >>>I will experiment a bit more tomorrow after hours and grab some =20 >> statistics. >> >>> >> >>>Regards >> >>>Peter >> >>> >> >>>Quoting "Peter Ross" <Peter.Ross@bogen.in-berlin.de>: >> >>> >> >>>>Hi all, >> >>>> >> >>>>I noticed a similar problem last week. It is also very >> >>>>similar to one reported last year: >> >>>> >> >>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/0587= 08.html >> >>>> >> >>>>My server is a Dell T410 server with the same bge card (the >> >>>>same pciconf -lvc output as described by Mahlon: >> >>>> >> >>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/0587= 11.html >> >>>> >> >>>>Yours, Scott, is a em(4).. >> >>>> >> >>>>Another similarity: In all cases we are using VirtualBox. I >> >>>>just want to mention it, in case it matters. I am still >> >>>>running VirtualBox 3.2. >> >>>> >> >>>>Most of the time kstat.zfs.misc.arcstats.size was reaching >> >>>>vfs.zfs.arc_max then, but I could catch one or two cases >> >>>>then the value was still below. >> >>>> >> >>>>I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it does not h= elp. >> >>>> >> >>>>BTW: It looks as ARC only gives back the memory when I >> >>>>destroy the ZFS (a cloned snapshot containing virtual >> >>>>machines). Even if nothing happens for hours the buffer >> >>>>isn't released.. >> >>>> >> >>>>My machine was still running 8.2-PRERELEASE so I am upgrading. >> >>>> >> >>>>I am happy to give information gathered on old/new kernel if it helps= . >> >>>> >> >>>>Regards >> >>>>Peter >> >>>> >> >>>>Quoting "Scott Sipe" <cscotts@gmail.com>: >> >>>> >> >>>>> >> >>>>>On Jul 2, 2011, at 12:54 AM, jhell wrote: >> >>>>> >> >>>>>>On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: >> >>>>>>>On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >> >>>>>>>>I'm running 8.2-RELEASE and am having new problems >> >>>>>>>>with scp. When scping >> >>>>>>>>files to a ZFS directory on the FreeBSD server -- >> >>>>>>>>most notably large files >> >>>>>>>>-- the transfer frequently dies after just a few >> >>>>>>>>seconds. In my last test, I >> >>>>>>>>tried to scp an 800mb file to the FreeBSD system and >> >>>>>>>>the transfer died after >> >>>>>>>>200mb. It completely copied the next 4 times I >> >>>>>>>>tried, and then died again on >> >>>>>>>>the next attempt. >> >>>>>>>> >> >>>>>>>>On the client side: >> >>>>>>>> >> >>>>>>>>"Connection to home closed by remote host. >> >>>>>>>>lost connection" >> >>>>>>>> >> >>>>>>>>In /var/log/auth.log: >> >>>>>>>> >> >>>>>>>>Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write >> >>>>>>>>failed: Cannot allocate >> >>>>>>>>memory >> >>>>>>>> >> >>>>>>>>I've never seen this before and have used scp before >> >>>>>>>>to transfer large files >> >>>>>>>>without problems. This computer has been used in >> >>>>>>>>production for months and >> >>>>>>>>has a current uptime of 36 days. I have not been >> >>>>>>>>able to notice any problems >> >>>>>>>>copying files to the server via samba or netatalk, or any =20 >> problems in >> >>>>>>>>apache. >> >>>>>>>> >> >>>>>>>>Uname: >> >>>>>>>> >> >>>>>>>>FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat >> >>>>>>>>Feb 19 01:02:54 EST >> >>>>>>>>2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >> >>>>>>>> >> >>>>>>>>I've attached my dmesg and output of vmstat -z. >> >>>>>>>> >> >>>>>>>>I have not restarted the sshd daemon or rebooted the computer. >> >>>>>>>> >> >>>>>>>>Am glad to provide any other information or test anything else. >> >>>>>>>> >> >>>>>>>>{snip vmstat -z and dmesg} >> >>>>>>> >> >>>>>>>You didn't provide details about your networking setup (rc.conf, >> >>>>>>>ifconfig -a, etc.). netstat -m would be useful too. >> >>>>>>> >> >>>>>>>Next, please see this thread circa September 2010, titled "Network >> >>>>>>>memory allocation failures": >> >>>>>>> >> >>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/t= hread.html#58708 >> >>>>>>> >> >>>>>>>The user in that thread is using rsync, which relies on scp =20 >> by default. >> >>>>>>>I believe this problem is similar, if not identical, to yours. >> >>>>>>> >> >>>>>> >> >>>>>>Please also provide your output of ( /usr/bin/limits -a ) for =20 >> the server >> >>>>>>end and the client. >> >>>>>> >> >>>>>>I am not quite sure I agree with the need for ifconfig -a but some >> >>>>>>information about the networking driver your using for the interfac= e >> >>>>>>would be helpful, uptime of the boxes. And configuration of the poo= l. >> >>>>>>e.g. ( zpool status -a ;zfs get all <poolname> ) You should probabl= y >> >>>>>>prop this information up somewhere so you can reference by =20 >> URL whenever >> >>>>>>needed. >> >>>>>> >> >>>>>>rsync(1) does not rely on scp(1) whatsoever but rsync(1) can =20 >> be made to >> >>>>>>use ssh(1) instead of rsh(1) and I believe that is what Jeremy is >> >>>>>>stating here but correct me if I am wrong. It does use ssh(1) by >> >>>>>>default. >> >>>>>> >> >>>>>>Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tm= p >> >>>>>>type filesystems that rsync(1) may be just filling up your =20 >> temp ram area >> >>>>>>and causing the connection abort which would be >> >>>>>>expected. ( df -h ) would >> >>>>>>help here. >> >>>>> >> >>>>>Hello, >> >>>>> >> >>>>>I'm not using tmpfs/mdmfs at all. The clients yesterday >> >>>>>were 3 different OSX computers (over gigabit). The FreeBSD >> >>>>>server has 12gb of ram and no bce adapter. For what it's >> >>>>>worth, the server is backed up remotely every night with >> >>>>>rsync (remote FreeBSD uses rsync to pull) to an offsite >> >>>>>(slow cable connection) FreeBSD computer, and I have not >> >>>>>seen any errors in the nightly rsync. >> >>>>> >> >>>>>Sorry for the omission of networking info, here's the >> >>>>>output of the requested commands and some that popped up >> >>>>>in the other thread: >> >>>>> >> >>>>>http://www.cap-press.com/misc/ >> >>>>> >> >>>>>In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0" >> >>>>> >> >>>>>Scott >> > >> >Just to make it crystal clear to everyone: >> > >> >There is no correlation between this problem and use of ZFS. People are >> >attempting to correlate "cannot allocate memory" messages with "anything >> >on the system that uses memory". The VM is much more complex than that. >> > >> >Given the nature of this problem, it's much more likely the issue is >> >"somewhere" within a networking layer within FreeBSD, whether it be >> >driver-level or some sort of intermediary layer. >> > >> >Two people who have this issue in this thread are both using VirtualBox. >> >Can one, or both, of you remove VirtualBox from the configuration >> >entirely (kernel, etc. -- not sure what is required) and then see if the >> >issue goes away? >> >> On the machine in question I only can do it after hours so I will do >> it tonight. >> >> I was _successfully_ sending the file over the loopback interface using >> >> cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null" >> >> I did it, btw, with the IPv6 localhost address first (accidently), >> and then using IPv4. Both worked. >> >> It always fails if I am sending it through the bce(4) interface, >> even if my target is the VirtualBox bridged to the bce card (so it >> does not "leave" the computer physically). >> >> Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and =20 >> kldstat output. >> >> I have another box where I do not see that problem. It copies files >> happily over the net using ssh. >> >> It is an an older HP ML 150 with 3GB RAM only but with a bge(4) >> driver instead. It runs the same last week's RELENG_8. I installed >> VirtualBox and enabled vboxnet (so it loads the kernel modules). But >> I do not run VirtualBox on it (because it hasn't enough RAM). >> >> Regards >> Peter >> >> DellT410one# uname -a >> FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun >> 30 17:07:18 EST 2011 >> root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 >> DellT410one# ifconfig -a >> bce0: flags=3D8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> >> metric 0 mtu 1500 >> =09options=3Dc01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_H= WCSUM,TSO4,VLAN_HWTSO,LINKSTATE> >> =09ether 84:2b:2b:68:64:e4 >> =09inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 >> =09media: Ethernet autoselect (1000baseT <full-duplex>) >> =09status: active >> bce1: flags=3D8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 >> =09options=3Dc01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_H= WCSUM,TSO4,VLAN_HWTSO,LINKSTATE> >> =09ether 84:2b:2b:68:64:e5 >> =09media: Ethernet autoselect >> lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 >> =09options=3D3<RXCSUM,TXCSUM> >> =09inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb >> =09inet6 ::1 prefixlen 128 >> =09inet 127.0.0.1 netmask 0xff000000 >> =09nd6 options=3D3<PERFORMNUD,ACCEPT_RTADV> >> vboxnet0: flags=3D8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 >> =09ether 0a:00:27:00:00:00 >> DellT410one# netstat -rn >> Routing tables >> >> Internet: >> Destination Gateway Flags Refs Use Netif Expir= e >> default 192.168.50.201 UGS 0 52195 bce0 >> 127.0.0.1 link#11 UH 0 6 lo0 >> 192.168.50.0/24 link#1 U 0 1118212 bce0 >> 192.168.50.219 link#1 UHS 0 9670 lo0 >> 192.168.50.220 link#1 UHS 0 8347 lo0 >> 192.168.50.221 link#1 UHS 0 103024 lo0 >> 192.168.50.223 link#1 UHS 0 43614 lo0 >> 192.168.50.224 link#1 UHS 0 8358 lo0 >> 192.168.50.225 link#1 UHS 0 8438 lo0 >> 192.168.50.226 link#1 UHS 0 8338 lo0 >> 192.168.50.227 link#1 UHS 0 8333 lo0 >> 192.168.165.0/24 192.168.50.200 UGS 0 3311 bce0 >> 192.168.166.0/24 192.168.50.200 UGS 0 699 bce0 >> 192.168.167.0/24 192.168.50.200 UGS 0 3012 bce0 >> 192.168.168.0/24 192.168.50.200 UGS 0 552 bce0 >> >> Internet6: >> Destination Gateway >> Flags Netif Expire >> ::1 ::1 UH >> lo0 >> fe80::%lo0/64 link#11 U >> lo0 >> fe80::1%lo0 link#11 UHS >> lo0 >> ff01::%lo0/32 fe80::1%lo0 U >> lo0 >> ff02::%lo0/32 fe80::1%lo0 U >> lo0 >> DellT410one# kldstat >> Id Refs Address Size Name >> 1 19 0xffffffff80100000 dbf5d0 kernel >> 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko >> 3 1 0xffffffff81012000 131998 zfs.ko >> 4 1 0xffffffff81144000 1ff1 opensolaris.ko >> 5 2 0xffffffff81146000 2940 vboxnetflt.ko >> 6 2 0xffffffff81149000 8e38 netgraph.ko >> 7 1 0xffffffff81152000 153c ng_ether.ko >> 8 1 0xffffffff81154000 e70 vboxnetadp.ko >> DellT410one# pciconf -lv >> .. >> bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >> vendor =3D 'Broadcom Corporation' >> class =3D network >> subclass =3D ethernet >> bce1@pci0:1:0:1: class=3D0x020000 card=3D0x028d1028 >> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >> vendor =3D 'Broadcom Corporation' >> class =3D network >> subclass =3D ethernet > > Could you please provide "pciconf -lvcb" output instead, specific to the > bce chips? Thanks. Her it is: bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 chip=3D0x163b14e4= =20 rev=3D0x20 hdr=3D0x00 vendor =3D 'Broadcom Corporation' class =3D network subclass =3D ethernet bar [10] =3D type Memory, range 64, base 0xda000000, size =20 33554432, enabled cap 01[48] =3D powerspec 3 supports D0 D3 current D0 cap 03[50] =3D VPD cap 05[58] =3D MSI supports 16 messages, 64 bit enabled with 1 message cap 11[a0] =3D MSI-X supports 9 messages in map 0x10 cap 10[ac] =3D PCI-Express 2 endpoint max data 256(512) link x4(x4) ecap 0003[100] =3D Serial 1 842b2bfffe6864e4 ecap 0001[110] =3D AER 1 0 fatal 0 non-fatal 1 corrected ecap 0004[150] =3D unknown 1 ecap 0002[160] =3D VC 1 max VC0 Regards Peter
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110706130753.182053f3ellasn0p>