From owner-freebsd-stable@FreeBSD.ORG  Wed Jul  6 07:32:45 2011
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A47FE1065675;
	Wed,  6 Jul 2011 07:32:45 +0000 (UTC)
	(envelope-from Peter.Ross@bogen.in-berlin.de)
Received: from einhorn.in-berlin.de (einhorn.in-berlin.de [192.109.42.8])
	by mx1.freebsd.org (Postfix) with ESMTP id 640578FC12;
	Wed,  6 Jul 2011 07:32:44 +0000 (UTC)
X-Envelope-From: Peter.Ross@bogen.in-berlin.de
Received: from localhost (okapi.in-berlin.de [192.109.42.117])
	by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id
	p667WgpP016550; Wed, 6 Jul 2011 09:32:42 +0200
Received: from 124-254-118-24-static.bb.ispone.net.au
	(124-254-118-24-static.bb.ispone.net.au [124.254.118.24]) by
	webmail.in-berlin.de (Horde Framework) with HTTP;
	Wed, 06 Jul 2011 17:32:42 +1000
Message-ID: <20110706173242.23404ffbhkxz6mqi@webmail.in-berlin.de>
Date: Wed, 06 Jul 2011 17:32:42 +1000
From: "Peter Ross" <Peter.Ross@bogen.in-berlin.de>
To: "Peter Ross" <Peter.Ross@bogen.in-berlin.de>
References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de>
	<20110706023234.GA72048@icarus.home.lan>
	<20110706130753.182053f3ellasn0p@webmail.in-berlin.de>
	<20110706032425.GA72757@icarus.home.lan>
	<20110706135412.15276i0fxavg09k4@webmail.in-berlin.de>
	<20110706041504.GA73698@icarus.home.lan>
	<20110706143129.10696235ldx9bjmp@webmail.in-berlin.de>
In-Reply-To: <20110706143129.10696235ldx9bjmp@webmail.in-berlin.de>
MIME-Version: 1.0
Content-Type: text/plain;
	charset=ISO-8859-1;
	DelSp="Yes";
	format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
User-Agent: Internet Messaging Program (IMP) 4.3.3
X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8
Cc: Yong-Hyeon Pyun <pyunyh@gmail.com>,
	freebsd-stable List <freebsd-stable@freebsd.org>, "Vogel,
	Jack" <jack.vogel@intel.com>, davidch@freebsd.org,
	Scott Sipe <cscotts@gmail.com>, Jeremy Chadwick <freebsd@jdc.parodius.com>
Subject: Re: scp: Write Failed: Cannot allocate memory
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Jul 2011 07:32:45 -0000

Quoting "Peter Ross" <Peter.Ross@bogen.in-berlin.de>:

> Quoting "Jeremy Chadwick" <freebsd@jdc.parodius.com>:
>
>> On Wed, Jul 06, 2011 at 01:54:12PM +1000, Peter Ross wrote:
>>> Quoting "Jeremy Chadwick" <freebsd@jdc.parodius.com>:
>>>
>>>> On Wed, Jul 06, 2011 at 01:07:53PM +1000, Peter Ross wrote:
>>>>> Quoting "Jeremy Chadwick" <freebsd@jdc.parodius.com>:
>>>>>
>>>>>> On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote:
>>>>>>> Quoting "Jeremy Chadwick" <freebsd@jdc.parodius.com>:
>>>>>>>
>>>>>>>> On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote:
>>>>>>>>> I'm running virtualbox 3.2.12_1 if that has anything to do with it=
.
>>>>>>>>>
>>>>>>>>> sysctl vfs.zfs.arc_max: 6200000000
>>>>>>>>>
>>>>>>>>> While I'm trying to scp, kstat.zfs.misc.arcstats.size is
>>>>>>>>> hovering right around that value, sometimes above, sometimes
>>>>>>>>> below (that's as it should be, right?). I don't think that it
>>>>>>>>> dies when crossing over arc_max. I can run the same scp 10 times
>>>>>>>>> and it might fail 1-3 times, with no correlation to the
>>>>>>>>> arcstats.size being above/below arc_max that I can see.
>>>>>>>>>
>>>>>>>>> Scott
>>>>>>>>>
>>>>>>>>> On Jul 5, 2011, at 3:00 AM, Peter Ross wrote:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> just as an addition: an upgrade to last Friday's
>>>>>>>>>> FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the
>>>>>>>>>> problem.
>>>>>>>>>>
>>>>>>>>>> I will experiment a bit more tomorrow after hours and grab
>>>>>>> some statistics.
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>> Peter
>>>>>>>>>>
>>>>>>>>>> Quoting "Peter Ross" <Peter.Ross@bogen.in-berlin.de>:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> I noticed a similar problem last week. It is also very
>>>>>>>>>>> similar to one reported last year:
>>>>>>>>>>>
>>>>>>>>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September=
/058708.html
>>>>>>>>>>>
>>>>>>>>>>> My server is a Dell T410 server with the same bge card (the
>>>>>>>>>>> same pciconf -lvc output as described by Mahlon:
>>>>>>>>>>>
>>>>>>>>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September=
/058711.html
>>>>>>>>>>>
>>>>>>>>>>> Yours, Scott, is a em(4)..
>>>>>>>>>>>
>>>>>>>>>>> Another similarity: In all cases we are using VirtualBox. I
>>>>>>>>>>> just want to mention it, in case it matters. I am still
>>>>>>>>>>> running VirtualBox 3.2.
>>>>>>>>>>>
>>>>>>>>>>> Most of the time kstat.zfs.misc.arcstats.size was reaching
>>>>>>>>>>> vfs.zfs.arc_max then, but I could catch one or two cases
>>>>>>>>>>> then the value was still below.
>>>>>>>>>>>
>>>>>>>>>>> I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it
>>>>> does not help.
>>>>>>>>>>>
>>>>>>>>>>> BTW: It looks as ARC only gives back the memory when I
>>>>>>>>>>> destroy the ZFS (a cloned snapshot containing virtual
>>>>>>>>>>> machines). Even if nothing happens for hours the buffer
>>>>>>>>>>> isn't released..
>>>>>>>>>>>
>>>>>>>>>>> My machine was still running 8.2-PRERELEASE so I am upgrading.
>>>>>>>>>>>
>>>>>>>>>>> I am happy to give information gathered on old/new kernel =20
>>>>>>>>>>> if it helps.
>>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>> Peter
>>>>>>>>>>>
>>>>>>>>>>> Quoting "Scott Sipe" <cscotts@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Jul 2, 2011, at 12:54 AM, jhell wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrot=
e:
>>>>>>>>>>>>>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote:
>>>>>>>>>>>>>>> I'm running 8.2-RELEASE and am having new problems
>>>>>>>>>>>>>>> with scp. When scping
>>>>>>>>>>>>>>> files to a ZFS directory on the FreeBSD server --
>>>>>>>>>>>>>>> most notably large files
>>>>>>>>>>>>>>> -- the transfer frequently dies after just a few
>>>>>>>>>>>>>>> seconds. In my last test, I
>>>>>>>>>>>>>>> tried to scp an 800mb file to the FreeBSD system and
>>>>>>>>>>>>>>> the transfer died after
>>>>>>>>>>>>>>> 200mb. It completely copied the next 4 times I
>>>>>>>>>>>>>>> tried, and then died again on
>>>>>>>>>>>>>>> the next attempt.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On the client side:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> "Connection to home closed by remote host.
>>>>>>>>>>>>>>> lost connection"
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In /var/log/auth.log:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jul  1 14:54:42 freebsd sshd[18955]: fatal: Write
>>>>>>>>>>>>>>> failed: Cannot allocate
>>>>>>>>>>>>>>> memory
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've never seen this before and have used scp before
>>>>>>>>>>>>>>> to transfer large files
>>>>>>>>>>>>>>> without problems. This computer has been used in
>>>>>>>>>>>>>>> production for months and
>>>>>>>>>>>>>>> has a current uptime of 36 days. I have not been
>>>>>>>>>>>>>>> able to notice any problems
>>>>>>>>>>>>>>> copying files to the server via samba or netatalk, or
>>>>>>> any problems in
>>>>>>>>>>>>>>> apache.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Uname:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat
>>>>>>>>>>>>>>> Feb 19 01:02:54 EST
>>>>>>>>>>>>>>> 2011     root@xeon:/usr/obj/usr/src/sys/GENERIC  amd64
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've attached my dmesg and output of vmstat -z.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have not restarted the sshd daemon or rebooted the compute=
r.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Am glad to provide any other information or test anything el=
se.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> {snip vmstat -z and dmesg}
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You didn't provide details about your networking setup (rc.co=
nf,
>>>>>>>>>>>>>> ifconfig -a, etc.).  netstat -m would be useful too.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Next, please see this thread circa September 2010, =20
>>>>>>>>>>>>>> titled "Network
>>>>>>>>>>>>>> memory allocation failures":
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-Septem=
ber/thread.html#58708
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The user in that thread is using rsync, which relies on
>>>>>>> scp by default.
>>>>>>>>>>>>>> I believe this problem is similar, if not identical, to yours=
.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please also provide your output of ( /usr/bin/limits -a )
>>>>>>> for the server
>>>>>>>>>>>>> end and the client.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am not quite sure I agree with the need for ifconfig =20
>>>>>>>>>>>>> -a but some
>>>>>>>>>>>>> information about the networking driver your using for =20
>>>>>>>>>>>>> the interface
>>>>>>>>>>>>> would be helpful, uptime of the boxes. And configuration
>>>>> of the pool.
>>>>>>>>>>>>> e.g. ( zpool status -a ;zfs get all <poolname> ) You =20
>>>>>>>>>>>>> should probably
>>>>>>>>>>>>> prop this information up somewhere so you can reference by
>>>>>>> URL whenever
>>>>>>>>>>>>> needed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> rsync(1) does not rely on scp(1) whatsoever but rsync(1)
>>>>>>> can be made to
>>>>>>>>>>>>> use ssh(1) instead of rsh(1) and I believe that is what Jeremy=
 is
>>>>>>>>>>>>> stating here but correct me if I am wrong. It does use ssh(1) =
by
>>>>>>>>>>>>> default.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Its a possiblity as well that if using tmpfs(5) or =20
>>>>>>>>>>>>> mdmfs(8) for /tmp
>>>>>>>>>>>>> type filesystems that rsync(1) may be just filling up your
>>>>>>> temp ram area
>>>>>>>>>>>>> and causing the connection abort which would be
>>>>>>>>>>>>> expected. ( df -h ) would
>>>>>>>>>>>>> help here.
>>>>>>>>>>>>
>>>>>>>>>>>> Hello,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm not using tmpfs/mdmfs at all. The clients yesterday
>>>>>>>>>>>> were 3 different OSX computers (over gigabit). The FreeBSD
>>>>>>>>>>>> server has 12gb of ram and no bce adapter. For what it's
>>>>>>>>>>>> worth, the server is backed up remotely every night with
>>>>>>>>>>>> rsync (remote FreeBSD uses rsync to pull) to an offsite
>>>>>>>>>>>> (slow cable connection) FreeBSD computer, and I have not
>>>>>>>>>>>> seen any errors in the nightly rsync.
>>>>>>>>>>>>
>>>>>>>>>>>> Sorry for the omission of networking info, here's the
>>>>>>>>>>>> output of the requested commands and some that popped up
>>>>>>>>>>>> in the other thread:
>>>>>>>>>>>>
>>>>>>>>>>>> http://www.cap-press.com/misc/
>>>>>>>>>>>>
>>>>>>>>>>>> In rc.conf:  ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0"
>>>>>>>>>>>>
>>>>>>>>>>>> Scott
>>>>>>>>
>>>>>>>> Just to make it crystal clear to everyone:
>>>>>>>>
>>>>>>>> There is no correlation between this problem and use of ZFS.  =20
>>>>>>>> People are
>>>>>>>> attempting to correlate "cannot allocate memory" messages =20
>>>>>>>> with "anything
>>>>>>>> on the system that uses memory".  The VM is much more complex =20
>>>>>>>> than that.
>>>>>>>>
>>>>>>>> Given the nature of this problem, it's much more likely the issue i=
s
>>>>>>>> "somewhere" within a networking layer within FreeBSD, whether it be
>>>>>>>> driver-level or some sort of intermediary layer.
>>>>>>>>
>>>>>>>> Two people who have this issue in this thread are both using =20
>>>>>>>> VirtualBox.
>>>>>>>> Can one, or both, of you remove VirtualBox from the configuration
>>>>>>>> entirely (kernel, etc. -- not sure what is required) and then =20
>>>>>>>> see if the
>>>>>>>> issue goes away?
>>>>>>>
>>>>>>> On the machine in question I only can do it after hours so I will do
>>>>>>> it tonight.
>>>>>>>
>>>>>>> I was _successfully_ sending the file over the loopback interface us=
ing
>>>>>>>
>>>>>>> cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null"
>>>>>>>
>>>>>>> I did it, btw, with the IPv6 localhost address first (accidently),
>>>>>>> and then using IPv4. Both worked.
>>>>>>>
>>>>>>> It always fails if I am sending it through the bce(4) interface,
>>>>>>> even if my target is the VirtualBox bridged to the bce card (so it
>>>>>>> does not "leave" the computer physically).
>>>>>>>
>>>>>>> Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and
>>>>>>> kldstat output.
>>>>>>>
>>>>>>> I have another box where I do not see that problem. It copies files
>>>>>>> happily over the net using ssh.
>>>>>>>
>>>>>>> It is an an older HP ML 150 with 3GB RAM only but with a bge(4)
>>>>>>> driver instead. It runs the same last week's RELENG_8. I installed
>>>>>>> VirtualBox and enabled vboxnet (so it loads the kernel modules). But
>>>>>>> I do not run VirtualBox on it (because it hasn't enough RAM).
>>>>>>>
>>>>>>> Regards
>>>>>>> Peter
>>>>>>>
>>>>>>> DellT410one# uname -a
>>>>>>> FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun
>>>>>>> 30 17:07:18 EST 2011
>>>>>>> root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC  amd64
>>>>>>> DellT410one# ifconfig -a
>>>>>>> bce0: flags=3D8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>
>>>>>>> metric 0 mtu 1500
>>>>>>> =09options=3Dc01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,V=
LAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
>>>>>>> =09ether 84:2b:2b:68:64:e4
>>>>>>> =09inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255
>>>>>>> =09inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255
>>>>>>> =09inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255
>>>>>>> =09inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255
>>>>>>> =09inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255
>>>>>>> =09inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255
>>>>>>> =09inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255
>>>>>>> =09inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255
>>>>>>> =09media: Ethernet autoselect (1000baseT <full-duplex>)
>>>>>>> =09status: active
>>>>>>> bce1: flags=3D8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
>>>>>>> =09options=3Dc01bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,V=
LAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE>
>>>>>>> =09ether 84:2b:2b:68:64:e5
>>>>>>> =09media: Ethernet autoselect
>>>>>>> lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
>>>>>>> =09options=3D3<RXCSUM,TXCSUM>
>>>>>>> =09inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb
>>>>>>> =09inet6 ::1 prefixlen 128
>>>>>>> =09inet 127.0.0.1 netmask 0xff000000
>>>>>>> =09nd6 options=3D3<PERFORMNUD,ACCEPT_RTADV>
>>>>>>> vboxnet0: flags=3D8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 150=
0
>>>>>>> =09ether 0a:00:27:00:00:00
>>>>>>> DellT410one# netstat -rn
>>>>>>> Routing tables
>>>>>>>
>>>>>>> Internet:
>>>>>>> Destination        Gateway            Flags    Refs      Use  =20
>>>>>>> Netif Expire
>>>>>>> default            192.168.50.201     UGS         0    52195   bce0
>>>>>>> 127.0.0.1          link#11            UH          0        6    lo0
>>>>>>> 192.168.50.0/24    link#1             U           0  1118212   bce0
>>>>>>> 192.168.50.219     link#1             UHS         0     9670    lo0
>>>>>>> 192.168.50.220     link#1             UHS         0     8347    lo0
>>>>>>> 192.168.50.221     link#1             UHS         0   103024    lo0
>>>>>>> 192.168.50.223     link#1             UHS         0    43614    lo0
>>>>>>> 192.168.50.224     link#1             UHS         0     8358    lo0
>>>>>>> 192.168.50.225     link#1             UHS         0     8438    lo0
>>>>>>> 192.168.50.226     link#1             UHS         0     8338    lo0
>>>>>>> 192.168.50.227     link#1             UHS         0     8333    lo0
>>>>>>> 192.168.165.0/24   192.168.50.200     UGS         0     3311   bce0
>>>>>>> 192.168.166.0/24   192.168.50.200     UGS         0      699   bce0
>>>>>>> 192.168.167.0/24   192.168.50.200     UGS         0     3012   bce0
>>>>>>> 192.168.168.0/24   192.168.50.200     UGS         0      552   bce0
>>>>>>>
>>>>>>> Internet6:
>>>>>>> Destination                       Gateway
>>>>>>> Flags      Netif Expire
>>>>>>> ::1                               ::1                           UH
>>>>>>> lo0
>>>>>>> fe80::%lo0/64                     link#11                       U
>>>>>>> lo0
>>>>>>> fe80::1%lo0                       link#11                       UHS
>>>>>>> lo0
>>>>>>> ff01::%lo0/32                     fe80::1%lo0                   U
>>>>>>> lo0
>>>>>>> ff02::%lo0/32                     fe80::1%lo0                   U
>>>>>>> lo0
>>>>>>> DellT410one# kldstat
>>>>>>> Id Refs Address            Size     Name
>>>>>>> 1   19 0xffffffff80100000 dbf5d0   kernel
>>>>>>> 2    3 0xffffffff80ec0000 4c358    vboxdrv.ko
>>>>>>> 3    1 0xffffffff81012000 131998   zfs.ko
>>>>>>> 4    1 0xffffffff81144000 1ff1     opensolaris.ko
>>>>>>> 5    2 0xffffffff81146000 2940     vboxnetflt.ko
>>>>>>> 6    2 0xffffffff81149000 8e38     netgraph.ko
>>>>>>> 7    1 0xffffffff81152000 153c     ng_ether.ko
>>>>>>> 8    1 0xffffffff81154000 e70      vboxnetadp.ko
>>>>>>> DellT410one# pciconf -lv
>>>>>>> ..
>>>>>>> bce0@pci0:1:0:0:        class=3D0x020000 card=3D0x028d1028
>>>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00
>>>>>>>    vendor     =3D 'Broadcom Corporation'
>>>>>>>    class      =3D network
>>>>>>>    subclass   =3D ethernet
>>>>>>> bce1@pci0:1:0:1:        class=3D0x020000 card=3D0x028d1028
>>>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00
>>>>>>>    vendor     =3D 'Broadcom Corporation'
>>>>>>>    class      =3D network
>>>>>>>    subclass   =3D ethernet
>>>>>>
>>>>>> Could you please provide "pciconf -lvcb" output instead, specific to =
the
>>>>>> bce chips?  Thanks.
>>>>>
>>>>> Her it is:
>>>>>
>>>>> bce0@pci0:1:0:0:        class=3D0x020000 card=3D0x028d1028
>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00
>>>>>    vendor     =3D 'Broadcom Corporation'
>>>>>    class      =3D network
>>>>>    subclass   =3D ethernet
>>>>>    bar   [10] =3D type Memory, range 64, base 0xda000000, size
>>>>> 33554432, enabled
>>>>>    cap 01[48] =3D powerspec 3  supports D0 D3  current D0
>>>>>    cap 03[50] =3D VPD
>>>>>    cap 05[58] =3D MSI supports 16 messages, 64 bit enabled with 1 mess=
age
>>>>>    cap 11[a0] =3D MSI-X supports 9 messages in map 0x10
>>>>>    cap 10[ac] =3D PCI-Express 2 endpoint max data 256(512) link x4(x4)
>>>>> ecap 0003[100] =3D Serial 1 842b2bfffe6864e4
>>>>> ecap 0001[110] =3D AER 1 0 fatal 0 non-fatal 1 corrected
>>>>> ecap 0004[150] =3D unknown 1
>>>>> ecap 0002[160] =3D VC 1 max VC0
>>>>
>>>> Thanks Peter.
>>>>
>>>> Adding Yong-Hyeon and David to the discussion, since they've both worke=
d
>>>> on the bce(4) driver in recent months (most of the changes made recentl=
y
>>>> are only in HEAD), and also adding Jack Vogel of Intel who maintains
>>>> em(4).  Brief history for the devs:
>>>>
>>>> The issue is described "Network memory allocation failures" and was
>>>> reported last year, but two users recently (Scott and Peter) have
>>>> reported the issue again:
>>>>
>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread=
.html#58708
>>>>
>>>> And was mentioned again by Scott here, which also contains some
>>>> technical details:
>>>>
>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063172.html
>>>>
>>>> What's interesting is that Scott's issue is identical in form but he's
>>>> using em(4), which isn't known to behave like this.  Both individuals
>>>> are using VirtualBox, though we're not sure at this point if that is th=
e
>>>> piece which is causing the anomaly.
>>>>
>>>> Relevant details of Scott's system (em-based):
>>>>
>>>> http://www.cap-press.com/misc/
>>>>
>>>> Relevant details of Peter's system (bce-based):
>>>>
>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063221.html
>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063223.html
>>>>
>>>> I think the biggest complexity right now is figuring out how/why scp
>>>> fails intermittently in this nature.  The errno probably "trickles down=
"
>>>> to userland from the kernel, but the condition regarding why it happens
>>>> is unknown.
>>>
>>> BTW: I also saw 2 of the errors coming from a BIND9 running in a
>>> jail on that box.
>>>
>>> DellT410one# fgrep -i allocate /jails/bind/20110315/var/log/messages
>>> Apr 13 05:17:41 bind named[23534]: internal_send:
>>> 192.168.50.145#65176: Cannot allocate memory
>>> Jun 21 23:30:44 bind named[39864]: internal_send:
>>> 192.168.50.251#36155: Cannot allocate memory
>>> Jun 24 15:28:00 bind named[39864]: internal_send:
>>> 192.168.50.251#28651: Cannot allocate memory
>>> Jun 28 12:57:52 bind named[2462]: internal_send:
>>> 192.168.165.154#1201: Cannot allocate memory
>>>
>>> My initial guess: it happens sooner or later somehow - whether it is
>>> a lot of traffic in one go (ssh/scp copies of virtual disks) or a
>>> lot of traffic over a longer period (a nameserver gets asked again
>>> and again).
>>
>> Scott, are you also using jails?  If both of you are: is there any
>> possibility you can remove use of those?  I'm not sure how VirtualBox
>> fits into the picture (jails + VirtualBox that is), but I can imagine
>> jails having different environmental constraints that might cause this.
>>
>> Basically the troubleshooting process here is to remove pieces of the
>> puzzle until you figure out which piece is causing the issue.  I don't
>> want to get the NIC driver devs all spun up for something that, for
>> example, might be an issue with the jail implementation.
>
> I understand this. As said, I do some afterhours debugging tonight.
>
> The scp/ssh problems are happening _outside_ the jails. The bind =20
> runs _inside_ the jail.
>
> I wanted to use the _host_ system to send VirtualBox virtual disks =20
> and  filesystems used by jails to archive them and/or having them =20
> available on other FreeBSD systems (as a cold standby solution).

I just switched off the VirtualBox (without removing the kernel modules).

The copy succeeds now.

Well, it could be a VirtualBox related problem, or is the server just =20
relieved to have 2GB more memory at hands now?

Do you have a quick idea to "emulate" the 2GB memory load usually =20
delivered by VirtualBox?

Regards
Peter