From owner-freebsd-stable@FreeBSD.ORG Sun Jul 3 00:29:07 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E7569106566B for ; Sun, 3 Jul 2011 00:29:07 +0000 (UTC) (envelope-from kob6558@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id A9FA28FC1E for ; Sun, 3 Jul 2011 00:29:07 +0000 (UTC) Received: by gyf3 with SMTP id 3so2151473gyf.13 for ; Sat, 02 Jul 2011 17:29:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; bh=hb3lU0eyFYF/tF+HDQG4zrJmK+/64Oy9uHOHyZwfBVA=; b=DOQODU3Rz6nQWzhGJrEP73Xrsv3Y9DXb6AN0258Wiu5WmR/7Ih8DxzZHvex302TeEs N63l6WM23maEbJgTElFDHf3DQipgVSrEeBoQ4W8E1zt4xR1tG39Kb1fOwXsKBR/+1n0w cZ8u/ITi/bYO3GanRYiy26jQhq80n/GV3EHRI= MIME-Version: 1.0 Received: by 10.150.62.1 with SMTP id k1mr4388584yba.196.1309652946538; Sat, 02 Jul 2011 17:29:06 -0700 (PDT) Received: by 10.150.220.20 with HTTP; Sat, 2 Jul 2011 17:29:06 -0700 (PDT) Received: by 10.150.220.20 with HTTP; Sat, 2 Jul 2011 17:29:06 -0700 (PDT) Date: Sat, 2 Jul 2011 14:29:06 -1000 Message-ID: From: Kevin Oberman To: Zoran Kolic Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-stable@freebsd.org Subject: Re: dell latitude 13 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Jul 2011 00:29:08 -0000 On Jul 2, 2011 4:16 AM, "Zoran Kolic" wrote: > > Thanks for answering my question. > > > Since the Intel chip is also still unsupported by FreeBSD, you will be > > limited to VEDA support which is very limited. > > I found that dell sells ubuntu on mentioned laptop in some parts > of the globe. Further, my old laptop (HP nx9020) has intel chip > and works under intel driver. I need simple 2d, no frills. Does > newer chip run for plain graphics at all? No acceleration? Fine. Linux has the required GEM/KMS support, but FreeBSD does not. It is being worked on, but I don't think it's near ready. > I don't use it neither on my NVIDIA GeForce 6200 card with "nv" > driver. The Optimus is a GPU designed in conjunction with Intel Sandy Bridge CPUs which have integrated Intel 5000 graphics. Optimus acts as an external accelerator using the Intel frame buffer. It will not work with any NVIDIA driver; nv, the NVIDIA binary driver, or the nouveau driver. NVIDIA has been clear that there is no plan to ever support Optimus on non-Windows systems. > > For laptops you may check the following site : > > http://laptop.bsdgroup.de/freebsd/ > > Old and no newer machines on the site. > > > Also check FreeBSD Hardware lists such as > > ftp://ftp.freebsd.org/pub/FreeBSD/releases/amd64/8.2-RELEASE/HARDWARE.HTM > > Intel 5100 wifi should work with "device iwn5000fw" or "5150"? > > > It is NOT possible to know in advance without any prior knowledge whether a > > computer is working with FreeDOS will work with FreeBSD because these are > > very different systems . > > It is sold with freedos to be cheap enough. I appreciate that. > > > The producer/seller can install FreeBSD or Linux without cost > > Not where I live. When I bought my previous laptop, live cd was the > guide if it worked at all. It is a matter of "buy it as it is" now. > More, no cd on latitude 13, which is fine, since it keeps it at > 3 lb. > > My two fears are acpi and graphics 4500mhd. I found links that show > 4500 supported. > http://freebsd.1045724.n5.nabble.com/subnotebooks-once-again-td4197868.html > If the list thinks I should wait, it is a metter of adding the code > as on this link: > http://forums.freebsd.org/archive/index.php/t-21852.html If it is a 4500, you will have at least minimal graphics support. You should be able to disable the NVIDIA daughter card on BIOS, but I would not want to guarantee it. Good luck! R. Kevin Oberman, Network Engineer Retired kob6558@gmail.com From owner-freebsd-stable@FreeBSD.ORG Sun Jul 3 03:50:36 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B51ED106564A for ; Sun, 3 Jul 2011 03:50:36 +0000 (UTC) (envelope-from cscotts@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 6AF558FC16 for ; Sun, 3 Jul 2011 03:50:36 +0000 (UTC) Received: by vxg33 with SMTP id 33so4250422vxg.13 for ; Sat, 02 Jul 2011 20:50:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=tZVaDOJhSxffSb6MiupggrnXBLuGkZTJ8jeeAEXedsc=; b=w4E1PluDcWDJpI6PoytVZQfhy9NyGs3niTI2iev3fqs7GFYiU8A1QRy2nKEr4qQ93d cqLgoXORI152llOeVSY/izZhMaRn0b3I7i1QqZIBTjJn7CjE9Ig/YlDfAQ8AK1U5yABg MB+3w9VWC4ts85zeAN+NeWtyMaD4P47oUOrN0= Received: by 10.220.176.3 with SMTP id bc3mr1887331vcb.26.1309665035472; Sat, 02 Jul 2011 20:50:35 -0700 (PDT) Received: from sahibkuran.kjlms (user-0c2hi2t.cable.mindspring.com [24.40.200.93]) by mx.google.com with ESMTPS id bh5sm92931vcb.27.2011.07.02.20.50.33 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 02 Jul 2011 20:50:34 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Scott Sipe In-Reply-To: <20110702045435.GA81502@DataIX.net> Date: Sat, 2 Jul 2011 23:50:32 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <54D65EC5-9A9B-4F96-BB45-1904F2147CBA@gmail.com> References: <20110701222232.GA33935@icarus.home.lan> <20110702045435.GA81502@DataIX.net> To: jhell , Jeremy Chadwick X-Mailer: Apple Mail (2.1084) Cc: freebsd-stable List Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Jul 2011 03:50:36 -0000 On Jul 2, 2011, at 12:54 AM, jhell wrote: > On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: >> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >>> I'm running 8.2-RELEASE and am having new problems with scp. When = scping >>> files to a ZFS directory on the FreeBSD server -- most notably large = files >>> -- the transfer frequently dies after just a few seconds. In my last = test, I >>> tried to scp an 800mb file to the FreeBSD system and the transfer = died after >>> 200mb. It completely copied the next 4 times I tried, and then died = again on >>> the next attempt. >>>=20 >>> On the client side: >>>=20 >>> "Connection to home closed by remote host. >>> lost connection" >>>=20 >>> In /var/log/auth.log: >>>=20 >>> Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write failed: Cannot = allocate >>> memory >>>=20 >>> I've never seen this before and have used scp before to transfer = large files >>> without problems. This computer has been used in production for = months and >>> has a current uptime of 36 days. I have not been able to notice any = problems >>> copying files to the server via samba or netatalk, or any problems = in >>> apache. >>>=20 >>> Uname: >>>=20 >>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat Feb 19 01:02:54 = EST >>> 2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >>>=20 >>> I've attached my dmesg and output of vmstat -z. >>>=20 >>> I have not restarted the sshd daemon or rebooted the computer. >>>=20 >>> Am glad to provide any other information or test anything else. >>>=20 >>> {snip vmstat -z and dmesg} >>=20 >> You didn't provide details about your networking setup (rc.conf, >> ifconfig -a, etc.). netstat -m would be useful too. >>=20 >> Next, please see this thread circa September 2010, titled "Network >> memory allocation failures": >>=20 >> = http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.ht= ml#58708 >>=20 >> The user in that thread is using rsync, which relies on scp by = default. >> I believe this problem is similar, if not identical, to yours. >>=20 >=20 > Please also provide your output of ( /usr/bin/limits -a ) for the = server > end and the client. >=20 > I am not quite sure I agree with the need for ifconfig -a but some > information about the networking driver your using for the interface > would be helpful, uptime of the boxes. And configuration of the pool. > e.g. ( zpool status -a ;zfs get all ) You should probably > prop this information up somewhere so you can reference by URL = whenever > needed. >=20 > rsync(1) does not rely on scp(1) whatsoever but rsync(1) can be made = to > use ssh(1) instead of rsh(1) and I believe that is what Jeremy is > stating here but correct me if I am wrong. It does use ssh(1) by > default. >=20 > Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tmp > type filesystems that rsync(1) may be just filling up your temp ram = area > and causing the connection abort which would be expected. ( df -h ) = would > help here. Hello, I'm not using tmpfs/mdmfs at all. The clients yesterday were 3 different = OSX computers (over gigabit). The FreeBSD server has 12gb of ram and no = bce adapter. For what it's worth, the server is backed up remotely every = night with rsync (remote FreeBSD uses rsync to pull) to an offsite (slow = cable connection) FreeBSD computer, and I have not seen any errors in = the nightly rsync. Sorry for the omission of networking info, here's the output of the = requested commands and some that popped up in the other thread: http://www.cap-press.com/misc/ In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0" Scott From owner-freebsd-stable@FreeBSD.ORG Sun Jul 3 09:15:52 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 60735106564A; Sun, 3 Jul 2011 09:15:52 +0000 (UTC) (envelope-from ohartman@zedat.fu-berlin.de) Received: from outpost1.zedat.fu-berlin.de (outpost1.zedat.fu-berlin.de [130.133.4.66]) by mx1.freebsd.org (Postfix) with ESMTP id 1C1808FC08; Sun, 3 Jul 2011 09:15:52 +0000 (UTC) Received: from inpost2.zedat.fu-berlin.de ([130.133.4.69]) by outpost1.zedat.fu-berlin.de (Exim 4.69) with esmtp (envelope-from ) id <1QdImN-0006YT-3X>; Sun, 03 Jul 2011 11:15:51 +0200 Received: from e178033186.adsl.alicedsl.de ([85.178.33.186] helo=thor.walstatt.dyndns.org) by inpost2.zedat.fu-berlin.de (Exim 4.69) with esmtpsa (envelope-from ) id <1QdImN-0003rn-0i>; Sun, 03 Jul 2011 11:15:51 +0200 Message-ID: <4E103346.1060206@zedat.fu-berlin.de> Date: Sun, 03 Jul 2011 11:15:50 +0200 From: "Hartmann, O." User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:5.0) Gecko/20110630 Thunderbird/5.0 MIME-Version: 1.0 To: Alexander Kabaev References: <4E0EC86E.9050608@zedat.fu-berlin.de> <20110702144513.5c5b9f75@kan.dnsalias.net> In-Reply-To: <20110702144513.5c5b9f75@kan.dnsalias.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: 85.178.33.186 Cc: FreeBSD Current , FreeBSD Stable Subject: Re: devel/subversion: svn: Couldn't perform atomic initialization X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Jul 2011 09:15:52 -0000 On 07/02/11 20:45, Alexander Kabaev wrote: > On Sat, 02 Jul 2011 09:27:42 +0200 > "Hartmann, O." wrote: > >> Hello. >> Since two days now I realize on several recently ports-updated >> servers a failure of the subversion server running on those servers. >> Sneaking around the internet I found several issues exactly targeting >> this error with an sqlite 3.7.7/3.7.7.1 issue, which has been fixed >> in sqlite-3.7.7.2. At this very moment, our subversion servers in >> question has all recently been updated and it seems, they all fail >> the same way. Does anyone also realize this behaviour shown below >> when commiting? >> >> Is there a workaround? Any help or hint is appreciated. >> >> Thanks in advance, >> Oliver >> >> Transmitting file data .svn: Commit failed (details follow): >> svn: Couldn't perform atomic initialization >> svn: database schema has changed >> svn: Your commit message was left in a temporary file: >> > Update database/sqlite3 port to the 3.7.7.1 version committed today. > Done - and it works fine. Thanks. But why 3.7.7.1 and not 3.7.7.2? Thanks, anyway. Regards, Oliver From owner-freebsd-stable@FreeBSD.ORG Sun Jul 3 10:20:30 2011 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 46C7C106566B; Sun, 3 Jul 2011 10:20:30 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mail.digiware.nl (mail.ip6.digiware.nl [IPv6:2001:4cb8:1:106::2]) by mx1.freebsd.org (Postfix) with ESMTP id 7FB398FC0A; Sun, 3 Jul 2011 10:20:29 +0000 (UTC) Received: from rack1.digiware.nl (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id 17849153434; Sun, 3 Jul 2011 12:20:27 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id C+XH8oh6kJdz; Sun, 3 Jul 2011 12:20:22 +0200 (CEST) Received: from [IPv6:2001:4cb8:3:1:159e:ff44:b28c:3bee] (unknown [IPv6:2001:4cb8:3:1:159e:ff44:b28c:3bee]) by mail.digiware.nl (Postfix) with ESMTP id 9D6B1153445; Sun, 3 Jul 2011 01:56:27 +0200 (CEST) Message-ID: <4E0FB02B.60301@digiware.nl> Date: Sun, 03 Jul 2011 01:56:27 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11 MIME-Version: 1.0 To: =?UTF-8?B?6buD5riF6ZqG?= References: <4E0674ED.5090502@digiware.nl> <4E073982.8050601@sentex.net> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: stable@freebsd.org, delphij@freebsd.org Subject: Re: arcmsr panic runnig 8.2 of 2011 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Jul 2011 10:20:30 -0000 On 2011-06-27 4:49, 黃清隆 wrote: > Hi Mike, > Thanks for your bug report. > Please compile the new driver in attached zip file and try again. > Thanks, > Ching Hi Ching, So I did, and it did boot the server. However upon reboot it again paniced. This time in arcmsc.c: 1298 with: mtx_lock_sleep recursed on non-recursive mutex Is this enough for you to figure out what went wrong? Otherwise tell me how to diagnose the problem. Regards, --WjW From owner-freebsd-stable@FreeBSD.ORG Sun Jul 3 14:25:04 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1D8D51065672; Sun, 3 Jul 2011 14:25:04 +0000 (UTC) (envelope-from kabaev@gmail.com) Received: from mail-qy0-f182.google.com (mail-qy0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id ACBE58FC20; Sun, 3 Jul 2011 14:25:03 +0000 (UTC) Received: by qyk38 with SMTP id 38so2958897qyk.13 for ; Sun, 03 Jul 2011 07:25:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:in-reply-to:references:x-mailer :mime-version:content-type; bh=iWSHp/NyNlrkvgOJx1Awql2e09jri+6EuR6aZdCNfDc=; b=YBkvSv4pioCZneP0K7QYfw23EArd7VVDsw48TN/IOBgUXz+s1/GWaw4gDJ6eyeizE6 KgS0NfNnn9Xn8nOAKWbPza6RM+WRK4o/DnJZD/4yRpuqx/T/w03j6lcq+P9YF62xNOQc SNH/U6wl2V72bUyl98ZnDDrZiS+2Dj/ABnp5k= Received: by 10.224.106.1 with SMTP id v1mr4103906qao.195.1309703102783; Sun, 03 Jul 2011 07:25:02 -0700 (PDT) Received: from kan.dnsalias.net (c-24-63-226-98.hsd1.ma.comcast.net [24.63.226.98]) by mx.google.com with ESMTPS id g7sm4064203qck.44.2011.07.03.07.25.01 (version=SSLv3 cipher=OTHER); Sun, 03 Jul 2011 07:25:01 -0700 (PDT) Date: Sun, 3 Jul 2011 10:24:54 -0400 From: Alexander Kabaev To: "Hartmann, O." Message-ID: <20110703102454.5c96cbc9@kan.dnsalias.net> In-Reply-To: <4E103346.1060206@zedat.fu-berlin.de> References: <4E0EC86E.9050608@zedat.fu-berlin.de> <20110702144513.5c5b9f75@kan.dnsalias.net> <4E103346.1060206@zedat.fu-berlin.de> X-Mailer: Claws Mail 3.7.9 (GTK+ 2.22.1; amd64-portbld-freebsd9.0) Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/fOJJt/wm1fX=QF0oCDrMpXA"; protocol="application/pgp-signature" Cc: FreeBSD Current , FreeBSD Stable Subject: Re: devel/subversion: svn: Couldn't perform atomic initialization X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Jul 2011 14:25:04 -0000 --Sig_/fOJJt/wm1fX=QF0oCDrMpXA Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable On Sun, 03 Jul 2011 11:15:50 +0200 "Hartmann, O." wrote: > On 07/02/11 20:45, Alexander Kabaev wrote: > > On Sat, 02 Jul 2011 09:27:42 +0200 > > "Hartmann, O." wrote: > > > >> > > Update database/sqlite3 port to the 3.7.7.1 version committed today. > > >=20 > Done - and it works fine. Thanks. But why > 3.7.7.1 and not 3.7.7.2? >=20 > Thanks, anyway. >=20 > Regards, > Oliver No reason other than 3.7.7.1 appears to be the latest version currently in ports. --=20 Alexander Kabaev --Sig_/fOJJt/wm1fX=QF0oCDrMpXA Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iD8DBQFOEHu8Q6z1jMm+XZYRAm/nAKDPVX66aw7R+uJ6i5QoH4iWdxHwMgCdE/IQ iXT/CNkExRWX++ltgFc+Zg4= =dPTt -----END PGP SIGNATURE----- --Sig_/fOJJt/wm1fX=QF0oCDrMpXA-- From owner-freebsd-stable@FreeBSD.ORG Sun Jul 3 15:55:13 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 964371065672; Sun, 3 Jul 2011 15:55:13 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-fx0-f44.google.com (mail-fx0-f44.google.com [209.85.161.44]) by mx1.freebsd.org (Postfix) with ESMTP id 68CCA8FC0A; Sun, 3 Jul 2011 15:55:11 +0000 (UTC) Received: by fxe6 with SMTP id 6so3726703fxe.17 for ; Sun, 03 Jul 2011 08:55:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:to:cc:subject:references:x-comment-to:sender:date:in-reply-to :message-id:user-agent:mime-version:content-type; bh=BHkEFEnVDPkt4PRx84BrgGWVGlDajdmIm9n9Gt8ZN+Y=; b=V+zgVwQ3coNeybuuILS6DrrtV00a3v0Ni32Ei3jzMHvriteQfMYBzOJpqp46mNZ0mm G2sENPyCWCRqrG8easO4F0Z5UD138Fl316Gz4MRsQ/Qzt/Fjklp7e+/AFv4tdK6NIvAd zDCx7aZ+kuey6knIRKSJWC1inhS/Ib/Be9Nz8= Received: by 10.223.55.8 with SMTP id s8mr8028415fag.141.1309708511134; Sun, 03 Jul 2011 08:55:11 -0700 (PDT) Received: from localhost ([95.69.173.122]) by mx.google.com with ESMTPS id k26sm3875893fak.0.2011.07.03.08.55.08 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 03 Jul 2011 08:55:08 -0700 (PDT) From: Mikolaj Golub To: Timothy Smith References: <8639ioadji.fsf@kopusha.home.net> X-Comment-To: Timothy Smith Sender: Mikolaj Golub Date: Sun, 03 Jul 2011 18:55:06 +0300 In-Reply-To: (Timothy Smith's message of "Sat, 2 Jul 2011 14:43:15 -0700") Message-ID: <861uy7xsth.fsf@kopusha.home.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Pawel Jakub Dawidek , freebsd-stable@freebsd.org Subject: Re: HAST + ZFS: no action on drive failure X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Jul 2011 15:55:13 -0000 On Sat, 2 Jul 2011 14:43:15 -0700 Timothy Smith wrote: TS> Hello Mikolaj, TS> So, just to be clear, if a local drive fails in my pool, but the TS> corresponding remote drive remains available, then hastd will both write to TS> and read from the remote drive? That's really very cool! Yes. TS> I looked more closely at the hastd(8) man page. There is some indication of TS> what you say, but not so clear: TS> "Read operations (BIO_READ) are handled locally unless I/O error occurs or local TS> version of the data is not up-to-date yet (synchronization is in progress)." This is about READ operations, and for WRITE we have just above: Every write, delete and flush operation (BIO_WRITE, BIO_DELETE, BIO_FLUSH) is send to local component and synchronously replicated to the remote (secondary) node if it is available. There might be things that should be improved in documetation but I don't feel capable to do this :-) TS> Perhaps this can be modified a bit? Adding, "or the local disk is TS> unavailable. In such a case, the I/O operation will be handled by the remote TS> resource." TS> It does makes sense however, since HAST is base on the idea of raid. This TS> feature increases the redundancy of the system greatly. My boss will be TS> very impressed, as am I! TS> I did notice however that when the pulled drive is reinserted, I need to TS> change the associated hast resource to init, then back to primary to allow TS> hastd to once again use it (perhaps the same if the secondary drive is TS> failed?). Unless it will do this on it's own after some time? I did not wait TS> more than a few minutes. But this is easy enough to script or to monitor the TS> log and present a notification to admin at such a time. When you are reinserting the drive the resource should be in init state. Remember, some data was updated on secondary only, so the right sequence of operations could be: 1) Failover (switch primary to init and secondary to primary). 2) Fix the disk issue. 3) If this is a new drive, recreate HAST metadata on it with hastctl utility. 4) Switch the repaired resource to secondary and wait until the new primary connects to it and updates metadata. After this synchronization is started. 5) You can switch to the previous primary before the synchronization is complete -- it will continue in right direction, but then you should expect performance degradation until the synchronization is complete -- the READ requests will go to remote node. So it might be better to wait until the synchronization is complete before switching back. -- Mikolaj Golub From owner-freebsd-stable@FreeBSD.ORG Sun Jul 3 19:29:00 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 832191065672 for ; Sun, 3 Jul 2011 19:29:00 +0000 (UTC) (envelope-from marcus@freebsd.org) Received: from av-tac-rtp.cisco.com (hen.cisco.com [64.102.19.198]) by mx1.freebsd.org (Postfix) with ESMTP id 5BBD38FC08 for ; Sun, 3 Jul 2011 19:29:00 +0000 (UTC) X-TACSUNS: Virus Scanned Received: from rooster.cisco.com (localhost.cisco.com [127.0.0.1]) by av-tac-rtp.cisco.com (8.13.8+Sun/8.13.8) with ESMTP id p63JCPBn011497 for ; Sun, 3 Jul 2011 15:12:25 -0400 (EDT) Received: from fruit-rollup.marcuscom.com (jclarke-pc.cisco.com [172.18.254.236]) by rooster.cisco.com (8.13.8+Sun/8.13.8) with ESMTP id p63JCLm3007198 for ; Sun, 3 Jul 2011 15:12:21 -0400 (EDT) Message-ID: <4E10BF14.2090507@freebsd.org> Date: Sun, 03 Jul 2011 15:12:20 -0400 From: Joe Marcus Clarke Organization: FreeBSD, Inc. User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Unable to attach USB disks at boot time X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Jul 2011 19:29:00 -0000 I have a VMware ESX 4.1 Update 1 server (underlying hardware is a Cisco UCS C210) to which I have connected two WD My Book 1130 drives. I have allocated both drives to my FreeBSD RELENG_8 VM (amd64). At boot time, I see: Root mount waiting for: usbus1 usb_alloc_device: set address 2 failed (USB_ERR_TIMEOUT, ignored) Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored) Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored) Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 ugen1.2: at usbus1 (disconnected) uhub_reattach_port: could not allocate new device Root mount waiting for: usbus1 Root mount waiting for: usbus1 usb_alloc_device: set address 2 failed (USB_ERR_TIMEOUT, ignored) Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored) Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 usbd_req_re_enumerate: addr=2, set address failed! (USB_ERR_TIMEOUT, ignored) Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 Root mount waiting for: usbus1 ugen1.2: at usbus1 (disconnected) uhub_reattach_port: could not allocate new device However, once FreeBSD is fully booted, I can unattach then reattach the drives (though VIC), and they attach just fine: ugen1.2: at usbus1 umass0: on usbus1 umass0: SCSI over Bulk-Only; quirks = 0x0000 umass0:1:0:-1: Attached to scbus1 da1 at umass-sim0 bus 0 scbus1 target 0 lun 0 da1: Fixed Direct Access SCSI-6 device da1: 40.000MB/s transfers da1: 1907697MB (3906963456 512 byte sectors: 255H 63S/T 243197C) ses0 at umass-sim0 bus 0 scbus1 target 0 lun 1 ses0: Fixed Enclosure Services SCSI-6 device ses0: 40.000MB/s transfers ses0: SCSI-3 SES Device ugen1.3: at usbus1 umass1: on usbus1 umass1: SCSI over Bulk-Only; quirks = 0x0000 umass1:2:1:-1: Attached to scbus2 da2 at umass-sim1 bus 1 scbus2 target 0 lun 0 da2: Fixed Direct Access SCSI-6 device da2: 40.000MB/s transfers da2: 1907697MB (3906963456 512 byte sectors: 255H 63S/T 243197C) ses1 at umass-sim1 bus 1 scbus2 target 0 lun 1 ses1: Fixed Enclosure Services SCSI-6 device ses1: 40.000MB/s transfers ses1: SCSI-3 SES Device I'm running FreeBSD RELENG_8 from Sat Jul 2 17:40:20 EDT 2011. I had an older Maxtor drive connected to this VM previously, and it was working fine. These WD drives are USB 3, but operating under USB 2 mode. Any advice? Thanks. Joe -- Joe Marcus Clarke FreeBSD GNOME Team :: gnome@FreeBSD.org FreeNode / #freebsd-gnome http://www.FreeBSD.org/gnome From owner-freebsd-stable@FreeBSD.ORG Mon Jul 4 00:38:40 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DD2D7106566C for ; Mon, 4 Jul 2011 00:38:40 +0000 (UTC) (envelope-from Peter.Ross@bogen.in-berlin.de) Received: from einhorn.in-berlin.de (einhorn.in-berlin.de [192.109.42.8]) by mx1.freebsd.org (Postfix) with ESMTP id 6510A8FC08 for ; Mon, 4 Jul 2011 00:38:40 +0000 (UTC) X-Envelope-From: Peter.Ross@bogen.in-berlin.de Received: from localhost (okapi.in-berlin.de [192.109.42.117]) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id p6408jrI017071; Mon, 4 Jul 2011 02:08:45 +0200 Received: from 124-254-118-24-static.bb.ispone.net.au (124-254-118-24-static.bb.ispone.net.au [124.254.118.24]) by webmail.in-berlin.de (Horde Framework) with HTTP; Mon, 04 Jul 2011 10:08:45 +1000 Message-ID: <20110704100845.94513n3znbabpthp@webmail.in-berlin.de> Date: Mon, 04 Jul 2011 10:08:45 +1000 From: "Peter Ross" To: "Scott Sipe" References: <20110701222232.GA33935@icarus.home.lan> <20110702045435.GA81502@DataIX.net> <54D65EC5-9A9B-4F96-BB45-1904F2147CBA@gmail.com> In-Reply-To: <54D65EC5-9A9B-4F96-BB45-1904F2147CBA@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Internet Messaging Program (IMP) 4.3.3 X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 Cc: freebsd-stable List , Jeremy Chadwick Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jul 2011 00:38:40 -0000 Hi all, I noticed a similar problem last week. It is also very similar to one =20 reported last year: http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.html My server is a Dell T410 server with the same bge card (the same =20 pciconf -lvc output as described by Mahlon: http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711.html Yours, Scott, is a em(4).. Another similarity: In all cases we are using VirtualBox. I just want =20 to mention it, in case it matters. I am still running VirtualBox 3.2. Most of the time kstat.zfs.misc.arcstats.size was reaching =20 vfs.zfs.arc_max then, but I could catch one or two cases then the =20 value was still below. I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it does not help. BTW: It looks as ARC only gives back the memory when I destroy the ZFS =20 (a cloned snapshot containing virtual machines). Even if nothing =20 happens for hours the buffer isn't released.. My machine was still running 8.2-PRERELEASE so I am upgrading. I am happy to give information gathered on old/new kernel if it helps. Regards Peter Quoting "Scott Sipe" : > > On Jul 2, 2011, at 12:54 AM, jhell wrote: > >> On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: >>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >>>> I'm running 8.2-RELEASE and am having new problems with scp. When scpin= g >>>> files to a ZFS directory on the FreeBSD server -- most notably large fi= les >>>> -- the transfer frequently dies after just a few seconds. In my =20 >>>> last test, I >>>> tried to scp an 800mb file to the FreeBSD system and the transfer =20 >>>> died after >>>> 200mb. It completely copied the next 4 times I tried, and then =20 >>>> died again on >>>> the next attempt. >>>> >>>> On the client side: >>>> >>>> "Connection to home closed by remote host. >>>> lost connection" >>>> >>>> In /var/log/auth.log: >>>> >>>> Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write failed: Cannot alloca= te >>>> memory >>>> >>>> I've never seen this before and have used scp before to transfer =20 >>>> large files >>>> without problems. This computer has been used in production for months = and >>>> has a current uptime of 36 days. I have not been able to notice =20 >>>> any problems >>>> copying files to the server via samba or netatalk, or any problems in >>>> apache. >>>> >>>> Uname: >>>> >>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat Feb 19 01:02:54 ES= T >>>> 2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >>>> >>>> I've attached my dmesg and output of vmstat -z. >>>> >>>> I have not restarted the sshd daemon or rebooted the computer. >>>> >>>> Am glad to provide any other information or test anything else. >>>> >>>> {snip vmstat -z and dmesg} >>> >>> You didn't provide details about your networking setup (rc.conf, >>> ifconfig -a, etc.). netstat -m would be useful too. >>> >>> Next, please see this thread circa September 2010, titled "Network >>> memory allocation failures": >>> >>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.= html#58708 >>> >>> The user in that thread is using rsync, which relies on scp by default. >>> I believe this problem is similar, if not identical, to yours. >>> >> >> Please also provide your output of ( /usr/bin/limits -a ) for the server >> end and the client. >> >> I am not quite sure I agree with the need for ifconfig -a but some >> information about the networking driver your using for the interface >> would be helpful, uptime of the boxes. And configuration of the pool. >> e.g. ( zpool status -a ;zfs get all ) You should probably >> prop this information up somewhere so you can reference by URL whenever >> needed. >> >> rsync(1) does not rely on scp(1) whatsoever but rsync(1) can be made to >> use ssh(1) instead of rsh(1) and I believe that is what Jeremy is >> stating here but correct me if I am wrong. It does use ssh(1) by >> default. >> >> Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tmp >> type filesystems that rsync(1) may be just filling up your temp ram area >> and causing the connection abort which would be expected. ( df -h ) would >> help here. > > Hello, > > I'm not using tmpfs/mdmfs at all. The clients yesterday were 3 =20 > different OSX computers (over gigabit). The FreeBSD server has 12gb =20 > of ram and no bce adapter. For what it's worth, the server is backed =20 > up remotely every night with rsync (remote FreeBSD uses rsync to =20 > pull) to an offsite (slow cable connection) FreeBSD computer, and I =20 > have not seen any errors in the nightly rsync. > > Sorry for the omission of networking info, here's the output of the =20 > requested commands and some that popped up in the other thread: > > http://www.cap-press.com/misc/ > > In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0" > > Scott > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > From owner-freebsd-stable@FreeBSD.ORG Mon Jul 4 08:52:11 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ABFD6106564A; Mon, 4 Jul 2011 08:52:11 +0000 (UTC) (envelope-from vermaden@interia.pl) Received: from smtpo.poczta.interia.pl (smtpo.poczta.interia.pl [217.74.65.206]) by mx1.freebsd.org (Postfix) with ESMTP id 0BC648FC17; Mon, 4 Jul 2011 08:52:10 +0000 (UTC) Date: Mon, 04 Jul 2011 10:35:28 +0200 From: vermaden To: freebsd-current@freebsd.org X-Mailer: interia.pl/pf09 X-Originating-IP: 194.0.181.128 Message-Id: MIME-Version: 1.0 X-EMID: aeb9e484 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable@freebsd.org, freebsd-questions@freebsd.org Subject: 8.2-STABLE: audio stopped working properly after upgrade to today's sources X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jul 2011 08:52:11 -0000 Hi, I have just upgraded to 8.2-STABLE (sources from today) and now my audio does not work as it should, its all very, very quiet, even at levels 100/100 for PCM/VOL with mixer, also when I plug in the headphones they are deaf and sound still plays on the speakers, below are some details of my hardware. I did not done any modifications to GENERIC config, just build it 'as is'. Thanks in advance for any help, vermaden Generally: Dell Latitude E6400 (laptop) % uname -a FreeBSD e6400 8.2-STABLE FreeBSD 8.2-STABLE #0: Mon Jul 4 09:34:04 CEST 2011 root@e6400:/usr/obj/usr/src/sys/GENERIC amd64 % cat /dev/sndstat=20 FreeBSD Audio Driver (newpcm: 64bit 2009061500/amd64) Installed devices: pcm0: (play/rec) default pcm1: (play) pcm2: (play) pcm3: (play) % mixer =20 Mixer vol is currently set to 100:100 Mixer bass is currently set to 50:50 Mixer treble is currently set to 50:50 Mixer pcm is currently set to 100:100 Mixer mic is currently set to 0:0 Recording source: mic % pciconf -lv hostb0@pci0:0:0:0: class=3D0x060000 card=3D0x02331028 chip=3D0x2a408086 rev= =3D0x07 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'Mobile Memory Controller Hub' class =3D bridge subclass =3D HOST-PCI vgapci0@pci0:0:2:0: class=3D0x030000 card=3D0x02331028 chip=3D0x2a428086 re= v=3D0x07 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'Intel Mobile Graphic (Mobile Intel 4 Series Chipset Family)' class =3D display subclass =3D VGA vgapci1@pci0:0:2:1: class=3D0x038000 card=3D0x02331028 chip=3D0x2a438086 re= v=3D0x07 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'Intel Mobile Graphic (Mobile Intel 4 Series Chipset Family)' class =3D display em0@pci0:0:25:0: class=3D0x020000 card=3D0x02331028 chip=3D0x10f58086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'Intel 82567LM-2 Gigabit Network Connection (82567LM)' class =3D network subclass =3D ethernet uhci0@pci0:0:26:0: class=3D0x0c0300 card=3D0x02331028 chip=3D0x29378086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) USB Universal Host Controller' class =3D serial bus subclass =3D USB uhci1@pci0:0:26:1: class=3D0x0c0300 card=3D0x02331028 chip=3D0x29388086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) USB Universal Host Controller' class =3D serial bus subclass =3D USB uhci2@pci0:0:26:2: class=3D0x0c0300 card=3D0x02331028 chip=3D0x29398086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) USB Universal Host Controller' class =3D serial bus subclass =3D USB ehci0@pci0:0:26:7: class=3D0x0c0320 card=3D0x02331028 chip=3D0x293c8086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) USB2 Enhanced Host Controller' class =3D serial bus subclass =3D USB hdac0@pci0:0:27:0: class=3D0x040300 card=3D0x02331028 chip=3D0x293e8086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) HD Audio Controller' class =3D multimedia subclass =3D HDA pcib1@pci0:0:28:0: class=3D0x060400 card=3D0x02331028 chip=3D0x29408086 rev= =3D0x03 hdr=3D0x01 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) PCIe Root Port 1' class =3D bridge subclass =3D PCI-PCI pcib2@pci0:0:28:1: class=3D0x060400 card=3D0x02331028 chip=3D0x29428086 rev= =3D0x03 hdr=3D0x01 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) PCIe Root Port 2' class =3D bridge subclass =3D PCI-PCI pcib3@pci0:0:28:2: class=3D0x060400 card=3D0x02331028 chip=3D0x29448086 rev= =3D0x03 hdr=3D0x01 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) PCIe Root Port 3' class =3D bridge subclass =3D PCI-PCI uhci3@pci0:0:29:0: class=3D0x0c0300 card=3D0x02331028 chip=3D0x29348086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) USB Universal Host Controller' class =3D serial bus subclass =3D USB uhci4@pci0:0:29:1: class=3D0x0c0300 card=3D0x02331028 chip=3D0x29358086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) USB Universal Host Controller' class =3D serial bus subclass =3D USB uhci5@pci0:0:29:2: class=3D0x0c0300 card=3D0x02331028 chip=3D0x29368086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) USB Universal Host Controller' class =3D serial bus subclass =3D USB ehci1@pci0:0:29:7: class=3D0x0c0320 card=3D0x02331028 chip=3D0x293a8086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) USB2 Enhanced Host Controller' class =3D serial bus subclass =3D USB pcib4@pci0:0:30:0: class=3D0x060401 card=3D0x02331028 chip=3D0x24488086 rev= =3D0x93 hdr=3D0x01 vendor =3D 'Intel Corporation' device =3D '82801 Family (ICH2/3/4/5/6/7/8/9-M) Hub Interface to PCI Bridge' class =3D bridge subclass =3D PCI-PCI isab0@pci0:0:31:0: class=3D0x060100 card=3D0x02331028 chip=3D0x29178086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'ICH9M-E LPC Interface Controller' class =3D bridge subclass =3D PCI-ISA ahci0@pci0:0:31:2: class=3D0x010601 card=3D0x02331028 chip=3D0x29298086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D '82801IB/IR/IH (ICH9 Family) Mobile SATA AHCI Controller' class =3D mass storage subclass =3D SATA none0@pci0:0:31:3: class=3D0x0c0500 card=3D0x02331028 chip=3D0x29308086 rev= =3D0x03 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'Intel(R) ICH9 Family SMBus Controller working fine with http://download.cnet.com/Chipset-Driver-Inte (8086)' class =3D serial bus subclass =3D SMBus iwn0@pci0:12:0:0: class=3D0x028000 card=3D0x11218086 chip=3D0x42358086 rev= =3D0x00 hdr=3D0x00 vendor =3D 'Intel Corporation' device =3D 'Intel WiFi Link 5300 AGN (5300AGN)' class =3D network ---------------------------------------------------------------- Najwieksza baza samochodow nowych i uzywanych Sprawdz >> http://linkint.pl/f29e3 From owner-freebsd-stable@FreeBSD.ORG Mon Jul 4 08:52:14 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E84FD106566B for ; Mon, 4 Jul 2011 08:52:14 +0000 (UTC) (envelope-from vermaden@interia.pl) Received: from smtpo.poczta.interia.pl (smtpo.poczta.interia.pl [217.74.65.206]) by mx1.freebsd.org (Postfix) with ESMTP id A96E78FC08 for ; Mon, 4 Jul 2011 08:52:14 +0000 (UTC) Date: Mon, 04 Jul 2011 10:33:09 +0200 From: vermaden To: zkolic@sbb.rs X-Mailer: interia.pl/pf09 X-Originating-IP: 194.0.181.128 Message-Id: MIME-Version: 1.0 X-EMID: 8b4ae3f8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: dell latitude 13 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jul 2011 08:52:15 -0000 > Intel Core#2 Duo SU7300 1.3GHz, 3MB L2 cache, FSB 800MHz > Intel GS45 Express + Intel ICH9M-Enhanced (chipset) > Intel GMA 4500 MHD (graphics) > Intel Link 5100 IEEE 802.11a/b/g/n WiFi (wifi) Hi, I have a laptop with GM45 with GMA 4500 MHD graphics, also a WiFi card 5100 and all taht hardware works like a charm, with i915.ko driver for GMA 4500 and if_iwn.ko for the wireless 5100 chip. Regards, vermaden ---------------------------------------------------------- Kredyt gotowkowy nr 1! Teraz do 150 000zl bez zaswiadczen! http://linkint.pl/f29e1 From owner-freebsd-stable@FreeBSD.ORG Mon Jul 4 08:54:41 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 431C0106566C for ; Mon, 4 Jul 2011 08:54:41 +0000 (UTC) (envelope-from hans@beastielabs.net) Received: from mail.beastielabs.net (beasties.demon.nl [82.161.3.114]) by mx1.freebsd.org (Postfix) with ESMTP id C3E738FC12 for ; Mon, 4 Jul 2011 08:54:40 +0000 (UTC) Received: from testsoekris.hotsoft.nl (localhost [127.0.0.1]) by mail.beastielabs.net (8.14.4/8.14.4) with ESMTP id p648sc8E018178 for ; Mon, 4 Jul 2011 10:54:38 +0200 (CEST) (envelope-from hans@testsoekris.hotsoft.nl) Received: (from hans@localhost) by testsoekris.hotsoft.nl (8.14.4/8.14.4/Submit) id p648scTD018177 for freebsd-stable@freebsd.org; Mon, 4 Jul 2011 10:54:38 +0200 (CEST) (envelope-from hans) Date: Mon, 4 Jul 2011 10:54:38 +0200 From: Hans Ottevanger To: freebsd-stable@freebsd.org Message-ID: <20110704085438.GA93119@testsoekris.hotsoft.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Subject: NFS related include files and make delete-old X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jul 2011 08:54:41 -0000 Hi, For a few months now, during the usual make delete-old after make installworld the files /usr/include/nfs/krpc.h and /usr/include/nfs/nfsdiskless.h turn up time and again. I have them deleted, but they get reinstalled during the next make installworld. This is a fairly old installation, but running an up-to-date 8.2-STABLE and these header files are also present in the directory /usr/include/nfsclient. Could it be that either the wrong files are specified in /usr/src/ObsoleteFiles.inc or the headers are installed in the wrong directory during make installworld? On my 9.0-CURRENT systems I also have the headers at both locations, but there only those in /usr/include/nfsclient get reinstalled and there is no entry in /usr/src/ObsoleteFiles.inc. Kind regards, Hans Ottevanger From owner-freebsd-stable@FreeBSD.ORG Mon Jul 4 09:16:01 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6CC1B106564A for ; Mon, 4 Jul 2011 09:16:01 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta05.westchester.pa.mail.comcast.net (qmta05.westchester.pa.mail.comcast.net [76.96.62.48]) by mx1.freebsd.org (Postfix) with ESMTP id 19F418FC12 for ; Mon, 4 Jul 2011 09:16:00 +0000 (UTC) Received: from omta16.westchester.pa.mail.comcast.net ([76.96.62.88]) by qmta05.westchester.pa.mail.comcast.net with comcast id 3l0F1h0021uE5Es55lG15Z; Mon, 04 Jul 2011 09:16:01 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta16.westchester.pa.mail.comcast.net with comcast id 3lFz1h00P1t3BNj3clG024; Mon, 04 Jul 2011 09:16:01 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 61816102C36; Mon, 4 Jul 2011 02:15:58 -0700 (PDT) Date: Mon, 4 Jul 2011 02:15:58 -0700 From: Jeremy Chadwick To: vermaden Message-ID: <20110704091558.GA32271@icarus.home.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org Subject: Re: 8.2-STABLE: audio stopped working properly after upgrade to today's sources X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jul 2011 09:16:01 -0000 On Mon, Jul 04, 2011 at 10:35:28AM +0200, vermaden wrote: > I have just upgraded to 8.2-STABLE (sources from today) and now my > audio does not work as it should, its all very, very quiet, even at levels > 100/100 for PCM/VOL with mixer, also when I plug in the headphones > they are deaf and sound still plays on the speakers, below are some > details of my hardware. > > I did not done any modifications to GENERIC config, just build it 'as > is'. > > {snip} You're going to need to disclose what source and build date your previous kernel/system was, or else it'll be extremely difficult to tell what commit may have caused your problem. Furthermore, you cross-posted this to -stable, -questions, as well as -current. I don't know why you picked -current; 8.2-STABLE isn't HEAD, and as such I've removed it from the CC list, as well as -questions since I imagine more people follow -stable. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Mon Jul 4 09:17:08 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CA5421065679 for ; Mon, 4 Jul 2011 09:17:08 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta13.westchester.pa.mail.comcast.net (qmta13.westchester.pa.mail.comcast.net [76.96.59.243]) by mx1.freebsd.org (Postfix) with ESMTP id 76DDF8FC17 for ; Mon, 4 Jul 2011 09:17:08 +0000 (UTC) Received: from omta19.westchester.pa.mail.comcast.net ([76.96.62.98]) by qmta13.westchester.pa.mail.comcast.net with comcast id 3lFX1h00227AodY5DlH8CE; Mon, 04 Jul 2011 09:17:08 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta19.westchester.pa.mail.comcast.net with comcast id 3lH61h0021t3BNj3flH71q; Mon, 04 Jul 2011 09:17:08 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id AA8A6102C36; Mon, 4 Jul 2011 02:17:05 -0700 (PDT) Date: Mon, 4 Jul 2011 02:17:05 -0700 From: Jeremy Chadwick To: Hans Ottevanger Message-ID: <20110704091705.GA32332@icarus.home.lan> References: <20110704085438.GA93119@testsoekris.hotsoft.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110704085438.GA93119@testsoekris.hotsoft.nl> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Rick Macklem , freebsd-stable@freebsd.org Subject: Re: NFS related include files and make delete-old X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jul 2011 09:17:08 -0000 On Mon, Jul 04, 2011 at 10:54:38AM +0200, Hans Ottevanger wrote: > For a few months now, during the usual make delete-old after > make installworld the files > > /usr/include/nfs/krpc.h > > and > > /usr/include/nfs/nfsdiskless.h > > turn up time and again. I have them deleted, but they get reinstalled > during the next make installworld. This is a fairly old installation, > but running an up-to-date 8.2-STABLE and these header files are also > present in the directory /usr/include/nfsclient. > > Could it be that either the wrong files are specified in > /usr/src/ObsoleteFiles.inc or the headers are installed in the wrong > directory during make installworld? I can confirm this problem on many (6-7) different systems. It's specific to RELENG_8. Rick, any insights? -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Mon Jul 4 09:22:16 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1E228106566B for ; Mon, 4 Jul 2011 09:22:16 +0000 (UTC) (envelope-from vermaden@interia.pl) Received: from smtpo.poczta.interia.pl (smtpo.poczta.interia.pl [217.74.65.206]) by mx1.freebsd.org (Postfix) with ESMTP id 855AE8FC16 for ; Mon, 4 Jul 2011 09:22:15 +0000 (UTC) Date: Mon, 04 Jul 2011 11:22:14 +0200 From: vermaden To: Jeremy Chadwick X-Mailer: interia.pl/pf09 In-Reply-To: <20110704091558.GA32271@icarus.home.lan> References: <20110704091558.GA32271@icarus.home.lan> X-Originating-IP: 194.0.181.128 Message-Id: MIME-Version: 1.0 X-EMID: a632d846 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable@freebsd.org Subject: Re: 8.2-STABLE: audio stopped working properly after upgrade to today's sources X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jul 2011 09:22:16 -0000 I have found a sollution, here: http://forums.freebsd.org/showpost.php?p=3D73828&postcount=3D3 To be precise, this solved my problem: # sysctl hw.snd.default_unit=3D1 Sorry for needless CC's. Regards, vermaden Regards and sorry for=20 "Jeremy Chadwick" pisze: > On Mon, Jul 04, 2011 at 10:35:28AM +0200, vermaden wrote: > > I have just upgraded to 8.2-STABLE (sources from today) and now my > > audio does not work as it should, its all very, very quiet, even at lev= els > > 100/100 for PCM/VOL with mixer, also when I plug in the headphones > > they are deaf and sound still plays on the speakers, below are some > > details of my hardware. > >=20 > > I did not done any modifications to GENERIC config, just build it 'as > > is'. > > > > {snip} >=20 > You're going to need to disclose what source and build date your > previous kernel/system was, or else it'll be extremely difficult to tell > what commit may have caused your problem. >=20 > Furthermore, you cross-posted this to -stable, -questions, as well as > -current. I don't know why you picked -current; 8.2-STABLE isn't HEAD, > and as such I've removed it from the CC list, as well as -questions > since I imagine more people follow -stable. >=20 > --=20 > | Jeremy Chadwick jdc at parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, US | > | Making life hard for others since 1977. PGP 4BD6C0CB | >=20 >=20 ---------------------------------------------------------- Kredyt gotowkowy nr 1! Teraz do 150 000zl bez zaswiadczen! http://linkint.pl/f29e1 From owner-freebsd-stable@FreeBSD.ORG Mon Jul 4 12:18:59 2011 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4356B1065686 for ; Mon, 4 Jul 2011 12:18:59 +0000 (UTC) (envelope-from astralblue@gmail.com) Received: from mail-yi0-f54.google.com (mail-yi0-f54.google.com [209.85.218.54]) by mx1.freebsd.org (Postfix) with ESMTP id 09FEC8FC08 for ; Mon, 4 Jul 2011 12:18:58 +0000 (UTC) Received: by yic13 with SMTP id 13so559741yic.13 for ; Mon, 04 Jul 2011 05:18:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=rPn824ZGpCvLZfMf4j677gza2RbbnJPwiVycBnS9zhM=; b=jigSjDjtyyiLsJOea4ua6bVLQh16t7fg0bezbW3xB7ZK804rB3ouu53HPECmbCHXSG JUJeHJkZa0SISTUc7zGRkAv2whjXL5/Bpcj9CyYnkFpHoPy/8Q674HNyIObpXEGm3yN7 BSQM0z5x0XZlfvOJFzh7HX2byZaThLZKPm0q4= MIME-Version: 1.0 Received: by 10.147.96.9 with SMTP id y9mr26717yal.1.1309780430468; Mon, 04 Jul 2011 04:53:50 -0700 (PDT) Received: by 10.147.99.10 with HTTP; Mon, 4 Jul 2011 04:53:50 -0700 (PDT) Date: Mon, 4 Jul 2011 04:53:50 -0700 Message-ID: From: Eugene Kim To: stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: Subject: Request for MFC r215299: Echoing asterisks for GELI passphrase X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jul 2011 12:18:59 -0000 Greetings, Could someone please MFC r215299? This commit enhances a workaround for a long-standing bug (kern/105368) and is pretty much required for any production system affected by the bug. (Without the patch, anyone that can run dmesg can see the passphrase entered for the root filesystem. ;_;) Regards, Eugene Kim From owner-freebsd-stable@FreeBSD.ORG Mon Jul 4 12:44:53 2011 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58CA6106564A for ; Mon, 4 Jul 2011 12:44:53 +0000 (UTC) (envelope-from oliver.pntr@gmail.com) Received: from mail-wy0-f196.google.com (mail-wy0-f196.google.com [74.125.82.196]) by mx1.freebsd.org (Postfix) with ESMTP id E0B858FC08 for ; Mon, 4 Jul 2011 12:44:52 +0000 (UTC) Received: by wyh11 with SMTP id 11so1232070wyh.7 for ; Mon, 04 Jul 2011 05:44:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=LDrJO4mpHPhR2SW5UirhQNletWTWbjGGM22qL7Lj95c=; b=miVu1k1dDr9wJh4Ns1ppd24zeqLlDBibCGLV5tXZ1o7WY5+JNjmGE1Gz5L3TRyfGy7 /QNyzupxBps9BdJ6hAXtpM0P+R1km5e0vXH3MTNiX1YQ74WtoPB/aE4wM5ez8m4CnOXD ZfXiYzZhU2IVtR3EFn/SegVmC8r+XpTh1vL0w= MIME-Version: 1.0 Received: by 10.227.2.81 with SMTP id 17mr5468154wbi.15.1309782644769; Mon, 04 Jul 2011 05:30:44 -0700 (PDT) Received: by 10.227.209.209 with HTTP; Mon, 4 Jul 2011 05:30:44 -0700 (PDT) In-Reply-To: References: Date: Mon, 4 Jul 2011 14:30:44 +0200 Message-ID: From: Oliver Pinter To: Eugene Kim Content-Type: text/plain; charset=ISO-8859-1 Cc: stable@freebsd.org Subject: Re: Request for MFC r215299: Echoing asterisks for GELI passphrase X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jul 2011 12:44:53 -0000 Hi Eugene, Just a note: The /var/run/dmesg.boot file is world readable by default ;) opv@pandora-d ~> dmesg dmesg: sysctl kern.msgbuf: Operation not permitted opv@pandora-d ~> tail -n 5 /var/run/dmesg.boot pcm4: at cad 0 nid 1 on hdac1 SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! SMP: AP CPU #3 Launched! Trying to mount root from ufs:/dev/ufs/deskroot opv@pandora-d ~> ll /var/run/dmesg.boot -rw-r--r-- 1 root wheel 10619 Jul 1 19:30 /var/run/dmesg.boot On 7/4/11, Eugene Kim wrote: > Greetings, > > Could someone please MFC r215299? This commit enhances a workaround > for a long-standing bug (kern/105368) and is pretty much required for > any production system affected by the bug. (Without the patch, anyone > that can run dmesg can see the passphrase entered for the root > filesystem. ;_;) > > Regards, > Eugene Kim > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > From owner-freebsd-stable@FreeBSD.ORG Mon Jul 4 13:18:48 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 53F74106566C for ; Mon, 4 Jul 2011 13:18:48 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 11F648FC1A for ; Mon, 4 Jul 2011 13:18:47 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap0EAIa9EU6DaFvO/2dsb2JhbABShEKkMoh6sFWQKYErg3+BDASSNpBS X-IronPort-AV: E=Sophos;i="4.65,472,1304308800"; d="scan'208";a="125956811" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 04 Jul 2011 09:18:47 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 2DB7EB3F28; Mon, 4 Jul 2011 09:18:47 -0400 (EDT) Date: Mon, 4 Jul 2011 09:18:47 -0400 (EDT) From: Rick Macklem To: Hans Ottevanger Message-ID: <1457531349.170391.1309785527174.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20110704085438.GA93119@testsoekris.hotsoft.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE7 (Win)/6.0.10_GA_2692) Cc: freebsd-stable@freebsd.org Subject: Re: NFS related include files and make delete-old X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jul 2011 13:18:48 -0000 > Hi, > > For a few months now, during the usual make delete-old after > make installworld the files > > /usr/include/nfs/krpc.h > > and > > /usr/include/nfs/nfsdiskless.h > > turn up time and again. I have them deleted, but they get reinstalled > during the next make installworld. This is a fairly old installation, > but running an up-to-date 8.2-STABLE and these header files are also > present in the directory /usr/include/nfsclient. > I moved them from sys/nfsclient to sys/nfs, so that it would be more obvious that they are shared by both NFS clients (in sys/nfsclient and sys/fs/nfsclient). So the ones at the new location /usr/include/nfs would not be deleted, the entry in ObsoleteFiles.inc that removed them from /usr/include/nfs was deleted (by someone else, after discussing it with me). I felt that they should remain in the old location for backwards compatibility. (The "userland" contents of the two copies are identical, so it shouldn't matter which copy any userland app includes. One problem here is that I have no idea if any software outside of /usr/src includes these.) > Could it be that either the wrong files are specified in > /usr/src/ObsoleteFiles.inc or the headers are installed in the wrong > directory during make installworld? > > On my 9.0-CURRENT systems I also have the headers at both locations, > but there only those in /usr/include/nfsclient get reinstalled and > there is no entry in /usr/src/ObsoleteFiles.inc. > Actually, only the ones in /usr/include/nfs should get updated, because they now live in sys/nfs and not sys/nfsclient. I plan on adding an entry to ObsoleteFiles.inc in head/current for the /usr/include/nfsclient ones. (Thanks for the reminder w.r.t. this.) Should I MFC this to stable/8? (I had assumed that I should leave them in the old location for backwards compatibility and therefore wasn't going to MFC deletion of them in /usr/include/nfsclient. If I MFC that, the entries for them in ObsoleteFiles.inc for /usr/include/nfs need to be deleted, so they remain in the new location.) rick ps: Maybe I shouldn't have MFC'd the changes for making the two NFS clients use the shared diskless boot code, but that would have made later MFCs difficult. From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 07:00:26 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1E63E106566B for ; Tue, 5 Jul 2011 07:00:26 +0000 (UTC) (envelope-from Peter.Ross@bogen.in-berlin.de) Received: from einhorn.in-berlin.de (einhorn.in-berlin.de [192.109.42.8]) by mx1.freebsd.org (Postfix) with ESMTP id 8215A8FC0A for ; Tue, 5 Jul 2011 07:00:25 +0000 (UTC) X-Envelope-From: Peter.Ross@bogen.in-berlin.de Received: from localhost (okapi.in-berlin.de [192.109.42.117]) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id p6570Nr7017139; Tue, 5 Jul 2011 09:00:23 +0200 Received: from 124-254-118-24-static.bb.ispone.net.au (124-254-118-24-static.bb.ispone.net.au [124.254.118.24]) by webmail.in-berlin.de (Horde Framework) with HTTP; Tue, 05 Jul 2011 17:00:23 +1000 Message-ID: <20110705170023.87183xz6ehxbeft3@webmail.in-berlin.de> Date: Tue, 05 Jul 2011 17:00:23 +1000 From: "Peter Ross" To: "freebsd-stable List" References: <20110701222232.GA33935@icarus.home.lan> <20110702045435.GA81502@DataIX.net> <54D65EC5-9A9B-4F96-BB45-1904F2147CBA@gmail.com> <20110704100845.94513n3znbabpthp@webmail.in-berlin.de> In-Reply-To: <20110704100845.94513n3znbabpthp@webmail.in-berlin.de> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Internet Messaging Program (IMP) 4.3.3 X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 Cc: Scott Sipe , Jeremy Chadwick Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 07:00:26 -0000 Hi all, just as an addition: an upgrade to last Friday's FreeBSD-Stable and to =20 VirtualBox 4.0.8 does not fix the problem. I will experiment a bit more tomorrow after hours and grab some statistics. Regards Peter Quoting "Peter Ross" : > Hi all, > > I noticed a similar problem last week. It is also very similar to =20 > one reported last year: > > http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.ht= ml > > My server is a Dell T410 server with the same bge card (the same =20 > pciconf -lvc output as described by Mahlon: > > http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711.ht= ml > > Yours, Scott, is a em(4).. > > Another similarity: In all cases we are using VirtualBox. I just =20 > want to mention it, in case it matters. I am still running =20 > VirtualBox 3.2. > > Most of the time kstat.zfs.misc.arcstats.size was reaching =20 > vfs.zfs.arc_max then, but I could catch one or two cases then the =20 > value was still below. > > I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it does not help. > > BTW: It looks as ARC only gives back the memory when I destroy the =20 > ZFS (a cloned snapshot containing virtual machines). Even if nothing =20 > happens for hours the buffer isn't released.. > > My machine was still running 8.2-PRERELEASE so I am upgrading. > > I am happy to give information gathered on old/new kernel if it helps. > > Regards > Peter > > Quoting "Scott Sipe" : > >> >> On Jul 2, 2011, at 12:54 AM, jhell wrote: >> >>> On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: >>>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >>>>> I'm running 8.2-RELEASE and am having new problems with scp. When scpi= ng >>>>> files to a ZFS directory on the FreeBSD server -- most notably =20 >>>>> large files >>>>> -- the transfer frequently dies after just a few seconds. In my =20 >>>>> last test, I >>>>> tried to scp an 800mb file to the FreeBSD system and the =20 >>>>> transfer died after >>>>> 200mb. It completely copied the next 4 times I tried, and then =20 >>>>> died again on >>>>> the next attempt. >>>>> >>>>> On the client side: >>>>> >>>>> "Connection to home closed by remote host. >>>>> lost connection" >>>>> >>>>> In /var/log/auth.log: >>>>> >>>>> Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write failed: Cannot alloc= ate >>>>> memory >>>>> >>>>> I've never seen this before and have used scp before to transfer =20 >>>>> large files >>>>> without problems. This computer has been used in production for =20 >>>>> months and >>>>> has a current uptime of 36 days. I have not been able to notice =20 >>>>> any problems >>>>> copying files to the server via samba or netatalk, or any problems in >>>>> apache. >>>>> >>>>> Uname: >>>>> >>>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat Feb 19 01:02:54 E= ST >>>>> 2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >>>>> >>>>> I've attached my dmesg and output of vmstat -z. >>>>> >>>>> I have not restarted the sshd daemon or rebooted the computer. >>>>> >>>>> Am glad to provide any other information or test anything else. >>>>> >>>>> {snip vmstat -z and dmesg} >>>> >>>> You didn't provide details about your networking setup (rc.conf, >>>> ifconfig -a, etc.). netstat -m would be useful too. >>>> >>>> Next, please see this thread circa September 2010, titled "Network >>>> memory allocation failures": >>>> >>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread= .html#58708 >>>> >>>> The user in that thread is using rsync, which relies on scp by default. >>>> I believe this problem is similar, if not identical, to yours. >>>> >>> >>> Please also provide your output of ( /usr/bin/limits -a ) for the server >>> end and the client. >>> >>> I am not quite sure I agree with the need for ifconfig -a but some >>> information about the networking driver your using for the interface >>> would be helpful, uptime of the boxes. And configuration of the pool. >>> e.g. ( zpool status -a ;zfs get all ) You should probably >>> prop this information up somewhere so you can reference by URL whenever >>> needed. >>> >>> rsync(1) does not rely on scp(1) whatsoever but rsync(1) can be made to >>> use ssh(1) instead of rsh(1) and I believe that is what Jeremy is >>> stating here but correct me if I am wrong. It does use ssh(1) by >>> default. >>> >>> Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tmp >>> type filesystems that rsync(1) may be just filling up your temp ram area >>> and causing the connection abort which would be expected. ( df -h ) woul= d >>> help here. >> >> Hello, >> >> I'm not using tmpfs/mdmfs at all. The clients yesterday were 3 =20 >> different OSX computers (over gigabit). The FreeBSD server has 12gb =20 >> of ram and no bce adapter. For what it's worth, the server is =20 >> backed up remotely every night with rsync (remote FreeBSD uses =20 >> rsync to pull) to an offsite (slow cable connection) FreeBSD =20 >> computer, and I have not seen any errors in the nightly rsync. >> >> Sorry for the omission of networking info, here's the output of the =20 >> requested commands and some that popped up in the other thread: >> >> http://www.cap-press.com/misc/ >> >> In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0" >> >> Scott >> >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >> > > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 08:26:26 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3E5B81065673 for ; Tue, 5 Jul 2011 08:26:26 +0000 (UTC) (envelope-from joerg_surmann@snafu.de) Received: from waikiki.ops.eusc.inter.net (waikiki.ops.eusc.inter.net [84.23.254.155]) by mx1.freebsd.org (Postfix) with ESMTP id F19EF8FC0C for ; Tue, 5 Jul 2011 08:26:25 +0000 (UTC) X-Trace: 507c73757269697c38352e3137382e3138342e3132367c3151653078632d303030 337a432d43527c31333039383534333834 Received: from waikiki.ops.eusc.inter.net ([10.155.10.19] helo=localhost) by waikiki.ops.eusc.inter.net with esmtpsa (Exim 4.72) id 1Qe0xc-0003zC-CR; Tue, 05 Jul 2011 10:26:24 +0200 Message-ID: <4E12CAA7.8040409@snafu.de> Date: Tue, 05 Jul 2011 10:26:15 +0200 From: joerg_surmann User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; de; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: Doug Barton , FreeBSD_mailiglist_KERNEL References: <4E11FB6E.4080203@bsdforen.de> <4E12388E.6090707@gmx.de> <4E125CC0.3000400@bsdforen.de> <4E125DA6.6020306@FreeBSD.org> In-Reply-To: <4E125DA6.6020306@FreeBSD.org> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-SA-Exim-Connect-IP: 85.178.184.126 X-SA-Exim-Mail-From: joerg_surmann@snafu.de X-SA-Exim-Scanned: No (on waikiki.ops.eusc.inter.net); SAEximRunCond expanded to false Cc: Subject: Re: enigmail 1.2 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 08:26:26 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, I have problems to install thunderbird: /usr/ports/mail/enigmail-thunderbird# make ===> thunderbird-enigmail-1.2 doesn't currently build. *** Error code 1 Stop in /usr/ports/mail/enigmail-thunderbird. Thanks for help. Am 05.07.11 02:41, schrieb Doug Barton: > On 07/04/2011 17:37, Dominic Fandrey wrote: >> OK, I got it, solution at the end. >> >> On 05/07/2011 00:02, Matthias Andree wrote: >>> Am 04.07.2011 19:42, schrieb Dominic Fandrey: >>>> Hello, >>>> >>>> I got enigmail 1.2 to build by changing the gecko extract dependency >>>> target from configure to build. >>>> >>>> It installs and runs, but it always claims the gpg agent cannot >>>> be found. Even when I configure it manually. >>>> >>>> I'm out of ideas. Any suggestions? >>> >>> Asking the obvious: do you have ${LOCALBASE}/bin/gpg-agent (from >>> security/gnupg)? >> >> Yes. >> >>> ... >> >> The solution was to start gpg-agent and populate the >> GPG_AGENT_INFO environment variable. The most convenient way for >> me to do this was to change my .xsession: >> >> exec /usr/local/bin/gpg-agent --daemon /usr/local/bin/mywm > > You may find this useful: > > http://dougbarton.us/PGP/gpg-agent.html > > > hth, > > Doug > -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.12 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk4SyqcACgkQcEHvP2uxrTOZAgCfYPHRmWOmhAoRM9kTkL20gpzW pmIAoJjuVX9jYKUa/6qW9Slx0XLT49Ka =i2Q1 -----END PGP SIGNATURE----- From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 14:36:05 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6FF9E106564A for ; Tue, 5 Jul 2011 14:36:05 +0000 (UTC) (envelope-from hans@beastielabs.net) Received: from mail.beastielabs.net (beasties.demon.nl [82.161.3.114]) by mx1.freebsd.org (Postfix) with ESMTP id 08D6D8FC20 for ; Tue, 5 Jul 2011 14:36:04 +0000 (UTC) Received: from testsoekris.hotsoft.nl (localhost [127.0.0.1]) by mail.beastielabs.net (8.14.4/8.14.4) with ESMTP id p65Ea32W093627; Tue, 5 Jul 2011 16:36:03 +0200 (CEST) (envelope-from hans@testsoekris.hotsoft.nl) Received: (from hans@localhost) by testsoekris.hotsoft.nl (8.14.4/8.14.4/Submit) id p65Ea2eR093626; Tue, 5 Jul 2011 16:36:02 +0200 (CEST) (envelope-from hans) Date: Tue, 5 Jul 2011 16:36:02 +0200 From: Hans Ottevanger To: Rick Macklem Message-ID: <20110705143602.GA93412@testsoekris.hotsoft.nl> References: <20110704085438.GA93119@testsoekris.hotsoft.nl> <1457531349.170391.1309785527174.JavaMail.root@erie.cs.uoguelph.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1457531349.170391.1309785527174.JavaMail.root@erie.cs.uoguelph.ca> User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org Subject: Re: NFS related include files and make delete-old X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 14:36:05 -0000 On Mon, Jul 04, 2011 at 09:18:47AM -0400, Rick Macklem wrote: > > Hi, > > > > For a few months now, during the usual make delete-old after > > make installworld the files > > > > /usr/include/nfs/krpc.h > > > > and > > > > /usr/include/nfs/nfsdiskless.h > > > > turn up time and again. I have them deleted, but they get reinstalled > > during the next make installworld. This is a fairly old installation, > > but running an up-to-date 8.2-STABLE and these header files are also > > present in the directory /usr/include/nfsclient. > > > I moved them from sys/nfsclient to sys/nfs, so that it would be more > obvious that they are shared by both NFS clients (in sys/nfsclient and > sys/fs/nfsclient). So the ones at the new location /usr/include/nfs would > not be deleted, the entry in ObsoleteFiles.inc that removed them from > /usr/include/nfs was deleted (by someone else, after discussing it with me). > At this moment make installworld only installs the headers in the new location, on both 8/stable and head/current. On 8/stable they are immediately removed again when running make delete-old, because they are in ObsoleteFiles.inc. On head/current they are left alone, they are not in ObsoleteFiles.inc (i.e. not anymore). If the files at the old location are still there, it is as a leftover from previous installations. On a freshly installed /usr/include hierarchy they will be missing. > I felt that they should remain in the old location for backwards compatibility. > (The "userland" contents of the two copies are identical, so it shouldn't matter > which copy any userland app includes. One problem here is that I have no idea > if any software outside of /usr/src includes these.) > I can confirm that the copies are identical (if they are present), apart from version control information. I think that you have to install the copies explicitly if you want them to be there, also on a fresh installs, for compatibility with 8.2-RELEASE and earlier. I would only do that for 8/stable, if at all. > > Could it be that either the wrong files are specified in > > /usr/src/ObsoleteFiles.inc or the headers are installed in the wrong > > directory during make installworld? > > > > On my 9.0-CURRENT systems I also have the headers at both locations, > > but there only those in /usr/include/nfsclient get reinstalled and > > there is no entry in /usr/src/ObsoleteFiles.inc. > > > Actually, only the ones in /usr/include/nfs should get updated, because they > now live in sys/nfs and not sys/nfsclient. I plan on adding an entry to > ObsoleteFiles.inc in head/current for the /usr/include/nfsclient ones. > (Thanks for the reminder w.r.t. this.) > Even as a relative outsider to the FreeBSD project I am all for it. I don't know the schedule for 9.0, but if anything breaks (e.g. in ports, not in /usr/src) it had better break now. > Should I MFC this to stable/8? > (I had assumed that I should leave them in > the old location for backwards compatibility and therefore wasn't going to > MFC deletion of them in /usr/include/nfsclient. If I MFC that, the entries > for them in ObsoleteFiles.inc for /usr/include/nfs need to be deleted, so > they remain in the new location.) > It would save you the effort of finding a way to actually install the copies at the old location. However, in a sense it would change the API, and I do not know how the keepers of the code tree think about that 8-) And, as already explained above, there is already an antry in ObsoleteFiles.inc on stable/8, but probably for the wrong directory. Kind regards, Hans Ottevanger From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 14:57:11 2011 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E870B106564A; Tue, 5 Jul 2011 14:57:10 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-stable.sentex.ca (freebsd-stable.sentex.ca [IPv6:2607:f3e0:0:3::6502:9b]) by mx1.freebsd.org (Postfix) with ESMTP id AD10B8FC15; Tue, 5 Jul 2011 14:57:10 +0000 (UTC) Received: from freebsd-stable.sentex.ca (localhost [127.0.0.1]) by freebsd-stable.sentex.ca (8.14.4/8.14.4) with ESMTP id p65Ev9wQ060671; Tue, 5 Jul 2011 14:57:09 GMT (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-stable.sentex.ca (8.14.4/8.14.4/Submit) id p65Ev9Q4060667; Tue, 5 Jul 2011 14:57:09 GMT (envelope-from tinderbox@freebsd.org) Date: Tue, 5 Jul 2011 14:57:09 GMT Message-Id: <201107051457.p65Ev9Q4060667@freebsd-stable.sentex.ca> X-Authentication-Warning: freebsd-stable.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Cc: Subject: [releng_8 tinderbox] failure on i386/i386 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 14:57:11 -0000 TB --- 2011-07-05 13:34:40 - tinderbox 2.6 running on freebsd-stable.sentex.ca TB --- 2011-07-05 13:34:40 - starting RELENG_8 tinderbox run for i386/i386 TB --- 2011-07-05 13:34:40 - cleaning the object tree TB --- 2011-07-05 13:35:02 - cvsupping the source tree TB --- 2011-07-05 13:35:02 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_8/i386/i386/supfile TB --- 2011-07-05 13:36:12 - building world TB --- 2011-07-05 13:36:12 - MAKEOBJDIRPREFIX=/obj TB --- 2011-07-05 13:36:12 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2011-07-05 13:36:12 - TARGET=i386 TB --- 2011-07-05 13:36:12 - TARGET_ARCH=i386 TB --- 2011-07-05 13:36:12 - TZ=UTC TB --- 2011-07-05 13:36:12 - __MAKE_CONF=/dev/null TB --- 2011-07-05 13:36:12 - cd /src TB --- 2011-07-05 13:36:12 - /usr/bin/make -B buildworld >>> World build started on Tue Jul 5 13:36:13 UTC 2011 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Tue Jul 5 14:41:08 UTC 2011 TB --- 2011-07-05 14:41:08 - generating LINT kernel config TB --- 2011-07-05 14:41:08 - cd /src/sys/i386/conf TB --- 2011-07-05 14:41:08 - /usr/bin/make -B LINT TB --- 2011-07-05 14:41:08 - building LINT kernel TB --- 2011-07-05 14:41:08 - MAKEOBJDIRPREFIX=/obj TB --- 2011-07-05 14:41:08 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2011-07-05 14:41:08 - TARGET=i386 TB --- 2011-07-05 14:41:08 - TARGET_ARCH=i386 TB --- 2011-07-05 14:41:08 - TZ=UTC TB --- 2011-07-05 14:41:08 - __MAKE_CONF=/dev/null TB --- 2011-07-05 14:41:08 - cd /src TB --- 2011-07-05 14:41:08 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Tue Jul 5 14:41:08 UTC 2011 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] :> hack.c cc -shared -nostdlib hack.c -o hack.So rm -f hack.c MAKE=/usr/bin/make sh /src/sys/conf/newvers.sh LINT cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue vers.c linking kernel ld: kernel: Not enough room for program headers (allocated 5, need 6) ld: final link failed: Bad value *** Error code 1 Stop in /obj/i386/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2011-07-05 14:57:09 - WARNING: /usr/bin/make returned exit code 1 TB --- 2011-07-05 14:57:09 - ERROR: failed to build lint kernel TB --- 2011-07-05 14:57:09 - 3603.08 user 485.99 system 4949.15 real http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8-i386-i386.full From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 15:14:33 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 38FA01065670 for ; Tue, 5 Jul 2011 15:14:33 +0000 (UTC) (envelope-from freebsd-stable@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 1D4F48FC16 for ; Tue, 5 Jul 2011 15:14:31 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1Qe7KY-0001gA-MV for freebsd-stable@freebsd.org; Tue, 05 Jul 2011 17:14:30 +0200 Received: from dtmd-4db2d4d7.pool.mediaways.net ([77.178.212.215]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 05 Jul 2011 17:14:30 +0200 Received: from christian.baer by dtmd-4db2d4d7.pool.mediaways.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 05 Jul 2011 17:14:30 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-stable@freebsd.org From: Christian Baer Date: Tue, 05 Jul 2011 17:14:18 +0200 Lines: 25 Message-ID: References: <52F39CE0-EEC7-4180-8186-BF8696AF279D@lassitu.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: dtmd-4db2d4d7.pool.mediaways.net User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.16) Gecko/20101125 Lightning/1.0b1 Thunderbird/3.0.11 In-Reply-To: Subject: Re: Crashes with Promise controller X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 15:14:33 -0000 On 01.07.2011 12:18, Tom Evans wrote: Hey guys! Thanks for all the answers! I'm sorry, mine take a while. This server is my personal machine which I work on in my spare time. As it seems, the last few weeks I was getting very little time to spare. :-/ >> A serial console is easy enough to set up on a Sun for example, but in >> this case, I am running a simple AthlonXP, which has nothing for that >> sort of help. I would need a special card for that and those cose quite >> a bit. :-( > Not that special - you just need a serial port on two computers. If > your computers don't have serial ports, USB serial adapters work fine, > and are cheap, as are (single port) PCI serial cards. Yeah, the post you are referring to is a little awkward for me because I was just too thick to get what you guys were on about. I was thinking along the lines of a card that also allows BIOS access (what Intel calls RMM, IIRC). After the fact that the serial console within the kernel would suffice struck me like lightning, it was set up within about 2 minutes - including the reboot. :-) Best regards, Chris From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 15:41:31 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4608D106566B for ; Tue, 5 Jul 2011 15:41:30 +0000 (UTC) (envelope-from freebsd-stable@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id CAF7D8FC13 for ; Tue, 5 Jul 2011 15:41:29 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1Qe7kb-0006oK-Vl for freebsd-stable@freebsd.org; Tue, 05 Jul 2011 17:41:25 +0200 Received: from dtmd-4db2d4d7.pool.mediaways.net ([77.178.212.215]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 05 Jul 2011 17:41:25 +0200 Received: from christian.baer by dtmd-4db2d4d7.pool.mediaways.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 05 Jul 2011 17:41:25 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-stable@freebsd.org From: Christian Baer Date: Tue, 05 Jul 2011 17:41:11 +0200 Lines: 89 Message-ID: References: <52F39CE0-EEC7-4180-8186-BF8696AF279D@lassitu.de> <20110618175215.GA18645@icarus.home.lan> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: dtmd-4db2d4d7.pool.mediaways.net User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.16) Gecko/20101125 Lightning/1.0b1 Thunderbird/3.0.11 In-Reply-To: <20110618175215.GA18645@icarus.home.lan> Subject: Re: Crashes with Promise controller X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 15:41:31 -0000 On 18.06.2011 19:52, Jeremy Chadwick wrote: > It may be that the kernel is panic'ing and auto-rebooting before he can > see the message in question. I would advocate he put the following > directives in his kernel configuration and rebuild/reinstall kernel and > wait for it to happen again. I have now changed the power setup slightly and the problems have *reduced* and slightly changed in themselves. Reproducing a panic is a lot harder, which I consider a good thing at the moment. Since I changed the power configuration, the system has been running for about 4 days and had only two crashes (traps) since then, despite quite heavy traffic on the drives. Because the system rebooted very quickly before I set up the serial console, I only ever got to see one panic (not a trap) in the past. But it was gone to quickly for me to write anything down about it. On a side-note: I did find out during my testing (before changing the power) that two drives were actually causing the problems and I could even make the system crash while only reading from one of those drives. Crashes while reading felt less frequent (no statistics collected though) but happened just the same. Because I formatted the two drives in question with rather strange values (rather large block sizes), I have decided to copy everything off them, re-partition them with gpt and create both the encryption-system on them aswell as the file system over. During this copying, I managed to crash the system twice. The first time was yesterday, where I got this: --- snip --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0x1f8 fault code = supervisor read, page not present instruction pointer = 0x20:0xc3d2120c stack pointer = 0x28:0xc3697bf4 frame pointer = 0x28:0xc3697c4c code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2 (g_event) [thread pid 2 tid 100007 ] Stopped at g_eli_access+0x7c: testl $0x10008,0x1f8(%ebx) --- snap --- About 25 minutes ago, the system crashed again. This time, I had the "known" errors prior to the actual trap: --- snip --- ata6: SIGNATURE: ffffffff ata6: timeout waiting to issue command ata6: error issuing SETFEATURES SET TRANSFER MODE command ata6: timeout waiting to issue command ata6: error issuing SETFEATURES ENABLE RCACHE command ata6: timeout waiting to issue command ata6: error issuing SETFEATURES ENABLE WCACHE command ata6: timeout waiting to issue command ata6: error issuing SET_MULTI command ad12: FAILURE - device detached GEOM_ELI: g_eli_read_done() failed ad12d.eli[READ(offset=403810975744, length=32768)] g_vfs_done():ad12d.eli[READ(offset=403810975744, length=32768)]error = 6 Fatal trap 12: page fault while in kernel mode fault virtual address = 0x1f8 fault code = supervisor read, page not present instruction pointer = 0x20:0xc3d2420c stack pointer = 0x28:0xc3697bf4 frame pointer = 0x28:0xc3697c4c code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2 (g_event) [thread pid 2 tid 100007 ] Stopped at g_eli_access+0x7c: testl $0x10008,0x1f8(%ebx) --- snap --- The strange thing is that I wasn't actually accessing ad12 at the time. I was running a "-t long" on it, but no more. That test had been running for over two hours at the time of the crash. Does this still somehow point to a power problem (since ad12 seems to get detached)? Or could is be something a bit more fundamental? Best regards, Chris From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 15:48:35 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CE5BC106566B for ; Tue, 5 Jul 2011 15:48:35 +0000 (UTC) (envelope-from gkontos.mail@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 9A6068FC08 for ; Tue, 5 Jul 2011 15:48:34 +0000 (UTC) Received: by iwr19 with SMTP id 19so7121647iwr.13 for ; Tue, 05 Jul 2011 08:48:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=ZwSbB6sdodYZ3BYGV8JUtWV1v0FTcRxU47rdYraxI6o=; b=oUxm1OZSLPqVPV8t6ru3fwNg4+cN8USGgFZBqBm9nmPD6qI3Y9vBJA6VwD0tuaO3Um GR4Tk6zBMU1Ncqrudyr+UTml3jGI4Lz2dWiJNHGUl/QmMZqur6y8dJKzgiSBikHdK9r2 vFd1p7dfmCqS6pEnBiHaLTQDUM/LBNa2cr088= MIME-Version: 1.0 Received: by 10.231.127.142 with SMTP id g14mr2543786ibs.163.1309880913387; Tue, 05 Jul 2011 08:48:33 -0700 (PDT) Received: by 10.231.15.205 with HTTP; Tue, 5 Jul 2011 08:48:33 -0700 (PDT) In-Reply-To: References: <52F39CE0-EEC7-4180-8186-BF8696AF279D@lassitu.de> <20110618175215.GA18645@icarus.home.lan> Date: Tue, 5 Jul 2011 18:48:33 +0300 Message-ID: From: George Kontostanos To: Christian Baer Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable@freebsd.org Subject: Re: Crashes with Promise controller X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 15:48:35 -0000 On Tue, Jul 5, 2011 at 6:41 PM, Christian Baer wrote: > On 18.06.2011 19:52, Jeremy Chadwick wrote: > >> It may be that the kernel is panic'ing and auto-rebooting before he can >> see the message in question. =A0I would advocate he put the following >> directives in his kernel configuration and rebuild/reinstall kernel and >> wait for it to happen again. > > I have now changed the power setup slightly and the problems have > *reduced* and slightly changed in themselves. Reproducing a panic is a > lot harder, which I consider a good thing at the moment. > > Since I changed the power configuration, the system has been running for > about 4 days and had only two crashes (traps) since then, despite quite > heavy traffic on the drives. Because the system rebooted very quickly > before I set up the serial console, I only ever got to see one panic > (not a trap) in the past. But it was gone to quickly for me to write > anything down about it. > > On a side-note: > I did find out during my testing (before changing the power) that two > drives were actually causing the problems and I could even make the > system crash while only reading from one of those drives. Crashes while > reading felt less frequent (no statistics collected though) but happened > just the same. > > Because I formatted the two drives in question with rather strange > values (rather large block sizes), I have decided to copy everything off > them, re-partition them with gpt and create both the encryption-system > on them aswell as the file system over. > > During this copying, I managed to crash the system twice. The first time > was yesterday, where I got this: > > --- snip --- > Fatal trap 12: page fault while in kernel mode > fault virtual address =A0 =3D 0x1f8 > fault code =A0 =A0 =A0 =A0 =A0 =A0 =A0=3D supervisor read, page not prese= nt > instruction pointer =A0 =A0 =3D 0x20:0xc3d2120c > stack pointer =A0 =A0 =A0 =A0 =A0 =3D 0x28:0xc3697bf4 > frame pointer =A0 =A0 =A0 =A0 =A0 =3D 0x28:0xc3697c4c > code segment =A0 =A0 =A0 =A0 =A0 =A0=3D base 0x0, limit 0xfffff, type 0x1= b > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=3D DPL 0, pres 1, def32 1= , gran 1 > processor eflags =A0 =A0 =A0 =A0=3D interrupt enabled, resume, IOPL =3D 0 > current process =A0 =A0 =A0 =A0 =3D 2 (g_event) > [thread pid 2 tid 100007 ] > Stopped at =A0 =A0 =A0g_eli_access+0x7c: =A0 =A0 =A0testl =A0 $0x10008,0x= 1f8(%ebx) > --- snap --- > > About 25 minutes ago, the system crashed again. This time, I had the > "known" errors prior to the actual trap: > > --- snip --- > ata6: SIGNATURE: ffffffff > ata6: timeout waiting to issue command > ata6: error issuing SETFEATURES SET TRANSFER MODE command > ata6: timeout waiting to issue command > ata6: error issuing SETFEATURES ENABLE RCACHE command > ata6: timeout waiting to issue command > ata6: error issuing SETFEATURES ENABLE WCACHE command > ata6: timeout waiting to issue command > ata6: error issuing SET_MULTI command > ad12: FAILURE - device detached > GEOM_ELI: g_eli_read_done() failed ad12d.eli[READ(offset=3D403810975744, > length=3D32768)] > g_vfs_done():ad12d.eli[READ(offset=3D403810975744, length=3D32768)]error = =3D 6 > > Fatal trap 12: page fault while in kernel mode > fault virtual address =A0 =3D 0x1f8 > fault code =A0 =A0 =A0 =A0 =A0 =A0 =A0=3D supervisor read, page not prese= nt > instruction pointer =A0 =A0 =3D 0x20:0xc3d2420c > stack pointer =A0 =A0 =A0 =A0 =A0 =3D 0x28:0xc3697bf4 > frame pointer =A0 =A0 =A0 =A0 =A0 =3D 0x28:0xc3697c4c > code segment =A0 =A0 =A0 =A0 =A0 =A0=3D base 0x0, limit 0xfffff, type 0x1= b > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=3D DPL 0, pres 1, def32 1= , gran 1 > processor eflags =A0 =A0 =A0 =A0=3D interrupt enabled, resume, IOPL =3D 0 > current process =A0 =A0 =A0 =A0 =3D 2 (g_event) > [thread pid 2 tid 100007 ] > Stopped at =A0 =A0 =A0g_eli_access+0x7c: =A0 =A0 =A0testl =A0 $0x10008,0x= 1f8(%ebx) > --- snap --- > > The strange thing is that I wasn't actually accessing ad12 at the time. > I was running a "-t long" on it, but no more. That test had been running > for over two hours at the time of the crash. > > Does this still somehow point to a power problem (since ad12 seems to > get detached)? Or could is be something a bit more fundamental? > > Best regards, > Chris > I am not sure if it is the same controller: http://www.freebsd.org/cgi/query-pr.cgi?pr=3D158268 --=20 George Kontostanos aisecure.net From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 17:03:24 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0F7FF106566B for ; Tue, 5 Jul 2011 17:03:24 +0000 (UTC) (envelope-from cscotts@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id BD8C68FC08 for ; Tue, 5 Jul 2011 17:03:23 +0000 (UTC) Received: by qwc9 with SMTP id 9so4051732qwc.13 for ; Tue, 05 Jul 2011 10:03:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=v11ZcKbcNsR/TXw2y/6NyaHbZ079P/xM1klt4cXVxTo=; b=h8b2yuN7vsNzbzZ/mAQ2ZJiQvrCNVje4vmEGQBcVgVPQF11q62qdSmu3qKgOQBfVBI ctPpet9cljietUyjp3YUbRKM0hv8VyZv5HjNCLossIH6p0nfSZOo8jbgykCjet5Yo/sL uqYaVjodne84R/Ucaz4msulcA5uMCLrjnJzCo= Received: by 10.224.17.144 with SMTP id s16mr5393523qaa.233.1309885402891; Tue, 05 Jul 2011 10:03:22 -0700 (PDT) Received: from sahibkuran.cap-press.com (rrcs-24-172-62-94.midsouth.biz.rr.com [24.172.62.94]) by mx.google.com with ESMTPS id r33sm5679106qcs.6.2011.07.05.10.03.21 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 05 Jul 2011 10:03:21 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Scott Sipe In-Reply-To: <20110705170023.87183xz6ehxbeft3@webmail.in-berlin.de> Date: Tue, 5 Jul 2011 13:03:20 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <5BA0546B-DE70-4753-B445-551FD5336787@gmail.com> References: <20110701222232.GA33935@icarus.home.lan> <20110702045435.GA81502@DataIX.net> <54D65EC5-9A9B-4F96-BB45-1904F2147CBA@gmail.com> <20110704100845.94513n3znbabpthp@webmail.in-berlin.de> <20110705170023.87183xz6ehxbeft3@webmail.in-berlin.de> To: Peter Ross X-Mailer: Apple Mail (2.1084) Cc: freebsd-stable List , Jeremy Chadwick Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 17:03:24 -0000 I'm running virtualbox 3.2.12_1 if that has anything to do with it. sysctl vfs.zfs.arc_max: 6200000000 While I'm trying to scp, kstat.zfs.misc.arcstats.size is hovering right = around that value, sometimes above, sometimes below (that's as it should = be, right?). I don't think that it dies when crossing over arc_max. I = can run the same scp 10 times and it might fail 1-3 times, with no = correlation to the arcstats.size being above/below arc_max that I can = see. Scott On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: > Hi all, >=20 > just as an addition: an upgrade to last Friday's FreeBSD-Stable and to = VirtualBox 4.0.8 does not fix the problem. >=20 > I will experiment a bit more tomorrow after hours and grab some = statistics. >=20 > Regards > Peter >=20 > Quoting "Peter Ross" : >=20 >> Hi all, >>=20 >> I noticed a similar problem last week. It is also very similar to one = reported last year: >>=20 >> = http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.ht= ml >>=20 >> My server is a Dell T410 server with the same bge card (the same = pciconf -lvc output as described by Mahlon: >>=20 >> = http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711.ht= ml >>=20 >> Yours, Scott, is a em(4).. >>=20 >> Another similarity: In all cases we are using VirtualBox. I just want = to mention it, in case it matters. I am still running VirtualBox 3.2. >>=20 >> Most of the time kstat.zfs.misc.arcstats.size was reaching = vfs.zfs.arc_max then, but I could catch one or two cases then the value = was still below. >>=20 >> I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it does not = help. >>=20 >> BTW: It looks as ARC only gives back the memory when I destroy the = ZFS (a cloned snapshot containing virtual machines). Even if nothing = happens for hours the buffer isn't released.. >>=20 >> My machine was still running 8.2-PRERELEASE so I am upgrading. >>=20 >> I am happy to give information gathered on old/new kernel if it = helps. >>=20 >> Regards >> Peter >>=20 >> Quoting "Scott Sipe" : >>=20 >>>=20 >>> On Jul 2, 2011, at 12:54 AM, jhell wrote: >>>=20 >>>> On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: >>>>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >>>>>> I'm running 8.2-RELEASE and am having new problems with scp. When = scping >>>>>> files to a ZFS directory on the FreeBSD server -- most notably = large files >>>>>> -- the transfer frequently dies after just a few seconds. In my = last test, I >>>>>> tried to scp an 800mb file to the FreeBSD system and the transfer = died after >>>>>> 200mb. It completely copied the next 4 times I tried, and then = died again on >>>>>> the next attempt. >>>>>>=20 >>>>>> On the client side: >>>>>>=20 >>>>>> "Connection to home closed by remote host. >>>>>> lost connection" >>>>>>=20 >>>>>> In /var/log/auth.log: >>>>>>=20 >>>>>> Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write failed: Cannot = allocate >>>>>> memory >>>>>>=20 >>>>>> I've never seen this before and have used scp before to transfer = large files >>>>>> without problems. This computer has been used in production for = months and >>>>>> has a current uptime of 36 days. I have not been able to notice = any problems >>>>>> copying files to the server via samba or netatalk, or any = problems in >>>>>> apache. >>>>>>=20 >>>>>> Uname: >>>>>>=20 >>>>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat Feb 19 = 01:02:54 EST >>>>>> 2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >>>>>>=20 >>>>>> I've attached my dmesg and output of vmstat -z. >>>>>>=20 >>>>>> I have not restarted the sshd daemon or rebooted the computer. >>>>>>=20 >>>>>> Am glad to provide any other information or test anything else. >>>>>>=20 >>>>>> {snip vmstat -z and dmesg} >>>>>=20 >>>>> You didn't provide details about your networking setup (rc.conf, >>>>> ifconfig -a, etc.). netstat -m would be useful too. >>>>>=20 >>>>> Next, please see this thread circa September 2010, titled "Network >>>>> memory allocation failures": >>>>>=20 >>>>> = http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.ht= ml#58708 >>>>>=20 >>>>> The user in that thread is using rsync, which relies on scp by = default. >>>>> I believe this problem is similar, if not identical, to yours. >>>>>=20 >>>>=20 >>>> Please also provide your output of ( /usr/bin/limits -a ) for the = server >>>> end and the client. >>>>=20 >>>> I am not quite sure I agree with the need for ifconfig -a but some >>>> information about the networking driver your using for the = interface >>>> would be helpful, uptime of the boxes. And configuration of the = pool. >>>> e.g. ( zpool status -a ;zfs get all ) You should = probably >>>> prop this information up somewhere so you can reference by URL = whenever >>>> needed. >>>>=20 >>>> rsync(1) does not rely on scp(1) whatsoever but rsync(1) can be = made to >>>> use ssh(1) instead of rsh(1) and I believe that is what Jeremy is >>>> stating here but correct me if I am wrong. It does use ssh(1) by >>>> default. >>>>=20 >>>> Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for = /tmp >>>> type filesystems that rsync(1) may be just filling up your temp ram = area >>>> and causing the connection abort which would be expected. ( df -h ) = would >>>> help here. >>>=20 >>> Hello, >>>=20 >>> I'm not using tmpfs/mdmfs at all. The clients yesterday were 3 = different OSX computers (over gigabit). The FreeBSD server has 12gb of = ram and no bce adapter. For what it's worth, the server is backed up = remotely every night with rsync (remote FreeBSD uses rsync to pull) to = an offsite (slow cable connection) FreeBSD computer, and I have not seen = any errors in the nightly rsync. >>>=20 >>> Sorry for the omission of networking info, here's the output of the = requested commands and some that popped up in the other thread: >>>=20 >>> http://www.cap-press.com/misc/ >>>=20 >>> In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0" >>>=20 >>> Scott >>>=20 >>> _______________________________________________ >>> freebsd-stable@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >>> To unsubscribe, send any mail to = "freebsd-stable-unsubscribe@freebsd.org" >>>=20 >>=20 >>=20 >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to = "freebsd-stable-unsubscribe@freebsd.org" >>=20 >=20 >=20 From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 17:53:57 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D4BBD1065674 for ; Tue, 5 Jul 2011 17:53:57 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 86C3F8FC24 for ; Tue, 5 Jul 2011 17:53:57 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap0EAJVPE06DaFvO/2dsb2JhbABThEKkPoh6sViQX4Erg3+BDASQL4IHkFI X-IronPort-AV: E=Sophos;i="4.65,480,1304308800"; d="scan'208";a="129900005" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 05 Jul 2011 13:53:37 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id DB4AAB3F0A; Tue, 5 Jul 2011 13:53:37 -0400 (EDT) Date: Tue, 5 Jul 2011 13:53:37 -0400 (EDT) From: Rick Macklem To: Hans Ottevanger Message-ID: <1078921668.230598.1309888417887.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20110705143602.GA93412@testsoekris.hotsoft.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE7 (Win)/6.0.10_GA_2692) Cc: freebsd-stable@freebsd.org Subject: Re: NFS related include files and make delete-old X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 17:53:58 -0000 > > At this moment make installworld only installs the headers in the new > location, > on both 8/stable and head/current. On 8/stable they are immediately > removed > again when running make delete-old, because they are in > ObsoleteFiles.inc. > On head/current they are left alone, they are not in ObsoleteFiles.inc > (i.e. not anymore). > > If the files at the old location are still there, it is as a leftover > from > previous installations. On a freshly installed /usr/include hierarchy > they > will be missing. > Yes. I am working on MFC'ing the patch (r221333) to stable/8 so that it doesn't delete the ones in /usr/include/nfs. Having said that, to the best of my knowledge (I looked a while back), nothing in /usr/src outside of the kernel includes them. Also, I can't think of any reason why a third party app. would have any use for what is in them. As such, I doubt it matters if they exist under /usr/include or where they end up. Do you have software that includes either of these files? If so, I would like to hear whay that software is and why it includes them. (Even bootstraps for diskless NFS root systems shouldn't need what's in them, as far as I understand how it works.) > > I felt that they should remain in the old location for backwards > > compatibility. > > (The "userland" contents of the two copies are identical, so it > > shouldn't matter > > which copy any userland app includes. One problem here is that I > > have no idea > > if any software outside of /usr/src includes these.) > > > > I can confirm that the copies are identical (if they are present), > apart from > version control information. I think that you have to install the > copies explicitly > if you want them to be there, also on a fresh installs, for > compatibility with > 8.2-RELEASE and earlier. I would only do that for 8/stable, if at all. > > > > Could it be that either the wrong files are specified in > > > /usr/src/ObsoleteFiles.inc or the headers are installed in the > > > wrong > > > directory during make installworld? > > > > > > On my 9.0-CURRENT systems I also have the headers at both > > > locations, > > > but there only those in /usr/include/nfsclient get reinstalled and > > > there is no entry in /usr/src/ObsoleteFiles.inc. > > > > > Actually, only the ones in /usr/include/nfs should get updated, > > because they > > now live in sys/nfs and not sys/nfsclient. I plan on adding an entry > > to > > ObsoleteFiles.inc in head/current for the /usr/include/nfsclient > > ones. > > (Thanks for the reminder w.r.t. this.) > > > > Even as a relative outsider to the FreeBSD project I am all for it. I > don't > know the schedule for 9.0, but if anything breaks (e.g. in ports, not > in /usr/src) > it had better break now. > > > Should I MFC this to stable/8? > > (I had assumed that I should leave them in > > the old location for backwards compatibility and therefore wasn't > > going to > > MFC deletion of them in /usr/include/nfsclient. If I MFC that, the > > entries > > for them in ObsoleteFiles.inc for /usr/include/nfs need to be > > deleted, so > > they remain in the new location.) > > > > It would save you the effort of finding a way to actually install the > copies > at the old location. However, in a sense it would change the API, and > I do not > know how the keepers of the code tree think about that 8-) > > And, as already explained above, there is already an antry in > ObsoleteFiles.inc on > stable/8, but probably for the wrong directory. > > Kind regards, > > Hans Ottevanger > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscribe@freebsd.org" From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 20:32:14 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 24C0F106564A for ; Tue, 5 Jul 2011 20:32:14 +0000 (UTC) (envelope-from hans@beastielabs.net) Received: from mail.beastielabs.net (beasties.demon.nl [82.161.3.114]) by mx1.freebsd.org (Postfix) with ESMTP id A41708FC15 for ; Tue, 5 Jul 2011 20:32:12 +0000 (UTC) Received: from testsoekris.hotsoft.nl (localhost [127.0.0.1]) by mail.beastielabs.net (8.14.4/8.14.4) with ESMTP id p65KWAau095140; Tue, 5 Jul 2011 22:32:10 +0200 (CEST) (envelope-from hans@testsoekris.hotsoft.nl) Received: (from hans@localhost) by testsoekris.hotsoft.nl (8.14.4/8.14.4/Submit) id p65KWAls095139; Tue, 5 Jul 2011 22:32:10 +0200 (CEST) (envelope-from hans) Date: Tue, 5 Jul 2011 22:32:10 +0200 From: Hans Ottevanger To: Rick Macklem Message-ID: <20110705203210.GA95046@testsoekris.hotsoft.nl> References: <20110705143602.GA93412@testsoekris.hotsoft.nl> <1078921668.230598.1309888417887.JavaMail.root@erie.cs.uoguelph.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1078921668.230598.1309888417887.JavaMail.root@erie.cs.uoguelph.ca> User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org Subject: Re: NFS related include files and make delete-old X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 20:32:14 -0000 On Tue, Jul 05, 2011 at 01:53:37PM -0400, Rick Macklem wrote: > > [...] > > Having said that, to the best of my knowledge (I looked a while back), > nothing in /usr/src outside of the kernel includes them. Also, I can't > think of any reason why a third party app. would have any use for what > is in them. As such, I doubt it matters if they exist under /usr/include > or where they end up. > > Do you have software that includes either of these files? > > If so, I would like to hear whay that software is and why it includes > them. (Even bootstraps for diskless NFS root systems shouldn't need > what's in them, as far as I understand how it works.) > No Rick, I do not have any code that uses the internals of the NFS implementation, neither am I aware of any other software doing that. There could be some in the ports collection, but even a quick scan there yields nothing obvious. My interest in this matter stems mainly from stumbling over the same headers to be deleted during every make delete-old. Kind regards, Hans Ottevanger From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 21:36:18 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9A2C5106566C for ; Tue, 5 Jul 2011 21:36:18 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta04.emeryville.ca.mail.comcast.net (qmta04.emeryville.ca.mail.comcast.net [76.96.30.40]) by mx1.freebsd.org (Postfix) with ESMTP id 7D2418FC08 for ; Tue, 5 Jul 2011 21:36:18 +0000 (UTC) Received: from omta15.emeryville.ca.mail.comcast.net ([76.96.30.71]) by qmta04.emeryville.ca.mail.comcast.net with comcast id 4MVn1h0011Y3wxoA4McFVk; Tue, 05 Jul 2011 21:36:15 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta15.emeryville.ca.mail.comcast.net with comcast id 4Maf1h0021t3BNj8bMafYC; Tue, 05 Jul 2011 21:34:41 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id D6910102C36; Tue, 5 Jul 2011 14:36:13 -0700 (PDT) Date: Tue, 5 Jul 2011 14:36:13 -0700 From: Jeremy Chadwick To: Scott Sipe Message-ID: <20110705213613.GA67476@icarus.home.lan> References: <20110701222232.GA33935@icarus.home.lan> <20110702045435.GA81502@DataIX.net> <54D65EC5-9A9B-4F96-BB45-1904F2147CBA@gmail.com> <20110704100845.94513n3znbabpthp@webmail.in-berlin.de> <20110705170023.87183xz6ehxbeft3@webmail.in-berlin.de> <5BA0546B-DE70-4753-B445-551FD5336787@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5BA0546B-DE70-4753-B445-551FD5336787@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Peter Ross , freebsd-stable List Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 21:36:18 -0000 On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: > I'm running virtualbox 3.2.12_1 if that has anything to do with it. > > sysctl vfs.zfs.arc_max: 6200000000 > > While I'm trying to scp, kstat.zfs.misc.arcstats.size is hovering right around that value, sometimes above, sometimes below (that's as it should be, right?). I don't think that it dies when crossing over arc_max. I can run the same scp 10 times and it might fail 1-3 times, with no correlation to the arcstats.size being above/below arc_max that I can see. > > Scott > > On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: > > > Hi all, > > > > just as an addition: an upgrade to last Friday's FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the problem. > > > > I will experiment a bit more tomorrow after hours and grab some statistics. > > > > Regards > > Peter > > > > Quoting "Peter Ross" : > > > >> Hi all, > >> > >> I noticed a similar problem last week. It is also very similar to one reported last year: > >> > >> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.html > >> > >> My server is a Dell T410 server with the same bge card (the same pciconf -lvc output as described by Mahlon: > >> > >> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711.html > >> > >> Yours, Scott, is a em(4).. > >> > >> Another similarity: In all cases we are using VirtualBox. I just want to mention it, in case it matters. I am still running VirtualBox 3.2. > >> > >> Most of the time kstat.zfs.misc.arcstats.size was reaching vfs.zfs.arc_max then, but I could catch one or two cases then the value was still below. > >> > >> I added vfs.zfs.prefetch_disable=1 to sysctl.conf but it does not help. > >> > >> BTW: It looks as ARC only gives back the memory when I destroy the ZFS (a cloned snapshot containing virtual machines). Even if nothing happens for hours the buffer isn't released.. > >> > >> My machine was still running 8.2-PRERELEASE so I am upgrading. > >> > >> I am happy to give information gathered on old/new kernel if it helps. > >> > >> Regards > >> Peter > >> > >> Quoting "Scott Sipe" : > >> > >>> > >>> On Jul 2, 2011, at 12:54 AM, jhell wrote: > >>> > >>>> On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: > >>>>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: > >>>>>> I'm running 8.2-RELEASE and am having new problems with scp. When scping > >>>>>> files to a ZFS directory on the FreeBSD server -- most notably large files > >>>>>> -- the transfer frequently dies after just a few seconds. In my last test, I > >>>>>> tried to scp an 800mb file to the FreeBSD system and the transfer died after > >>>>>> 200mb. It completely copied the next 4 times I tried, and then died again on > >>>>>> the next attempt. > >>>>>> > >>>>>> On the client side: > >>>>>> > >>>>>> "Connection to home closed by remote host. > >>>>>> lost connection" > >>>>>> > >>>>>> In /var/log/auth.log: > >>>>>> > >>>>>> Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write failed: Cannot allocate > >>>>>> memory > >>>>>> > >>>>>> I've never seen this before and have used scp before to transfer large files > >>>>>> without problems. This computer has been used in production for months and > >>>>>> has a current uptime of 36 days. I have not been able to notice any problems > >>>>>> copying files to the server via samba or netatalk, or any problems in > >>>>>> apache. > >>>>>> > >>>>>> Uname: > >>>>>> > >>>>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat Feb 19 01:02:54 EST > >>>>>> 2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 > >>>>>> > >>>>>> I've attached my dmesg and output of vmstat -z. > >>>>>> > >>>>>> I have not restarted the sshd daemon or rebooted the computer. > >>>>>> > >>>>>> Am glad to provide any other information or test anything else. > >>>>>> > >>>>>> {snip vmstat -z and dmesg} > >>>>> > >>>>> You didn't provide details about your networking setup (rc.conf, > >>>>> ifconfig -a, etc.). netstat -m would be useful too. > >>>>> > >>>>> Next, please see this thread circa September 2010, titled "Network > >>>>> memory allocation failures": > >>>>> > >>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.html#58708 > >>>>> > >>>>> The user in that thread is using rsync, which relies on scp by default. > >>>>> I believe this problem is similar, if not identical, to yours. > >>>>> > >>>> > >>>> Please also provide your output of ( /usr/bin/limits -a ) for the server > >>>> end and the client. > >>>> > >>>> I am not quite sure I agree with the need for ifconfig -a but some > >>>> information about the networking driver your using for the interface > >>>> would be helpful, uptime of the boxes. And configuration of the pool. > >>>> e.g. ( zpool status -a ;zfs get all ) You should probably > >>>> prop this information up somewhere so you can reference by URL whenever > >>>> needed. > >>>> > >>>> rsync(1) does not rely on scp(1) whatsoever but rsync(1) can be made to > >>>> use ssh(1) instead of rsh(1) and I believe that is what Jeremy is > >>>> stating here but correct me if I am wrong. It does use ssh(1) by > >>>> default. > >>>> > >>>> Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tmp > >>>> type filesystems that rsync(1) may be just filling up your temp ram area > >>>> and causing the connection abort which would be expected. ( df -h ) would > >>>> help here. > >>> > >>> Hello, > >>> > >>> I'm not using tmpfs/mdmfs at all. The clients yesterday were 3 different OSX computers (over gigabit). The FreeBSD server has 12gb of ram and no bce adapter. For what it's worth, the server is backed up remotely every night with rsync (remote FreeBSD uses rsync to pull) to an offsite (slow cable connection) FreeBSD computer, and I have not seen any errors in the nightly rsync. > >>> > >>> Sorry for the omission of networking info, here's the output of the requested commands and some that popped up in the other thread: > >>> > >>> http://www.cap-press.com/misc/ > >>> > >>> In rc.conf: ifconfig_em1="inet 10.1.1.1 netmask 255.255.0.0" > >>> > >>> Scott Just to make it crystal clear to everyone: There is no correlation between this problem and use of ZFS. People are attempting to correlate "cannot allocate memory" messages with "anything on the system that uses memory". The VM is much more complex than that. Given the nature of this problem, it's much more likely the issue is "somewhere" within a networking layer within FreeBSD, whether it be driver-level or some sort of intermediary layer. Two people who have this issue in this thread are both using VirtualBox. Can one, or both, of you remove VirtualBox from the configuration entirely (kernel, etc. -- not sure what is required) and then see if the issue goes away? Thanks. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 22:47:26 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 61C3B1065675 for ; Tue, 5 Jul 2011 22:47:26 +0000 (UTC) (envelope-from freebsd-stable@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id E3CED8FC13 for ; Tue, 5 Jul 2011 22:47:25 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1QeEOq-0005BX-7w for freebsd-stable@freebsd.org; Wed, 06 Jul 2011 00:47:24 +0200 Received: from dtmd-4db2d4d7.pool.mediaways.net ([77.178.212.215]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 06 Jul 2011 00:47:24 +0200 Received: from christian.baer by dtmd-4db2d4d7.pool.mediaways.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 06 Jul 2011 00:47:24 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-stable@freebsd.org From: Christian Baer Date: Wed, 06 Jul 2011 00:47:12 +0200 Lines: 53 Message-ID: References: <52F39CE0-EEC7-4180-8186-BF8696AF279D@lassitu.de> <20110618175215.GA18645@icarus.home.lan> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: dtmd-4db2d4d7.pool.mediaways.net User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.16) Gecko/20101125 Lightning/1.0b1 Thunderbird/3.0.11 In-Reply-To: Subject: Re: Crashes with Promise controller X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 22:47:26 -0000 On 05.07.2011 17:48, George Kontostanos wrote: > I am not sure if it is the same controller: > http://www.freebsd.org/cgi/query-pr.cgi?pr=158268 Sure is. Here from my dmesg: atapci1: port 0xdc00-0xdc7f,0xe000-0xe0ff mem 0xd3461000-0xd3461fff,0xd3420000-0xd343ffff irq 11 at device 8.0 on pci0 atapci1: [ITHREAD] atapci1: [ITHREAD] ata2: on atapci1 ata2: SIGNATURE: 00000101 ata2: [ITHREAD] ata3: on atapci1 ata3: SIGNATURE: 00000101 ata3: [ITHREAD] ata4: on atapci1 ata4: SIGNATURE: 00000101 ata4: [ITHREAD] ata5: on atapci1 ata5: SIGNATURE: 00000101 ata5: [ITHREAD] fxp0: port 0xe400-0xe43f mem 0xd3460000-0xd3460fff,0xd3440000-0xd345ffff irq 10 at device 9.0 on pci0 miibus0: on fxp0 inphy0: PHY 1 on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow fxp0: Ethernet address: 00:02:b3:ad:ad:d7 fxp0: [ITHREAD] atapci2: port 0xe800-0xe87f,0xec00-0xecff mem 0xd3462000-0xd3462fff,0xd3400000-0xd341ffff irq 11 at device 12.0 on pci0 atapci2: [ITHREAD] atapci2: [ITHREAD] ata6: on atapci2 ata6: SIGNATURE: 00000101 ata6: [ITHREAD] ata7: on atapci2 ata7: SIGNATURE: 00000101 ata7: [ITHREAD] ata8: on atapci2 ata8: SIGNATURE: 00000101 ata8: [ITHREAD] ata9: on atapci2 ata9: SIGNATURE: 00000101 ata9: [ITHREAD] Best regards, Chris From owner-freebsd-stable@FreeBSD.ORG Tue Jul 5 23:00:40 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B0B8E106566B for ; Tue, 5 Jul 2011 23:00:40 +0000 (UTC) (envelope-from gkontos.mail@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 7D9278FC14 for ; Tue, 5 Jul 2011 23:00:40 +0000 (UTC) Received: by iyb11 with SMTP id 11so7525306iyb.13 for ; Tue, 05 Jul 2011 16:00:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=vdViezRM0YrJVI3zLAkTOqWZRIIF8U07j2ByhDKzO9c=; b=XLBWLw6S/p8ianxR4NCFa1TjgzOjImsK8S575TB4ne5gvlTP7WUx9DoUYw86j06kvG IdA8ovarDZt+lKKIFJXmLBxhGK9nQxQYeejj/7u6N13kObySe0B3Z9kl6dPYIQwbV9lL V044bYLMt8X7nk6vdcAS5QYUNwHILqljqSO+0= MIME-Version: 1.0 Received: by 10.231.31.129 with SMTP id y1mr6917802ibc.138.1309906839336; Tue, 05 Jul 2011 16:00:39 -0700 (PDT) Received: by 10.231.15.205 with HTTP; Tue, 5 Jul 2011 16:00:39 -0700 (PDT) In-Reply-To: References: <52F39CE0-EEC7-4180-8186-BF8696AF279D@lassitu.de> <20110618175215.GA18645@icarus.home.lan> Date: Wed, 6 Jul 2011 02:00:39 +0300 Message-ID: From: George Kontostanos To: Christian Baer Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable@freebsd.org Subject: Re: Crashes with Promise controller X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jul 2011 23:00:40 -0000 On Wed, Jul 6, 2011 at 1:47 AM, Christian Baer wrote: > On 05.07.2011 17:48, George Kontostanos wrote: > >> I am not sure if it is the same controller: >> http://www.freebsd.org/cgi/query-pr.cgi?pr=3D158268 > > Sure is. Here from my dmesg: > > atapci1: port > =A00xdc00-0xdc7f,0xe000-0xe0ff mem > =A00xd3461000-0xd3461fff,0xd3420000-0xd343ffff irq 11 at device 8.0 on > =A0pci0 > atapci1: [ITHREAD] > atapci1: [ITHREAD] > ata2: on atapci1 > ata2: SIGNATURE: 00000101 > ata2: [ITHREAD] > ata3: on atapci1 > ata3: SIGNATURE: 00000101 > ata3: [ITHREAD] > ata4: on atapci1 > ata4: SIGNATURE: 00000101 > ata4: [ITHREAD] > ata5: on atapci1 > ata5: SIGNATURE: 00000101 > ata5: [ITHREAD] > fxp0: port 0xe400-0xe43f mem > 0xd3460000-0xd3460fff,0xd3440000-0xd345ffff irq 10 at device 9.0 on pci0 > miibus0: on fxp0 > inphy0: PHY 1 on miibus0 > inphy0: =A010baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flo= w > fxp0: Ethernet address: 00:02:b3:ad:ad:d7 > fxp0: [ITHREAD] > atapci2: port > =A00xe800-0xe87f,0xec00-0xecff mem > =A00xd3462000-0xd3462fff,0xd3400000-0xd341ffff irq 11 at device 12.0 on > =A0pci0 > atapci2: [ITHREAD] > atapci2: [ITHREAD] > ata6: on atapci2 > ata6: SIGNATURE: 00000101 > ata6: [ITHREAD] > ata7: on atapci2 > ata7: SIGNATURE: 00000101 > ata7: [ITHREAD] > ata8: on atapci2 > ata8: SIGNATURE: 00000101 > ata8: [ITHREAD] > ata9: on atapci2 > ata9: SIGNATURE: 00000101 > ata9: [ITHREAD] > > Best regards, > Chris > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > There are a lot of people I know that have similar issues. It has caused me to replace 3 disks so far. I am afraid that this controller should be marked as junk. Regards, --=20 George Kontostanos aisecure.net From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 01:11:48 2011 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9FEE5106566B; Wed, 6 Jul 2011 01:11:48 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-stable.sentex.ca (freebsd-stable.sentex.ca [IPv6:2607:f3e0:0:3::6502:9b]) by mx1.freebsd.org (Postfix) with ESMTP id 64A478FC08; Wed, 6 Jul 2011 01:11:48 +0000 (UTC) Received: from freebsd-stable.sentex.ca (localhost [127.0.0.1]) by freebsd-stable.sentex.ca (8.14.4/8.14.4) with ESMTP id p661BlZV079433; Wed, 6 Jul 2011 01:11:47 GMT (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-stable.sentex.ca (8.14.4/8.14.4/Submit) id p661BlTc079428; Wed, 6 Jul 2011 01:11:47 GMT (envelope-from tinderbox@freebsd.org) Date: Wed, 6 Jul 2011 01:11:47 GMT Message-Id: <201107060111.p661BlTc079428@freebsd-stable.sentex.ca> X-Authentication-Warning: freebsd-stable.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Cc: Subject: [releng_8 tinderbox] failure on i386/i386 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 01:11:48 -0000 TB --- 2011-07-05 23:50:08 - tinderbox 2.6 running on freebsd-stable.sentex.ca TB --- 2011-07-05 23:50:08 - starting RELENG_8 tinderbox run for i386/i386 TB --- 2011-07-05 23:50:08 - cleaning the object tree TB --- 2011-07-05 23:50:31 - cvsupping the source tree TB --- 2011-07-05 23:50:31 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_8/i386/i386/supfile TB --- 2011-07-05 23:51:18 - building world TB --- 2011-07-05 23:51:18 - MAKEOBJDIRPREFIX=/obj TB --- 2011-07-05 23:51:18 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2011-07-05 23:51:18 - TARGET=i386 TB --- 2011-07-05 23:51:18 - TARGET_ARCH=i386 TB --- 2011-07-05 23:51:18 - TZ=UTC TB --- 2011-07-05 23:51:18 - __MAKE_CONF=/dev/null TB --- 2011-07-05 23:51:18 - cd /src TB --- 2011-07-05 23:51:18 - /usr/bin/make -B buildworld >>> World build started on Tue Jul 5 23:51:19 UTC 2011 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Wed Jul 6 00:55:48 UTC 2011 TB --- 2011-07-06 00:55:48 - generating LINT kernel config TB --- 2011-07-06 00:55:48 - cd /src/sys/i386/conf TB --- 2011-07-06 00:55:48 - /usr/bin/make -B LINT TB --- 2011-07-06 00:55:48 - building LINT kernel TB --- 2011-07-06 00:55:48 - MAKEOBJDIRPREFIX=/obj TB --- 2011-07-06 00:55:48 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2011-07-06 00:55:48 - TARGET=i386 TB --- 2011-07-06 00:55:48 - TARGET_ARCH=i386 TB --- 2011-07-06 00:55:48 - TZ=UTC TB --- 2011-07-06 00:55:48 - __MAKE_CONF=/dev/null TB --- 2011-07-06 00:55:48 - cd /src TB --- 2011-07-06 00:55:48 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Wed Jul 6 00:55:48 UTC 2011 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] :> hack.c cc -shared -nostdlib hack.c -o hack.So rm -f hack.c MAKE=/usr/bin/make sh /src/sys/conf/newvers.sh LINT cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue vers.c linking kernel ld: kernel: Not enough room for program headers (allocated 5, need 6) ld: final link failed: Bad value *** Error code 1 Stop in /obj/i386/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2011-07-06 01:11:47 - WARNING: /usr/bin/make returned exit code 1 TB --- 2011-07-06 01:11:47 - ERROR: failed to build lint kernel TB --- 2011-07-06 01:11:47 - 3589.88 user 482.71 system 4899.65 real http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8-i386-i386.full From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 02:23:41 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B6B42106564A for ; Wed, 6 Jul 2011 02:23:41 +0000 (UTC) (envelope-from Peter.Ross@bogen.in-berlin.de) Received: from einhorn.in-berlin.de (einhorn.in-berlin.de [192.109.42.8]) by mx1.freebsd.org (Postfix) with ESMTP id 2634C8FC0A for ; Wed, 6 Jul 2011 02:23:40 +0000 (UTC) X-Envelope-From: Peter.Ross@bogen.in-berlin.de Received: from localhost (okapi.in-berlin.de [192.109.42.117]) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id p662NdJM031914; Wed, 6 Jul 2011 04:23:39 +0200 Received: from 124-254-118-24-static.bb.ispone.net.au (124-254-118-24-static.bb.ispone.net.au [124.254.118.24]) by webmail.in-berlin.de (Horde Framework) with HTTP; Wed, 06 Jul 2011 12:23:39 +1000 Message-ID: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> Date: Wed, 06 Jul 2011 12:23:39 +1000 From: "Peter Ross" To: "Jeremy Chadwick" MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Internet Messaging Program (IMP) 4.3.3 X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 Cc: freebsd-stable List , Scott Sipe Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 02:23:41 -0000 Quoting "Jeremy Chadwick" : > On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: >> I'm running virtualbox 3.2.12_1 if that has anything to do with it. >> >> sysctl vfs.zfs.arc_max: 6200000000 >> >> While I'm trying to scp, kstat.zfs.misc.arcstats.size is hovering =20 >> right around that value, sometimes above, sometimes below (that's =20 >> as it should be, right?). I don't think that it dies when crossing =20 >> over arc_max. I can run the same scp 10 times and it might fail 1-3 =20 >> times, with no correlation to the arcstats.size being above/below =20 >> arc_max that I can see. >> >> Scott >> >> On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: >> >>> Hi all, >>> >>> just as an addition: an upgrade to last Friday's FreeBSD-Stable =20 >>> and to VirtualBox 4.0.8 does not fix the problem. >>> >>> I will experiment a bit more tomorrow after hours and grab some statisti= cs. >>> >>> Regards >>> Peter >>> >>> Quoting "Peter Ross" : >>> >>>> Hi all, >>>> >>>> I noticed a similar problem last week. It is also very similar to =20 >>>> one reported last year: >>>> >>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708= .html >>>> >>>> My server is a Dell T410 server with the same bge card (the same =20 >>>> pciconf -lvc output as described by Mahlon: >>>> >>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711= .html >>>> >>>> Yours, Scott, is a em(4).. >>>> >>>> Another similarity: In all cases we are using VirtualBox. I just =20 >>>> want to mention it, in case it matters. I am still running =20 >>>> VirtualBox 3.2. >>>> >>>> Most of the time kstat.zfs.misc.arcstats.size was reaching =20 >>>> vfs.zfs.arc_max then, but I could catch one or two cases then the =20 >>>> value was still below. >>>> >>>> I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it does not hel= p. >>>> >>>> BTW: It looks as ARC only gives back the memory when I destroy =20 >>>> the ZFS (a cloned snapshot containing virtual machines). Even if =20 >>>> nothing happens for hours the buffer isn't released.. >>>> >>>> My machine was still running 8.2-PRERELEASE so I am upgrading. >>>> >>>> I am happy to give information gathered on old/new kernel if it helps. >>>> >>>> Regards >>>> Peter >>>> >>>> Quoting "Scott Sipe" : >>>> >>>>> >>>>> On Jul 2, 2011, at 12:54 AM, jhell wrote: >>>>> >>>>>> On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: >>>>>>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >>>>>>>> I'm running 8.2-RELEASE and am having new problems with scp. =20 >>>>>>>> When scping >>>>>>>> files to a ZFS directory on the FreeBSD server -- most =20 >>>>>>>> notably large files >>>>>>>> -- the transfer frequently dies after just a few seconds. In =20 >>>>>>>> my last test, I >>>>>>>> tried to scp an 800mb file to the FreeBSD system and the =20 >>>>>>>> transfer died after >>>>>>>> 200mb. It completely copied the next 4 times I tried, and =20 >>>>>>>> then died again on >>>>>>>> the next attempt. >>>>>>>> >>>>>>>> On the client side: >>>>>>>> >>>>>>>> "Connection to home closed by remote host. >>>>>>>> lost connection" >>>>>>>> >>>>>>>> In /var/log/auth.log: >>>>>>>> >>>>>>>> Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write failed: =20 >>>>>>>> Cannot allocate >>>>>>>> memory >>>>>>>> >>>>>>>> I've never seen this before and have used scp before to =20 >>>>>>>> transfer large files >>>>>>>> without problems. This computer has been used in production =20 >>>>>>>> for months and >>>>>>>> has a current uptime of 36 days. I have not been able to =20 >>>>>>>> notice any problems >>>>>>>> copying files to the server via samba or netatalk, or any problems = in >>>>>>>> apache. >>>>>>>> >>>>>>>> Uname: >>>>>>>> >>>>>>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat Feb 19 =20 >>>>>>>> 01:02:54 EST >>>>>>>> 2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >>>>>>>> >>>>>>>> I've attached my dmesg and output of vmstat -z. >>>>>>>> >>>>>>>> I have not restarted the sshd daemon or rebooted the computer. >>>>>>>> >>>>>>>> Am glad to provide any other information or test anything else. >>>>>>>> >>>>>>>> {snip vmstat -z and dmesg} >>>>>>> >>>>>>> You didn't provide details about your networking setup (rc.conf, >>>>>>> ifconfig -a, etc.). netstat -m would be useful too. >>>>>>> >>>>>>> Next, please see this thread circa September 2010, titled "Network >>>>>>> memory allocation failures": >>>>>>> >>>>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thr= ead.html#58708 >>>>>>> >>>>>>> The user in that thread is using rsync, which relies on scp by defau= lt. >>>>>>> I believe this problem is similar, if not identical, to yours. >>>>>>> >>>>>> >>>>>> Please also provide your output of ( /usr/bin/limits -a ) for the ser= ver >>>>>> end and the client. >>>>>> >>>>>> I am not quite sure I agree with the need for ifconfig -a but some >>>>>> information about the networking driver your using for the interface >>>>>> would be helpful, uptime of the boxes. And configuration of the pool. >>>>>> e.g. ( zpool status -a ;zfs get all ) You should probably >>>>>> prop this information up somewhere so you can reference by URL whenev= er >>>>>> needed. >>>>>> >>>>>> rsync(1) does not rely on scp(1) whatsoever but rsync(1) can be made = to >>>>>> use ssh(1) instead of rsh(1) and I believe that is what Jeremy is >>>>>> stating here but correct me if I am wrong. It does use ssh(1) by >>>>>> default. >>>>>> >>>>>> Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tmp >>>>>> type filesystems that rsync(1) may be just filling up your temp ram a= rea >>>>>> and causing the connection abort which would be expected. ( df =20 >>>>>> -h ) would >>>>>> help here. >>>>> >>>>> Hello, >>>>> >>>>> I'm not using tmpfs/mdmfs at all. The clients yesterday were 3 =20 >>>>> different OSX computers (over gigabit). The FreeBSD server has =20 >>>>> 12gb of ram and no bce adapter. For what it's worth, the server =20 >>>>> is backed up remotely every night with rsync (remote FreeBSD =20 >>>>> uses rsync to pull) to an offsite (slow cable connection) =20 >>>>> FreeBSD computer, and I have not seen any errors in the nightly =20 >>>>> rsync. >>>>> >>>>> Sorry for the omission of networking info, here's the output of =20 >>>>> the requested commands and some that popped up in the other =20 >>>>> thread: >>>>> >>>>> http://www.cap-press.com/misc/ >>>>> >>>>> In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0" >>>>> >>>>> Scott > > Just to make it crystal clear to everyone: > > There is no correlation between this problem and use of ZFS. People are > attempting to correlate "cannot allocate memory" messages with "anything > on the system that uses memory". The VM is much more complex than that. > > Given the nature of this problem, it's much more likely the issue is > "somewhere" within a networking layer within FreeBSD, whether it be > driver-level or some sort of intermediary layer. > > Two people who have this issue in this thread are both using VirtualBox. > Can one, or both, of you remove VirtualBox from the configuration > entirely (kernel, etc. -- not sure what is required) and then see if the > issue goes away? On the machine in question I only can do it after hours so I will do =20 it tonight. I was _successfully_ sending the file over the loopback interface using cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null" I did it, btw, with the IPv6 localhost address first (accidently), and =20 then using IPv4. Both worked. It always fails if I am sending it through the bce(4) interface, even =20 if my target is the VirtualBox bridged to the bce card (so it does not =20 "leave" the computer physically). Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and kldstat output= . I have another box where I do not see that problem. It copies files =20 happily over the net using ssh. It is an an older HP ML 150 with 3GB RAM only but with a bge(4) driver =20 instead. It runs the same last week's RELENG_8. I installed VirtualBox =20 and enabled vboxnet (so it loads the kernel modules). But I do not run =20 VirtualBox on it (because it hasn't enough RAM). Regards Peter DellT410one# uname -a FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun =20 30 17:07:18 EST 2011 =20 root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 DellT410one# ifconfig -a bce0: flags=3D8943 =20 metric 0 mtu 1500 =09options=3Dc01bb =09ether 84:2b:2b:68:64:e4 =09inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 =09inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 =09inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 =09inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 =09inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 =09inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 =09inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 =09inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 =09media: Ethernet autoselect (1000baseT ) =09status: active bce1: flags=3D8802 metric 0 mtu 1500 =09options=3Dc01bb =09ether 84:2b:2b:68:64:e5 =09media: Ethernet autoselect lo0: flags=3D8049 metric 0 mtu 16384 =09options=3D3 =09inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb =09inet6 ::1 prefixlen 128 =09inet 127.0.0.1 netmask 0xff000000 =09nd6 options=3D3 vboxnet0: flags=3D8802 metric 0 mtu 1500 =09ether 0a:00:27:00:00:00 DellT410one# netstat -rn Routing tables Internet: Destination Gateway Flags Refs Use Netif Expire default 192.168.50.201 UGS 0 52195 bce0 127.0.0.1 link#11 UH 0 6 lo0 192.168.50.0/24 link#1 U 0 1118212 bce0 192.168.50.219 link#1 UHS 0 9670 lo0 192.168.50.220 link#1 UHS 0 8347 lo0 192.168.50.221 link#1 UHS 0 103024 lo0 192.168.50.223 link#1 UHS 0 43614 lo0 192.168.50.224 link#1 UHS 0 8358 lo0 192.168.50.225 link#1 UHS 0 8438 lo0 192.168.50.226 link#1 UHS 0 8338 lo0 192.168.50.227 link#1 UHS 0 8333 lo0 192.168.165.0/24 192.168.50.200 UGS 0 3311 bce0 192.168.166.0/24 192.168.50.200 UGS 0 699 bce0 192.168.167.0/24 192.168.50.200 UGS 0 3012 bce0 192.168.168.0/24 192.168.50.200 UGS 0 552 bce0 Internet6: Destination Gateway Flags =20 Netif Expire ::1 ::1 UH =20 lo0 fe80::%lo0/64 link#11 U =20 lo0 fe80::1%lo0 link#11 UHS =20 lo0 ff01::%lo0/32 fe80::1%lo0 U =20 lo0 ff02::%lo0/32 fe80::1%lo0 U =20 lo0 DellT410one# kldstat Id Refs Address Size Name 1 19 0xffffffff80100000 dbf5d0 kernel 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko 3 1 0xffffffff81012000 131998 zfs.ko 4 1 0xffffffff81144000 1ff1 opensolaris.ko 5 2 0xffffffff81146000 2940 vboxnetflt.ko 6 2 0xffffffff81149000 8e38 netgraph.ko 7 1 0xffffffff81152000 153c ng_ether.ko 8 1 0xffffffff81154000 e70 vboxnetadp.ko DellT410one# pciconf -lv .. bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 chip=3D0x163b14e4= =20 rev=3D0x20 hdr=3D0x00 vendor =3D 'Broadcom Corporation' class =3D network subclass =3D ethernet bce1@pci0:1:0:1: class=3D0x020000 card=3D0x028d1028 chip=3D0x163b14e4= =20 rev=3D0x20 hdr=3D0x00 vendor =3D 'Broadcom Corporation' class =3D network subclass =3D ethernet From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 02:33:01 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BD4671065673 for ; Wed, 6 Jul 2011 02:33:01 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta14.emeryville.ca.mail.comcast.net (qmta14.emeryville.ca.mail.comcast.net [76.96.27.212]) by mx1.freebsd.org (Postfix) with ESMTP id A3AC48FC08 for ; Wed, 6 Jul 2011 02:33:01 +0000 (UTC) Received: from omta11.emeryville.ca.mail.comcast.net ([76.96.30.36]) by qmta14.emeryville.ca.mail.comcast.net with comcast id 4S781h0050mlR8UAESYxml; Wed, 06 Jul 2011 02:32:59 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta11.emeryville.ca.mail.comcast.net with comcast id 4SYj1h0261t3BNj8XSYqn7; Wed, 06 Jul 2011 02:32:56 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 9F007102C36; Tue, 5 Jul 2011 19:32:34 -0700 (PDT) Date: Tue, 5 Jul 2011 19:32:34 -0700 From: Jeremy Chadwick To: Peter Ross Message-ID: <20110706023234.GA72048@icarus.home.lan> References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable List , Scott Sipe Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 02:33:01 -0000 On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: > Quoting "Jeremy Chadwick" : > > >On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: > >>I'm running virtualbox 3.2.12_1 if that has anything to do with it. > >> > >>sysctl vfs.zfs.arc_max: 6200000000 > >> > >>While I'm trying to scp, kstat.zfs.misc.arcstats.size is > >>hovering right around that value, sometimes above, sometimes > >>below (that's as it should be, right?). I don't think that it > >>dies when crossing over arc_max. I can run the same scp 10 times > >>and it might fail 1-3 times, with no correlation to the > >>arcstats.size being above/below arc_max that I can see. > >> > >>Scott > >> > >>On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: > >> > >>>Hi all, > >>> > >>>just as an addition: an upgrade to last Friday's > >>>FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the > >>>problem. > >>> > >>>I will experiment a bit more tomorrow after hours and grab some statistics. > >>> > >>>Regards > >>>Peter > >>> > >>>Quoting "Peter Ross" : > >>> > >>>>Hi all, > >>>> > >>>>I noticed a similar problem last week. It is also very > >>>>similar to one reported last year: > >>>> > >>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.html > >>>> > >>>>My server is a Dell T410 server with the same bge card (the > >>>>same pciconf -lvc output as described by Mahlon: > >>>> > >>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711.html > >>>> > >>>>Yours, Scott, is a em(4).. > >>>> > >>>>Another similarity: In all cases we are using VirtualBox. I > >>>>just want to mention it, in case it matters. I am still > >>>>running VirtualBox 3.2. > >>>> > >>>>Most of the time kstat.zfs.misc.arcstats.size was reaching > >>>>vfs.zfs.arc_max then, but I could catch one or two cases > >>>>then the value was still below. > >>>> > >>>>I added vfs.zfs.prefetch_disable=1 to sysctl.conf but it does not help. > >>>> > >>>>BTW: It looks as ARC only gives back the memory when I > >>>>destroy the ZFS (a cloned snapshot containing virtual > >>>>machines). Even if nothing happens for hours the buffer > >>>>isn't released.. > >>>> > >>>>My machine was still running 8.2-PRERELEASE so I am upgrading. > >>>> > >>>>I am happy to give information gathered on old/new kernel if it helps. > >>>> > >>>>Regards > >>>>Peter > >>>> > >>>>Quoting "Scott Sipe" : > >>>> > >>>>> > >>>>>On Jul 2, 2011, at 12:54 AM, jhell wrote: > >>>>> > >>>>>>On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: > >>>>>>>On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: > >>>>>>>>I'm running 8.2-RELEASE and am having new problems > >>>>>>>>with scp. When scping > >>>>>>>>files to a ZFS directory on the FreeBSD server -- > >>>>>>>>most notably large files > >>>>>>>>-- the transfer frequently dies after just a few > >>>>>>>>seconds. In my last test, I > >>>>>>>>tried to scp an 800mb file to the FreeBSD system and > >>>>>>>>the transfer died after > >>>>>>>>200mb. It completely copied the next 4 times I > >>>>>>>>tried, and then died again on > >>>>>>>>the next attempt. > >>>>>>>> > >>>>>>>>On the client side: > >>>>>>>> > >>>>>>>>"Connection to home closed by remote host. > >>>>>>>>lost connection" > >>>>>>>> > >>>>>>>>In /var/log/auth.log: > >>>>>>>> > >>>>>>>>Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write > >>>>>>>>failed: Cannot allocate > >>>>>>>>memory > >>>>>>>> > >>>>>>>>I've never seen this before and have used scp before > >>>>>>>>to transfer large files > >>>>>>>>without problems. This computer has been used in > >>>>>>>>production for months and > >>>>>>>>has a current uptime of 36 days. I have not been > >>>>>>>>able to notice any problems > >>>>>>>>copying files to the server via samba or netatalk, or any problems in > >>>>>>>>apache. > >>>>>>>> > >>>>>>>>Uname: > >>>>>>>> > >>>>>>>>FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat > >>>>>>>>Feb 19 01:02:54 EST > >>>>>>>>2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 > >>>>>>>> > >>>>>>>>I've attached my dmesg and output of vmstat -z. > >>>>>>>> > >>>>>>>>I have not restarted the sshd daemon or rebooted the computer. > >>>>>>>> > >>>>>>>>Am glad to provide any other information or test anything else. > >>>>>>>> > >>>>>>>>{snip vmstat -z and dmesg} > >>>>>>> > >>>>>>>You didn't provide details about your networking setup (rc.conf, > >>>>>>>ifconfig -a, etc.). netstat -m would be useful too. > >>>>>>> > >>>>>>>Next, please see this thread circa September 2010, titled "Network > >>>>>>>memory allocation failures": > >>>>>>> > >>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.html#58708 > >>>>>>> > >>>>>>>The user in that thread is using rsync, which relies on scp by default. > >>>>>>>I believe this problem is similar, if not identical, to yours. > >>>>>>> > >>>>>> > >>>>>>Please also provide your output of ( /usr/bin/limits -a ) for the server > >>>>>>end and the client. > >>>>>> > >>>>>>I am not quite sure I agree with the need for ifconfig -a but some > >>>>>>information about the networking driver your using for the interface > >>>>>>would be helpful, uptime of the boxes. And configuration of the pool. > >>>>>>e.g. ( zpool status -a ;zfs get all ) You should probably > >>>>>>prop this information up somewhere so you can reference by URL whenever > >>>>>>needed. > >>>>>> > >>>>>>rsync(1) does not rely on scp(1) whatsoever but rsync(1) can be made to > >>>>>>use ssh(1) instead of rsh(1) and I believe that is what Jeremy is > >>>>>>stating here but correct me if I am wrong. It does use ssh(1) by > >>>>>>default. > >>>>>> > >>>>>>Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tmp > >>>>>>type filesystems that rsync(1) may be just filling up your temp ram area > >>>>>>and causing the connection abort which would be > >>>>>>expected. ( df -h ) would > >>>>>>help here. > >>>>> > >>>>>Hello, > >>>>> > >>>>>I'm not using tmpfs/mdmfs at all. The clients yesterday > >>>>>were 3 different OSX computers (over gigabit). The FreeBSD > >>>>>server has 12gb of ram and no bce adapter. For what it's > >>>>>worth, the server is backed up remotely every night with > >>>>>rsync (remote FreeBSD uses rsync to pull) to an offsite > >>>>>(slow cable connection) FreeBSD computer, and I have not > >>>>>seen any errors in the nightly rsync. > >>>>> > >>>>>Sorry for the omission of networking info, here's the > >>>>>output of the requested commands and some that popped up > >>>>>in the other thread: > >>>>> > >>>>>http://www.cap-press.com/misc/ > >>>>> > >>>>>In rc.conf: ifconfig_em1="inet 10.1.1.1 netmask 255.255.0.0" > >>>>> > >>>>>Scott > > > >Just to make it crystal clear to everyone: > > > >There is no correlation between this problem and use of ZFS. People are > >attempting to correlate "cannot allocate memory" messages with "anything > >on the system that uses memory". The VM is much more complex than that. > > > >Given the nature of this problem, it's much more likely the issue is > >"somewhere" within a networking layer within FreeBSD, whether it be > >driver-level or some sort of intermediary layer. > > > >Two people who have this issue in this thread are both using VirtualBox. > >Can one, or both, of you remove VirtualBox from the configuration > >entirely (kernel, etc. -- not sure what is required) and then see if the > >issue goes away? > > On the machine in question I only can do it after hours so I will do > it tonight. > > I was _successfully_ sending the file over the loopback interface using > > cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null" > > I did it, btw, with the IPv6 localhost address first (accidently), > and then using IPv4. Both worked. > > It always fails if I am sending it through the bce(4) interface, > even if my target is the VirtualBox bridged to the bce card (so it > does not "leave" the computer physically). > > Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and kldstat output. > > I have another box where I do not see that problem. It copies files > happily over the net using ssh. > > It is an an older HP ML 150 with 3GB RAM only but with a bge(4) > driver instead. It runs the same last week's RELENG_8. I installed > VirtualBox and enabled vboxnet (so it loads the kernel modules). But > I do not run VirtualBox on it (because it hasn't enough RAM). > > Regards > Peter > > DellT410one# uname -a > FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun > 30 17:07:18 EST 2011 > root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 > DellT410one# ifconfig -a > bce0: flags=8943 > metric 0 mtu 1500 > options=c01bb > ether 84:2b:2b:68:64:e4 > inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 > inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 > media: Ethernet autoselect (1000baseT ) > status: active > bce1: flags=8802 metric 0 mtu 1500 > options=c01bb > ether 84:2b:2b:68:64:e5 > media: Ethernet autoselect > lo0: flags=8049 metric 0 mtu 16384 > options=3 > inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb > inet6 ::1 prefixlen 128 > inet 127.0.0.1 netmask 0xff000000 > nd6 options=3 > vboxnet0: flags=8802 metric 0 mtu 1500 > ether 0a:00:27:00:00:00 > DellT410one# netstat -rn > Routing tables > > Internet: > Destination Gateway Flags Refs Use Netif Expire > default 192.168.50.201 UGS 0 52195 bce0 > 127.0.0.1 link#11 UH 0 6 lo0 > 192.168.50.0/24 link#1 U 0 1118212 bce0 > 192.168.50.219 link#1 UHS 0 9670 lo0 > 192.168.50.220 link#1 UHS 0 8347 lo0 > 192.168.50.221 link#1 UHS 0 103024 lo0 > 192.168.50.223 link#1 UHS 0 43614 lo0 > 192.168.50.224 link#1 UHS 0 8358 lo0 > 192.168.50.225 link#1 UHS 0 8438 lo0 > 192.168.50.226 link#1 UHS 0 8338 lo0 > 192.168.50.227 link#1 UHS 0 8333 lo0 > 192.168.165.0/24 192.168.50.200 UGS 0 3311 bce0 > 192.168.166.0/24 192.168.50.200 UGS 0 699 bce0 > 192.168.167.0/24 192.168.50.200 UGS 0 3012 bce0 > 192.168.168.0/24 192.168.50.200 UGS 0 552 bce0 > > Internet6: > Destination Gateway > Flags Netif Expire > ::1 ::1 UH > lo0 > fe80::%lo0/64 link#11 U > lo0 > fe80::1%lo0 link#11 UHS > lo0 > ff01::%lo0/32 fe80::1%lo0 U > lo0 > ff02::%lo0/32 fe80::1%lo0 U > lo0 > DellT410one# kldstat > Id Refs Address Size Name > 1 19 0xffffffff80100000 dbf5d0 kernel > 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko > 3 1 0xffffffff81012000 131998 zfs.ko > 4 1 0xffffffff81144000 1ff1 opensolaris.ko > 5 2 0xffffffff81146000 2940 vboxnetflt.ko > 6 2 0xffffffff81149000 8e38 netgraph.ko > 7 1 0xffffffff81152000 153c ng_ether.ko > 8 1 0xffffffff81154000 e70 vboxnetadp.ko > DellT410one# pciconf -lv > .. > bce0@pci0:1:0:0: class=0x020000 card=0x028d1028 > chip=0x163b14e4 rev=0x20 hdr=0x00 > vendor = 'Broadcom Corporation' > class = network > subclass = ethernet > bce1@pci0:1:0:1: class=0x020000 card=0x028d1028 > chip=0x163b14e4 rev=0x20 hdr=0x00 > vendor = 'Broadcom Corporation' > class = network > subclass = ethernet Could you please provide "pciconf -lvcb" output instead, specific to the bce chips? Thanks. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 03:07:55 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E7687106566B for ; Wed, 6 Jul 2011 03:07:55 +0000 (UTC) (envelope-from Peter.Ross@bogen.in-berlin.de) Received: from einhorn.in-berlin.de (einhorn.in-berlin.de [192.109.42.8]) by mx1.freebsd.org (Postfix) with ESMTP id 285D98FC0C for ; Wed, 6 Jul 2011 03:07:54 +0000 (UTC) X-Envelope-From: Peter.Ross@bogen.in-berlin.de Received: from localhost (okapi.in-berlin.de [192.109.42.117]) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id p6637rj5002190; Wed, 6 Jul 2011 05:07:53 +0200 Received: from 124-254-118-24-static.bb.ispone.net.au (124-254-118-24-static.bb.ispone.net.au [124.254.118.24]) by webmail.in-berlin.de (Horde Framework) with HTTP; Wed, 06 Jul 2011 13:07:53 +1000 Message-ID: <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> Date: Wed, 06 Jul 2011 13:07:53 +1000 From: "Peter Ross" To: "Jeremy Chadwick" References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> <20110706023234.GA72048@icarus.home.lan> In-Reply-To: <20110706023234.GA72048@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Internet Messaging Program (IMP) 4.3.3 X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 Cc: freebsd-stable List , Scott Sipe Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 03:07:56 -0000 Quoting "Jeremy Chadwick" : > On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: >> Quoting "Jeremy Chadwick" : >> >> >On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: >> >>I'm running virtualbox 3.2.12_1 if that has anything to do with it. >> >> >> >>sysctl vfs.zfs.arc_max: 6200000000 >> >> >> >>While I'm trying to scp, kstat.zfs.misc.arcstats.size is >> >>hovering right around that value, sometimes above, sometimes >> >>below (that's as it should be, right?). I don't think that it >> >>dies when crossing over arc_max. I can run the same scp 10 times >> >>and it might fail 1-3 times, with no correlation to the >> >>arcstats.size being above/below arc_max that I can see. >> >> >> >>Scott >> >> >> >>On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: >> >> >> >>>Hi all, >> >>> >> >>>just as an addition: an upgrade to last Friday's >> >>>FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the >> >>>problem. >> >>> >> >>>I will experiment a bit more tomorrow after hours and grab some =20 >> statistics. >> >>> >> >>>Regards >> >>>Peter >> >>> >> >>>Quoting "Peter Ross" : >> >>> >> >>>>Hi all, >> >>>> >> >>>>I noticed a similar problem last week. It is also very >> >>>>similar to one reported last year: >> >>>> >> >>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/0587= 08.html >> >>>> >> >>>>My server is a Dell T410 server with the same bge card (the >> >>>>same pciconf -lvc output as described by Mahlon: >> >>>> >> >>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/0587= 11.html >> >>>> >> >>>>Yours, Scott, is a em(4).. >> >>>> >> >>>>Another similarity: In all cases we are using VirtualBox. I >> >>>>just want to mention it, in case it matters. I am still >> >>>>running VirtualBox 3.2. >> >>>> >> >>>>Most of the time kstat.zfs.misc.arcstats.size was reaching >> >>>>vfs.zfs.arc_max then, but I could catch one or two cases >> >>>>then the value was still below. >> >>>> >> >>>>I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it does not h= elp. >> >>>> >> >>>>BTW: It looks as ARC only gives back the memory when I >> >>>>destroy the ZFS (a cloned snapshot containing virtual >> >>>>machines). Even if nothing happens for hours the buffer >> >>>>isn't released.. >> >>>> >> >>>>My machine was still running 8.2-PRERELEASE so I am upgrading. >> >>>> >> >>>>I am happy to give information gathered on old/new kernel if it helps= . >> >>>> >> >>>>Regards >> >>>>Peter >> >>>> >> >>>>Quoting "Scott Sipe" : >> >>>> >> >>>>> >> >>>>>On Jul 2, 2011, at 12:54 AM, jhell wrote: >> >>>>> >> >>>>>>On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: >> >>>>>>>On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >> >>>>>>>>I'm running 8.2-RELEASE and am having new problems >> >>>>>>>>with scp. When scping >> >>>>>>>>files to a ZFS directory on the FreeBSD server -- >> >>>>>>>>most notably large files >> >>>>>>>>-- the transfer frequently dies after just a few >> >>>>>>>>seconds. In my last test, I >> >>>>>>>>tried to scp an 800mb file to the FreeBSD system and >> >>>>>>>>the transfer died after >> >>>>>>>>200mb. It completely copied the next 4 times I >> >>>>>>>>tried, and then died again on >> >>>>>>>>the next attempt. >> >>>>>>>> >> >>>>>>>>On the client side: >> >>>>>>>> >> >>>>>>>>"Connection to home closed by remote host. >> >>>>>>>>lost connection" >> >>>>>>>> >> >>>>>>>>In /var/log/auth.log: >> >>>>>>>> >> >>>>>>>>Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write >> >>>>>>>>failed: Cannot allocate >> >>>>>>>>memory >> >>>>>>>> >> >>>>>>>>I've never seen this before and have used scp before >> >>>>>>>>to transfer large files >> >>>>>>>>without problems. This computer has been used in >> >>>>>>>>production for months and >> >>>>>>>>has a current uptime of 36 days. I have not been >> >>>>>>>>able to notice any problems >> >>>>>>>>copying files to the server via samba or netatalk, or any =20 >> problems in >> >>>>>>>>apache. >> >>>>>>>> >> >>>>>>>>Uname: >> >>>>>>>> >> >>>>>>>>FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat >> >>>>>>>>Feb 19 01:02:54 EST >> >>>>>>>>2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >> >>>>>>>> >> >>>>>>>>I've attached my dmesg and output of vmstat -z. >> >>>>>>>> >> >>>>>>>>I have not restarted the sshd daemon or rebooted the computer. >> >>>>>>>> >> >>>>>>>>Am glad to provide any other information or test anything else. >> >>>>>>>> >> >>>>>>>>{snip vmstat -z and dmesg} >> >>>>>>> >> >>>>>>>You didn't provide details about your networking setup (rc.conf, >> >>>>>>>ifconfig -a, etc.). netstat -m would be useful too. >> >>>>>>> >> >>>>>>>Next, please see this thread circa September 2010, titled "Network >> >>>>>>>memory allocation failures": >> >>>>>>> >> >>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/t= hread.html#58708 >> >>>>>>> >> >>>>>>>The user in that thread is using rsync, which relies on scp =20 >> by default. >> >>>>>>>I believe this problem is similar, if not identical, to yours. >> >>>>>>> >> >>>>>> >> >>>>>>Please also provide your output of ( /usr/bin/limits -a ) for =20 >> the server >> >>>>>>end and the client. >> >>>>>> >> >>>>>>I am not quite sure I agree with the need for ifconfig -a but some >> >>>>>>information about the networking driver your using for the interfac= e >> >>>>>>would be helpful, uptime of the boxes. And configuration of the poo= l. >> >>>>>>e.g. ( zpool status -a ;zfs get all ) You should probabl= y >> >>>>>>prop this information up somewhere so you can reference by =20 >> URL whenever >> >>>>>>needed. >> >>>>>> >> >>>>>>rsync(1) does not rely on scp(1) whatsoever but rsync(1) can =20 >> be made to >> >>>>>>use ssh(1) instead of rsh(1) and I believe that is what Jeremy is >> >>>>>>stating here but correct me if I am wrong. It does use ssh(1) by >> >>>>>>default. >> >>>>>> >> >>>>>>Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tm= p >> >>>>>>type filesystems that rsync(1) may be just filling up your =20 >> temp ram area >> >>>>>>and causing the connection abort which would be >> >>>>>>expected. ( df -h ) would >> >>>>>>help here. >> >>>>> >> >>>>>Hello, >> >>>>> >> >>>>>I'm not using tmpfs/mdmfs at all. The clients yesterday >> >>>>>were 3 different OSX computers (over gigabit). The FreeBSD >> >>>>>server has 12gb of ram and no bce adapter. For what it's >> >>>>>worth, the server is backed up remotely every night with >> >>>>>rsync (remote FreeBSD uses rsync to pull) to an offsite >> >>>>>(slow cable connection) FreeBSD computer, and I have not >> >>>>>seen any errors in the nightly rsync. >> >>>>> >> >>>>>Sorry for the omission of networking info, here's the >> >>>>>output of the requested commands and some that popped up >> >>>>>in the other thread: >> >>>>> >> >>>>>http://www.cap-press.com/misc/ >> >>>>> >> >>>>>In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0" >> >>>>> >> >>>>>Scott >> > >> >Just to make it crystal clear to everyone: >> > >> >There is no correlation between this problem and use of ZFS. People are >> >attempting to correlate "cannot allocate memory" messages with "anything >> >on the system that uses memory". The VM is much more complex than that. >> > >> >Given the nature of this problem, it's much more likely the issue is >> >"somewhere" within a networking layer within FreeBSD, whether it be >> >driver-level or some sort of intermediary layer. >> > >> >Two people who have this issue in this thread are both using VirtualBox. >> >Can one, or both, of you remove VirtualBox from the configuration >> >entirely (kernel, etc. -- not sure what is required) and then see if the >> >issue goes away? >> >> On the machine in question I only can do it after hours so I will do >> it tonight. >> >> I was _successfully_ sending the file over the loopback interface using >> >> cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null" >> >> I did it, btw, with the IPv6 localhost address first (accidently), >> and then using IPv4. Both worked. >> >> It always fails if I am sending it through the bce(4) interface, >> even if my target is the VirtualBox bridged to the bce card (so it >> does not "leave" the computer physically). >> >> Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and =20 >> kldstat output. >> >> I have another box where I do not see that problem. It copies files >> happily over the net using ssh. >> >> It is an an older HP ML 150 with 3GB RAM only but with a bge(4) >> driver instead. It runs the same last week's RELENG_8. I installed >> VirtualBox and enabled vboxnet (so it loads the kernel modules). But >> I do not run VirtualBox on it (because it hasn't enough RAM). >> >> Regards >> Peter >> >> DellT410one# uname -a >> FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun >> 30 17:07:18 EST 2011 >> root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 >> DellT410one# ifconfig -a >> bce0: flags=3D8943 >> metric 0 mtu 1500 >> =09options=3Dc01bb >> =09ether 84:2b:2b:68:64:e4 >> =09inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 >> =09inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 >> =09media: Ethernet autoselect (1000baseT ) >> =09status: active >> bce1: flags=3D8802 metric 0 mtu 1500 >> =09options=3Dc01bb >> =09ether 84:2b:2b:68:64:e5 >> =09media: Ethernet autoselect >> lo0: flags=3D8049 metric 0 mtu 16384 >> =09options=3D3 >> =09inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb >> =09inet6 ::1 prefixlen 128 >> =09inet 127.0.0.1 netmask 0xff000000 >> =09nd6 options=3D3 >> vboxnet0: flags=3D8802 metric 0 mtu 1500 >> =09ether 0a:00:27:00:00:00 >> DellT410one# netstat -rn >> Routing tables >> >> Internet: >> Destination Gateway Flags Refs Use Netif Expir= e >> default 192.168.50.201 UGS 0 52195 bce0 >> 127.0.0.1 link#11 UH 0 6 lo0 >> 192.168.50.0/24 link#1 U 0 1118212 bce0 >> 192.168.50.219 link#1 UHS 0 9670 lo0 >> 192.168.50.220 link#1 UHS 0 8347 lo0 >> 192.168.50.221 link#1 UHS 0 103024 lo0 >> 192.168.50.223 link#1 UHS 0 43614 lo0 >> 192.168.50.224 link#1 UHS 0 8358 lo0 >> 192.168.50.225 link#1 UHS 0 8438 lo0 >> 192.168.50.226 link#1 UHS 0 8338 lo0 >> 192.168.50.227 link#1 UHS 0 8333 lo0 >> 192.168.165.0/24 192.168.50.200 UGS 0 3311 bce0 >> 192.168.166.0/24 192.168.50.200 UGS 0 699 bce0 >> 192.168.167.0/24 192.168.50.200 UGS 0 3012 bce0 >> 192.168.168.0/24 192.168.50.200 UGS 0 552 bce0 >> >> Internet6: >> Destination Gateway >> Flags Netif Expire >> ::1 ::1 UH >> lo0 >> fe80::%lo0/64 link#11 U >> lo0 >> fe80::1%lo0 link#11 UHS >> lo0 >> ff01::%lo0/32 fe80::1%lo0 U >> lo0 >> ff02::%lo0/32 fe80::1%lo0 U >> lo0 >> DellT410one# kldstat >> Id Refs Address Size Name >> 1 19 0xffffffff80100000 dbf5d0 kernel >> 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko >> 3 1 0xffffffff81012000 131998 zfs.ko >> 4 1 0xffffffff81144000 1ff1 opensolaris.ko >> 5 2 0xffffffff81146000 2940 vboxnetflt.ko >> 6 2 0xffffffff81149000 8e38 netgraph.ko >> 7 1 0xffffffff81152000 153c ng_ether.ko >> 8 1 0xffffffff81154000 e70 vboxnetadp.ko >> DellT410one# pciconf -lv >> .. >> bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >> vendor =3D 'Broadcom Corporation' >> class =3D network >> subclass =3D ethernet >> bce1@pci0:1:0:1: class=3D0x020000 card=3D0x028d1028 >> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >> vendor =3D 'Broadcom Corporation' >> class =3D network >> subclass =3D ethernet > > Could you please provide "pciconf -lvcb" output instead, specific to the > bce chips? Thanks. Her it is: bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 chip=3D0x163b14e4= =20 rev=3D0x20 hdr=3D0x00 vendor =3D 'Broadcom Corporation' class =3D network subclass =3D ethernet bar [10] =3D type Memory, range 64, base 0xda000000, size =20 33554432, enabled cap 01[48] =3D powerspec 3 supports D0 D3 current D0 cap 03[50] =3D VPD cap 05[58] =3D MSI supports 16 messages, 64 bit enabled with 1 message cap 11[a0] =3D MSI-X supports 9 messages in map 0x10 cap 10[ac] =3D PCI-Express 2 endpoint max data 256(512) link x4(x4) ecap 0003[100] =3D Serial 1 842b2bfffe6864e4 ecap 0001[110] =3D AER 1 0 fatal 0 non-fatal 1 corrected ecap 0004[150] =3D unknown 1 ecap 0002[160] =3D VC 1 max VC0 Regards Peter From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 03:24:28 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 401DA106566B for ; Wed, 6 Jul 2011 03:24:28 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta09.emeryville.ca.mail.comcast.net (qmta09.emeryville.ca.mail.comcast.net [76.96.30.96]) by mx1.freebsd.org (Postfix) with ESMTP id 239488FC14 for ; Wed, 6 Jul 2011 03:24:27 +0000 (UTC) Received: from omta19.emeryville.ca.mail.comcast.net ([76.96.30.76]) by qmta09.emeryville.ca.mail.comcast.net with comcast id 4SdH1h0071eYJf8A9TQRLx; Wed, 06 Jul 2011 03:24:25 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta19.emeryville.ca.mail.comcast.net with comcast id 4TQX1h00w1t3BNj01TQXJr; Wed, 06 Jul 2011 03:24:32 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 0F2C3102C36; Tue, 5 Jul 2011 20:24:25 -0700 (PDT) Date: Tue, 5 Jul 2011 20:24:25 -0700 From: Jeremy Chadwick To: Peter Ross Message-ID: <20110706032425.GA72757@icarus.home.lan> References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> <20110706023234.GA72048@icarus.home.lan> <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Yong-Hyeon Pyun , "Vogel, Jack" , freebsd-stable List , Scott Sipe , davidch@freebsd.org Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 03:24:28 -0000 On Wed, Jul 06, 2011 at 01:07:53PM +1000, Peter Ross wrote: > Quoting "Jeremy Chadwick" : > > >On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: > >>Quoting "Jeremy Chadwick" : > >> > >>>On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: > >>>>I'm running virtualbox 3.2.12_1 if that has anything to do with it. > >>>> > >>>>sysctl vfs.zfs.arc_max: 6200000000 > >>>> > >>>>While I'm trying to scp, kstat.zfs.misc.arcstats.size is > >>>>hovering right around that value, sometimes above, sometimes > >>>>below (that's as it should be, right?). I don't think that it > >>>>dies when crossing over arc_max. I can run the same scp 10 times > >>>>and it might fail 1-3 times, with no correlation to the > >>>>arcstats.size being above/below arc_max that I can see. > >>>> > >>>>Scott > >>>> > >>>>On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: > >>>> > >>>>>Hi all, > >>>>> > >>>>>just as an addition: an upgrade to last Friday's > >>>>>FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the > >>>>>problem. > >>>>> > >>>>>I will experiment a bit more tomorrow after hours and grab > >>some statistics. > >>>>> > >>>>>Regards > >>>>>Peter > >>>>> > >>>>>Quoting "Peter Ross" : > >>>>> > >>>>>>Hi all, > >>>>>> > >>>>>>I noticed a similar problem last week. It is also very > >>>>>>similar to one reported last year: > >>>>>> > >>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.html > >>>>>> > >>>>>>My server is a Dell T410 server with the same bge card (the > >>>>>>same pciconf -lvc output as described by Mahlon: > >>>>>> > >>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711.html > >>>>>> > >>>>>>Yours, Scott, is a em(4).. > >>>>>> > >>>>>>Another similarity: In all cases we are using VirtualBox. I > >>>>>>just want to mention it, in case it matters. I am still > >>>>>>running VirtualBox 3.2. > >>>>>> > >>>>>>Most of the time kstat.zfs.misc.arcstats.size was reaching > >>>>>>vfs.zfs.arc_max then, but I could catch one or two cases > >>>>>>then the value was still below. > >>>>>> > >>>>>>I added vfs.zfs.prefetch_disable=1 to sysctl.conf but it does not help. > >>>>>> > >>>>>>BTW: It looks as ARC only gives back the memory when I > >>>>>>destroy the ZFS (a cloned snapshot containing virtual > >>>>>>machines). Even if nothing happens for hours the buffer > >>>>>>isn't released.. > >>>>>> > >>>>>>My machine was still running 8.2-PRERELEASE so I am upgrading. > >>>>>> > >>>>>>I am happy to give information gathered on old/new kernel if it helps. > >>>>>> > >>>>>>Regards > >>>>>>Peter > >>>>>> > >>>>>>Quoting "Scott Sipe" : > >>>>>> > >>>>>>> > >>>>>>>On Jul 2, 2011, at 12:54 AM, jhell wrote: > >>>>>>> > >>>>>>>>On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: > >>>>>>>>>On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: > >>>>>>>>>>I'm running 8.2-RELEASE and am having new problems > >>>>>>>>>>with scp. When scping > >>>>>>>>>>files to a ZFS directory on the FreeBSD server -- > >>>>>>>>>>most notably large files > >>>>>>>>>>-- the transfer frequently dies after just a few > >>>>>>>>>>seconds. In my last test, I > >>>>>>>>>>tried to scp an 800mb file to the FreeBSD system and > >>>>>>>>>>the transfer died after > >>>>>>>>>>200mb. It completely copied the next 4 times I > >>>>>>>>>>tried, and then died again on > >>>>>>>>>>the next attempt. > >>>>>>>>>> > >>>>>>>>>>On the client side: > >>>>>>>>>> > >>>>>>>>>>"Connection to home closed by remote host. > >>>>>>>>>>lost connection" > >>>>>>>>>> > >>>>>>>>>>In /var/log/auth.log: > >>>>>>>>>> > >>>>>>>>>>Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write > >>>>>>>>>>failed: Cannot allocate > >>>>>>>>>>memory > >>>>>>>>>> > >>>>>>>>>>I've never seen this before and have used scp before > >>>>>>>>>>to transfer large files > >>>>>>>>>>without problems. This computer has been used in > >>>>>>>>>>production for months and > >>>>>>>>>>has a current uptime of 36 days. I have not been > >>>>>>>>>>able to notice any problems > >>>>>>>>>>copying files to the server via samba or netatalk, or > >>any problems in > >>>>>>>>>>apache. > >>>>>>>>>> > >>>>>>>>>>Uname: > >>>>>>>>>> > >>>>>>>>>>FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat > >>>>>>>>>>Feb 19 01:02:54 EST > >>>>>>>>>>2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 > >>>>>>>>>> > >>>>>>>>>>I've attached my dmesg and output of vmstat -z. > >>>>>>>>>> > >>>>>>>>>>I have not restarted the sshd daemon or rebooted the computer. > >>>>>>>>>> > >>>>>>>>>>Am glad to provide any other information or test anything else. > >>>>>>>>>> > >>>>>>>>>>{snip vmstat -z and dmesg} > >>>>>>>>> > >>>>>>>>>You didn't provide details about your networking setup (rc.conf, > >>>>>>>>>ifconfig -a, etc.). netstat -m would be useful too. > >>>>>>>>> > >>>>>>>>>Next, please see this thread circa September 2010, titled "Network > >>>>>>>>>memory allocation failures": > >>>>>>>>> > >>>>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.html#58708 > >>>>>>>>> > >>>>>>>>>The user in that thread is using rsync, which relies on > >>scp by default. > >>>>>>>>>I believe this problem is similar, if not identical, to yours. > >>>>>>>>> > >>>>>>>> > >>>>>>>>Please also provide your output of ( /usr/bin/limits -a ) > >>for the server > >>>>>>>>end and the client. > >>>>>>>> > >>>>>>>>I am not quite sure I agree with the need for ifconfig -a but some > >>>>>>>>information about the networking driver your using for the interface > >>>>>>>>would be helpful, uptime of the boxes. And configuration of the pool. > >>>>>>>>e.g. ( zpool status -a ;zfs get all ) You should probably > >>>>>>>>prop this information up somewhere so you can reference by > >>URL whenever > >>>>>>>>needed. > >>>>>>>> > >>>>>>>>rsync(1) does not rely on scp(1) whatsoever but rsync(1) > >>can be made to > >>>>>>>>use ssh(1) instead of rsh(1) and I believe that is what Jeremy is > >>>>>>>>stating here but correct me if I am wrong. It does use ssh(1) by > >>>>>>>>default. > >>>>>>>> > >>>>>>>>Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tmp > >>>>>>>>type filesystems that rsync(1) may be just filling up your > >>temp ram area > >>>>>>>>and causing the connection abort which would be > >>>>>>>>expected. ( df -h ) would > >>>>>>>>help here. > >>>>>>> > >>>>>>>Hello, > >>>>>>> > >>>>>>>I'm not using tmpfs/mdmfs at all. The clients yesterday > >>>>>>>were 3 different OSX computers (over gigabit). The FreeBSD > >>>>>>>server has 12gb of ram and no bce adapter. For what it's > >>>>>>>worth, the server is backed up remotely every night with > >>>>>>>rsync (remote FreeBSD uses rsync to pull) to an offsite > >>>>>>>(slow cable connection) FreeBSD computer, and I have not > >>>>>>>seen any errors in the nightly rsync. > >>>>>>> > >>>>>>>Sorry for the omission of networking info, here's the > >>>>>>>output of the requested commands and some that popped up > >>>>>>>in the other thread: > >>>>>>> > >>>>>>>http://www.cap-press.com/misc/ > >>>>>>> > >>>>>>>In rc.conf: ifconfig_em1="inet 10.1.1.1 netmask 255.255.0.0" > >>>>>>> > >>>>>>>Scott > >>> > >>>Just to make it crystal clear to everyone: > >>> > >>>There is no correlation between this problem and use of ZFS. People are > >>>attempting to correlate "cannot allocate memory" messages with "anything > >>>on the system that uses memory". The VM is much more complex than that. > >>> > >>>Given the nature of this problem, it's much more likely the issue is > >>>"somewhere" within a networking layer within FreeBSD, whether it be > >>>driver-level or some sort of intermediary layer. > >>> > >>>Two people who have this issue in this thread are both using VirtualBox. > >>>Can one, or both, of you remove VirtualBox from the configuration > >>>entirely (kernel, etc. -- not sure what is required) and then see if the > >>>issue goes away? > >> > >>On the machine in question I only can do it after hours so I will do > >>it tonight. > >> > >>I was _successfully_ sending the file over the loopback interface using > >> > >>cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null" > >> > >>I did it, btw, with the IPv6 localhost address first (accidently), > >>and then using IPv4. Both worked. > >> > >>It always fails if I am sending it through the bce(4) interface, > >>even if my target is the VirtualBox bridged to the bce card (so it > >>does not "leave" the computer physically). > >> > >>Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and > >>kldstat output. > >> > >>I have another box where I do not see that problem. It copies files > >>happily over the net using ssh. > >> > >>It is an an older HP ML 150 with 3GB RAM only but with a bge(4) > >>driver instead. It runs the same last week's RELENG_8. I installed > >>VirtualBox and enabled vboxnet (so it loads the kernel modules). But > >>I do not run VirtualBox on it (because it hasn't enough RAM). > >> > >>Regards > >>Peter > >> > >>DellT410one# uname -a > >>FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun > >>30 17:07:18 EST 2011 > >>root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 > >>DellT410one# ifconfig -a > >>bce0: flags=8943 > >>metric 0 mtu 1500 > >> options=c01bb > >> ether 84:2b:2b:68:64:e4 > >> inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 > >> inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 > >> inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 > >> inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 > >> inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 > >> inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 > >> inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 > >> inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 > >> media: Ethernet autoselect (1000baseT ) > >> status: active > >>bce1: flags=8802 metric 0 mtu 1500 > >> options=c01bb > >> ether 84:2b:2b:68:64:e5 > >> media: Ethernet autoselect > >>lo0: flags=8049 metric 0 mtu 16384 > >> options=3 > >> inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb > >> inet6 ::1 prefixlen 128 > >> inet 127.0.0.1 netmask 0xff000000 > >> nd6 options=3 > >>vboxnet0: flags=8802 metric 0 mtu 1500 > >> ether 0a:00:27:00:00:00 > >>DellT410one# netstat -rn > >>Routing tables > >> > >>Internet: > >>Destination Gateway Flags Refs Use Netif Expire > >>default 192.168.50.201 UGS 0 52195 bce0 > >>127.0.0.1 link#11 UH 0 6 lo0 > >>192.168.50.0/24 link#1 U 0 1118212 bce0 > >>192.168.50.219 link#1 UHS 0 9670 lo0 > >>192.168.50.220 link#1 UHS 0 8347 lo0 > >>192.168.50.221 link#1 UHS 0 103024 lo0 > >>192.168.50.223 link#1 UHS 0 43614 lo0 > >>192.168.50.224 link#1 UHS 0 8358 lo0 > >>192.168.50.225 link#1 UHS 0 8438 lo0 > >>192.168.50.226 link#1 UHS 0 8338 lo0 > >>192.168.50.227 link#1 UHS 0 8333 lo0 > >>192.168.165.0/24 192.168.50.200 UGS 0 3311 bce0 > >>192.168.166.0/24 192.168.50.200 UGS 0 699 bce0 > >>192.168.167.0/24 192.168.50.200 UGS 0 3012 bce0 > >>192.168.168.0/24 192.168.50.200 UGS 0 552 bce0 > >> > >>Internet6: > >>Destination Gateway > >>Flags Netif Expire > >>::1 ::1 UH > >>lo0 > >>fe80::%lo0/64 link#11 U > >>lo0 > >>fe80::1%lo0 link#11 UHS > >>lo0 > >>ff01::%lo0/32 fe80::1%lo0 U > >>lo0 > >>ff02::%lo0/32 fe80::1%lo0 U > >>lo0 > >>DellT410one# kldstat > >>Id Refs Address Size Name > >> 1 19 0xffffffff80100000 dbf5d0 kernel > >> 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko > >> 3 1 0xffffffff81012000 131998 zfs.ko > >> 4 1 0xffffffff81144000 1ff1 opensolaris.ko > >> 5 2 0xffffffff81146000 2940 vboxnetflt.ko > >> 6 2 0xffffffff81149000 8e38 netgraph.ko > >> 7 1 0xffffffff81152000 153c ng_ether.ko > >> 8 1 0xffffffff81154000 e70 vboxnetadp.ko > >>DellT410one# pciconf -lv > >>.. > >>bce0@pci0:1:0:0: class=0x020000 card=0x028d1028 > >>chip=0x163b14e4 rev=0x20 hdr=0x00 > >> vendor = 'Broadcom Corporation' > >> class = network > >> subclass = ethernet > >>bce1@pci0:1:0:1: class=0x020000 card=0x028d1028 > >>chip=0x163b14e4 rev=0x20 hdr=0x00 > >> vendor = 'Broadcom Corporation' > >> class = network > >> subclass = ethernet > > > >Could you please provide "pciconf -lvcb" output instead, specific to the > >bce chips? Thanks. > > Her it is: > > bce0@pci0:1:0:0: class=0x020000 card=0x028d1028 > chip=0x163b14e4 rev=0x20 hdr=0x00 > vendor = 'Broadcom Corporation' > class = network > subclass = ethernet > bar [10] = type Memory, range 64, base 0xda000000, size > 33554432, enabled > cap 01[48] = powerspec 3 supports D0 D3 current D0 > cap 03[50] = VPD > cap 05[58] = MSI supports 16 messages, 64 bit enabled with 1 message > cap 11[a0] = MSI-X supports 9 messages in map 0x10 > cap 10[ac] = PCI-Express 2 endpoint max data 256(512) link x4(x4) > ecap 0003[100] = Serial 1 842b2bfffe6864e4 > ecap 0001[110] = AER 1 0 fatal 0 non-fatal 1 corrected > ecap 0004[150] = unknown 1 > ecap 0002[160] = VC 1 max VC0 Thanks Peter. Adding Yong-Hyeon and David to the discussion, since they've both worked on the bce(4) driver in recent months (most of the changes made recently are only in HEAD), and also adding Jack Vogel of Intel who maintains em(4). Brief history for the devs: The issue is described "Network memory allocation failures" and was reported last year, but two users recently (Scott and Peter) have reported the issue again: http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.html#58708 And was mentioned again by Scott here, which also contains some technical details: http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063172.html What's interesting is that Scott's issue is identical in form but he's using em(4), which isn't known to behave like this. Both individuals are using VirtualBox, though we're not sure at this point if that is the piece which is causing the anomaly. Relevant details of Scott's system (em-based): http://www.cap-press.com/misc/ Relevant details of Peter's system (bce-based): http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063221.html http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063223.html I think the biggest complexity right now is figuring out how/why scp fails intermittently in this nature. The errno probably "trickles down" to userland from the kernel, but the condition regarding why it happens is unknown. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 03:54:15 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 16398106566C for ; Wed, 6 Jul 2011 03:54:15 +0000 (UTC) (envelope-from Peter.Ross@bogen.in-berlin.de) Received: from einhorn.in-berlin.de (einhorn.in-berlin.de [192.109.42.8]) by mx1.freebsd.org (Postfix) with ESMTP id 4B6DB8FC0A for ; Wed, 6 Jul 2011 03:54:13 +0000 (UTC) X-Envelope-From: Peter.Ross@bogen.in-berlin.de Received: from localhost (okapi.in-berlin.de [192.109.42.117]) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id p663sCXB005080; Wed, 6 Jul 2011 05:54:12 +0200 Received: from 124-254-118-24-static.bb.ispone.net.au (124-254-118-24-static.bb.ispone.net.au [124.254.118.24]) by webmail.in-berlin.de (Horde Framework) with HTTP; Wed, 06 Jul 2011 13:54:12 +1000 Message-ID: <20110706135412.15276i0fxavg09k4@webmail.in-berlin.de> Date: Wed, 06 Jul 2011 13:54:12 +1000 From: "Peter Ross" To: "Jeremy Chadwick" References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> <20110706023234.GA72048@icarus.home.lan> <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> <20110706032425.GA72757@icarus.home.lan> In-Reply-To: <20110706032425.GA72757@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Internet Messaging Program (IMP) 4.3.3 X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 Cc: Yong-Hyeon Pyun , "Vogel, Jack" , freebsd-stable List , Scott Sipe , davidch@freebsd.org Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 03:54:15 -0000 Quoting "Jeremy Chadwick" : > On Wed, Jul 06, 2011 at 01:07:53PM +1000, Peter Ross wrote: >> Quoting "Jeremy Chadwick" : >> >> >On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: >> >>Quoting "Jeremy Chadwick" : >> >> >> >>>On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: >> >>>>I'm running virtualbox 3.2.12_1 if that has anything to do with it. >> >>>> >> >>>>sysctl vfs.zfs.arc_max: 6200000000 >> >>>> >> >>>>While I'm trying to scp, kstat.zfs.misc.arcstats.size is >> >>>>hovering right around that value, sometimes above, sometimes >> >>>>below (that's as it should be, right?). I don't think that it >> >>>>dies when crossing over arc_max. I can run the same scp 10 times >> >>>>and it might fail 1-3 times, with no correlation to the >> >>>>arcstats.size being above/below arc_max that I can see. >> >>>> >> >>>>Scott >> >>>> >> >>>>On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: >> >>>> >> >>>>>Hi all, >> >>>>> >> >>>>>just as an addition: an upgrade to last Friday's >> >>>>>FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the >> >>>>>problem. >> >>>>> >> >>>>>I will experiment a bit more tomorrow after hours and grab >> >>some statistics. >> >>>>> >> >>>>>Regards >> >>>>>Peter >> >>>>> >> >>>>>Quoting "Peter Ross" : >> >>>>> >> >>>>>>Hi all, >> >>>>>> >> >>>>>>I noticed a similar problem last week. It is also very >> >>>>>>similar to one reported last year: >> >>>>>> >> >>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/05= 8708.html >> >>>>>> >> >>>>>>My server is a Dell T410 server with the same bge card (the >> >>>>>>same pciconf -lvc output as described by Mahlon: >> >>>>>> >> >>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/05= 8711.html >> >>>>>> >> >>>>>>Yours, Scott, is a em(4).. >> >>>>>> >> >>>>>>Another similarity: In all cases we are using VirtualBox. I >> >>>>>>just want to mention it, in case it matters. I am still >> >>>>>>running VirtualBox 3.2. >> >>>>>> >> >>>>>>Most of the time kstat.zfs.misc.arcstats.size was reaching >> >>>>>>vfs.zfs.arc_max then, but I could catch one or two cases >> >>>>>>then the value was still below. >> >>>>>> >> >>>>>>I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it does =20 >> not help. >> >>>>>> >> >>>>>>BTW: It looks as ARC only gives back the memory when I >> >>>>>>destroy the ZFS (a cloned snapshot containing virtual >> >>>>>>machines). Even if nothing happens for hours the buffer >> >>>>>>isn't released.. >> >>>>>> >> >>>>>>My machine was still running 8.2-PRERELEASE so I am upgrading. >> >>>>>> >> >>>>>>I am happy to give information gathered on old/new kernel if it hel= ps. >> >>>>>> >> >>>>>>Regards >> >>>>>>Peter >> >>>>>> >> >>>>>>Quoting "Scott Sipe" : >> >>>>>> >> >>>>>>> >> >>>>>>>On Jul 2, 2011, at 12:54 AM, jhell wrote: >> >>>>>>> >> >>>>>>>>On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: >> >>>>>>>>>On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >> >>>>>>>>>>I'm running 8.2-RELEASE and am having new problems >> >>>>>>>>>>with scp. When scping >> >>>>>>>>>>files to a ZFS directory on the FreeBSD server -- >> >>>>>>>>>>most notably large files >> >>>>>>>>>>-- the transfer frequently dies after just a few >> >>>>>>>>>>seconds. In my last test, I >> >>>>>>>>>>tried to scp an 800mb file to the FreeBSD system and >> >>>>>>>>>>the transfer died after >> >>>>>>>>>>200mb. It completely copied the next 4 times I >> >>>>>>>>>>tried, and then died again on >> >>>>>>>>>>the next attempt. >> >>>>>>>>>> >> >>>>>>>>>>On the client side: >> >>>>>>>>>> >> >>>>>>>>>>"Connection to home closed by remote host. >> >>>>>>>>>>lost connection" >> >>>>>>>>>> >> >>>>>>>>>>In /var/log/auth.log: >> >>>>>>>>>> >> >>>>>>>>>>Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write >> >>>>>>>>>>failed: Cannot allocate >> >>>>>>>>>>memory >> >>>>>>>>>> >> >>>>>>>>>>I've never seen this before and have used scp before >> >>>>>>>>>>to transfer large files >> >>>>>>>>>>without problems. This computer has been used in >> >>>>>>>>>>production for months and >> >>>>>>>>>>has a current uptime of 36 days. I have not been >> >>>>>>>>>>able to notice any problems >> >>>>>>>>>>copying files to the server via samba or netatalk, or >> >>any problems in >> >>>>>>>>>>apache. >> >>>>>>>>>> >> >>>>>>>>>>Uname: >> >>>>>>>>>> >> >>>>>>>>>>FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat >> >>>>>>>>>>Feb 19 01:02:54 EST >> >>>>>>>>>>2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >> >>>>>>>>>> >> >>>>>>>>>>I've attached my dmesg and output of vmstat -z. >> >>>>>>>>>> >> >>>>>>>>>>I have not restarted the sshd daemon or rebooted the computer. >> >>>>>>>>>> >> >>>>>>>>>>Am glad to provide any other information or test anything else. >> >>>>>>>>>> >> >>>>>>>>>>{snip vmstat -z and dmesg} >> >>>>>>>>> >> >>>>>>>>>You didn't provide details about your networking setup (rc.conf, >> >>>>>>>>>ifconfig -a, etc.). netstat -m would be useful too. >> >>>>>>>>> >> >>>>>>>>>Next, please see this thread circa September 2010, titled "Netwo= rk >> >>>>>>>>>memory allocation failures": >> >>>>>>>>> >> >>>>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September= /thread.html#58708 >> >>>>>>>>> >> >>>>>>>>>The user in that thread is using rsync, which relies on >> >>scp by default. >> >>>>>>>>>I believe this problem is similar, if not identical, to yours. >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>>Please also provide your output of ( /usr/bin/limits -a ) >> >>for the server >> >>>>>>>>end and the client. >> >>>>>>>> >> >>>>>>>>I am not quite sure I agree with the need for ifconfig -a but som= e >> >>>>>>>>information about the networking driver your using for the interf= ace >> >>>>>>>>would be helpful, uptime of the boxes. And configuration of =20 >> the pool. >> >>>>>>>>e.g. ( zpool status -a ;zfs get all ) You should proba= bly >> >>>>>>>>prop this information up somewhere so you can reference by >> >>URL whenever >> >>>>>>>>needed. >> >>>>>>>> >> >>>>>>>>rsync(1) does not rely on scp(1) whatsoever but rsync(1) >> >>can be made to >> >>>>>>>>use ssh(1) instead of rsh(1) and I believe that is what Jeremy is >> >>>>>>>>stating here but correct me if I am wrong. It does use ssh(1) by >> >>>>>>>>default. >> >>>>>>>> >> >>>>>>>>Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /= tmp >> >>>>>>>>type filesystems that rsync(1) may be just filling up your >> >>temp ram area >> >>>>>>>>and causing the connection abort which would be >> >>>>>>>>expected. ( df -h ) would >> >>>>>>>>help here. >> >>>>>>> >> >>>>>>>Hello, >> >>>>>>> >> >>>>>>>I'm not using tmpfs/mdmfs at all. The clients yesterday >> >>>>>>>were 3 different OSX computers (over gigabit). The FreeBSD >> >>>>>>>server has 12gb of ram and no bce adapter. For what it's >> >>>>>>>worth, the server is backed up remotely every night with >> >>>>>>>rsync (remote FreeBSD uses rsync to pull) to an offsite >> >>>>>>>(slow cable connection) FreeBSD computer, and I have not >> >>>>>>>seen any errors in the nightly rsync. >> >>>>>>> >> >>>>>>>Sorry for the omission of networking info, here's the >> >>>>>>>output of the requested commands and some that popped up >> >>>>>>>in the other thread: >> >>>>>>> >> >>>>>>>http://www.cap-press.com/misc/ >> >>>>>>> >> >>>>>>>In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0" >> >>>>>>> >> >>>>>>>Scott >> >>> >> >>>Just to make it crystal clear to everyone: >> >>> >> >>>There is no correlation between this problem and use of ZFS. People a= re >> >>>attempting to correlate "cannot allocate memory" messages with "anythi= ng >> >>>on the system that uses memory". The VM is much more complex than tha= t. >> >>> >> >>>Given the nature of this problem, it's much more likely the issue is >> >>>"somewhere" within a networking layer within FreeBSD, whether it be >> >>>driver-level or some sort of intermediary layer. >> >>> >> >>>Two people who have this issue in this thread are both using VirtualBo= x. >> >>>Can one, or both, of you remove VirtualBox from the configuration >> >>>entirely (kernel, etc. -- not sure what is required) and then see if t= he >> >>>issue goes away? >> >> >> >>On the machine in question I only can do it after hours so I will do >> >>it tonight. >> >> >> >>I was _successfully_ sending the file over the loopback interface using >> >> >> >>cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null" >> >> >> >>I did it, btw, with the IPv6 localhost address first (accidently), >> >>and then using IPv4. Both worked. >> >> >> >>It always fails if I am sending it through the bce(4) interface, >> >>even if my target is the VirtualBox bridged to the bce card (so it >> >>does not "leave" the computer physically). >> >> >> >>Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and >> >>kldstat output. >> >> >> >>I have another box where I do not see that problem. It copies files >> >>happily over the net using ssh. >> >> >> >>It is an an older HP ML 150 with 3GB RAM only but with a bge(4) >> >>driver instead. It runs the same last week's RELENG_8. I installed >> >>VirtualBox and enabled vboxnet (so it loads the kernel modules). But >> >>I do not run VirtualBox on it (because it hasn't enough RAM). >> >> >> >>Regards >> >>Peter >> >> >> >>DellT410one# uname -a >> >>FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun >> >>30 17:07:18 EST 2011 >> >>root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 >> >>DellT410one# ifconfig -a >> >>bce0: flags=3D8943 >> >>metric 0 mtu 1500 >> >>=09options=3Dc01bb >> >>=09ether 84:2b:2b:68:64:e4 >> >>=09inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 >> >>=09inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 >> >>=09inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 >> >>=09inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 >> >>=09inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 >> >>=09inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 >> >>=09inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 >> >>=09inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 >> >>=09media: Ethernet autoselect (1000baseT ) >> >>=09status: active >> >>bce1: flags=3D8802 metric 0 mtu 1500 >> >>=09options=3Dc01bb >> >>=09ether 84:2b:2b:68:64:e5 >> >>=09media: Ethernet autoselect >> >>lo0: flags=3D8049 metric 0 mtu 16384 >> >>=09options=3D3 >> >>=09inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb >> >>=09inet6 ::1 prefixlen 128 >> >>=09inet 127.0.0.1 netmask 0xff000000 >> >>=09nd6 options=3D3 >> >>vboxnet0: flags=3D8802 metric 0 mtu 1500 >> >>=09ether 0a:00:27:00:00:00 >> >>DellT410one# netstat -rn >> >>Routing tables >> >> >> >>Internet: >> >>Destination Gateway Flags Refs Use Netif Exp= ire >> >>default 192.168.50.201 UGS 0 52195 bce0 >> >>127.0.0.1 link#11 UH 0 6 lo0 >> >>192.168.50.0/24 link#1 U 0 1118212 bce0 >> >>192.168.50.219 link#1 UHS 0 9670 lo0 >> >>192.168.50.220 link#1 UHS 0 8347 lo0 >> >>192.168.50.221 link#1 UHS 0 103024 lo0 >> >>192.168.50.223 link#1 UHS 0 43614 lo0 >> >>192.168.50.224 link#1 UHS 0 8358 lo0 >> >>192.168.50.225 link#1 UHS 0 8438 lo0 >> >>192.168.50.226 link#1 UHS 0 8338 lo0 >> >>192.168.50.227 link#1 UHS 0 8333 lo0 >> >>192.168.165.0/24 192.168.50.200 UGS 0 3311 bce0 >> >>192.168.166.0/24 192.168.50.200 UGS 0 699 bce0 >> >>192.168.167.0/24 192.168.50.200 UGS 0 3012 bce0 >> >>192.168.168.0/24 192.168.50.200 UGS 0 552 bce0 >> >> >> >>Internet6: >> >>Destination Gateway >> >>Flags Netif Expire >> >>::1 ::1 UH >> >>lo0 >> >>fe80::%lo0/64 link#11 U >> >>lo0 >> >>fe80::1%lo0 link#11 UHS >> >>lo0 >> >>ff01::%lo0/32 fe80::1%lo0 U >> >>lo0 >> >>ff02::%lo0/32 fe80::1%lo0 U >> >>lo0 >> >>DellT410one# kldstat >> >>Id Refs Address Size Name >> >> 1 19 0xffffffff80100000 dbf5d0 kernel >> >> 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko >> >> 3 1 0xffffffff81012000 131998 zfs.ko >> >> 4 1 0xffffffff81144000 1ff1 opensolaris.ko >> >> 5 2 0xffffffff81146000 2940 vboxnetflt.ko >> >> 6 2 0xffffffff81149000 8e38 netgraph.ko >> >> 7 1 0xffffffff81152000 153c ng_ether.ko >> >> 8 1 0xffffffff81154000 e70 vboxnetadp.ko >> >>DellT410one# pciconf -lv >> >>.. >> >>bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >> >>chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >> >> vendor =3D 'Broadcom Corporation' >> >> class =3D network >> >> subclass =3D ethernet >> >>bce1@pci0:1:0:1: class=3D0x020000 card=3D0x028d1028 >> >>chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >> >> vendor =3D 'Broadcom Corporation' >> >> class =3D network >> >> subclass =3D ethernet >> > >> >Could you please provide "pciconf -lvcb" output instead, specific to the >> >bce chips? Thanks. >> >> Her it is: >> >> bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >> vendor =3D 'Broadcom Corporation' >> class =3D network >> subclass =3D ethernet >> bar [10] =3D type Memory, range 64, base 0xda000000, size >> 33554432, enabled >> cap 01[48] =3D powerspec 3 supports D0 D3 current D0 >> cap 03[50] =3D VPD >> cap 05[58] =3D MSI supports 16 messages, 64 bit enabled with 1 messag= e >> cap 11[a0] =3D MSI-X supports 9 messages in map 0x10 >> cap 10[ac] =3D PCI-Express 2 endpoint max data 256(512) link x4(x4) >> ecap 0003[100] =3D Serial 1 842b2bfffe6864e4 >> ecap 0001[110] =3D AER 1 0 fatal 0 non-fatal 1 corrected >> ecap 0004[150] =3D unknown 1 >> ecap 0002[160] =3D VC 1 max VC0 > > Thanks Peter. > > Adding Yong-Hyeon and David to the discussion, since they've both worked > on the bce(4) driver in recent months (most of the changes made recently > are only in HEAD), and also adding Jack Vogel of Intel who maintains > em(4). Brief history for the devs: > > The issue is described "Network memory allocation failures" and was > reported last year, but two users recently (Scott and Peter) have > reported the issue again: > > http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.ht= ml#58708 > > And was mentioned again by Scott here, which also contains some > technical details: > > http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063172.html > > What's interesting is that Scott's issue is identical in form but he's > using em(4), which isn't known to behave like this. Both individuals > are using VirtualBox, though we're not sure at this point if that is the > piece which is causing the anomaly. > > Relevant details of Scott's system (em-based): > > http://www.cap-press.com/misc/ > > Relevant details of Peter's system (bce-based): > > http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063221.html > http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063223.html > > I think the biggest complexity right now is figuring out how/why scp > fails intermittently in this nature. The errno probably "trickles down" > to userland from the kernel, but the condition regarding why it happens > is unknown. BTW: I also saw 2 of the errors coming from a BIND9 running in a jail =20 on that box. DellT410one# fgrep -i allocate /jails/bind/20110315/var/log/messages Apr 13 05:17:41 bind named[23534]: internal_send: =20 192.168.50.145#65176: Cannot allocate memory Jun 21 23:30:44 bind named[39864]: internal_send: =20 192.168.50.251#36155: Cannot allocate memory Jun 24 15:28:00 bind named[39864]: internal_send: =20 192.168.50.251#28651: Cannot allocate memory Jun 28 12:57:52 bind named[2462]: internal_send: 192.168.165.154#1201: =20 Cannot allocate memory My initial guess: it happens sooner or later somehow - whether it is a =20 lot of traffic in one go (ssh/scp copies of virtual disks) or a lot of =20 traffic over a longer period (a nameserver gets asked again and again). Regards Peter From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 04:15:09 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 00472106564A for ; Wed, 6 Jul 2011 04:15:08 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta09.westchester.pa.mail.comcast.net (qmta09.westchester.pa.mail.comcast.net [76.96.62.96]) by mx1.freebsd.org (Postfix) with ESMTP id A03848FC19 for ; Wed, 6 Jul 2011 04:15:08 +0000 (UTC) Received: from omta13.westchester.pa.mail.comcast.net ([76.96.62.52]) by qmta09.westchester.pa.mail.comcast.net with comcast id 4UDX1h00317dt5G59UF9zJ; Wed, 06 Jul 2011 04:15:09 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta13.westchester.pa.mail.comcast.net with comcast id 4UF61h00C1t3BNj3ZUF7vT; Wed, 06 Jul 2011 04:15:08 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id AF9AB102C36; Tue, 5 Jul 2011 21:15:04 -0700 (PDT) Date: Tue, 5 Jul 2011 21:15:04 -0700 From: Jeremy Chadwick To: Peter Ross Message-ID: <20110706041504.GA73698@icarus.home.lan> References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> <20110706023234.GA72048@icarus.home.lan> <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> <20110706032425.GA72757@icarus.home.lan> <20110706135412.15276i0fxavg09k4@webmail.in-berlin.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110706135412.15276i0fxavg09k4@webmail.in-berlin.de> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Yong-Hyeon Pyun , davidch@freebsd.org, freebsd-stable List , "Vogel, Jack" , Scott Sipe Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 04:15:09 -0000 On Wed, Jul 06, 2011 at 01:54:12PM +1000, Peter Ross wrote: > Quoting "Jeremy Chadwick" : > > >On Wed, Jul 06, 2011 at 01:07:53PM +1000, Peter Ross wrote: > >>Quoting "Jeremy Chadwick" : > >> > >>>On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: > >>>>Quoting "Jeremy Chadwick" : > >>>> > >>>>>On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: > >>>>>>I'm running virtualbox 3.2.12_1 if that has anything to do with it. > >>>>>> > >>>>>>sysctl vfs.zfs.arc_max: 6200000000 > >>>>>> > >>>>>>While I'm trying to scp, kstat.zfs.misc.arcstats.size is > >>>>>>hovering right around that value, sometimes above, sometimes > >>>>>>below (that's as it should be, right?). I don't think that it > >>>>>>dies when crossing over arc_max. I can run the same scp 10 times > >>>>>>and it might fail 1-3 times, with no correlation to the > >>>>>>arcstats.size being above/below arc_max that I can see. > >>>>>> > >>>>>>Scott > >>>>>> > >>>>>>On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: > >>>>>> > >>>>>>>Hi all, > >>>>>>> > >>>>>>>just as an addition: an upgrade to last Friday's > >>>>>>>FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the > >>>>>>>problem. > >>>>>>> > >>>>>>>I will experiment a bit more tomorrow after hours and grab > >>>>some statistics. > >>>>>>> > >>>>>>>Regards > >>>>>>>Peter > >>>>>>> > >>>>>>>Quoting "Peter Ross" : > >>>>>>> > >>>>>>>>Hi all, > >>>>>>>> > >>>>>>>>I noticed a similar problem last week. It is also very > >>>>>>>>similar to one reported last year: > >>>>>>>> > >>>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.html > >>>>>>>> > >>>>>>>>My server is a Dell T410 server with the same bge card (the > >>>>>>>>same pciconf -lvc output as described by Mahlon: > >>>>>>>> > >>>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711.html > >>>>>>>> > >>>>>>>>Yours, Scott, is a em(4).. > >>>>>>>> > >>>>>>>>Another similarity: In all cases we are using VirtualBox. I > >>>>>>>>just want to mention it, in case it matters. I am still > >>>>>>>>running VirtualBox 3.2. > >>>>>>>> > >>>>>>>>Most of the time kstat.zfs.misc.arcstats.size was reaching > >>>>>>>>vfs.zfs.arc_max then, but I could catch one or two cases > >>>>>>>>then the value was still below. > >>>>>>>> > >>>>>>>>I added vfs.zfs.prefetch_disable=1 to sysctl.conf but it > >>does not help. > >>>>>>>> > >>>>>>>>BTW: It looks as ARC only gives back the memory when I > >>>>>>>>destroy the ZFS (a cloned snapshot containing virtual > >>>>>>>>machines). Even if nothing happens for hours the buffer > >>>>>>>>isn't released.. > >>>>>>>> > >>>>>>>>My machine was still running 8.2-PRERELEASE so I am upgrading. > >>>>>>>> > >>>>>>>>I am happy to give information gathered on old/new kernel if it helps. > >>>>>>>> > >>>>>>>>Regards > >>>>>>>>Peter > >>>>>>>> > >>>>>>>>Quoting "Scott Sipe" : > >>>>>>>> > >>>>>>>>> > >>>>>>>>>On Jul 2, 2011, at 12:54 AM, jhell wrote: > >>>>>>>>> > >>>>>>>>>>On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote: > >>>>>>>>>>>On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: > >>>>>>>>>>>>I'm running 8.2-RELEASE and am having new problems > >>>>>>>>>>>>with scp. When scping > >>>>>>>>>>>>files to a ZFS directory on the FreeBSD server -- > >>>>>>>>>>>>most notably large files > >>>>>>>>>>>>-- the transfer frequently dies after just a few > >>>>>>>>>>>>seconds. In my last test, I > >>>>>>>>>>>>tried to scp an 800mb file to the FreeBSD system and > >>>>>>>>>>>>the transfer died after > >>>>>>>>>>>>200mb. It completely copied the next 4 times I > >>>>>>>>>>>>tried, and then died again on > >>>>>>>>>>>>the next attempt. > >>>>>>>>>>>> > >>>>>>>>>>>>On the client side: > >>>>>>>>>>>> > >>>>>>>>>>>>"Connection to home closed by remote host. > >>>>>>>>>>>>lost connection" > >>>>>>>>>>>> > >>>>>>>>>>>>In /var/log/auth.log: > >>>>>>>>>>>> > >>>>>>>>>>>>Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write > >>>>>>>>>>>>failed: Cannot allocate > >>>>>>>>>>>>memory > >>>>>>>>>>>> > >>>>>>>>>>>>I've never seen this before and have used scp before > >>>>>>>>>>>>to transfer large files > >>>>>>>>>>>>without problems. This computer has been used in > >>>>>>>>>>>>production for months and > >>>>>>>>>>>>has a current uptime of 36 days. I have not been > >>>>>>>>>>>>able to notice any problems > >>>>>>>>>>>>copying files to the server via samba or netatalk, or > >>>>any problems in > >>>>>>>>>>>>apache. > >>>>>>>>>>>> > >>>>>>>>>>>>Uname: > >>>>>>>>>>>> > >>>>>>>>>>>>FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat > >>>>>>>>>>>>Feb 19 01:02:54 EST > >>>>>>>>>>>>2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 > >>>>>>>>>>>> > >>>>>>>>>>>>I've attached my dmesg and output of vmstat -z. > >>>>>>>>>>>> > >>>>>>>>>>>>I have not restarted the sshd daemon or rebooted the computer. > >>>>>>>>>>>> > >>>>>>>>>>>>Am glad to provide any other information or test anything else. > >>>>>>>>>>>> > >>>>>>>>>>>>{snip vmstat -z and dmesg} > >>>>>>>>>>> > >>>>>>>>>>>You didn't provide details about your networking setup (rc.conf, > >>>>>>>>>>>ifconfig -a, etc.). netstat -m would be useful too. > >>>>>>>>>>> > >>>>>>>>>>>Next, please see this thread circa September 2010, titled "Network > >>>>>>>>>>>memory allocation failures": > >>>>>>>>>>> > >>>>>>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.html#58708 > >>>>>>>>>>> > >>>>>>>>>>>The user in that thread is using rsync, which relies on > >>>>scp by default. > >>>>>>>>>>>I believe this problem is similar, if not identical, to yours. > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>Please also provide your output of ( /usr/bin/limits -a ) > >>>>for the server > >>>>>>>>>>end and the client. > >>>>>>>>>> > >>>>>>>>>>I am not quite sure I agree with the need for ifconfig -a but some > >>>>>>>>>>information about the networking driver your using for the interface > >>>>>>>>>>would be helpful, uptime of the boxes. And configuration > >>of the pool. > >>>>>>>>>>e.g. ( zpool status -a ;zfs get all ) You should probably > >>>>>>>>>>prop this information up somewhere so you can reference by > >>>>URL whenever > >>>>>>>>>>needed. > >>>>>>>>>> > >>>>>>>>>>rsync(1) does not rely on scp(1) whatsoever but rsync(1) > >>>>can be made to > >>>>>>>>>>use ssh(1) instead of rsh(1) and I believe that is what Jeremy is > >>>>>>>>>>stating here but correct me if I am wrong. It does use ssh(1) by > >>>>>>>>>>default. > >>>>>>>>>> > >>>>>>>>>>Its a possiblity as well that if using tmpfs(5) or mdmfs(8) for /tmp > >>>>>>>>>>type filesystems that rsync(1) may be just filling up your > >>>>temp ram area > >>>>>>>>>>and causing the connection abort which would be > >>>>>>>>>>expected. ( df -h ) would > >>>>>>>>>>help here. > >>>>>>>>> > >>>>>>>>>Hello, > >>>>>>>>> > >>>>>>>>>I'm not using tmpfs/mdmfs at all. The clients yesterday > >>>>>>>>>were 3 different OSX computers (over gigabit). The FreeBSD > >>>>>>>>>server has 12gb of ram and no bce adapter. For what it's > >>>>>>>>>worth, the server is backed up remotely every night with > >>>>>>>>>rsync (remote FreeBSD uses rsync to pull) to an offsite > >>>>>>>>>(slow cable connection) FreeBSD computer, and I have not > >>>>>>>>>seen any errors in the nightly rsync. > >>>>>>>>> > >>>>>>>>>Sorry for the omission of networking info, here's the > >>>>>>>>>output of the requested commands and some that popped up > >>>>>>>>>in the other thread: > >>>>>>>>> > >>>>>>>>>http://www.cap-press.com/misc/ > >>>>>>>>> > >>>>>>>>>In rc.conf: ifconfig_em1="inet 10.1.1.1 netmask 255.255.0.0" > >>>>>>>>> > >>>>>>>>>Scott > >>>>> > >>>>>Just to make it crystal clear to everyone: > >>>>> > >>>>>There is no correlation between this problem and use of ZFS. People are > >>>>>attempting to correlate "cannot allocate memory" messages with "anything > >>>>>on the system that uses memory". The VM is much more complex than that. > >>>>> > >>>>>Given the nature of this problem, it's much more likely the issue is > >>>>>"somewhere" within a networking layer within FreeBSD, whether it be > >>>>>driver-level or some sort of intermediary layer. > >>>>> > >>>>>Two people who have this issue in this thread are both using VirtualBox. > >>>>>Can one, or both, of you remove VirtualBox from the configuration > >>>>>entirely (kernel, etc. -- not sure what is required) and then see if the > >>>>>issue goes away? > >>>> > >>>>On the machine in question I only can do it after hours so I will do > >>>>it tonight. > >>>> > >>>>I was _successfully_ sending the file over the loopback interface using > >>>> > >>>>cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null" > >>>> > >>>>I did it, btw, with the IPv6 localhost address first (accidently), > >>>>and then using IPv4. Both worked. > >>>> > >>>>It always fails if I am sending it through the bce(4) interface, > >>>>even if my target is the VirtualBox bridged to the bce card (so it > >>>>does not "leave" the computer physically). > >>>> > >>>>Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and > >>>>kldstat output. > >>>> > >>>>I have another box where I do not see that problem. It copies files > >>>>happily over the net using ssh. > >>>> > >>>>It is an an older HP ML 150 with 3GB RAM only but with a bge(4) > >>>>driver instead. It runs the same last week's RELENG_8. I installed > >>>>VirtualBox and enabled vboxnet (so it loads the kernel modules). But > >>>>I do not run VirtualBox on it (because it hasn't enough RAM). > >>>> > >>>>Regards > >>>>Peter > >>>> > >>>>DellT410one# uname -a > >>>>FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun > >>>>30 17:07:18 EST 2011 > >>>>root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 > >>>>DellT410one# ifconfig -a > >>>>bce0: flags=8943 > >>>>metric 0 mtu 1500 > >>>> options=c01bb > >>>> ether 84:2b:2b:68:64:e4 > >>>> inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 > >>>> inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 > >>>> inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 > >>>> inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 > >>>> inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 > >>>> inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 > >>>> inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 > >>>> inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 > >>>> media: Ethernet autoselect (1000baseT ) > >>>> status: active > >>>>bce1: flags=8802 metric 0 mtu 1500 > >>>> options=c01bb > >>>> ether 84:2b:2b:68:64:e5 > >>>> media: Ethernet autoselect > >>>>lo0: flags=8049 metric 0 mtu 16384 > >>>> options=3 > >>>> inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb > >>>> inet6 ::1 prefixlen 128 > >>>> inet 127.0.0.1 netmask 0xff000000 > >>>> nd6 options=3 > >>>>vboxnet0: flags=8802 metric 0 mtu 1500 > >>>> ether 0a:00:27:00:00:00 > >>>>DellT410one# netstat -rn > >>>>Routing tables > >>>> > >>>>Internet: > >>>>Destination Gateway Flags Refs Use Netif Expire > >>>>default 192.168.50.201 UGS 0 52195 bce0 > >>>>127.0.0.1 link#11 UH 0 6 lo0 > >>>>192.168.50.0/24 link#1 U 0 1118212 bce0 > >>>>192.168.50.219 link#1 UHS 0 9670 lo0 > >>>>192.168.50.220 link#1 UHS 0 8347 lo0 > >>>>192.168.50.221 link#1 UHS 0 103024 lo0 > >>>>192.168.50.223 link#1 UHS 0 43614 lo0 > >>>>192.168.50.224 link#1 UHS 0 8358 lo0 > >>>>192.168.50.225 link#1 UHS 0 8438 lo0 > >>>>192.168.50.226 link#1 UHS 0 8338 lo0 > >>>>192.168.50.227 link#1 UHS 0 8333 lo0 > >>>>192.168.165.0/24 192.168.50.200 UGS 0 3311 bce0 > >>>>192.168.166.0/24 192.168.50.200 UGS 0 699 bce0 > >>>>192.168.167.0/24 192.168.50.200 UGS 0 3012 bce0 > >>>>192.168.168.0/24 192.168.50.200 UGS 0 552 bce0 > >>>> > >>>>Internet6: > >>>>Destination Gateway > >>>>Flags Netif Expire > >>>>::1 ::1 UH > >>>>lo0 > >>>>fe80::%lo0/64 link#11 U > >>>>lo0 > >>>>fe80::1%lo0 link#11 UHS > >>>>lo0 > >>>>ff01::%lo0/32 fe80::1%lo0 U > >>>>lo0 > >>>>ff02::%lo0/32 fe80::1%lo0 U > >>>>lo0 > >>>>DellT410one# kldstat > >>>>Id Refs Address Size Name > >>>> 1 19 0xffffffff80100000 dbf5d0 kernel > >>>> 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko > >>>> 3 1 0xffffffff81012000 131998 zfs.ko > >>>> 4 1 0xffffffff81144000 1ff1 opensolaris.ko > >>>> 5 2 0xffffffff81146000 2940 vboxnetflt.ko > >>>> 6 2 0xffffffff81149000 8e38 netgraph.ko > >>>> 7 1 0xffffffff81152000 153c ng_ether.ko > >>>> 8 1 0xffffffff81154000 e70 vboxnetadp.ko > >>>>DellT410one# pciconf -lv > >>>>.. > >>>>bce0@pci0:1:0:0: class=0x020000 card=0x028d1028 > >>>>chip=0x163b14e4 rev=0x20 hdr=0x00 > >>>> vendor = 'Broadcom Corporation' > >>>> class = network > >>>> subclass = ethernet > >>>>bce1@pci0:1:0:1: class=0x020000 card=0x028d1028 > >>>>chip=0x163b14e4 rev=0x20 hdr=0x00 > >>>> vendor = 'Broadcom Corporation' > >>>> class = network > >>>> subclass = ethernet > >>> > >>>Could you please provide "pciconf -lvcb" output instead, specific to the > >>>bce chips? Thanks. > >> > >>Her it is: > >> > >>bce0@pci0:1:0:0: class=0x020000 card=0x028d1028 > >>chip=0x163b14e4 rev=0x20 hdr=0x00 > >> vendor = 'Broadcom Corporation' > >> class = network > >> subclass = ethernet > >> bar [10] = type Memory, range 64, base 0xda000000, size > >>33554432, enabled > >> cap 01[48] = powerspec 3 supports D0 D3 current D0 > >> cap 03[50] = VPD > >> cap 05[58] = MSI supports 16 messages, 64 bit enabled with 1 message > >> cap 11[a0] = MSI-X supports 9 messages in map 0x10 > >> cap 10[ac] = PCI-Express 2 endpoint max data 256(512) link x4(x4) > >>ecap 0003[100] = Serial 1 842b2bfffe6864e4 > >>ecap 0001[110] = AER 1 0 fatal 0 non-fatal 1 corrected > >>ecap 0004[150] = unknown 1 > >>ecap 0002[160] = VC 1 max VC0 > > > >Thanks Peter. > > > >Adding Yong-Hyeon and David to the discussion, since they've both worked > >on the bce(4) driver in recent months (most of the changes made recently > >are only in HEAD), and also adding Jack Vogel of Intel who maintains > >em(4). Brief history for the devs: > > > >The issue is described "Network memory allocation failures" and was > >reported last year, but two users recently (Scott and Peter) have > >reported the issue again: > > > >http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.html#58708 > > > >And was mentioned again by Scott here, which also contains some > >technical details: > > > >http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063172.html > > > >What's interesting is that Scott's issue is identical in form but he's > >using em(4), which isn't known to behave like this. Both individuals > >are using VirtualBox, though we're not sure at this point if that is the > >piece which is causing the anomaly. > > > >Relevant details of Scott's system (em-based): > > > >http://www.cap-press.com/misc/ > > > >Relevant details of Peter's system (bce-based): > > > >http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063221.html > >http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063223.html > > > >I think the biggest complexity right now is figuring out how/why scp > >fails intermittently in this nature. The errno probably "trickles down" > >to userland from the kernel, but the condition regarding why it happens > >is unknown. > > BTW: I also saw 2 of the errors coming from a BIND9 running in a > jail on that box. > > DellT410one# fgrep -i allocate /jails/bind/20110315/var/log/messages > Apr 13 05:17:41 bind named[23534]: internal_send: > 192.168.50.145#65176: Cannot allocate memory > Jun 21 23:30:44 bind named[39864]: internal_send: > 192.168.50.251#36155: Cannot allocate memory > Jun 24 15:28:00 bind named[39864]: internal_send: > 192.168.50.251#28651: Cannot allocate memory > Jun 28 12:57:52 bind named[2462]: internal_send: > 192.168.165.154#1201: Cannot allocate memory > > My initial guess: it happens sooner or later somehow - whether it is > a lot of traffic in one go (ssh/scp copies of virtual disks) or a > lot of traffic over a longer period (a nameserver gets asked again > and again). Scott, are you also using jails? If both of you are: is there any possibility you can remove use of those? I'm not sure how VirtualBox fits into the picture (jails + VirtualBox that is), but I can imagine jails having different environmental constraints that might cause this. Basically the troubleshooting process here is to remove pieces of the puzzle until you figure out which piece is causing the issue. I don't want to get the NIC driver devs all spun up for something that, for example, might be an issue with the jail implementation. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 04:31:32 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6967F106566B; Wed, 6 Jul 2011 04:31:32 +0000 (UTC) (envelope-from Peter.Ross@bogen.in-berlin.de) Received: from einhorn.in-berlin.de (einhorn.in-berlin.de [192.109.42.8]) by mx1.freebsd.org (Postfix) with ESMTP id 6B3928FC08; Wed, 6 Jul 2011 04:31:31 +0000 (UTC) X-Envelope-From: Peter.Ross@bogen.in-berlin.de Received: from localhost (okapi.in-berlin.de [192.109.42.117]) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id p664VTr2007197; Wed, 6 Jul 2011 06:31:30 +0200 Received: from 124-254-118-24-static.bb.ispone.net.au (124-254-118-24-static.bb.ispone.net.au [124.254.118.24]) by webmail.in-berlin.de (Horde Framework) with HTTP; Wed, 06 Jul 2011 14:31:29 +1000 Message-ID: <20110706143129.10696235ldx9bjmp@webmail.in-berlin.de> Date: Wed, 06 Jul 2011 14:31:29 +1000 From: "Peter Ross" To: "Jeremy Chadwick" References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> <20110706023234.GA72048@icarus.home.lan> <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> <20110706032425.GA72757@icarus.home.lan> <20110706135412.15276i0fxavg09k4@webmail.in-berlin.de> <20110706041504.GA73698@icarus.home.lan> In-Reply-To: <20110706041504.GA73698@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Internet Messaging Program (IMP) 4.3.3 X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 Cc: Yong-Hyeon Pyun , freebsd-stable List , davidch@freebsd.org, Scott Sipe , "Vogel, Jack" Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 04:31:32 -0000 Quoting "Jeremy Chadwick" : > On Wed, Jul 06, 2011 at 01:54:12PM +1000, Peter Ross wrote: >> Quoting "Jeremy Chadwick" : >> >> >On Wed, Jul 06, 2011 at 01:07:53PM +1000, Peter Ross wrote: >> >>Quoting "Jeremy Chadwick" : >> >> >> >>>On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: >> >>>>Quoting "Jeremy Chadwick" : >> >>>> >> >>>>>On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: >> >>>>>>I'm running virtualbox 3.2.12_1 if that has anything to do with it. >> >>>>>> >> >>>>>>sysctl vfs.zfs.arc_max: 6200000000 >> >>>>>> >> >>>>>>While I'm trying to scp, kstat.zfs.misc.arcstats.size is >> >>>>>>hovering right around that value, sometimes above, sometimes >> >>>>>>below (that's as it should be, right?). I don't think that it >> >>>>>>dies when crossing over arc_max. I can run the same scp 10 times >> >>>>>>and it might fail 1-3 times, with no correlation to the >> >>>>>>arcstats.size being above/below arc_max that I can see. >> >>>>>> >> >>>>>>Scott >> >>>>>> >> >>>>>>On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: >> >>>>>> >> >>>>>>>Hi all, >> >>>>>>> >> >>>>>>>just as an addition: an upgrade to last Friday's >> >>>>>>>FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the >> >>>>>>>problem. >> >>>>>>> >> >>>>>>>I will experiment a bit more tomorrow after hours and grab >> >>>>some statistics. >> >>>>>>> >> >>>>>>>Regards >> >>>>>>>Peter >> >>>>>>> >> >>>>>>>Quoting "Peter Ross" : >> >>>>>>> >> >>>>>>>>Hi all, >> >>>>>>>> >> >>>>>>>>I noticed a similar problem last week. It is also very >> >>>>>>>>similar to one reported last year: >> >>>>>>>> >> >>>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/= 058708.html >> >>>>>>>> >> >>>>>>>>My server is a Dell T410 server with the same bge card (the >> >>>>>>>>same pciconf -lvc output as described by Mahlon: >> >>>>>>>> >> >>>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/= 058711.html >> >>>>>>>> >> >>>>>>>>Yours, Scott, is a em(4).. >> >>>>>>>> >> >>>>>>>>Another similarity: In all cases we are using VirtualBox. I >> >>>>>>>>just want to mention it, in case it matters. I am still >> >>>>>>>>running VirtualBox 3.2. >> >>>>>>>> >> >>>>>>>>Most of the time kstat.zfs.misc.arcstats.size was reaching >> >>>>>>>>vfs.zfs.arc_max then, but I could catch one or two cases >> >>>>>>>>then the value was still below. >> >>>>>>>> >> >>>>>>>>I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it >> >>does not help. >> >>>>>>>> >> >>>>>>>>BTW: It looks as ARC only gives back the memory when I >> >>>>>>>>destroy the ZFS (a cloned snapshot containing virtual >> >>>>>>>>machines). Even if nothing happens for hours the buffer >> >>>>>>>>isn't released.. >> >>>>>>>> >> >>>>>>>>My machine was still running 8.2-PRERELEASE so I am upgrading. >> >>>>>>>> >> >>>>>>>>I am happy to give information gathered on old/new kernel =20 >> if it helps. >> >>>>>>>> >> >>>>>>>>Regards >> >>>>>>>>Peter >> >>>>>>>> >> >>>>>>>>Quoting "Scott Sipe" : >> >>>>>>>> >> >>>>>>>>> >> >>>>>>>>>On Jul 2, 2011, at 12:54 AM, jhell wrote: >> >>>>>>>>> >> >>>>>>>>>>On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrote= : >> >>>>>>>>>>>On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >> >>>>>>>>>>>>I'm running 8.2-RELEASE and am having new problems >> >>>>>>>>>>>>with scp. When scping >> >>>>>>>>>>>>files to a ZFS directory on the FreeBSD server -- >> >>>>>>>>>>>>most notably large files >> >>>>>>>>>>>>-- the transfer frequently dies after just a few >> >>>>>>>>>>>>seconds. In my last test, I >> >>>>>>>>>>>>tried to scp an 800mb file to the FreeBSD system and >> >>>>>>>>>>>>the transfer died after >> >>>>>>>>>>>>200mb. It completely copied the next 4 times I >> >>>>>>>>>>>>tried, and then died again on >> >>>>>>>>>>>>the next attempt. >> >>>>>>>>>>>> >> >>>>>>>>>>>>On the client side: >> >>>>>>>>>>>> >> >>>>>>>>>>>>"Connection to home closed by remote host. >> >>>>>>>>>>>>lost connection" >> >>>>>>>>>>>> >> >>>>>>>>>>>>In /var/log/auth.log: >> >>>>>>>>>>>> >> >>>>>>>>>>>>Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write >> >>>>>>>>>>>>failed: Cannot allocate >> >>>>>>>>>>>>memory >> >>>>>>>>>>>> >> >>>>>>>>>>>>I've never seen this before and have used scp before >> >>>>>>>>>>>>to transfer large files >> >>>>>>>>>>>>without problems. This computer has been used in >> >>>>>>>>>>>>production for months and >> >>>>>>>>>>>>has a current uptime of 36 days. I have not been >> >>>>>>>>>>>>able to notice any problems >> >>>>>>>>>>>>copying files to the server via samba or netatalk, or >> >>>>any problems in >> >>>>>>>>>>>>apache. >> >>>>>>>>>>>> >> >>>>>>>>>>>>Uname: >> >>>>>>>>>>>> >> >>>>>>>>>>>>FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat >> >>>>>>>>>>>>Feb 19 01:02:54 EST >> >>>>>>>>>>>>2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >> >>>>>>>>>>>> >> >>>>>>>>>>>>I've attached my dmesg and output of vmstat -z. >> >>>>>>>>>>>> >> >>>>>>>>>>>>I have not restarted the sshd daemon or rebooted the computer= . >> >>>>>>>>>>>> >> >>>>>>>>>>>>Am glad to provide any other information or test anything els= e. >> >>>>>>>>>>>> >> >>>>>>>>>>>>{snip vmstat -z and dmesg} >> >>>>>>>>>>> >> >>>>>>>>>>>You didn't provide details about your networking setup (rc.con= f, >> >>>>>>>>>>>ifconfig -a, etc.). netstat -m would be useful too. >> >>>>>>>>>>> >> >>>>>>>>>>>Next, please see this thread circa September 2010, =20 >> titled "Network >> >>>>>>>>>>>memory allocation failures": >> >>>>>>>>>>> >> >>>>>>>>>>>http://lists.freebsd.org/pipermail/freebsd-stable/2010-Septemb= er/thread.html#58708 >> >>>>>>>>>>> >> >>>>>>>>>>>The user in that thread is using rsync, which relies on >> >>>>scp by default. >> >>>>>>>>>>>I believe this problem is similar, if not identical, to yours. >> >>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>>Please also provide your output of ( /usr/bin/limits -a ) >> >>>>for the server >> >>>>>>>>>>end and the client. >> >>>>>>>>>> >> >>>>>>>>>>I am not quite sure I agree with the need for ifconfig -a but s= ome >> >>>>>>>>>>information about the networking driver your using for =20 >> the interface >> >>>>>>>>>>would be helpful, uptime of the boxes. And configuration >> >>of the pool. >> >>>>>>>>>>e.g. ( zpool status -a ;zfs get all ) You =20 >> should probably >> >>>>>>>>>>prop this information up somewhere so you can reference by >> >>>>URL whenever >> >>>>>>>>>>needed. >> >>>>>>>>>> >> >>>>>>>>>>rsync(1) does not rely on scp(1) whatsoever but rsync(1) >> >>>>can be made to >> >>>>>>>>>>use ssh(1) instead of rsh(1) and I believe that is what Jeremy = is >> >>>>>>>>>>stating here but correct me if I am wrong. It does use ssh(1) b= y >> >>>>>>>>>>default. >> >>>>>>>>>> >> >>>>>>>>>>Its a possiblity as well that if using tmpfs(5) or =20 >> mdmfs(8) for /tmp >> >>>>>>>>>>type filesystems that rsync(1) may be just filling up your >> >>>>temp ram area >> >>>>>>>>>>and causing the connection abort which would be >> >>>>>>>>>>expected. ( df -h ) would >> >>>>>>>>>>help here. >> >>>>>>>>> >> >>>>>>>>>Hello, >> >>>>>>>>> >> >>>>>>>>>I'm not using tmpfs/mdmfs at all. The clients yesterday >> >>>>>>>>>were 3 different OSX computers (over gigabit). The FreeBSD >> >>>>>>>>>server has 12gb of ram and no bce adapter. For what it's >> >>>>>>>>>worth, the server is backed up remotely every night with >> >>>>>>>>>rsync (remote FreeBSD uses rsync to pull) to an offsite >> >>>>>>>>>(slow cable connection) FreeBSD computer, and I have not >> >>>>>>>>>seen any errors in the nightly rsync. >> >>>>>>>>> >> >>>>>>>>>Sorry for the omission of networking info, here's the >> >>>>>>>>>output of the requested commands and some that popped up >> >>>>>>>>>in the other thread: >> >>>>>>>>> >> >>>>>>>>>http://www.cap-press.com/misc/ >> >>>>>>>>> >> >>>>>>>>>In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0" >> >>>>>>>>> >> >>>>>>>>>Scott >> >>>>> >> >>>>>Just to make it crystal clear to everyone: >> >>>>> >> >>>>>There is no correlation between this problem and use of ZFS. =20 >> People are >> >>>>>attempting to correlate "cannot allocate memory" messages with =20 >> "anything >> >>>>>on the system that uses memory". The VM is much more complex =20 >> than that. >> >>>>> >> >>>>>Given the nature of this problem, it's much more likely the issue is >> >>>>>"somewhere" within a networking layer within FreeBSD, whether it be >> >>>>>driver-level or some sort of intermediary layer. >> >>>>> >> >>>>>Two people who have this issue in this thread are both using =20 >> VirtualBox. >> >>>>>Can one, or both, of you remove VirtualBox from the configuration >> >>>>>entirely (kernel, etc. -- not sure what is required) and then =20 >> see if the >> >>>>>issue goes away? >> >>>> >> >>>>On the machine in question I only can do it after hours so I will do >> >>>>it tonight. >> >>>> >> >>>>I was _successfully_ sending the file over the loopback interface usi= ng >> >>>> >> >>>>cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null" >> >>>> >> >>>>I did it, btw, with the IPv6 localhost address first (accidently), >> >>>>and then using IPv4. Both worked. >> >>>> >> >>>>It always fails if I am sending it through the bce(4) interface, >> >>>>even if my target is the VirtualBox bridged to the bce card (so it >> >>>>does not "leave" the computer physically). >> >>>> >> >>>>Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and >> >>>>kldstat output. >> >>>> >> >>>>I have another box where I do not see that problem. It copies files >> >>>>happily over the net using ssh. >> >>>> >> >>>>It is an an older HP ML 150 with 3GB RAM only but with a bge(4) >> >>>>driver instead. It runs the same last week's RELENG_8. I installed >> >>>>VirtualBox and enabled vboxnet (so it loads the kernel modules). But >> >>>>I do not run VirtualBox on it (because it hasn't enough RAM). >> >>>> >> >>>>Regards >> >>>>Peter >> >>>> >> >>>>DellT410one# uname -a >> >>>>FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun >> >>>>30 17:07:18 EST 2011 >> >>>>root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 >> >>>>DellT410one# ifconfig -a >> >>>>bce0: flags=3D8943 >> >>>>metric 0 mtu 1500 >> >>>>=09options=3Dc01bb >> >>>>=09ether 84:2b:2b:68:64:e4 >> >>>>=09inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 >> >>>>=09inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 >> >>>>=09inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 >> >>>>=09inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 >> >>>>=09inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 >> >>>>=09inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 >> >>>>=09inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 >> >>>>=09inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 >> >>>>=09media: Ethernet autoselect (1000baseT ) >> >>>>=09status: active >> >>>>bce1: flags=3D8802 metric 0 mtu 1500 >> >>>>=09options=3Dc01bb >> >>>>=09ether 84:2b:2b:68:64:e5 >> >>>>=09media: Ethernet autoselect >> >>>>lo0: flags=3D8049 metric 0 mtu 16384 >> >>>>=09options=3D3 >> >>>>=09inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb >> >>>>=09inet6 ::1 prefixlen 128 >> >>>>=09inet 127.0.0.1 netmask 0xff000000 >> >>>>=09nd6 options=3D3 >> >>>>vboxnet0: flags=3D8802 metric 0 mtu 1500 >> >>>>=09ether 0a:00:27:00:00:00 >> >>>>DellT410one# netstat -rn >> >>>>Routing tables >> >>>> >> >>>>Internet: >> >>>>Destination Gateway Flags Refs Use =20 >> Netif Expire >> >>>>default 192.168.50.201 UGS 0 52195 bce0 >> >>>>127.0.0.1 link#11 UH 0 6 lo0 >> >>>>192.168.50.0/24 link#1 U 0 1118212 bce0 >> >>>>192.168.50.219 link#1 UHS 0 9670 lo0 >> >>>>192.168.50.220 link#1 UHS 0 8347 lo0 >> >>>>192.168.50.221 link#1 UHS 0 103024 lo0 >> >>>>192.168.50.223 link#1 UHS 0 43614 lo0 >> >>>>192.168.50.224 link#1 UHS 0 8358 lo0 >> >>>>192.168.50.225 link#1 UHS 0 8438 lo0 >> >>>>192.168.50.226 link#1 UHS 0 8338 lo0 >> >>>>192.168.50.227 link#1 UHS 0 8333 lo0 >> >>>>192.168.165.0/24 192.168.50.200 UGS 0 3311 bce0 >> >>>>192.168.166.0/24 192.168.50.200 UGS 0 699 bce0 >> >>>>192.168.167.0/24 192.168.50.200 UGS 0 3012 bce0 >> >>>>192.168.168.0/24 192.168.50.200 UGS 0 552 bce0 >> >>>> >> >>>>Internet6: >> >>>>Destination Gateway >> >>>>Flags Netif Expire >> >>>>::1 ::1 UH >> >>>>lo0 >> >>>>fe80::%lo0/64 link#11 U >> >>>>lo0 >> >>>>fe80::1%lo0 link#11 UHS >> >>>>lo0 >> >>>>ff01::%lo0/32 fe80::1%lo0 U >> >>>>lo0 >> >>>>ff02::%lo0/32 fe80::1%lo0 U >> >>>>lo0 >> >>>>DellT410one# kldstat >> >>>>Id Refs Address Size Name >> >>>> 1 19 0xffffffff80100000 dbf5d0 kernel >> >>>> 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko >> >>>> 3 1 0xffffffff81012000 131998 zfs.ko >> >>>> 4 1 0xffffffff81144000 1ff1 opensolaris.ko >> >>>> 5 2 0xffffffff81146000 2940 vboxnetflt.ko >> >>>> 6 2 0xffffffff81149000 8e38 netgraph.ko >> >>>> 7 1 0xffffffff81152000 153c ng_ether.ko >> >>>> 8 1 0xffffffff81154000 e70 vboxnetadp.ko >> >>>>DellT410one# pciconf -lv >> >>>>.. >> >>>>bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >> >>>>chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >> >>>> vendor =3D 'Broadcom Corporation' >> >>>> class =3D network >> >>>> subclass =3D ethernet >> >>>>bce1@pci0:1:0:1: class=3D0x020000 card=3D0x028d1028 >> >>>>chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >> >>>> vendor =3D 'Broadcom Corporation' >> >>>> class =3D network >> >>>> subclass =3D ethernet >> >>> >> >>>Could you please provide "pciconf -lvcb" output instead, specific to t= he >> >>>bce chips? Thanks. >> >> >> >>Her it is: >> >> >> >>bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >> >>chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >> >> vendor =3D 'Broadcom Corporation' >> >> class =3D network >> >> subclass =3D ethernet >> >> bar [10] =3D type Memory, range 64, base 0xda000000, size >> >>33554432, enabled >> >> cap 01[48] =3D powerspec 3 supports D0 D3 current D0 >> >> cap 03[50] =3D VPD >> >> cap 05[58] =3D MSI supports 16 messages, 64 bit enabled with 1 mess= age >> >> cap 11[a0] =3D MSI-X supports 9 messages in map 0x10 >> >> cap 10[ac] =3D PCI-Express 2 endpoint max data 256(512) link x4(x4) >> >>ecap 0003[100] =3D Serial 1 842b2bfffe6864e4 >> >>ecap 0001[110] =3D AER 1 0 fatal 0 non-fatal 1 corrected >> >>ecap 0004[150] =3D unknown 1 >> >>ecap 0002[160] =3D VC 1 max VC0 >> > >> >Thanks Peter. >> > >> >Adding Yong-Hyeon and David to the discussion, since they've both worked >> >on the bce(4) driver in recent months (most of the changes made recently >> >are only in HEAD), and also adding Jack Vogel of Intel who maintains >> >em(4). Brief history for the devs: >> > >> >The issue is described "Network memory allocation failures" and was >> >reported last year, but two users recently (Scott and Peter) have >> >reported the issue again: >> > >> >http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.= html#58708 >> > >> >And was mentioned again by Scott here, which also contains some >> >technical details: >> > >> >http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063172.html >> > >> >What's interesting is that Scott's issue is identical in form but he's >> >using em(4), which isn't known to behave like this. Both individuals >> >are using VirtualBox, though we're not sure at this point if that is the >> >piece which is causing the anomaly. >> > >> >Relevant details of Scott's system (em-based): >> > >> >http://www.cap-press.com/misc/ >> > >> >Relevant details of Peter's system (bce-based): >> > >> >http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063221.html >> >http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063223.html >> > >> >I think the biggest complexity right now is figuring out how/why scp >> >fails intermittently in this nature. The errno probably "trickles down" >> >to userland from the kernel, but the condition regarding why it happens >> >is unknown. >> >> BTW: I also saw 2 of the errors coming from a BIND9 running in a >> jail on that box. >> >> DellT410one# fgrep -i allocate /jails/bind/20110315/var/log/messages >> Apr 13 05:17:41 bind named[23534]: internal_send: >> 192.168.50.145#65176: Cannot allocate memory >> Jun 21 23:30:44 bind named[39864]: internal_send: >> 192.168.50.251#36155: Cannot allocate memory >> Jun 24 15:28:00 bind named[39864]: internal_send: >> 192.168.50.251#28651: Cannot allocate memory >> Jun 28 12:57:52 bind named[2462]: internal_send: >> 192.168.165.154#1201: Cannot allocate memory >> >> My initial guess: it happens sooner or later somehow - whether it is >> a lot of traffic in one go (ssh/scp copies of virtual disks) or a >> lot of traffic over a longer period (a nameserver gets asked again >> and again). > > Scott, are you also using jails? If both of you are: is there any > possibility you can remove use of those? I'm not sure how VirtualBox > fits into the picture (jails + VirtualBox that is), but I can imagine > jails having different environmental constraints that might cause this. > > Basically the troubleshooting process here is to remove pieces of the > puzzle until you figure out which piece is causing the issue. I don't > want to get the NIC driver devs all spun up for something that, for > example, might be an issue with the jail implementation. I understand this. As said, I do some afterhours debugging tonight. The scp/ssh problems are happening _outside_ the jails. The bind runs =20 _inside_ the jail. I wanted to use the _host_ system to send VirtualBox virtual disks and =20 filesystems used by jails to archive them and/or having them =20 available on other FreeBSD systems (as a cold standby solution). Regards Peter From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 04:34:16 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C4AD91065677 for ; Wed, 6 Jul 2011 04:34:16 +0000 (UTC) (envelope-from cscotts@gmail.com) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id 529F18FC08 for ; Wed, 6 Jul 2011 04:34:15 +0000 (UTC) Received: by qyk30 with SMTP id 30so2034601qyk.13 for ; Tue, 05 Jul 2011 21:34:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=itlhCh593vYw/LuN24r5OhiAocwo1ZmY7WkuVK/M0Zg=; b=a5chm8ZMYXl+jdvVtbY491wbaGvD5cAqfC2f5nXNaT8ZnTGXir4fzWKQg0bxJzQTo2 2k7HGUU8TjprQ+W75dmAh4e/gNBi5xPRPlT21ivzYfrRcW2PMIHLaNqSHi60viEgcewj Snq5RvRrFsf9W+FpxPh44+yw5ka1WometJQEo= Received: by 10.224.137.68 with SMTP id v4mr5719966qat.234.1309926855172; Tue, 05 Jul 2011 21:34:15 -0700 (PDT) Received: from sahibkuran.kjlms (user-0c2hi2t.cable.mindspring.com [24.40.200.93]) by mx.google.com with ESMTPS id u15sm6109038qcq.36.2011.07.05.21.34.13 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 05 Jul 2011 21:34:13 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Scott Sipe In-Reply-To: <20110706041504.GA73698@icarus.home.lan> Date: Wed, 6 Jul 2011 00:34:12 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <345CD069-BAC6-4E4E-A963-38C77CC74345@gmail.com> References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> <20110706023234.GA72048@icarus.home.lan> <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> <20110706032425.GA72757@icarus.home.lan> <20110706135412.15276i0fxavg09k4@webmail.in-berlin.de> <20110706041504.GA73698@icarus.home.lan> To: Jeremy Chadwick X-Mailer: Apple Mail (2.1084) Cc: Peter Ross , Yong-Hyeon Pyun , freebsd-stable List , "Vogel, Jack" , davidch@freebsd.org Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 04:34:16 -0000 On Jul 6, 2011, at 12:15 AM, Jeremy Chadwick wrote: > On Wed, Jul 06, 2011 at 01:54:12PM +1000, Peter Ross wrote: >> Quoting "Jeremy Chadwick" : >>=20 >>> On Wed, Jul 06, 2011 at 01:07:53PM +1000, Peter Ross wrote: >>>> Quoting "Jeremy Chadwick" : >>>>=20 >>>>> On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: >>>>>> Quoting "Jeremy Chadwick" : >>>>>>=20 >>>>>>> On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: >>>>>>>> I'm running virtualbox 3.2.12_1 if that has anything to do with = it. >>>>>>>>=20 >>>>>>>> sysctl vfs.zfs.arc_max: 6200000000 >>>>>>>>=20 >>>>>>>> While I'm trying to scp, kstat.zfs.misc.arcstats.size is >>>>>>>> hovering right around that value, sometimes above, sometimes >>>>>>>> below (that's as it should be, right?). I don't think that it >>>>>>>> dies when crossing over arc_max. I can run the same scp 10 = times >>>>>>>> and it might fail 1-3 times, with no correlation to the >>>>>>>> arcstats.size being above/below arc_max that I can see. >>>>>>>>=20 >>>>>>>> Scott >>>>>>>>=20 >>>>>>>> On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: >>>>>>>>=20 >>>>>>>>> Hi all, >>>>>>>>>=20 >>>>>>>>> just as an addition: an upgrade to last Friday's >>>>>>>>> FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the >>>>>>>>> problem. >>>>>>>>>=20 >>>>>>>>> I will experiment a bit more tomorrow after hours and grab >>>>>> some statistics. >>>>>>>>>=20 >>>>>>>>> Regards >>>>>>>>> Peter >>>>>>>>>=20 >>>>>>>>> Quoting "Peter Ross" : >>>>>>>>>=20 >>>>>>>>>> Hi all, >>>>>>>>>>=20 >>>>>>>>>> I noticed a similar problem last week. It is also very >>>>>>>>>> similar to one reported last year: >>>>>>>>>>=20 >>>>>>>>>> = http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058708.ht= ml >>>>>>>>>>=20 >>>>>>>>>> My server is a Dell T410 server with the same bge card (the >>>>>>>>>> same pciconf -lvc output as described by Mahlon: >>>>>>>>>>=20 >>>>>>>>>> = http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/058711.ht= ml >>>>>>>>>>=20 >>>>>>>>>> Yours, Scott, is a em(4).. >>>>>>>>>>=20 >>>>>>>>>> Another similarity: In all cases we are using VirtualBox. I >>>>>>>>>> just want to mention it, in case it matters. I am still >>>>>>>>>> running VirtualBox 3.2. >>>>>>>>>>=20 >>>>>>>>>> Most of the time kstat.zfs.misc.arcstats.size was reaching >>>>>>>>>> vfs.zfs.arc_max then, but I could catch one or two cases >>>>>>>>>> then the value was still below. >>>>>>>>>>=20 >>>>>>>>>> I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it >>>> does not help. >>>>>>>>>>=20 >>>>>>>>>> BTW: It looks as ARC only gives back the memory when I >>>>>>>>>> destroy the ZFS (a cloned snapshot containing virtual >>>>>>>>>> machines). Even if nothing happens for hours the buffer >>>>>>>>>> isn't released.. >>>>>>>>>>=20 >>>>>>>>>> My machine was still running 8.2-PRERELEASE so I am = upgrading. >>>>>>>>>>=20 >>>>>>>>>> I am happy to give information gathered on old/new kernel if = it helps. >>>>>>>>>>=20 >>>>>>>>>> Regards >>>>>>>>>> Peter >>>>>>>>>>=20 >>>>>>>>>> Quoting "Scott Sipe" : >>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>> On Jul 2, 2011, at 12:54 AM, jhell wrote: >>>>>>>>>>>=20 >>>>>>>>>>>> On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick = wrote: >>>>>>>>>>>>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe = wrote: >>>>>>>>>>>>>> I'm running 8.2-RELEASE and am having new problems >>>>>>>>>>>>>> with scp. When scping >>>>>>>>>>>>>> files to a ZFS directory on the FreeBSD server -- >>>>>>>>>>>>>> most notably large files >>>>>>>>>>>>>> -- the transfer frequently dies after just a few >>>>>>>>>>>>>> seconds. In my last test, I >>>>>>>>>>>>>> tried to scp an 800mb file to the FreeBSD system and >>>>>>>>>>>>>> the transfer died after >>>>>>>>>>>>>> 200mb. It completely copied the next 4 times I >>>>>>>>>>>>>> tried, and then died again on >>>>>>>>>>>>>> the next attempt. >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> On the client side: >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> "Connection to home closed by remote host. >>>>>>>>>>>>>> lost connection" >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> In /var/log/auth.log: >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write >>>>>>>>>>>>>> failed: Cannot allocate >>>>>>>>>>>>>> memory >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> I've never seen this before and have used scp before >>>>>>>>>>>>>> to transfer large files >>>>>>>>>>>>>> without problems. This computer has been used in >>>>>>>>>>>>>> production for months and >>>>>>>>>>>>>> has a current uptime of 36 days. I have not been >>>>>>>>>>>>>> able to notice any problems >>>>>>>>>>>>>> copying files to the server via samba or netatalk, or >>>>>> any problems in >>>>>>>>>>>>>> apache. >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> Uname: >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat >>>>>>>>>>>>>> Feb 19 01:02:54 EST >>>>>>>>>>>>>> 2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> I've attached my dmesg and output of vmstat -z. >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> I have not restarted the sshd daemon or rebooted the = computer. >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> Am glad to provide any other information or test anything = else. >>>>>>>>>>>>>>=20 >>>>>>>>>>>>>> {snip vmstat -z and dmesg} >>>>>>>>>>>>>=20 >>>>>>>>>>>>> You didn't provide details about your networking setup = (rc.conf, >>>>>>>>>>>>> ifconfig -a, etc.). netstat -m would be useful too. >>>>>>>>>>>>>=20 >>>>>>>>>>>>> Next, please see this thread circa September 2010, titled = "Network >>>>>>>>>>>>> memory allocation failures": >>>>>>>>>>>>>=20 >>>>>>>>>>>>> = http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.ht= ml#58708 >>>>>>>>>>>>>=20 >>>>>>>>>>>>> The user in that thread is using rsync, which relies on >>>>>> scp by default. >>>>>>>>>>>>> I believe this problem is similar, if not identical, to = yours. >>>>>>>>>>>>>=20 >>>>>>>>>>>>=20 >>>>>>>>>>>> Please also provide your output of ( /usr/bin/limits -a ) >>>>>> for the server >>>>>>>>>>>> end and the client. >>>>>>>>>>>>=20 >>>>>>>>>>>> I am not quite sure I agree with the need for ifconfig -a = but some >>>>>>>>>>>> information about the networking driver your using for the = interface >>>>>>>>>>>> would be helpful, uptime of the boxes. And configuration >>>> of the pool. >>>>>>>>>>>> e.g. ( zpool status -a ;zfs get all ) You should = probably >>>>>>>>>>>> prop this information up somewhere so you can reference by >>>>>> URL whenever >>>>>>>>>>>> needed. >>>>>>>>>>>>=20 >>>>>>>>>>>> rsync(1) does not rely on scp(1) whatsoever but rsync(1) >>>>>> can be made to >>>>>>>>>>>> use ssh(1) instead of rsh(1) and I believe that is what = Jeremy is >>>>>>>>>>>> stating here but correct me if I am wrong. It does use = ssh(1) by >>>>>>>>>>>> default. >>>>>>>>>>>>=20 >>>>>>>>>>>> Its a possiblity as well that if using tmpfs(5) or mdmfs(8) = for /tmp >>>>>>>>>>>> type filesystems that rsync(1) may be just filling up your >>>>>> temp ram area >>>>>>>>>>>> and causing the connection abort which would be >>>>>>>>>>>> expected. ( df -h ) would >>>>>>>>>>>> help here. >>>>>>>>>>>=20 >>>>>>>>>>> Hello, >>>>>>>>>>>=20 >>>>>>>>>>> I'm not using tmpfs/mdmfs at all. The clients yesterday >>>>>>>>>>> were 3 different OSX computers (over gigabit). The FreeBSD >>>>>>>>>>> server has 12gb of ram and no bce adapter. For what it's >>>>>>>>>>> worth, the server is backed up remotely every night with >>>>>>>>>>> rsync (remote FreeBSD uses rsync to pull) to an offsite >>>>>>>>>>> (slow cable connection) FreeBSD computer, and I have not >>>>>>>>>>> seen any errors in the nightly rsync. >>>>>>>>>>>=20 >>>>>>>>>>> Sorry for the omission of networking info, here's the >>>>>>>>>>> output of the requested commands and some that popped up >>>>>>>>>>> in the other thread: >>>>>>>>>>>=20 >>>>>>>>>>> http://www.cap-press.com/misc/ >>>>>>>>>>>=20 >>>>>>>>>>> In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask = 255.255.0.0" >>>>>>>>>>>=20 >>>>>>>>>>> Scott >>>>>>>=20 >>>>>>> Just to make it crystal clear to everyone: >>>>>>>=20 >>>>>>> There is no correlation between this problem and use of ZFS. = People are >>>>>>> attempting to correlate "cannot allocate memory" messages with = "anything >>>>>>> on the system that uses memory". The VM is much more complex = than that. >>>>>>>=20 >>>>>>> Given the nature of this problem, it's much more likely the = issue is >>>>>>> "somewhere" within a networking layer within FreeBSD, whether it = be >>>>>>> driver-level or some sort of intermediary layer. >>>>>>>=20 >>>>>>> Two people who have this issue in this thread are both using = VirtualBox. >>>>>>> Can one, or both, of you remove VirtualBox from the = configuration >>>>>>> entirely (kernel, etc. -- not sure what is required) and then = see if the >>>>>>> issue goes away? >>>>>>=20 >>>>>> On the machine in question I only can do it after hours so I will = do >>>>>> it tonight. >>>>>>=20 >>>>>> I was _successfully_ sending the file over the loopback interface = using >>>>>>=20 >>>>>> cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > = /dev/null" >>>>>>=20 >>>>>> I did it, btw, with the IPv6 localhost address first = (accidently), >>>>>> and then using IPv4. Both worked. >>>>>>=20 >>>>>> It always fails if I am sending it through the bce(4) interface, >>>>>> even if my target is the VirtualBox bridged to the bce card (so = it >>>>>> does not "leave" the computer physically). >>>>>>=20 >>>>>> Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and >>>>>> kldstat output. >>>>>>=20 >>>>>> I have another box where I do not see that problem. It copies = files >>>>>> happily over the net using ssh. >>>>>>=20 >>>>>> It is an an older HP ML 150 with 3GB RAM only but with a bge(4) >>>>>> driver instead. It runs the same last week's RELENG_8. I = installed >>>>>> VirtualBox and enabled vboxnet (so it loads the kernel modules). = But >>>>>> I do not run VirtualBox on it (because it hasn't enough RAM). >>>>>>=20 >>>>>> Regards >>>>>> Peter >>>>>>=20 >>>>>> DellT410one# uname -a >>>>>> FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu = Jun >>>>>> 30 17:07:18 EST 2011 >>>>>> root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 >>>>>> DellT410one# ifconfig -a >>>>>> bce0: flags=3D8943 >>>>>> metric 0 mtu 1500 >>>>>> = options=3Dc01bb >>>>>> ether 84:2b:2b:68:64:e4 >>>>>> inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>> inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>> inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>> inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>> inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>> inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>> inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>> inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>> media: Ethernet autoselect (1000baseT ) >>>>>> status: active >>>>>> bce1: flags=3D8802 metric 0 mtu 1500 >>>>>> = options=3Dc01bb >>>>>> ether 84:2b:2b:68:64:e5 >>>>>> media: Ethernet autoselect >>>>>> lo0: flags=3D8049 metric 0 mtu = 16384 >>>>>> options=3D3 >>>>>> inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb >>>>>> inet6 ::1 prefixlen 128 >>>>>> inet 127.0.0.1 netmask 0xff000000 >>>>>> nd6 options=3D3 >>>>>> vboxnet0: flags=3D8802 metric 0 mtu = 1500 >>>>>> ether 0a:00:27:00:00:00 >>>>>> DellT410one# netstat -rn >>>>>> Routing tables >>>>>>=20 >>>>>> Internet: >>>>>> Destination Gateway Flags Refs Use = Netif Expire >>>>>> default 192.168.50.201 UGS 0 52195 = bce0 >>>>>> 127.0.0.1 link#11 UH 0 6 = lo0 >>>>>> 192.168.50.0/24 link#1 U 0 1118212 = bce0 >>>>>> 192.168.50.219 link#1 UHS 0 9670 = lo0 >>>>>> 192.168.50.220 link#1 UHS 0 8347 = lo0 >>>>>> 192.168.50.221 link#1 UHS 0 103024 = lo0 >>>>>> 192.168.50.223 link#1 UHS 0 43614 = lo0 >>>>>> 192.168.50.224 link#1 UHS 0 8358 = lo0 >>>>>> 192.168.50.225 link#1 UHS 0 8438 = lo0 >>>>>> 192.168.50.226 link#1 UHS 0 8338 = lo0 >>>>>> 192.168.50.227 link#1 UHS 0 8333 = lo0 >>>>>> 192.168.165.0/24 192.168.50.200 UGS 0 3311 = bce0 >>>>>> 192.168.166.0/24 192.168.50.200 UGS 0 699 = bce0 >>>>>> 192.168.167.0/24 192.168.50.200 UGS 0 3012 = bce0 >>>>>> 192.168.168.0/24 192.168.50.200 UGS 0 552 = bce0 >>>>>>=20 >>>>>> Internet6: >>>>>> Destination Gateway >>>>>> Flags Netif Expire >>>>>> ::1 ::1 = UH >>>>>> lo0 >>>>>> fe80::%lo0/64 link#11 U >>>>>> lo0 >>>>>> fe80::1%lo0 link#11 = UHS >>>>>> lo0 >>>>>> ff01::%lo0/32 fe80::1%lo0 U >>>>>> lo0 >>>>>> ff02::%lo0/32 fe80::1%lo0 U >>>>>> lo0 >>>>>> DellT410one# kldstat >>>>>> Id Refs Address Size Name >>>>>> 1 19 0xffffffff80100000 dbf5d0 kernel >>>>>> 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko >>>>>> 3 1 0xffffffff81012000 131998 zfs.ko >>>>>> 4 1 0xffffffff81144000 1ff1 opensolaris.ko >>>>>> 5 2 0xffffffff81146000 2940 vboxnetflt.ko >>>>>> 6 2 0xffffffff81149000 8e38 netgraph.ko >>>>>> 7 1 0xffffffff81152000 153c ng_ether.ko >>>>>> 8 1 0xffffffff81154000 e70 vboxnetadp.ko >>>>>> DellT410one# pciconf -lv >>>>>> .. >>>>>> bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>>>> vendor =3D 'Broadcom Corporation' >>>>>> class =3D network >>>>>> subclass =3D ethernet >>>>>> bce1@pci0:1:0:1: class=3D0x020000 card=3D0x028d1028 >>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>>>> vendor =3D 'Broadcom Corporation' >>>>>> class =3D network >>>>>> subclass =3D ethernet >>>>>=20 >>>>> Could you please provide "pciconf -lvcb" output instead, specific = to the >>>>> bce chips? Thanks. >>>>=20 >>>> Her it is: >>>>=20 >>>> bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>> vendor =3D 'Broadcom Corporation' >>>> class =3D network >>>> subclass =3D ethernet >>>> bar [10] =3D type Memory, range 64, base 0xda000000, size >>>> 33554432, enabled >>>> cap 01[48] =3D powerspec 3 supports D0 D3 current D0 >>>> cap 03[50] =3D VPD >>>> cap 05[58] =3D MSI supports 16 messages, 64 bit enabled with 1 = message >>>> cap 11[a0] =3D MSI-X supports 9 messages in map 0x10 >>>> cap 10[ac] =3D PCI-Express 2 endpoint max data 256(512) link = x4(x4) >>>> ecap 0003[100] =3D Serial 1 842b2bfffe6864e4 >>>> ecap 0001[110] =3D AER 1 0 fatal 0 non-fatal 1 corrected >>>> ecap 0004[150] =3D unknown 1 >>>> ecap 0002[160] =3D VC 1 max VC0 >>>=20 >>> Thanks Peter. >>>=20 >>> Adding Yong-Hyeon and David to the discussion, since they've both = worked >>> on the bce(4) driver in recent months (most of the changes made = recently >>> are only in HEAD), and also adding Jack Vogel of Intel who maintains >>> em(4). Brief history for the devs: >>>=20 >>> The issue is described "Network memory allocation failures" and was >>> reported last year, but two users recently (Scott and Peter) have >>> reported the issue again: >>>=20 >>> = http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread.ht= ml#58708 >>>=20 >>> And was mentioned again by Scott here, which also contains some >>> technical details: >>>=20 >>> = http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063172.html >>>=20 >>> What's interesting is that Scott's issue is identical in form but = he's >>> using em(4), which isn't known to behave like this. Both = individuals >>> are using VirtualBox, though we're not sure at this point if that is = the >>> piece which is causing the anomaly. >>>=20 >>> Relevant details of Scott's system (em-based): >>>=20 >>> http://www.cap-press.com/misc/ >>>=20 >>> Relevant details of Peter's system (bce-based): >>>=20 >>> = http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063221.html >>> = http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063223.html >>>=20 >>> I think the biggest complexity right now is figuring out how/why scp >>> fails intermittently in this nature. The errno probably "trickles = down" >>> to userland from the kernel, but the condition regarding why it = happens >>> is unknown. >>=20 >> BTW: I also saw 2 of the errors coming from a BIND9 running in a >> jail on that box. >>=20 >> DellT410one# fgrep -i allocate /jails/bind/20110315/var/log/messages >> Apr 13 05:17:41 bind named[23534]: internal_send: >> 192.168.50.145#65176: Cannot allocate memory >> Jun 21 23:30:44 bind named[39864]: internal_send: >> 192.168.50.251#36155: Cannot allocate memory >> Jun 24 15:28:00 bind named[39864]: internal_send: >> 192.168.50.251#28651: Cannot allocate memory >> Jun 28 12:57:52 bind named[2462]: internal_send: >> 192.168.165.154#1201: Cannot allocate memory >>=20 >> My initial guess: it happens sooner or later somehow - whether it is >> a lot of traffic in one go (ssh/scp copies of virtual disks) or a >> lot of traffic over a longer period (a nameserver gets asked again >> and again). >=20 > Scott, are you also using jails? If both of you are: is there any > possibility you can remove use of those? I'm not sure how VirtualBox > fits into the picture (jails + VirtualBox that is), but I can imagine > jails having different environmental constraints that might cause = this. >=20 > Basically the troubleshooting process here is to remove pieces of the > puzzle until you figure out which piece is causing the issue. I don't > want to get the NIC driver devs all spun up for something that, for > example, might be an issue with the jail implementation. No jails here. I do have one bind error message in all my logs: daemon:Jun 20 10:52:28 xeon named[399]: internal_send: 10.1.2.95#51946: = Cannot allocate memory Greping my logs for "allocate" turned up a handful of memory allocation = errors with netatalk too. afpd.log:Jul 01 16:13:04.828835 afpd[18303] {dsi_stream.c:427} (E:DSI): = dsi_stream_send: Cannot allocate memory afpd.log:Jun 23 13:34:01.000987 afpd[17970] {fork.c:980} (E:AFPDaemon): = afp_read(final file.pdf): Cannot allocate memory And a handful from samba: [2011/07/05 23:43:22.483224, 0] lib/util_sock.c:675(write_data) write_data: write failure in writing to client 10.1.1.10. Error Cannot = allocate memory [2011/07/05 23:43:22.493839, 0] smbd/process.c:79(srv_send_smb) Error writing 51 bytes to client. -1. (Cannot allocate memory) I haven't personally seen any errors on the client side with = samba/netatalk (and when scp was failing regularly I transferred the = same files over netatalk+samba without error) nor have I had any reports = of problems, but I guess there's a good chance all these log messages = are related. I've been trying to trigger the scp failure remotely tonight with no = luck. I was triggering it regularly during the work day today, but not = tonight. I will try to experiment tomorrow during the day with stopping = VirtualBox and removing the kernel modules and seeing what happens. Scott= From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 05:36:35 2011 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2AA98106566B for ; Wed, 6 Jul 2011 05:36:35 +0000 (UTC) (envelope-from eugen@grosbein.pp.ru) Received: from eg.sd.rdtc.ru (unknown [IPv6:2a03:3100:c:13::5]) by mx1.freebsd.org (Postfix) with ESMTP id 91EB78FC0C for ; Wed, 6 Jul 2011 05:36:33 +0000 (UTC) Received: from eg.sd.rdtc.ru (localhost [127.0.0.1]) by eg.sd.rdtc.ru (8.14.4/8.14.4) with ESMTP id p665aUs7053899 for ; Wed, 6 Jul 2011 12:36:30 +0700 (NOVST) (envelope-from eugen@grosbein.pp.ru) Message-ID: <4E13F459.30502@grosbein.pp.ru> Date: Wed, 06 Jul 2011 12:36:25 +0700 From: Eugene Grosbein User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; ru-RU; rv:1.9.2.13) Gecko/20110112 Thunderbird/3.1.7 MIME-Version: 1.0 To: stable@freebsd.org Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: Subject: Builworld is broken for RELENG_8 when CPUTYPE?=core2 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 05:36:35 -0000 Hi! Yesterday I've updated sources of my RELENG_8 box using csup and ran "make -j5 buildworld" while having CPUTYPE?=core2 in /etc/make.conf It failed: ===> kerberos5/tools/asn1_compile (depend) cd /home/src/kerberos5/tools/asn1_compile/../make-print-version && make cd /home/src/kerberos5/tools/asn1_compile/../make-roken && make lex -t /home/src/kerberos5/tools/asn1_compile/../../../crypto/heimdal/lib/asn1/lex.l > lex.c sed -e '96s/"/"#ifdef __PARSE_UNITS_H__\\n/;' -e '96s/",/\\n#endif\\n",/' /home/src/kerberos5/tools/asn1_compile/../../../crypto/heimdal/lib/asn1/gen_glue.c > gen_glue-fixed.c yacc -d -o parse.c /home/src/kerberos5/tools/asn1_compile/../../../crypto/heimdal/lib/asn1/parse.y yacc: 4 shift/reduce conflicts cc -O2 -pipe -march=core2 -DHAVE_CONFIG_H -I/home/src/kerberos5/tools/make-print-version/../../include -std=gnu99 -c /home/src/kerberos5/tools/make-print-version/../../../crypto/heimdal/lib/vers/make-print-version.c cc -O2 -pipe -march=core2 -DHAVE_CONFIG_H -I/home/src/kerberos5/tools/make-roken/../../include -std=gnu99 -c make-roken.c /home/src/kerberos5/tools/make-print-version/../../../crypto/heimdal/lib/vers/make-print-version.c:1: error: bad value (core2) for -march= switch /home/src/kerberos5/tools/make-print-version/../../../crypto/heimdal/lib/vers/make-print-version.c:1: error: bad value (core2) for -mtune= switch make-roken.c:1: error: bad value (core2) for -march= switch make-roken.c:1: error: bad value (core2) for -mtune= switch *** Error code 1 1 error *** Error code 2 *** Error code 1 1 error *** Error code 2 2 errors *** Error code 2 1 error *** Error code 2 1 error *** Error code 2 1 error Please take a look. Eugene Grosbein From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 05:45:09 2011 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4E976106566C for ; Wed, 6 Jul 2011 05:45:09 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta02.westchester.pa.mail.comcast.net (qmta02.westchester.pa.mail.comcast.net [76.96.62.24]) by mx1.freebsd.org (Postfix) with ESMTP id F05238FC1C for ; Wed, 6 Jul 2011 05:45:08 +0000 (UTC) Received: from omta15.westchester.pa.mail.comcast.net ([76.96.62.87]) by qmta02.westchester.pa.mail.comcast.net with comcast id 4VjE1h0021swQuc52Vl9va; Wed, 06 Jul 2011 05:45:09 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta15.westchester.pa.mail.comcast.net with comcast id 4Vl71h00G1t3BNj3bVl83H; Wed, 06 Jul 2011 05:45:09 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 2113D102C36; Tue, 5 Jul 2011 22:45:06 -0700 (PDT) Date: Tue, 5 Jul 2011 22:45:06 -0700 From: Jeremy Chadwick To: Eugene Grosbein Message-ID: <20110706054506.GA75147@icarus.home.lan> References: <4E13F459.30502@grosbein.pp.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E13F459.30502@grosbein.pp.ru> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: stable@freebsd.org Subject: Re: Builworld is broken for RELENG_8 when CPUTYPE?=core2 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 05:45:09 -0000 On Wed, Jul 06, 2011 at 12:36:25PM +0700, Eugene Grosbein wrote: > Yesterday I've updated sources of my RELENG_8 box using csup > and ran "make -j5 buildworld" while having CPUTYPE?=core2 in /etc/make.conf > > It failed: > > ===> kerberos5/tools/asn1_compile (depend) > cd /home/src/kerberos5/tools/asn1_compile/../make-print-version && make > cd /home/src/kerberos5/tools/asn1_compile/../make-roken && make > lex -t /home/src/kerberos5/tools/asn1_compile/../../../crypto/heimdal/lib/asn1/lex.l > lex.c > sed -e '96s/"/"#ifdef __PARSE_UNITS_H__\\n/;' -e '96s/",/\\n#endif\\n",/' /home/src/kerberos5/tools/asn1_compile/../../../crypto/heimdal/lib/asn1/gen_glue.c > gen_glue-fixed.c > yacc -d -o parse.c /home/src/kerberos5/tools/asn1_compile/../../../crypto/heimdal/lib/asn1/parse.y > yacc: 4 shift/reduce conflicts > cc -O2 -pipe -march=core2 -DHAVE_CONFIG_H -I/home/src/kerberos5/tools/make-print-version/../../include -std=gnu99 -c /home/src/kerberos5/tools/make-print-version/../../../crypto/heimdal/lib/vers/make-print-version.c > cc -O2 -pipe -march=core2 -DHAVE_CONFIG_H -I/home/src/kerberos5/tools/make-roken/../../include -std=gnu99 -c make-roken.c > /home/src/kerberos5/tools/make-print-version/../../../crypto/heimdal/lib/vers/make-print-version.c:1: error: bad value (core2) for -march= switch > /home/src/kerberos5/tools/make-print-version/../../../crypto/heimdal/lib/vers/make-print-version.c:1: error: bad value (core2) for -mtune= switch > make-roken.c:1: error: bad value (core2) for -march= switch > make-roken.c:1: error: bad value (core2) for -mtune= switch > *** Error code 1 > 1 error > *** Error code 2 > *** Error code 1 > 1 error > *** Error code 2 > 2 errors > *** Error code 2 > 1 error > *** Error code 2 > 1 error > *** Error code 2 > 1 error > > Please take a look. Please take a look at this thread from a little over a month ago, titled "RELENG_8 does not build with CPUTYPE=core2". The answer to the problem is within there (absolutely 100% certain, trust me): http://lists.freebsd.org/pipermail/freebsd-stable/2011-May/thread.html#62655 -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 07:32:45 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A47FE1065675; Wed, 6 Jul 2011 07:32:45 +0000 (UTC) (envelope-from Peter.Ross@bogen.in-berlin.de) Received: from einhorn.in-berlin.de (einhorn.in-berlin.de [192.109.42.8]) by mx1.freebsd.org (Postfix) with ESMTP id 640578FC12; Wed, 6 Jul 2011 07:32:44 +0000 (UTC) X-Envelope-From: Peter.Ross@bogen.in-berlin.de Received: from localhost (okapi.in-berlin.de [192.109.42.117]) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id p667WgpP016550; Wed, 6 Jul 2011 09:32:42 +0200 Received: from 124-254-118-24-static.bb.ispone.net.au (124-254-118-24-static.bb.ispone.net.au [124.254.118.24]) by webmail.in-berlin.de (Horde Framework) with HTTP; Wed, 06 Jul 2011 17:32:42 +1000 Message-ID: <20110706173242.23404ffbhkxz6mqi@webmail.in-berlin.de> Date: Wed, 06 Jul 2011 17:32:42 +1000 From: "Peter Ross" To: "Peter Ross" References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> <20110706023234.GA72048@icarus.home.lan> <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> <20110706032425.GA72757@icarus.home.lan> <20110706135412.15276i0fxavg09k4@webmail.in-berlin.de> <20110706041504.GA73698@icarus.home.lan> <20110706143129.10696235ldx9bjmp@webmail.in-berlin.de> In-Reply-To: <20110706143129.10696235ldx9bjmp@webmail.in-berlin.de> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Internet Messaging Program (IMP) 4.3.3 X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 Cc: Yong-Hyeon Pyun , freebsd-stable List , "Vogel, Jack" , davidch@freebsd.org, Scott Sipe , Jeremy Chadwick Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 07:32:45 -0000 Quoting "Peter Ross" : > Quoting "Jeremy Chadwick" : > >> On Wed, Jul 06, 2011 at 01:54:12PM +1000, Peter Ross wrote: >>> Quoting "Jeremy Chadwick" : >>> >>>> On Wed, Jul 06, 2011 at 01:07:53PM +1000, Peter Ross wrote: >>>>> Quoting "Jeremy Chadwick" : >>>>> >>>>>> On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: >>>>>>> Quoting "Jeremy Chadwick" : >>>>>>> >>>>>>>> On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: >>>>>>>>> I'm running virtualbox 3.2.12_1 if that has anything to do with it= . >>>>>>>>> >>>>>>>>> sysctl vfs.zfs.arc_max: 6200000000 >>>>>>>>> >>>>>>>>> While I'm trying to scp, kstat.zfs.misc.arcstats.size is >>>>>>>>> hovering right around that value, sometimes above, sometimes >>>>>>>>> below (that's as it should be, right?). I don't think that it >>>>>>>>> dies when crossing over arc_max. I can run the same scp 10 times >>>>>>>>> and it might fail 1-3 times, with no correlation to the >>>>>>>>> arcstats.size being above/below arc_max that I can see. >>>>>>>>> >>>>>>>>> Scott >>>>>>>>> >>>>>>>>> On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: >>>>>>>>> >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> just as an addition: an upgrade to last Friday's >>>>>>>>>> FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the >>>>>>>>>> problem. >>>>>>>>>> >>>>>>>>>> I will experiment a bit more tomorrow after hours and grab >>>>>>> some statistics. >>>>>>>>>> >>>>>>>>>> Regards >>>>>>>>>> Peter >>>>>>>>>> >>>>>>>>>> Quoting "Peter Ross" : >>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I noticed a similar problem last week. It is also very >>>>>>>>>>> similar to one reported last year: >>>>>>>>>>> >>>>>>>>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September= /058708.html >>>>>>>>>>> >>>>>>>>>>> My server is a Dell T410 server with the same bge card (the >>>>>>>>>>> same pciconf -lvc output as described by Mahlon: >>>>>>>>>>> >>>>>>>>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September= /058711.html >>>>>>>>>>> >>>>>>>>>>> Yours, Scott, is a em(4).. >>>>>>>>>>> >>>>>>>>>>> Another similarity: In all cases we are using VirtualBox. I >>>>>>>>>>> just want to mention it, in case it matters. I am still >>>>>>>>>>> running VirtualBox 3.2. >>>>>>>>>>> >>>>>>>>>>> Most of the time kstat.zfs.misc.arcstats.size was reaching >>>>>>>>>>> vfs.zfs.arc_max then, but I could catch one or two cases >>>>>>>>>>> then the value was still below. >>>>>>>>>>> >>>>>>>>>>> I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it >>>>> does not help. >>>>>>>>>>> >>>>>>>>>>> BTW: It looks as ARC only gives back the memory when I >>>>>>>>>>> destroy the ZFS (a cloned snapshot containing virtual >>>>>>>>>>> machines). Even if nothing happens for hours the buffer >>>>>>>>>>> isn't released.. >>>>>>>>>>> >>>>>>>>>>> My machine was still running 8.2-PRERELEASE so I am upgrading. >>>>>>>>>>> >>>>>>>>>>> I am happy to give information gathered on old/new kernel =20 >>>>>>>>>>> if it helps. >>>>>>>>>>> >>>>>>>>>>> Regards >>>>>>>>>>> Peter >>>>>>>>>>> >>>>>>>>>>> Quoting "Scott Sipe" : >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Jul 2, 2011, at 12:54 AM, jhell wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wrot= e: >>>>>>>>>>>>>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >>>>>>>>>>>>>>> I'm running 8.2-RELEASE and am having new problems >>>>>>>>>>>>>>> with scp. When scping >>>>>>>>>>>>>>> files to a ZFS directory on the FreeBSD server -- >>>>>>>>>>>>>>> most notably large files >>>>>>>>>>>>>>> -- the transfer frequently dies after just a few >>>>>>>>>>>>>>> seconds. In my last test, I >>>>>>>>>>>>>>> tried to scp an 800mb file to the FreeBSD system and >>>>>>>>>>>>>>> the transfer died after >>>>>>>>>>>>>>> 200mb. It completely copied the next 4 times I >>>>>>>>>>>>>>> tried, and then died again on >>>>>>>>>>>>>>> the next attempt. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On the client side: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> "Connection to home closed by remote host. >>>>>>>>>>>>>>> lost connection" >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In /var/log/auth.log: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write >>>>>>>>>>>>>>> failed: Cannot allocate >>>>>>>>>>>>>>> memory >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've never seen this before and have used scp before >>>>>>>>>>>>>>> to transfer large files >>>>>>>>>>>>>>> without problems. This computer has been used in >>>>>>>>>>>>>>> production for months and >>>>>>>>>>>>>>> has a current uptime of 36 days. I have not been >>>>>>>>>>>>>>> able to notice any problems >>>>>>>>>>>>>>> copying files to the server via samba or netatalk, or >>>>>>> any problems in >>>>>>>>>>>>>>> apache. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Uname: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat >>>>>>>>>>>>>>> Feb 19 01:02:54 EST >>>>>>>>>>>>>>> 2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I've attached my dmesg and output of vmstat -z. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have not restarted the sshd daemon or rebooted the compute= r. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Am glad to provide any other information or test anything el= se. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> {snip vmstat -z and dmesg} >>>>>>>>>>>>>> >>>>>>>>>>>>>> You didn't provide details about your networking setup (rc.co= nf, >>>>>>>>>>>>>> ifconfig -a, etc.). netstat -m would be useful too. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Next, please see this thread circa September 2010, =20 >>>>>>>>>>>>>> titled "Network >>>>>>>>>>>>>> memory allocation failures": >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-Septem= ber/thread.html#58708 >>>>>>>>>>>>>> >>>>>>>>>>>>>> The user in that thread is using rsync, which relies on >>>>>>> scp by default. >>>>>>>>>>>>>> I believe this problem is similar, if not identical, to yours= . >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Please also provide your output of ( /usr/bin/limits -a ) >>>>>>> for the server >>>>>>>>>>>>> end and the client. >>>>>>>>>>>>> >>>>>>>>>>>>> I am not quite sure I agree with the need for ifconfig =20 >>>>>>>>>>>>> -a but some >>>>>>>>>>>>> information about the networking driver your using for =20 >>>>>>>>>>>>> the interface >>>>>>>>>>>>> would be helpful, uptime of the boxes. And configuration >>>>> of the pool. >>>>>>>>>>>>> e.g. ( zpool status -a ;zfs get all ) You =20 >>>>>>>>>>>>> should probably >>>>>>>>>>>>> prop this information up somewhere so you can reference by >>>>>>> URL whenever >>>>>>>>>>>>> needed. >>>>>>>>>>>>> >>>>>>>>>>>>> rsync(1) does not rely on scp(1) whatsoever but rsync(1) >>>>>>> can be made to >>>>>>>>>>>>> use ssh(1) instead of rsh(1) and I believe that is what Jeremy= is >>>>>>>>>>>>> stating here but correct me if I am wrong. It does use ssh(1) = by >>>>>>>>>>>>> default. >>>>>>>>>>>>> >>>>>>>>>>>>> Its a possiblity as well that if using tmpfs(5) or =20 >>>>>>>>>>>>> mdmfs(8) for /tmp >>>>>>>>>>>>> type filesystems that rsync(1) may be just filling up your >>>>>>> temp ram area >>>>>>>>>>>>> and causing the connection abort which would be >>>>>>>>>>>>> expected. ( df -h ) would >>>>>>>>>>>>> help here. >>>>>>>>>>>> >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> I'm not using tmpfs/mdmfs at all. The clients yesterday >>>>>>>>>>>> were 3 different OSX computers (over gigabit). The FreeBSD >>>>>>>>>>>> server has 12gb of ram and no bce adapter. For what it's >>>>>>>>>>>> worth, the server is backed up remotely every night with >>>>>>>>>>>> rsync (remote FreeBSD uses rsync to pull) to an offsite >>>>>>>>>>>> (slow cable connection) FreeBSD computer, and I have not >>>>>>>>>>>> seen any errors in the nightly rsync. >>>>>>>>>>>> >>>>>>>>>>>> Sorry for the omission of networking info, here's the >>>>>>>>>>>> output of the requested commands and some that popped up >>>>>>>>>>>> in the other thread: >>>>>>>>>>>> >>>>>>>>>>>> http://www.cap-press.com/misc/ >>>>>>>>>>>> >>>>>>>>>>>> In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0" >>>>>>>>>>>> >>>>>>>>>>>> Scott >>>>>>>> >>>>>>>> Just to make it crystal clear to everyone: >>>>>>>> >>>>>>>> There is no correlation between this problem and use of ZFS. =20 >>>>>>>> People are >>>>>>>> attempting to correlate "cannot allocate memory" messages =20 >>>>>>>> with "anything >>>>>>>> on the system that uses memory". The VM is much more complex =20 >>>>>>>> than that. >>>>>>>> >>>>>>>> Given the nature of this problem, it's much more likely the issue i= s >>>>>>>> "somewhere" within a networking layer within FreeBSD, whether it be >>>>>>>> driver-level or some sort of intermediary layer. >>>>>>>> >>>>>>>> Two people who have this issue in this thread are both using =20 >>>>>>>> VirtualBox. >>>>>>>> Can one, or both, of you remove VirtualBox from the configuration >>>>>>>> entirely (kernel, etc. -- not sure what is required) and then =20 >>>>>>>> see if the >>>>>>>> issue goes away? >>>>>>> >>>>>>> On the machine in question I only can do it after hours so I will do >>>>>>> it tonight. >>>>>>> >>>>>>> I was _successfully_ sending the file over the loopback interface us= ing >>>>>>> >>>>>>> cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null" >>>>>>> >>>>>>> I did it, btw, with the IPv6 localhost address first (accidently), >>>>>>> and then using IPv4. Both worked. >>>>>>> >>>>>>> It always fails if I am sending it through the bce(4) interface, >>>>>>> even if my target is the VirtualBox bridged to the bce card (so it >>>>>>> does not "leave" the computer physically). >>>>>>> >>>>>>> Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and >>>>>>> kldstat output. >>>>>>> >>>>>>> I have another box where I do not see that problem. It copies files >>>>>>> happily over the net using ssh. >>>>>>> >>>>>>> It is an an older HP ML 150 with 3GB RAM only but with a bge(4) >>>>>>> driver instead. It runs the same last week's RELENG_8. I installed >>>>>>> VirtualBox and enabled vboxnet (so it loads the kernel modules). But >>>>>>> I do not run VirtualBox on it (because it hasn't enough RAM). >>>>>>> >>>>>>> Regards >>>>>>> Peter >>>>>>> >>>>>>> DellT410one# uname -a >>>>>>> FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Jun >>>>>>> 30 17:07:18 EST 2011 >>>>>>> root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 >>>>>>> DellT410one# ifconfig -a >>>>>>> bce0: flags=3D8943 >>>>>>> metric 0 mtu 1500 >>>>>>> =09options=3Dc01bb >>>>>>> =09ether 84:2b:2b:68:64:e4 >>>>>>> =09inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>> =09inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>> =09inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>> =09inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>> =09inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>> =09inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>> =09inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>> =09inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>> =09media: Ethernet autoselect (1000baseT ) >>>>>>> =09status: active >>>>>>> bce1: flags=3D8802 metric 0 mtu 1500 >>>>>>> =09options=3Dc01bb >>>>>>> =09ether 84:2b:2b:68:64:e5 >>>>>>> =09media: Ethernet autoselect >>>>>>> lo0: flags=3D8049 metric 0 mtu 16384 >>>>>>> =09options=3D3 >>>>>>> =09inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb >>>>>>> =09inet6 ::1 prefixlen 128 >>>>>>> =09inet 127.0.0.1 netmask 0xff000000 >>>>>>> =09nd6 options=3D3 >>>>>>> vboxnet0: flags=3D8802 metric 0 mtu 150= 0 >>>>>>> =09ether 0a:00:27:00:00:00 >>>>>>> DellT410one# netstat -rn >>>>>>> Routing tables >>>>>>> >>>>>>> Internet: >>>>>>> Destination Gateway Flags Refs Use =20 >>>>>>> Netif Expire >>>>>>> default 192.168.50.201 UGS 0 52195 bce0 >>>>>>> 127.0.0.1 link#11 UH 0 6 lo0 >>>>>>> 192.168.50.0/24 link#1 U 0 1118212 bce0 >>>>>>> 192.168.50.219 link#1 UHS 0 9670 lo0 >>>>>>> 192.168.50.220 link#1 UHS 0 8347 lo0 >>>>>>> 192.168.50.221 link#1 UHS 0 103024 lo0 >>>>>>> 192.168.50.223 link#1 UHS 0 43614 lo0 >>>>>>> 192.168.50.224 link#1 UHS 0 8358 lo0 >>>>>>> 192.168.50.225 link#1 UHS 0 8438 lo0 >>>>>>> 192.168.50.226 link#1 UHS 0 8338 lo0 >>>>>>> 192.168.50.227 link#1 UHS 0 8333 lo0 >>>>>>> 192.168.165.0/24 192.168.50.200 UGS 0 3311 bce0 >>>>>>> 192.168.166.0/24 192.168.50.200 UGS 0 699 bce0 >>>>>>> 192.168.167.0/24 192.168.50.200 UGS 0 3012 bce0 >>>>>>> 192.168.168.0/24 192.168.50.200 UGS 0 552 bce0 >>>>>>> >>>>>>> Internet6: >>>>>>> Destination Gateway >>>>>>> Flags Netif Expire >>>>>>> ::1 ::1 UH >>>>>>> lo0 >>>>>>> fe80::%lo0/64 link#11 U >>>>>>> lo0 >>>>>>> fe80::1%lo0 link#11 UHS >>>>>>> lo0 >>>>>>> ff01::%lo0/32 fe80::1%lo0 U >>>>>>> lo0 >>>>>>> ff02::%lo0/32 fe80::1%lo0 U >>>>>>> lo0 >>>>>>> DellT410one# kldstat >>>>>>> Id Refs Address Size Name >>>>>>> 1 19 0xffffffff80100000 dbf5d0 kernel >>>>>>> 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko >>>>>>> 3 1 0xffffffff81012000 131998 zfs.ko >>>>>>> 4 1 0xffffffff81144000 1ff1 opensolaris.ko >>>>>>> 5 2 0xffffffff81146000 2940 vboxnetflt.ko >>>>>>> 6 2 0xffffffff81149000 8e38 netgraph.ko >>>>>>> 7 1 0xffffffff81152000 153c ng_ether.ko >>>>>>> 8 1 0xffffffff81154000 e70 vboxnetadp.ko >>>>>>> DellT410one# pciconf -lv >>>>>>> .. >>>>>>> bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >>>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>>>>> vendor =3D 'Broadcom Corporation' >>>>>>> class =3D network >>>>>>> subclass =3D ethernet >>>>>>> bce1@pci0:1:0:1: class=3D0x020000 card=3D0x028d1028 >>>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>>>>> vendor =3D 'Broadcom Corporation' >>>>>>> class =3D network >>>>>>> subclass =3D ethernet >>>>>> >>>>>> Could you please provide "pciconf -lvcb" output instead, specific to = the >>>>>> bce chips? Thanks. >>>>> >>>>> Her it is: >>>>> >>>>> bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>>> vendor =3D 'Broadcom Corporation' >>>>> class =3D network >>>>> subclass =3D ethernet >>>>> bar [10] =3D type Memory, range 64, base 0xda000000, size >>>>> 33554432, enabled >>>>> cap 01[48] =3D powerspec 3 supports D0 D3 current D0 >>>>> cap 03[50] =3D VPD >>>>> cap 05[58] =3D MSI supports 16 messages, 64 bit enabled with 1 mess= age >>>>> cap 11[a0] =3D MSI-X supports 9 messages in map 0x10 >>>>> cap 10[ac] =3D PCI-Express 2 endpoint max data 256(512) link x4(x4) >>>>> ecap 0003[100] =3D Serial 1 842b2bfffe6864e4 >>>>> ecap 0001[110] =3D AER 1 0 fatal 0 non-fatal 1 corrected >>>>> ecap 0004[150] =3D unknown 1 >>>>> ecap 0002[160] =3D VC 1 max VC0 >>>> >>>> Thanks Peter. >>>> >>>> Adding Yong-Hyeon and David to the discussion, since they've both worke= d >>>> on the bce(4) driver in recent months (most of the changes made recentl= y >>>> are only in HEAD), and also adding Jack Vogel of Intel who maintains >>>> em(4). Brief history for the devs: >>>> >>>> The issue is described "Network memory allocation failures" and was >>>> reported last year, but two users recently (Scott and Peter) have >>>> reported the issue again: >>>> >>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/thread= .html#58708 >>>> >>>> And was mentioned again by Scott here, which also contains some >>>> technical details: >>>> >>>> http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063172.html >>>> >>>> What's interesting is that Scott's issue is identical in form but he's >>>> using em(4), which isn't known to behave like this. Both individuals >>>> are using VirtualBox, though we're not sure at this point if that is th= e >>>> piece which is causing the anomaly. >>>> >>>> Relevant details of Scott's system (em-based): >>>> >>>> http://www.cap-press.com/misc/ >>>> >>>> Relevant details of Peter's system (bce-based): >>>> >>>> http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063221.html >>>> http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063223.html >>>> >>>> I think the biggest complexity right now is figuring out how/why scp >>>> fails intermittently in this nature. The errno probably "trickles down= " >>>> to userland from the kernel, but the condition regarding why it happens >>>> is unknown. >>> >>> BTW: I also saw 2 of the errors coming from a BIND9 running in a >>> jail on that box. >>> >>> DellT410one# fgrep -i allocate /jails/bind/20110315/var/log/messages >>> Apr 13 05:17:41 bind named[23534]: internal_send: >>> 192.168.50.145#65176: Cannot allocate memory >>> Jun 21 23:30:44 bind named[39864]: internal_send: >>> 192.168.50.251#36155: Cannot allocate memory >>> Jun 24 15:28:00 bind named[39864]: internal_send: >>> 192.168.50.251#28651: Cannot allocate memory >>> Jun 28 12:57:52 bind named[2462]: internal_send: >>> 192.168.165.154#1201: Cannot allocate memory >>> >>> My initial guess: it happens sooner or later somehow - whether it is >>> a lot of traffic in one go (ssh/scp copies of virtual disks) or a >>> lot of traffic over a longer period (a nameserver gets asked again >>> and again). >> >> Scott, are you also using jails? If both of you are: is there any >> possibility you can remove use of those? I'm not sure how VirtualBox >> fits into the picture (jails + VirtualBox that is), but I can imagine >> jails having different environmental constraints that might cause this. >> >> Basically the troubleshooting process here is to remove pieces of the >> puzzle until you figure out which piece is causing the issue. I don't >> want to get the NIC driver devs all spun up for something that, for >> example, might be an issue with the jail implementation. > > I understand this. As said, I do some afterhours debugging tonight. > > The scp/ssh problems are happening _outside_ the jails. The bind =20 > runs _inside_ the jail. > > I wanted to use the _host_ system to send VirtualBox virtual disks =20 > and filesystems used by jails to archive them and/or having them =20 > available on other FreeBSD systems (as a cold standby solution). I just switched off the VirtualBox (without removing the kernel modules). The copy succeeds now. Well, it could be a VirtualBox related problem, or is the server just =20 relieved to have 2GB more memory at hands now? Do you have a quick idea to "emulate" the 2GB memory load usually =20 delivered by VirtualBox? Regards Peter From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 08:21:44 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2EB28106564A; Wed, 6 Jul 2011 08:21:44 +0000 (UTC) (envelope-from Peter.Ross@bogen.in-berlin.de) Received: from einhorn.in-berlin.de (einhorn.in-berlin.de [192.109.42.8]) by mx1.freebsd.org (Postfix) with ESMTP id 2E80C8FC1B; Wed, 6 Jul 2011 08:21:42 +0000 (UTC) X-Envelope-From: Peter.Ross@bogen.in-berlin.de Received: from localhost (okapi.in-berlin.de [192.109.42.117]) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id p668LfL9019576; Wed, 6 Jul 2011 10:21:42 +0200 Received: from 124-254-118-24-static.bb.ispone.net.au (124-254-118-24-static.bb.ispone.net.au [124.254.118.24]) by webmail.in-berlin.de (Horde Framework) with HTTP; Wed, 06 Jul 2011 18:21:41 +1000 Message-ID: <20110706182141.13056plxp148y61h@webmail.in-berlin.de> Date: Wed, 06 Jul 2011 18:21:41 +1000 From: "Peter Ross" To: "Jeremy Chadwick" References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> <20110706023234.GA72048@icarus.home.lan> <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> <20110706032425.GA72757@icarus.home.lan> <20110706135412.15276i0fxavg09k4@webmail.in-berlin.de> <20110706041504.GA73698@icarus.home.lan> <20110706143129.10696235ldx9bjmp@webmail.in-berlin.de> <20110706173242.23404ffbhkxz6mqi@webmail.in-berlin.de> In-Reply-To: <20110706173242.23404ffbhkxz6mqi@webmail.in-berlin.de> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Internet Messaging Program (IMP) 4.3.3 X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 Cc: Yong-Hyeon Pyun , "Vogel, Jack" , freebsd-stable List , davidch@freebsd.org, Scott Sipe Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 08:21:44 -0000 Quoting "Peter Ross" : > Quoting "Peter Ross" : > >> Quoting "Jeremy Chadwick" : >> >>> On Wed, Jul 06, 2011 at 01:54:12PM +1000, Peter Ross wrote: >>>> Quoting "Jeremy Chadwick" : >>>> >>>>> On Wed, Jul 06, 2011 at 01:07:53PM +1000, Peter Ross wrote: >>>>>> Quoting "Jeremy Chadwick" : >>>>>> >>>>>>> On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: >>>>>>>> Quoting "Jeremy Chadwick" : >>>>>>>> >>>>>>>>> On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: >>>>>>>>>> I'm running virtualbox 3.2.12_1 if that has anything to do with i= t. >>>>>>>>>> >>>>>>>>>> sysctl vfs.zfs.arc_max: 6200000000 >>>>>>>>>> >>>>>>>>>> While I'm trying to scp, kstat.zfs.misc.arcstats.size is >>>>>>>>>> hovering right around that value, sometimes above, sometimes >>>>>>>>>> below (that's as it should be, right?). I don't think that it >>>>>>>>>> dies when crossing over arc_max. I can run the same scp 10 times >>>>>>>>>> and it might fail 1-3 times, with no correlation to the >>>>>>>>>> arcstats.size being above/below arc_max that I can see. >>>>>>>>>> >>>>>>>>>> Scott >>>>>>>>>> >>>>>>>>>> On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: >>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> just as an addition: an upgrade to last Friday's >>>>>>>>>>> FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the >>>>>>>>>>> problem. >>>>>>>>>>> >>>>>>>>>>> I will experiment a bit more tomorrow after hours and grab >>>>>>>> some statistics. >>>>>>>>>>> >>>>>>>>>>> Regards >>>>>>>>>>> Peter >>>>>>>>>>> >>>>>>>>>>> Quoting "Peter Ross" : >>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> I noticed a similar problem last week. It is also very >>>>>>>>>>>> similar to one reported last year: >>>>>>>>>>>> >>>>>>>>>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-Septembe= r/058708.html >>>>>>>>>>>> >>>>>>>>>>>> My server is a Dell T410 server with the same bge card (the >>>>>>>>>>>> same pciconf -lvc output as described by Mahlon: >>>>>>>>>>>> >>>>>>>>>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-Septembe= r/058711.html >>>>>>>>>>>> >>>>>>>>>>>> Yours, Scott, is a em(4).. >>>>>>>>>>>> >>>>>>>>>>>> Another similarity: In all cases we are using VirtualBox. I >>>>>>>>>>>> just want to mention it, in case it matters. I am still >>>>>>>>>>>> running VirtualBox 3.2. >>>>>>>>>>>> >>>>>>>>>>>> Most of the time kstat.zfs.misc.arcstats.size was reaching >>>>>>>>>>>> vfs.zfs.arc_max then, but I could catch one or two cases >>>>>>>>>>>> then the value was still below. >>>>>>>>>>>> >>>>>>>>>>>> I added vfs.zfs.prefetch_disable=3D1 to sysctl.conf but it >>>>>> does not help. >>>>>>>>>>>> >>>>>>>>>>>> BTW: It looks as ARC only gives back the memory when I >>>>>>>>>>>> destroy the ZFS (a cloned snapshot containing virtual >>>>>>>>>>>> machines). Even if nothing happens for hours the buffer >>>>>>>>>>>> isn't released.. >>>>>>>>>>>> >>>>>>>>>>>> My machine was still running 8.2-PRERELEASE so I am upgrading. >>>>>>>>>>>> >>>>>>>>>>>> I am happy to give information gathered on old/new kernel =20 >>>>>>>>>>>> if it helps. >>>>>>>>>>>> >>>>>>>>>>>> Regards >>>>>>>>>>>> Peter >>>>>>>>>>>> >>>>>>>>>>>> Quoting "Scott Sipe" : >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Jul 2, 2011, at 12:54 AM, jhell wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick wro= te: >>>>>>>>>>>>>>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >>>>>>>>>>>>>>>> I'm running 8.2-RELEASE and am having new problems >>>>>>>>>>>>>>>> with scp. When scping >>>>>>>>>>>>>>>> files to a ZFS directory on the FreeBSD server -- >>>>>>>>>>>>>>>> most notably large files >>>>>>>>>>>>>>>> -- the transfer frequently dies after just a few >>>>>>>>>>>>>>>> seconds. In my last test, I >>>>>>>>>>>>>>>> tried to scp an 800mb file to the FreeBSD system and >>>>>>>>>>>>>>>> the transfer died after >>>>>>>>>>>>>>>> 200mb. It completely copied the next 4 times I >>>>>>>>>>>>>>>> tried, and then died again on >>>>>>>>>>>>>>>> the next attempt. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On the client side: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> "Connection to home closed by remote host. >>>>>>>>>>>>>>>> lost connection" >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> In /var/log/auth.log: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write >>>>>>>>>>>>>>>> failed: Cannot allocate >>>>>>>>>>>>>>>> memory >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I've never seen this before and have used scp before >>>>>>>>>>>>>>>> to transfer large files >>>>>>>>>>>>>>>> without problems. This computer has been used in >>>>>>>>>>>>>>>> production for months and >>>>>>>>>>>>>>>> has a current uptime of 36 days. I have not been >>>>>>>>>>>>>>>> able to notice any problems >>>>>>>>>>>>>>>> copying files to the server via samba or netatalk, or >>>>>>>> any problems in >>>>>>>>>>>>>>>> apache. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Uname: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat >>>>>>>>>>>>>>>> Feb 19 01:02:54 EST >>>>>>>>>>>>>>>> 2011 root@xeon:/usr/obj/usr/src/sys/GENERIC amd64 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I've attached my dmesg and output of vmstat -z. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have not restarted the sshd daemon or rebooted the comput= er. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Am glad to provide any other information or test =20 >>>>>>>>>>>>>>>> anything else. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> {snip vmstat -z and dmesg} >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> You didn't provide details about your networking setup =20 >>>>>>>>>>>>>>> (rc.conf, >>>>>>>>>>>>>>> ifconfig -a, etc.). netstat -m would be useful too. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Next, please see this thread circa September 2010, =20 >>>>>>>>>>>>>>> titled "Network >>>>>>>>>>>>>>> memory allocation failures": >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-Septe= mber/thread.html#58708 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The user in that thread is using rsync, which relies on >>>>>>>> scp by default. >>>>>>>>>>>>>>> I believe this problem is similar, if not identical, to your= s. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please also provide your output of ( /usr/bin/limits -a ) >>>>>>>> for the server >>>>>>>>>>>>>> end and the client. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am not quite sure I agree with the need for ifconfig =20 >>>>>>>>>>>>>> -a but some >>>>>>>>>>>>>> information about the networking driver your using for =20 >>>>>>>>>>>>>> the interface >>>>>>>>>>>>>> would be helpful, uptime of the boxes. And configuration >>>>>> of the pool. >>>>>>>>>>>>>> e.g. ( zpool status -a ;zfs get all ) You =20 >>>>>>>>>>>>>> should probably >>>>>>>>>>>>>> prop this information up somewhere so you can reference by >>>>>>>> URL whenever >>>>>>>>>>>>>> needed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> rsync(1) does not rely on scp(1) whatsoever but rsync(1) >>>>>>>> can be made to >>>>>>>>>>>>>> use ssh(1) instead of rsh(1) and I believe that is what =20 >>>>>>>>>>>>>> Jeremy is >>>>>>>>>>>>>> stating here but correct me if I am wrong. It does use ssh(1)= by >>>>>>>>>>>>>> default. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Its a possiblity as well that if using tmpfs(5) or =20 >>>>>>>>>>>>>> mdmfs(8) for /tmp >>>>>>>>>>>>>> type filesystems that rsync(1) may be just filling up your >>>>>>>> temp ram area >>>>>>>>>>>>>> and causing the connection abort which would be >>>>>>>>>>>>>> expected. ( df -h ) would >>>>>>>>>>>>>> help here. >>>>>>>>>>>>> >>>>>>>>>>>>> Hello, >>>>>>>>>>>>> >>>>>>>>>>>>> I'm not using tmpfs/mdmfs at all. The clients yesterday >>>>>>>>>>>>> were 3 different OSX computers (over gigabit). The FreeBSD >>>>>>>>>>>>> server has 12gb of ram and no bce adapter. For what it's >>>>>>>>>>>>> worth, the server is backed up remotely every night with >>>>>>>>>>>>> rsync (remote FreeBSD uses rsync to pull) to an offsite >>>>>>>>>>>>> (slow cable connection) FreeBSD computer, and I have not >>>>>>>>>>>>> seen any errors in the nightly rsync. >>>>>>>>>>>>> >>>>>>>>>>>>> Sorry for the omission of networking info, here's the >>>>>>>>>>>>> output of the requested commands and some that popped up >>>>>>>>>>>>> in the other thread: >>>>>>>>>>>>> >>>>>>>>>>>>> http://www.cap-press.com/misc/ >>>>>>>>>>>>> >>>>>>>>>>>>> In rc.conf: ifconfig_em1=3D"inet 10.1.1.1 netmask 255.255.0.0= " >>>>>>>>>>>>> >>>>>>>>>>>>> Scott >>>>>>>>> >>>>>>>>> Just to make it crystal clear to everyone: >>>>>>>>> >>>>>>>>> There is no correlation between this problem and use of ZFS. =20 >>>>>>>>> People are >>>>>>>>> attempting to correlate "cannot allocate memory" messages =20 >>>>>>>>> with "anything >>>>>>>>> on the system that uses memory". The VM is much more =20 >>>>>>>>> complex than that. >>>>>>>>> >>>>>>>>> Given the nature of this problem, it's much more likely the issue = is >>>>>>>>> "somewhere" within a networking layer within FreeBSD, whether it b= e >>>>>>>>> driver-level or some sort of intermediary layer. >>>>>>>>> >>>>>>>>> Two people who have this issue in this thread are both using =20 >>>>>>>>> VirtualBox. >>>>>>>>> Can one, or both, of you remove VirtualBox from the configuration >>>>>>>>> entirely (kernel, etc. -- not sure what is required) and =20 >>>>>>>>> then see if the >>>>>>>>> issue goes away? >>>>>>>> >>>>>>>> On the machine in question I only can do it after hours so I will d= o >>>>>>>> it tonight. >>>>>>>> >>>>>>>> I was _successfully_ sending the file over the loopback =20 >>>>>>>> interface using >>>>>>>> >>>>>>>> cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > /dev/null= " >>>>>>>> >>>>>>>> I did it, btw, with the IPv6 localhost address first (accidently), >>>>>>>> and then using IPv4. Both worked. >>>>>>>> >>>>>>>> It always fails if I am sending it through the bce(4) interface, >>>>>>>> even if my target is the VirtualBox bridged to the bce card (so it >>>>>>>> does not "leave" the computer physically). >>>>>>>> >>>>>>>> Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and >>>>>>>> kldstat output. >>>>>>>> >>>>>>>> I have another box where I do not see that problem. It copies files >>>>>>>> happily over the net using ssh. >>>>>>>> >>>>>>>> It is an an older HP ML 150 with 3GB RAM only but with a bge(4) >>>>>>>> driver instead. It runs the same last week's RELENG_8. I installed >>>>>>>> VirtualBox and enabled vboxnet (so it loads the kernel modules). Bu= t >>>>>>>> I do not run VirtualBox on it (because it hasn't enough RAM). >>>>>>>> >>>>>>>> Regards >>>>>>>> Peter >>>>>>>> >>>>>>>> DellT410one# uname -a >>>>>>>> FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu Ju= n >>>>>>>> 30 17:07:18 EST 2011 >>>>>>>> root@DellT410one.vv.fda:/usr/obj/usr/src/sys/GENERIC amd64 >>>>>>>> DellT410one# ifconfig -a >>>>>>>> bce0: flags=3D8943 >>>>>>>> metric 0 mtu 1500 >>>>>>>> =09options=3Dc01bb >>>>>>>> =09ether 84:2b:2b:68:64:e4 >>>>>>>> =09inet 192.168.50.220 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>>> =09inet 192.168.50.221 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>>> =09inet 192.168.50.223 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>>> =09inet 192.168.50.224 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>>> =09inet 192.168.50.225 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>>> =09inet 192.168.50.226 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>>> =09inet 192.168.50.227 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>>> =09inet 192.168.50.219 netmask 0xffffff00 broadcast 192.168.50.255 >>>>>>>> =09media: Ethernet autoselect (1000baseT ) >>>>>>>> =09status: active >>>>>>>> bce1: flags=3D8802 metric 0 mtu 1500 >>>>>>>> =09options=3Dc01bb >>>>>>>> =09ether 84:2b:2b:68:64:e5 >>>>>>>> =09media: Ethernet autoselect >>>>>>>> lo0: flags=3D8049 metric 0 mtu 16384 >>>>>>>> =09options=3D3 >>>>>>>> =09inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb >>>>>>>> =09inet6 ::1 prefixlen 128 >>>>>>>> =09inet 127.0.0.1 netmask 0xff000000 >>>>>>>> =09nd6 options=3D3 >>>>>>>> vboxnet0: flags=3D8802 metric 0 mtu 15= 00 >>>>>>>> =09ether 0a:00:27:00:00:00 >>>>>>>> DellT410one# netstat -rn >>>>>>>> Routing tables >>>>>>>> >>>>>>>> Internet: >>>>>>>> Destination Gateway Flags Refs Use =20 >>>>>>>> Netif Expire >>>>>>>> default 192.168.50.201 UGS 0 52195 bce0 >>>>>>>> 127.0.0.1 link#11 UH 0 6 lo0 >>>>>>>> 192.168.50.0/24 link#1 U 0 1118212 bce0 >>>>>>>> 192.168.50.219 link#1 UHS 0 9670 lo0 >>>>>>>> 192.168.50.220 link#1 UHS 0 8347 lo0 >>>>>>>> 192.168.50.221 link#1 UHS 0 103024 lo0 >>>>>>>> 192.168.50.223 link#1 UHS 0 43614 lo0 >>>>>>>> 192.168.50.224 link#1 UHS 0 8358 lo0 >>>>>>>> 192.168.50.225 link#1 UHS 0 8438 lo0 >>>>>>>> 192.168.50.226 link#1 UHS 0 8338 lo0 >>>>>>>> 192.168.50.227 link#1 UHS 0 8333 lo0 >>>>>>>> 192.168.165.0/24 192.168.50.200 UGS 0 3311 bce0 >>>>>>>> 192.168.166.0/24 192.168.50.200 UGS 0 699 bce0 >>>>>>>> 192.168.167.0/24 192.168.50.200 UGS 0 3012 bce0 >>>>>>>> 192.168.168.0/24 192.168.50.200 UGS 0 552 bce0 >>>>>>>> >>>>>>>> Internet6: >>>>>>>> Destination Gateway >>>>>>>> Flags Netif Expire >>>>>>>> ::1 ::1 UH >>>>>>>> lo0 >>>>>>>> fe80::%lo0/64 link#11 U >>>>>>>> lo0 >>>>>>>> fe80::1%lo0 link#11 UHS >>>>>>>> lo0 >>>>>>>> ff01::%lo0/32 fe80::1%lo0 U >>>>>>>> lo0 >>>>>>>> ff02::%lo0/32 fe80::1%lo0 U >>>>>>>> lo0 >>>>>>>> DellT410one# kldstat >>>>>>>> Id Refs Address Size Name >>>>>>>> 1 19 0xffffffff80100000 dbf5d0 kernel >>>>>>>> 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko >>>>>>>> 3 1 0xffffffff81012000 131998 zfs.ko >>>>>>>> 4 1 0xffffffff81144000 1ff1 opensolaris.ko >>>>>>>> 5 2 0xffffffff81146000 2940 vboxnetflt.ko >>>>>>>> 6 2 0xffffffff81149000 8e38 netgraph.ko >>>>>>>> 7 1 0xffffffff81152000 153c ng_ether.ko >>>>>>>> 8 1 0xffffffff81154000 e70 vboxnetadp.ko >>>>>>>> DellT410one# pciconf -lv >>>>>>>> .. >>>>>>>> bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >>>>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>>>>>> vendor =3D 'Broadcom Corporation' >>>>>>>> class =3D network >>>>>>>> subclass =3D ethernet >>>>>>>> bce1@pci0:1:0:1: class=3D0x020000 card=3D0x028d1028 >>>>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>>>>>> vendor =3D 'Broadcom Corporation' >>>>>>>> class =3D network >>>>>>>> subclass =3D ethernet >>>>>>> >>>>>>> Could you please provide "pciconf -lvcb" output instead, =20 >>>>>>> specific to the >>>>>>> bce chips? Thanks. >>>>>> >>>>>> Her it is: >>>>>> >>>>>> bce0@pci0:1:0:0: class=3D0x020000 card=3D0x028d1028 >>>>>> chip=3D0x163b14e4 rev=3D0x20 hdr=3D0x00 >>>>>> vendor =3D 'Broadcom Corporation' >>>>>> class =3D network >>>>>> subclass =3D ethernet >>>>>> bar [10] =3D type Memory, range 64, base 0xda000000, size >>>>>> 33554432, enabled >>>>>> cap 01[48] =3D powerspec 3 supports D0 D3 current D0 >>>>>> cap 03[50] =3D VPD >>>>>> cap 05[58] =3D MSI supports 16 messages, 64 bit enabled with 1 mess= age >>>>>> cap 11[a0] =3D MSI-X supports 9 messages in map 0x10 >>>>>> cap 10[ac] =3D PCI-Express 2 endpoint max data 256(512) link x4(x4) >>>>>> ecap 0003[100] =3D Serial 1 842b2bfffe6864e4 >>>>>> ecap 0001[110] =3D AER 1 0 fatal 0 non-fatal 1 corrected >>>>>> ecap 0004[150] =3D unknown 1 >>>>>> ecap 0002[160] =3D VC 1 max VC0 >>>>> >>>>> Thanks Peter. >>>>> >>>>> Adding Yong-Hyeon and David to the discussion, since they've both work= ed >>>>> on the bce(4) driver in recent months (most of the changes made recent= ly >>>>> are only in HEAD), and also adding Jack Vogel of Intel who maintains >>>>> em(4). Brief history for the devs: >>>>> >>>>> The issue is described "Network memory allocation failures" and was >>>>> reported last year, but two users recently (Scott and Peter) have >>>>> reported the issue again: >>>>> >>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/threa= d.html#58708 >>>>> >>>>> And was mentioned again by Scott here, which also contains some >>>>> technical details: >>>>> >>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063172.htm= l >>>>> >>>>> What's interesting is that Scott's issue is identical in form but he's >>>>> using em(4), which isn't known to behave like this. Both individuals >>>>> are using VirtualBox, though we're not sure at this point if that is t= he >>>>> piece which is causing the anomaly. >>>>> >>>>> Relevant details of Scott's system (em-based): >>>>> >>>>> http://www.cap-press.com/misc/ >>>>> >>>>> Relevant details of Peter's system (bce-based): >>>>> >>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063221.htm= l >>>>> http://lists.freebsd.org/pipermail/freebsd-stable/2011-July/063223.htm= l >>>>> >>>>> I think the biggest complexity right now is figuring out how/why scp >>>>> fails intermittently in this nature. The errno probably "trickles dow= n" >>>>> to userland from the kernel, but the condition regarding why it happen= s >>>>> is unknown. >>>> >>>> BTW: I also saw 2 of the errors coming from a BIND9 running in a >>>> jail on that box. >>>> >>>> DellT410one# fgrep -i allocate /jails/bind/20110315/var/log/messages >>>> Apr 13 05:17:41 bind named[23534]: internal_send: >>>> 192.168.50.145#65176: Cannot allocate memory >>>> Jun 21 23:30:44 bind named[39864]: internal_send: >>>> 192.168.50.251#36155: Cannot allocate memory >>>> Jun 24 15:28:00 bind named[39864]: internal_send: >>>> 192.168.50.251#28651: Cannot allocate memory >>>> Jun 28 12:57:52 bind named[2462]: internal_send: >>>> 192.168.165.154#1201: Cannot allocate memory >>>> >>>> My initial guess: it happens sooner or later somehow - whether it is >>>> a lot of traffic in one go (ssh/scp copies of virtual disks) or a >>>> lot of traffic over a longer period (a nameserver gets asked again >>>> and again). >>> >>> Scott, are you also using jails? If both of you are: is there any >>> possibility you can remove use of those? I'm not sure how VirtualBox >>> fits into the picture (jails + VirtualBox that is), but I can imagine >>> jails having different environmental constraints that might cause this. >>> >>> Basically the troubleshooting process here is to remove pieces of the >>> puzzle until you figure out which piece is causing the issue. I don't >>> want to get the NIC driver devs all spun up for something that, for >>> example, might be an issue with the jail implementation. >> >> I understand this. As said, I do some afterhours debugging tonight. >> >> The scp/ssh problems are happening _outside_ the jails. The bind =20 >> runs _inside_ the jail. >> >> I wanted to use the _host_ system to send VirtualBox virtual disks =20 >> and filesystems used by jails to archive them and/or having them =20 >> available on other FreeBSD systems (as a cold standby solution). > > I just switched off the VirtualBox (without removing the kernel modules). > > The copy succeeds now. > > Well, it could be a VirtualBox related problem, or is the server =20 > just relieved to have 2GB more memory at hands now? > > Do you have a quick idea to "emulate" the 2GB memory load usually =20 > delivered by VirtualBox? Well, managed that (using lookbusy) Interestingly I could copy a large file (30GB) without problems, as =20 soon as I switched off the VirtualBox. As said, the kernel modules =20 weren't unloaded, they are still there. The copy crashes seconds after I started the VirtualBox. According to =20 vmstat and top I had more free memory (ca. 1.5GB) as I had without =20 VirtualBox and lookbusy (ca. 350MB). So, it looks (to me, at least) as I have a VirtualBox related problem, =20 somehow. Any ideas? I am happy to play a bit more to get it sorted although it =20 has some limits (it is running the company mailserver, after all) Regards Peter From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 15:13:19 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1603E1065670 for ; Wed, 6 Jul 2011 15:13:19 +0000 (UTC) (envelope-from zkolic@sbb.rs) Received: from smtp9.sbb.rs (smtp9.sbb.rs [89.216.2.41]) by mx1.freebsd.org (Postfix) with ESMTP id 87F378FC0A for ; Wed, 6 Jul 2011 15:13:18 +0000 (UTC) Received: from faust (cable-94-189-181-176.dynamic.sbb.rs [94.189.181.176]) by smtp9.sbb.rs (8.14.0/8.14.0) with ESMTP id p66FDEZr010097 for ; Wed, 6 Jul 2011 17:13:14 +0200 Received: by faust (Postfix, from userid 1001) id 899C11701D; Wed, 6 Jul 2011 17:13:18 +0200 (CEST) Date: Wed, 6 Jul 2011 17:13:18 +0200 From: Zoran Kolic To: freebsd-stable@freebsd.org Message-ID: <20110706151318.GA1100@faust> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-SMTP-Vilter-Version: 1.3.2 X-SBB-Virus-Status: clean X-SBB-Spam-Score: -1.8 Subject: Re: Status of support for Intel 3000 (Sandybridge) (and VESA help) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 15:13:19 -0000 > Does any one know what the state of support for this might be? ATM, > the Intel driver simply does not recognize it. I just saw HP 4330s in the store. One post about year ago mentioned 3000/3100 working perfect. Could you try out all drivers, ie i915.ko? http://forums.freebsd.org/showthread.php?t=16395 I'm not sure if it is the same generation of graphics, re- garding the very name. There is just few reviews on details, but seems that linux works perfectly on the same hardware, after this russian site (if you could manage the lingo): http://retera.ru/reviews/hp-probook-4330s.html If you have time, post how you manage with your hardware. Zoran From owner-freebsd-stable@FreeBSD.ORG Wed Jul 6 16:22:43 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9A926106564A; Wed, 6 Jul 2011 16:22:43 +0000 (UTC) (envelope-from cscotts@gmail.com) Received: from mail-fx0-f44.google.com (mail-fx0-f44.google.com [209.85.161.44]) by mx1.freebsd.org (Postfix) with ESMTP id 691238FC12; Wed, 6 Jul 2011 16:22:40 +0000 (UTC) Received: by fxe6 with SMTP id 6so232489fxe.17 for ; Wed, 06 Jul 2011 09:22:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=ppK2LbNwtTgPBfpi6Kz2BlnBD/EEam7C+9Q5RK5t6u8=; b=ta5Rs/KupBUAF+oJuB2/+yufbqX57YwLhJ9nAPTk8f8SQld1r9XbJMdSRcZp/VJdSs 1nTLijBh0RiatwxlrwSv4rt1xwxhV2TBk3lPdwNuOxNmzeODLpvH7AzLHZicMA28C731 3swgqDeB9hCrG6bChaBH4a8sZaq+buMPt25TU= MIME-Version: 1.0 Received: by 10.223.102.67 with SMTP id f3mr2621378fao.32.1309969359489; Wed, 06 Jul 2011 09:22:39 -0700 (PDT) Received: by 10.223.125.201 with HTTP; Wed, 6 Jul 2011 09:22:39 -0700 (PDT) In-Reply-To: <20110706182141.13056plxp148y61h@webmail.in-berlin.de> References: <20110706122339.61453nlqra1vqsrv@webmail.in-berlin.de> <20110706023234.GA72048@icarus.home.lan> <20110706130753.182053f3ellasn0p@webmail.in-berlin.de> <20110706032425.GA72757@icarus.home.lan> <20110706135412.15276i0fxavg09k4@webmail.in-berlin.de> <20110706041504.GA73698@icarus.home.lan> <20110706143129.10696235ldx9bjmp@webmail.in-berlin.de> <20110706173242.23404ffbhkxz6mqi@webmail.in-berlin.de> <20110706182141.13056plxp148y61h@webmail.in-berlin.de> Date: Wed, 6 Jul 2011 12:22:39 -0400 Message-ID: From: Scott Sipe To: Peter Ross Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Yong-Hyeon Pyun , freebsd-stable List , davidch@freebsd.org, Jeremy Chadwick , "Vogel, Jack" Subject: Re: scp: Write Failed: Cannot allocate memory X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jul 2011 16:22:43 -0000 On Wed, Jul 6, 2011 at 4:21 AM, Peter Ross wrote: > Quoting "Peter Ross" : > > Quoting "Peter Ross" : >> >> Quoting "Jeremy Chadwick" : >>> >>> On Wed, Jul 06, 2011 at 01:54:12PM +1000, Peter Ross wrote: >>>> >>>>> Quoting "Jeremy Chadwick" : >>>>> >>>>> On Wed, Jul 06, 2011 at 01:07:53PM +1000, Peter Ross wrote: >>>>>> >>>>>>> Quoting "Jeremy Chadwick" : >>>>>>> >>>>>>> On Wed, Jul 06, 2011 at 12:23:39PM +1000, Peter Ross wrote: >>>>>>>> >>>>>>>>> Quoting "Jeremy Chadwick" : >>>>>>>>> >>>>>>>>> On Tue, Jul 05, 2011 at 01:03:20PM -0400, Scott Sipe wrote: >>>>>>>>>> >>>>>>>>>>> I'm running virtualbox 3.2.12_1 if that has anything to do with >>>>>>>>>>> it. >>>>>>>>>>> >>>>>>>>>>> sysctl vfs.zfs.arc_max: 6200000000 >>>>>>>>>>> >>>>>>>>>>> While I'm trying to scp, kstat.zfs.misc.arcstats.size is >>>>>>>>>>> hovering right around that value, sometimes above, sometimes >>>>>>>>>>> below (that's as it should be, right?). I don't think that it >>>>>>>>>>> dies when crossing over arc_max. I can run the same scp 10 times >>>>>>>>>>> and it might fail 1-3 times, with no correlation to the >>>>>>>>>>> arcstats.size being above/below arc_max that I can see. >>>>>>>>>>> >>>>>>>>>>> Scott >>>>>>>>>>> >>>>>>>>>>> On Jul 5, 2011, at 3:00 AM, Peter Ross wrote: >>>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> just as an addition: an upgrade to last Friday's >>>>>>>>>>>> FreeBSD-Stable and to VirtualBox 4.0.8 does not fix the >>>>>>>>>>>> problem. >>>>>>>>>>>> >>>>>>>>>>>> I will experiment a bit more tomorrow after hours and grab >>>>>>>>>>>> >>>>>>>>>>> some statistics. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> Regards >>>>>>>>>>>> Peter >>>>>>>>>>>> >>>>>>>>>>>> Quoting "Peter Ross" : >>>>>>>>>>>> >>>>>>>>>>>> Hi all, >>>>>>>>>>>>> >>>>>>>>>>>>> I noticed a similar problem last week. It is also very >>>>>>>>>>>>> similar to one reported last year: >>>>>>>>>>>>> >>>>>>>>>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2010-** >>>>>>>>>>>>> September/058708.html >>>>>>>>>>>>> >>>>>>>>>>>>> My server is a Dell T410 server with the same bge card (the >>>>>>>>>>>>> same pciconf -lvc output as described by Mahlon: >>>>>>>>>>>>> >>>>>>>>>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2010-** >>>>>>>>>>>>> September/058711.html >>>>>>>>>>>>> >>>>>>>>>>>>> Yours, Scott, is a em(4).. >>>>>>>>>>>>> >>>>>>>>>>>>> Another similarity: In all cases we are using VirtualBox. I >>>>>>>>>>>>> just want to mention it, in case it matters. I am still >>>>>>>>>>>>> running VirtualBox 3.2. >>>>>>>>>>>>> >>>>>>>>>>>>> Most of the time kstat.zfs.misc.arcstats.size was reaching >>>>>>>>>>>>> vfs.zfs.arc_max then, but I could catch one or two cases >>>>>>>>>>>>> then the value was still below. >>>>>>>>>>>>> >>>>>>>>>>>>> I added vfs.zfs.prefetch_disable=1 to sysctl.conf but it >>>>>>>>>>>>> >>>>>>>>>>>> does not help. >>>>>>> >>>>>>>> >>>>>>>>>>>>> BTW: It looks as ARC only gives back the memory when I >>>>>>>>>>>>> destroy the ZFS (a cloned snapshot containing virtual >>>>>>>>>>>>> machines). Even if nothing happens for hours the buffer >>>>>>>>>>>>> isn't released.. >>>>>>>>>>>>> >>>>>>>>>>>>> My machine was still running 8.2-PRERELEASE so I am upgrading. >>>>>>>>>>>>> >>>>>>>>>>>>> I am happy to give information gathered on old/new kernel if it >>>>>>>>>>>>> helps. >>>>>>>>>>>>> >>>>>>>>>>>>> Regards >>>>>>>>>>>>> Peter >>>>>>>>>>>>> >>>>>>>>>>>>> Quoting "Scott Sipe" : >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On Jul 2, 2011, at 12:54 AM, jhell wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Jul 01, 2011 at 03:22:32PM -0700, Jeremy Chadwick >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jul 01, 2011 at 03:13:17PM -0400, Scott Sipe wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I'm running 8.2-RELEASE and am having new problems >>>>>>>>>>>>>>>>> with scp. When scping >>>>>>>>>>>>>>>>> files to a ZFS directory on the FreeBSD server -- >>>>>>>>>>>>>>>>> most notably large files >>>>>>>>>>>>>>>>> -- the transfer frequently dies after just a few >>>>>>>>>>>>>>>>> seconds. In my last test, I >>>>>>>>>>>>>>>>> tried to scp an 800mb file to the FreeBSD system and >>>>>>>>>>>>>>>>> the transfer died after >>>>>>>>>>>>>>>>> 200mb. It completely copied the next 4 times I >>>>>>>>>>>>>>>>> tried, and then died again on >>>>>>>>>>>>>>>>> the next attempt. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On the client side: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> "Connection to home closed by remote host. >>>>>>>>>>>>>>>>> lost connection" >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> In /var/log/auth.log: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Jul 1 14:54:42 freebsd sshd[18955]: fatal: Write >>>>>>>>>>>>>>>>> failed: Cannot allocate >>>>>>>>>>>>>>>>> memory >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I've never seen this before and have used scp before >>>>>>>>>>>>>>>>> to transfer large files >>>>>>>>>>>>>>>>> without problems. This computer has been used in >>>>>>>>>>>>>>>>> production for months and >>>>>>>>>>>>>>>>> has a current uptime of 36 days. I have not been >>>>>>>>>>>>>>>>> able to notice any problems >>>>>>>>>>>>>>>>> copying files to the server via samba or netatalk, or >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> any problems in >>>>>>>>> >>>>>>>>>> apache. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Uname: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> FreeBSD xeon 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat >>>>>>>>>>>>>>>>> Feb 19 01:02:54 EST >>>>>>>>>>>>>>>>> 2011 root@xeon:/usr/obj/usr/src/**sys/GENERIC amd64 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I've attached my dmesg and output of vmstat -z. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I have not restarted the sshd daemon or rebooted the >>>>>>>>>>>>>>>>> computer. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Am glad to provide any other information or test anything >>>>>>>>>>>>>>>>> else. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> {snip vmstat -z and dmesg} >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> You didn't provide details about your networking setup >>>>>>>>>>>>>>>> (rc.conf, >>>>>>>>>>>>>>>> ifconfig -a, etc.). netstat -m would be useful too. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Next, please see this thread circa September 2010, titled >>>>>>>>>>>>>>>> "Network >>>>>>>>>>>>>>>> memory allocation failures": >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2010-** >>>>>>>>>>>>>>>> September/thread.html#58708 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The user in that thread is using rsync, which relies on >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> scp by default. >>>>>>>>> >>>>>>>>>> I believe this problem is similar, if not identical, to yours. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please also provide your output of ( /usr/bin/limits -a ) >>>>>>>>>>>>>>> >>>>>>>>>>>>>> for the server >>>>>>>>> >>>>>>>>>> end and the client. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am not quite sure I agree with the need for ifconfig -a but >>>>>>>>>>>>>>> some >>>>>>>>>>>>>>> information about the networking driver your using for the >>>>>>>>>>>>>>> interface >>>>>>>>>>>>>>> would be helpful, uptime of the boxes. And configuration >>>>>>>>>>>>>>> >>>>>>>>>>>>>> of the pool. >>>>>>> >>>>>>>> e.g. ( zpool status -a ;zfs get all ) You should probably >>>>>>>>>>>>>>> prop this information up somewhere so you can reference by >>>>>>>>>>>>>>> >>>>>>>>>>>>>> URL whenever >>>>>>>>> >>>>>>>>>> needed. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> rsync(1) does not rely on scp(1) whatsoever but rsync(1) >>>>>>>>>>>>>>> >>>>>>>>>>>>>> can be made to >>>>>>>>> >>>>>>>>>> use ssh(1) instead of rsh(1) and I believe that is what Jeremy is >>>>>>>>>>>>>>> stating here but correct me if I am wrong. It does use ssh(1) >>>>>>>>>>>>>>> by >>>>>>>>>>>>>>> default. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Its a possiblity as well that if using tmpfs(5) or mdmfs(8) >>>>>>>>>>>>>>> for /tmp >>>>>>>>>>>>>>> type filesystems that rsync(1) may be just filling up your >>>>>>>>>>>>>>> >>>>>>>>>>>>>> temp ram area >>>>>>>>> >>>>>>>>>> and causing the connection abort which would be >>>>>>>>>>>>>>> expected. ( df -h ) would >>>>>>>>>>>>>>> help here. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm not using tmpfs/mdmfs at all. The clients yesterday >>>>>>>>>>>>>> were 3 different OSX computers (over gigabit). The FreeBSD >>>>>>>>>>>>>> server has 12gb of ram and no bce adapter. For what it's >>>>>>>>>>>>>> worth, the server is backed up remotely every night with >>>>>>>>>>>>>> rsync (remote FreeBSD uses rsync to pull) to an offsite >>>>>>>>>>>>>> (slow cable connection) FreeBSD computer, and I have not >>>>>>>>>>>>>> seen any errors in the nightly rsync. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sorry for the omission of networking info, here's the >>>>>>>>>>>>>> output of the requested commands and some that popped up >>>>>>>>>>>>>> in the other thread: >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://www.cap-press.com/misc/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> In rc.conf: ifconfig_em1="inet 10.1.1.1 netmask 255.255.0.0" >>>>>>>>>>>>>> >>>>>>>>>>>>>> Scott >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>> Just to make it crystal clear to everyone: >>>>>>>>>> >>>>>>>>>> There is no correlation between this problem and use of ZFS. >>>>>>>>>> People are >>>>>>>>>> attempting to correlate "cannot allocate memory" messages with >>>>>>>>>> "anything >>>>>>>>>> on the system that uses memory". The VM is much more complex than >>>>>>>>>> that. >>>>>>>>>> >>>>>>>>>> Given the nature of this problem, it's much more likely the issue >>>>>>>>>> is >>>>>>>>>> "somewhere" within a networking layer within FreeBSD, whether it >>>>>>>>>> be >>>>>>>>>> driver-level or some sort of intermediary layer. >>>>>>>>>> >>>>>>>>>> Two people who have this issue in this thread are both using >>>>>>>>>> VirtualBox. >>>>>>>>>> Can one, or both, of you remove VirtualBox from the configuration >>>>>>>>>> entirely (kernel, etc. -- not sure what is required) and then see >>>>>>>>>> if the >>>>>>>>>> issue goes away? >>>>>>>>>> >>>>>>>>> >>>>>>>>> On the machine in question I only can do it after hours so I will >>>>>>>>> do >>>>>>>>> it tonight. >>>>>>>>> >>>>>>>>> I was _successfully_ sending the file over the loopback interface >>>>>>>>> using >>>>>>>>> >>>>>>>>> cat /zpool/temp/zimbra_oldroot.vdi | ssh localhost "cat > >>>>>>>>> /dev/null" >>>>>>>>> >>>>>>>>> I did it, btw, with the IPv6 localhost address first (accidently), >>>>>>>>> and then using IPv4. Both worked. >>>>>>>>> >>>>>>>>> It always fails if I am sending it through the bce(4) interface, >>>>>>>>> even if my target is the VirtualBox bridged to the bce card (so it >>>>>>>>> does not "leave" the computer physically). >>>>>>>>> >>>>>>>>> Below the uname -a, ifconfig -a, netstat -rn, pciconf -lv and >>>>>>>>> kldstat output. >>>>>>>>> >>>>>>>>> I have another box where I do not see that problem. It copies files >>>>>>>>> happily over the net using ssh. >>>>>>>>> >>>>>>>>> It is an an older HP ML 150 with 3GB RAM only but with a bge(4) >>>>>>>>> driver instead. It runs the same last week's RELENG_8. I installed >>>>>>>>> VirtualBox and enabled vboxnet (so it loads the kernel modules). >>>>>>>>> But >>>>>>>>> I do not run VirtualBox on it (because it hasn't enough RAM). >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> Peter >>>>>>>>> >>>>>>>>> DellT410one# uname -a >>>>>>>>> FreeBSD DellT410one.vv.fda 8.2-STABLE FreeBSD 8.2-STABLE #1: Thu >>>>>>>>> Jun >>>>>>>>> 30 17:07:18 EST 2011 >>>>>>>>> root@DellT410one.vv.fda:/usr/**obj/usr/src/sys/GENERIC amd64 >>>>>>>>> DellT410one# ifconfig -a >>>>>>>>> bce0: flags=8943>>>>>>>> MULTICAST> >>>>>>>>> metric 0 mtu 1500 >>>>>>>>> options=c01bb>>>>>>>> VLAN_MTU,VLAN_HWTAGGING,JUMBO_**MTU,VLAN_HWCSUM,TSO4,VLAN_** >>>>>>>>> HWTSO,LINKSTATE> >>>>>>>>> ether 84:2b:2b:68:64:e4 >>>>>>>>> inet 192.168.50.220 netmask 0xffffff00 broadcast >>>>>>>>> 192.168.50.255 >>>>>>>>> inet 192.168.50.221 netmask 0xffffff00 broadcast >>>>>>>>> 192.168.50.255 >>>>>>>>> inet 192.168.50.223 netmask 0xffffff00 broadcast >>>>>>>>> 192.168.50.255 >>>>>>>>> inet 192.168.50.224 netmask 0xffffff00 broadcast >>>>>>>>> 192.168.50.255 >>>>>>>>> inet 192.168.50.225 netmask 0xffffff00 broadcast >>>>>>>>> 192.168.50.255 >>>>>>>>> inet 192.168.50.226 netmask 0xffffff00 broadcast >>>>>>>>> 192.168.50.255 >>>>>>>>> inet 192.168.50.227 netmask 0xffffff00 broadcast >>>>>>>>> 192.168.50.255 >>>>>>>>> inet 192.168.50.219 netmask 0xffffff00 broadcast >>>>>>>>> 192.168.50.255 >>>>>>>>> media: Ethernet autoselect (1000baseT ) >>>>>>>>> status: active >>>>>>>>> bce1: flags=8802 metric 0 mtu 1500 >>>>>>>>> options=c01bb>>>>>>>> VLAN_MTU,VLAN_HWTAGGING,JUMBO_**MTU,VLAN_HWCSUM,TSO4,VLAN_** >>>>>>>>> HWTSO,LINKSTATE> >>>>>>>>> ether 84:2b:2b:68:64:e5 >>>>>>>>> media: Ethernet autoselect >>>>>>>>> lo0: flags=8049 metric 0 mtu >>>>>>>>> 16384 >>>>>>>>> options=3 >>>>>>>>> inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb >>>>>>>>> inet6 ::1 prefixlen 128 >>>>>>>>> inet 127.0.0.1 netmask 0xff000000 >>>>>>>>> nd6 options=3 >>>>>>>>> vboxnet0: flags=8802 metric 0 mtu >>>>>>>>> 1500 >>>>>>>>> ether 0a:00:27:00:00:00 >>>>>>>>> DellT410one# netstat -rn >>>>>>>>> Routing tables >>>>>>>>> >>>>>>>>> Internet: >>>>>>>>> Destination Gateway Flags Refs Use Netif >>>>>>>>> Expire >>>>>>>>> default 192.168.50.201 UGS 0 52195 bce0 >>>>>>>>> 127.0.0.1 link#11 UH 0 6 lo0 >>>>>>>>> 192.168.50.0/24 link#1 U 0 1118212 >>>>>>>>> bce0 >>>>>>>>> 192.168.50.219 link#1 UHS 0 9670 lo0 >>>>>>>>> 192.168.50.220 link#1 UHS 0 8347 lo0 >>>>>>>>> 192.168.50.221 link#1 UHS 0 103024 lo0 >>>>>>>>> 192.168.50.223 link#1 UHS 0 43614 lo0 >>>>>>>>> 192.168.50.224 link#1 UHS 0 8358 lo0 >>>>>>>>> 192.168.50.225 link#1 UHS 0 8438 lo0 >>>>>>>>> 192.168.50.226 link#1 UHS 0 8338 lo0 >>>>>>>>> 192.168.50.227 link#1 UHS 0 8333 lo0 >>>>>>>>> 192.168.165.0/24 192.168.50.200 UGS 0 3311 >>>>>>>>> bce0 >>>>>>>>> 192.168.166.0/24 192.168.50.200 UGS 0 699 >>>>>>>>> bce0 >>>>>>>>> 192.168.167.0/24 192.168.50.200 UGS 0 3012 >>>>>>>>> bce0 >>>>>>>>> 192.168.168.0/24 192.168.50.200 UGS 0 552 >>>>>>>>> bce0 >>>>>>>>> >>>>>>>>> Internet6: >>>>>>>>> Destination Gateway >>>>>>>>> Flags Netif Expire >>>>>>>>> ::1 ::1 UH >>>>>>>>> lo0 >>>>>>>>> fe80::%lo0/64 link#11 U >>>>>>>>> lo0 >>>>>>>>> fe80::1%lo0 link#11 UHS >>>>>>>>> lo0 >>>>>>>>> ff01::%lo0/32 fe80::1%lo0 U >>>>>>>>> lo0 >>>>>>>>> ff02::%lo0/32 fe80::1%lo0 U >>>>>>>>> lo0 >>>>>>>>> DellT410one# kldstat >>>>>>>>> Id Refs Address Size Name >>>>>>>>> 1 19 0xffffffff80100000 dbf5d0 kernel >>>>>>>>> 2 3 0xffffffff80ec0000 4c358 vboxdrv.ko >>>>>>>>> 3 1 0xffffffff81012000 131998 zfs.ko >>>>>>>>> 4 1 0xffffffff81144000 1ff1 opensolaris.ko >>>>>>>>> 5 2 0xffffffff81146000 2940 vboxnetflt.ko >>>>>>>>> 6 2 0xffffffff81149000 8e38 netgraph.ko >>>>>>>>> 7 1 0xffffffff81152000 153c ng_ether.ko >>>>>>>>> 8 1 0xffffffff81154000 e70 vboxnetadp.ko >>>>>>>>> DellT410one# pciconf -lv >>>>>>>>> .. >>>>>>>>> bce0@pci0:1:0:0: class=0x020000 card=0x028d1028 >>>>>>>>> chip=0x163b14e4 rev=0x20 hdr=0x00 >>>>>>>>> vendor = 'Broadcom Corporation' >>>>>>>>> class = network >>>>>>>>> subclass = ethernet >>>>>>>>> bce1@pci0:1:0:1: class=0x020000 card=0x028d1028 >>>>>>>>> chip=0x163b14e4 rev=0x20 hdr=0x00 >>>>>>>>> vendor = 'Broadcom Corporation' >>>>>>>>> class = network >>>>>>>>> subclass = ethernet >>>>>>>>> >>>>>>>> >>>>>>>> Could you please provide "pciconf -lvcb" output instead, specific to >>>>>>>> the >>>>>>>> bce chips? Thanks. >>>>>>>> >>>>>>> >>>>>>> Her it is: >>>>>>> >>>>>>> bce0@pci0:1:0:0: class=0x020000 card=0x028d1028 >>>>>>> chip=0x163b14e4 rev=0x20 hdr=0x00 >>>>>>> vendor = 'Broadcom Corporation' >>>>>>> class = network >>>>>>> subclass = ethernet >>>>>>> bar [10] = type Memory, range 64, base 0xda000000, size >>>>>>> 33554432, enabled >>>>>>> cap 01[48] = powerspec 3 supports D0 D3 current D0 >>>>>>> cap 03[50] = VPD >>>>>>> cap 05[58] = MSI supports 16 messages, 64 bit enabled with 1 message >>>>>>> cap 11[a0] = MSI-X supports 9 messages in map 0x10 >>>>>>> cap 10[ac] = PCI-Express 2 endpoint max data 256(512) link x4(x4) >>>>>>> ecap 0003[100] = Serial 1 842b2bfffe6864e4 >>>>>>> ecap 0001[110] = AER 1 0 fatal 0 non-fatal 1 corrected >>>>>>> ecap 0004[150] = unknown 1 >>>>>>> ecap 0002[160] = VC 1 max VC0 >>>>>>> >>>>>> >>>>>> Thanks Peter. >>>>>> >>>>>> Adding Yong-Hyeon and David to the discussion, since they've both >>>>>> worked >>>>>> on the bce(4) driver in recent months (most of the changes made >>>>>> recently >>>>>> are only in HEAD), and also adding Jack Vogel of Intel who maintains >>>>>> em(4). Brief history for the devs: >>>>>> >>>>>> The issue is described "Network memory allocation failures" and was >>>>>> reported last year, but two users recently (Scott and Peter) have >>>>>> reported the issue again: >>>>>> >>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2010-** >>>>>> September/thread.html#58708 >>>>>> >>>>>> And was mentioned again by Scott here, which also contains some >>>>>> technical details: >>>>>> >>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2011-** >>>>>> July/063172.html >>>>>> >>>>>> What's interesting is that Scott's issue is identical in form but he's >>>>>> using em(4), which isn't known to behave like this. Both individuals >>>>>> are using VirtualBox, though we're not sure at this point if that is >>>>>> the >>>>>> piece which is causing the anomaly. >>>>>> >>>>>> Relevant details of Scott's system (em-based): >>>>>> >>>>>> http://www.cap-press.com/misc/ >>>>>> >>>>>> Relevant details of Peter's system (bce-based): >>>>>> >>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2011-** >>>>>> July/063221.html >>>>>> http://lists.freebsd.org/**pipermail/freebsd-stable/2011-** >>>>>> July/063223.html >>>>>> >>>>>> I think the biggest complexity right now is figuring out how/why scp >>>>>> fails intermittently in this nature. The errno probably "trickles >>>>>> down" >>>>>> to userland from the kernel, but the condition regarding why it >>>>>> happens >>>>>> is unknown. >>>>>> >>>>> >>>>> BTW: I also saw 2 of the errors coming from a BIND9 running in a >>>>> jail on that box. >>>>> >>>>> DellT410one# fgrep -i allocate /jails/bind/20110315/var/log/**messages >>>>> Apr 13 05:17:41 bind named[23534]: internal_send: >>>>> 192.168.50.145#65176: Cannot allocate memory >>>>> Jun 21 23:30:44 bind named[39864]: internal_send: >>>>> 192.168.50.251#36155: Cannot allocate memory >>>>> Jun 24 15:28:00 bind named[39864]: internal_send: >>>>> 192.168.50.251#28651: Cannot allocate memory >>>>> Jun 28 12:57:52 bind named[2462]: internal_send: >>>>> 192.168.165.154#1201: Cannot allocate memory >>>>> >>>>> My initial guess: it happens sooner or later somehow - whether it is >>>>> a lot of traffic in one go (ssh/scp copies of virtual disks) or a >>>>> lot of traffic over a longer period (a nameserver gets asked again >>>>> and again). >>>>> >>>> >>>> Scott, are you also using jails? If both of you are: is there any >>>> possibility you can remove use of those? I'm not sure how VirtualBox >>>> fits into the picture (jails + VirtualBox that is), but I can imagine >>>> jails having different environmental constraints that might cause this. >>>> >>>> Basically the troubleshooting process here is to remove pieces of the >>>> puzzle until you figure out which piece is causing the issue. I don't >>>> want to get the NIC driver devs all spun up for something that, for >>>> example, might be an issue with the jail implementation. >>>> >>> >>> I understand this. As said, I do some afterhours debugging tonight. >>> >>> The scp/ssh problems are happening _outside_ the jails. The bind runs >>> _inside_ the jail. >>> >>> I wanted to use the _host_ system to send VirtualBox virtual disks and >>> filesystems used by jails to archive them and/or having them available on >>> other FreeBSD systems (as a cold standby solution). >>> >> >> I just switched off the VirtualBox (without removing the kernel modules). >> >> The copy succeeds now. >> >> Well, it could be a VirtualBox related problem, or is the server just >> relieved to have 2GB more memory at hands now? >> >> Do you have a quick idea to "emulate" the 2GB memory load usually >> delivered by VirtualBox? >> > > Well, managed that (using lookbusy) > > Interestingly I could copy a large file (30GB) without problems, as soon as > I switched off the VirtualBox. As said, the kernel modules weren't unloaded, > they are still there. > > The copy crashes seconds after I started the VirtualBox. According to > vmstat and top I had more free memory (ca. 1.5GB) as I had without > VirtualBox and lookbusy (ca. 350MB). > > So, it looks (to me, at least) as I have a VirtualBox related problem, > somehow. > > Any ideas? I am happy to play a bit more to get it sorted although it has > some limits (it is running the company mailserver, after all) > > Regards > Peter > This is it -- I'm seeing the exact same thing. Scp dies reliably with VirtualBox running. Quit VirtualBox and I was able to scp about 30 large files with no errors. Once I started VirtualBox an in-progress scp died within seconds. Ditto that the Kernel modules merely being loaded don't seem to make a difference, it's VirtualBox actually running. virtualbox-ose-3.2.12_1 If anybody has any additional tests to run, outputs to send, etc, I'm glad to muck around. Thanks, Scott From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 05:56:17 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1D4531065786 for ; Thu, 7 Jul 2011 05:56:17 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id BF1668FC2A for ; Thu, 7 Jul 2011 05:56:16 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.4/8.14.4) with ESMTP id p675uEOa040108 for ; Thu, 7 Jul 2011 01:56:14 -0400 (EDT) (envelope-from mike@sentex.net) Message-ID: <4E154A63.90600@sentex.net> Date: Thu, 07 Jul 2011 01:55:47 -0400 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: FreeBSD-STABLE Mailing List X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on IPv6:2607:f3e0:0:1::12 Subject: panic: spin lock held too long (RELENG_8 from today) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 05:56:17 -0000 I did a buildworld on this box to bring it up to RELENG_8 for the BIND fixes. Unfortunately, the formerly solid box (April 13th kernel) panic'd tonight with Unread portion of the kernel message buffer: spin lock 0xc0b1d200 (sched lock 1) held by 0xc5dac8a0 (tid 100107) too long panic: spin lock held too long cpuid = 0 Uptime: 13h30m4s Physical memory: 2035 MB Its a somewhat busy box taking in mail as well as backups for a few servers over nfs. At the time, it would have been getting about 250Mb/s inbound on its gigabit interface. Full core.txt file at http://www.tancsa.com/core-jul8-2011.txt #0 doadump () at pcpu.h:231 231 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump () at pcpu.h:231 #1 0xc06fd6d3 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:429 #2 0xc06fd937 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:602 #3 0xc06ed95f in _mtx_lock_spin_failed (m=0x0) at /usr/src/sys/kern/kern_mutex.c:490 #4 0xc06ed9e5 in _mtx_lock_spin (m=0xc0b1d200, tid=3312388992, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:526 #5 0xc0720254 in sched_add (td=0xc5dac5c0, flags=0) at /usr/src/sys/kern/sched_ule.c:1119 #6 0xc07203f9 in sched_wakeup (td=0xc5dac5c0) at /usr/src/sys/kern/sched_ule.c:1950 #7 0xc07061f8 in setrunnable (td=0xc5dac5c0) at /usr/src/sys/kern/kern_synch.c:499 #8 0xc07362af in sleepq_resume_thread (sq=0xca0da300, td=0xc5dac5c0, pri=Variable "pri" is not available. ) at /usr/src/sys/kern/subr_sleepqueue.c:751 #9 0xc0736e18 in sleepq_signal (wchan=0xc5fafe50, flags=1, pri=0, queue=0) at /usr/src/sys/kern/subr_sleepqueue.c:825 #10 0xc06b6764 in cv_signal (cvp=0xc5fafe50) at /usr/src/sys/kern/kern_condvar.c:422 #11 0xc08eaa0d in xprt_assignthread (xprt=Variable "xprt" is not available. ) at /usr/src/sys/rpc/svc.c:342 #12 0xc08ec502 in xprt_active (xprt=0xc95d9600) at /usr/src/sys/rpc/svc.c:378 #13 0xc08ee051 in svc_vc_soupcall (so=0xc6372ce0, arg=0xc95d9600, waitflag=1) at /usr/src/sys/rpc/svc_vc.c:747 #14 0xc075bbb1 in sowakeup (so=0xc6372ce0, sb=0xc6372d34) at /usr/src/sys/kern/uipc_sockbuf.c:191 #15 0xc08447bc in tcp_do_segment (m=0xcaa8d200, th=0xca6aa824, so=0xc6372ce0, tp=0xc63b4d20, drop_hdrlen=52, tlen=1448, iptos=0 '\0', ti_locked=2) at /usr/src/sys/netinet/tcp_input.c:1775 #16 0xc0847930 in tcp_input (m=0xcaa8d200, off0=20) at /usr/src/sys/netinet/tcp_input.c:1329 #17 0xc07ddaf7 in ip_input (m=0xcaa8d200) at /usr/src/sys/netinet/ip_input.c:787 #18 0xc07b8859 in netisr_dispatch_src (proto=1, source=0, m=0xcaa8d200) at /usr/src/sys/net/netisr.c:859 #19 0xc07b8af0 in netisr_dispatch (proto=1, m=0xcaa8d200) at /usr/src/sys/net/netisr.c:946 #20 0xc07ae5e1 in ether_demux (ifp=0xc56ed800, m=0xcaa8d200) at /usr/src/sys/net/if_ethersubr.c:894 #21 0xc07aeb5f in ether_input (ifp=0xc56ed800, m=0xcaa8d200) at /usr/src/sys/net/if_ethersubr.c:753 #22 0xc09977b2 in nfe_int_task (arg=0xc56ff000, pending=1) at /usr/src/sys/dev/nfe/if_nfe.c:2187 #23 0xc07387ca in taskqueue_run_locked (queue=0xc5702440) at /usr/src/sys/kern/subr_taskqueue.c:248 #24 0xc073895c in taskqueue_thread_loop (arg=0xc56ff130) at /usr/src/sys/kern/subr_taskqueue.c:385 #25 0xc06d1027 in fork_exit (callout=0xc07388a0 , arg=0xc56ff130, frame=0xc538ed28) at /usr/src/sys/kern/kern_fork.c:861 #26 0xc09a5c24 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:275 (kgdb) -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 07:36:51 2011 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 75120106566C for ; Thu, 7 Jul 2011 07:36:51 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id B680C8FC08 for ; Thu, 7 Jul 2011 07:36:50 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA14049; Thu, 07 Jul 2011 10:36:44 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1Qej8e-00094T-1R; Thu, 07 Jul 2011 10:36:44 +0300 Message-ID: <4E15620A.9030608@FreeBSD.org> Date: Thu, 07 Jul 2011 10:36:42 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.18) Gecko/20110626 Lightning/1.0b2 Thunderbird/3.1.11 MIME-Version: 1.0 To: Mike Tancsa References: <4E154A63.90600@sentex.net> In-Reply-To: <4E154A63.90600@sentex.net> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: FreeBSD-STABLE Mailing List Subject: Re: panic: spin lock held too long (RELENG_8 from today) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 07:36:51 -0000 on 07/07/2011 08:55 Mike Tancsa said the following: > I did a buildworld on this box to bring it up to RELENG_8 for the BIND > fixes. Unfortunately, the formerly solid box (April 13th kernel) > panic'd tonight with > > Unread portion of the kernel message buffer: > spin lock 0xc0b1d200 (sched lock 1) held by 0xc5dac8a0 (tid 100107) too long > panic: spin lock held too long > cpuid = 0 > Uptime: 13h30m4s > Physical memory: 2035 MB > > > Its a somewhat busy box taking in mail as well as backups for a few > servers over nfs. At the time, it would have been getting about 250Mb/s > inbound on its gigabit interface. Full core.txt file at > > http://www.tancsa.com/core-jul8-2011.txt I thought that this was supposed to contain output of 'thread apply all bt' in kgdb. Anyway, I think that stacktrace for tid 100107 may have some useful information. > #0 doadump () at pcpu.h:231 > 231 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) #0 doadump () at pcpu.h:231 > #1 0xc06fd6d3 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:429 > #2 0xc06fd937 in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:602 > #3 0xc06ed95f in _mtx_lock_spin_failed (m=0x0) > at /usr/src/sys/kern/kern_mutex.c:490 > #4 0xc06ed9e5 in _mtx_lock_spin (m=0xc0b1d200, tid=3312388992, opts=0, > file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:526 > #5 0xc0720254 in sched_add (td=0xc5dac5c0, flags=0) > at /usr/src/sys/kern/sched_ule.c:1119 > #6 0xc07203f9 in sched_wakeup (td=0xc5dac5c0) > at /usr/src/sys/kern/sched_ule.c:1950 > #7 0xc07061f8 in setrunnable (td=0xc5dac5c0) > at /usr/src/sys/kern/kern_synch.c:499 > #8 0xc07362af in sleepq_resume_thread (sq=0xca0da300, td=0xc5dac5c0, > pri=Variable "pri" is not available. > ) > at /usr/src/sys/kern/subr_sleepqueue.c:751 > #9 0xc0736e18 in sleepq_signal (wchan=0xc5fafe50, flags=1, pri=0, queue=0) > at /usr/src/sys/kern/subr_sleepqueue.c:825 > #10 0xc06b6764 in cv_signal (cvp=0xc5fafe50) > at /usr/src/sys/kern/kern_condvar.c:422 > #11 0xc08eaa0d in xprt_assignthread (xprt=Variable "xprt" is not available. > ) at /usr/src/sys/rpc/svc.c:342 > #12 0xc08ec502 in xprt_active (xprt=0xc95d9600) at > /usr/src/sys/rpc/svc.c:378 > #13 0xc08ee051 in svc_vc_soupcall (so=0xc6372ce0, arg=0xc95d9600, > waitflag=1) > at /usr/src/sys/rpc/svc_vc.c:747 > #14 0xc075bbb1 in sowakeup (so=0xc6372ce0, sb=0xc6372d34) > at /usr/src/sys/kern/uipc_sockbuf.c:191 > #15 0xc08447bc in tcp_do_segment (m=0xcaa8d200, th=0xca6aa824, > so=0xc6372ce0, > tp=0xc63b4d20, drop_hdrlen=52, tlen=1448, iptos=0 '\0', ti_locked=2) > at /usr/src/sys/netinet/tcp_input.c:1775 > #16 0xc0847930 in tcp_input (m=0xcaa8d200, off0=20) > at /usr/src/sys/netinet/tcp_input.c:1329 > #17 0xc07ddaf7 in ip_input (m=0xcaa8d200) > at /usr/src/sys/netinet/ip_input.c:787 > #18 0xc07b8859 in netisr_dispatch_src (proto=1, source=0, m=0xcaa8d200) > at /usr/src/sys/net/netisr.c:859 > #19 0xc07b8af0 in netisr_dispatch (proto=1, m=0xcaa8d200) > at /usr/src/sys/net/netisr.c:946 > #20 0xc07ae5e1 in ether_demux (ifp=0xc56ed800, m=0xcaa8d200) > at /usr/src/sys/net/if_ethersubr.c:894 > #21 0xc07aeb5f in ether_input (ifp=0xc56ed800, m=0xcaa8d200) > at /usr/src/sys/net/if_ethersubr.c:753 > #22 0xc09977b2 in nfe_int_task (arg=0xc56ff000, pending=1) > at /usr/src/sys/dev/nfe/if_nfe.c:2187 > #23 0xc07387ca in taskqueue_run_locked (queue=0xc5702440) > at /usr/src/sys/kern/subr_taskqueue.c:248 > #24 0xc073895c in taskqueue_thread_loop (arg=0xc56ff130) > at /usr/src/sys/kern/subr_taskqueue.c:385 > #25 0xc06d1027 in fork_exit (callout=0xc07388a0 , > arg=0xc56ff130, frame=0xc538ed28) at /usr/src/sys/kern/kern_fork.c:861 > #26 0xc09a5c24 in fork_trampoline () at > /usr/src/sys/i386/i386/exception.s:275 > (kgdb) > -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 08:20:33 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B79B41065670; Thu, 7 Jul 2011 08:20:33 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 4B1798FC1A; Thu, 7 Jul 2011 08:20:32 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p678KROA030405 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 7 Jul 2011 11:20:28 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id p678KRMr041132; Thu, 7 Jul 2011 11:20:27 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p678KRUj041131; Thu, 7 Jul 2011 11:20:27 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 7 Jul 2011 11:20:27 +0300 From: Kostik Belousov To: Andriy Gapon Message-ID: <20110707082027.GX48734@deviant.kiev.zoral.com.ua> References: <4E154A63.90600@sentex.net> <4E15620A.9030608@FreeBSD.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="g0LfaOww9mCf9g0e" Content-Disposition: inline In-Reply-To: <4E15620A.9030608@FreeBSD.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: FreeBSD-STABLE Mailing List Subject: Re: panic: spin lock held too long (RELENG_8 from today) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 08:20:33 -0000 --g0LfaOww9mCf9g0e Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jul 07, 2011 at 10:36:42AM +0300, Andriy Gapon wrote: > on 07/07/2011 08:55 Mike Tancsa said the following: > > I did a buildworld on this box to bring it up to RELENG_8 for the BIND > > fixes. Unfortunately, the formerly solid box (April 13th kernel) > > panic'd tonight with > >=20 > > Unread portion of the kernel message buffer: > > spin lock 0xc0b1d200 (sched lock 1) held by 0xc5dac8a0 (tid 100107) too= long > > panic: spin lock held too long > > cpuid =3D 0 > > Uptime: 13h30m4s > > Physical memory: 2035 MB > >=20 > >=20 > > Its a somewhat busy box taking in mail as well as backups for a few > > servers over nfs. At the time, it would have been getting about 250Mb/s > > inbound on its gigabit interface. Full core.txt file at > >=20 > > http://www.tancsa.com/core-jul8-2011.txt >=20 > I thought that this was supposed to contain output of 'thread apply all b= t' in > kgdb. Anyway, I think that stacktrace for tid 100107 may have some useful > information. >=20 > > #0 doadump () at pcpu.h:231 > > 231 pcpu.h: No such file or directory. > > in pcpu.h > > (kgdb) #0 doadump () at pcpu.h:231 > > #1 0xc06fd6d3 in boot (howto=3D260) at /usr/src/sys/kern/kern_shutdown= .c:429 > > #2 0xc06fd937 in panic (fmt=3DVariable "fmt" is not available. > > ) at /usr/src/sys/kern/kern_shutdown.c:602 > > #3 0xc06ed95f in _mtx_lock_spin_failed (m=3D0x0) > > at /usr/src/sys/kern/kern_mutex.c:490 > > #4 0xc06ed9e5 in _mtx_lock_spin (m=3D0xc0b1d200, tid=3D3312388992, opt= s=3D0, > > file=3D0x0, line=3D0) at /usr/src/sys/kern/kern_mutex.c:526 > > #5 0xc0720254 in sched_add (td=3D0xc5dac5c0, flags=3D0) > > at /usr/src/sys/kern/sched_ule.c:1119 > > #6 0xc07203f9 in sched_wakeup (td=3D0xc5dac5c0) > > at /usr/src/sys/kern/sched_ule.c:1950 > > #7 0xc07061f8 in setrunnable (td=3D0xc5dac5c0) > > at /usr/src/sys/kern/kern_synch.c:499 > > #8 0xc07362af in sleepq_resume_thread (sq=3D0xca0da300, td=3D0xc5dac5c= 0, > > pri=3DVariable "pri" is not available. > > ) > > at /usr/src/sys/kern/subr_sleepqueue.c:751 > > #9 0xc0736e18 in sleepq_signal (wchan=3D0xc5fafe50, flags=3D1, pri=3D0= , queue=3D0) > > at /usr/src/sys/kern/subr_sleepqueue.c:825 > > #10 0xc06b6764 in cv_signal (cvp=3D0xc5fafe50) > > at /usr/src/sys/kern/kern_condvar.c:422 > > #11 0xc08eaa0d in xprt_assignthread (xprt=3DVariable "xprt" is not avai= lable. > > ) at /usr/src/sys/rpc/svc.c:342 > > #12 0xc08ec502 in xprt_active (xprt=3D0xc95d9600) at > > /usr/src/sys/rpc/svc.c:378 > > #13 0xc08ee051 in svc_vc_soupcall (so=3D0xc6372ce0, arg=3D0xc95d9600, > > waitflag=3D1) > > at /usr/src/sys/rpc/svc_vc.c:747 > > #14 0xc075bbb1 in sowakeup (so=3D0xc6372ce0, sb=3D0xc6372d34) > > at /usr/src/sys/kern/uipc_sockbuf.c:191 > > #15 0xc08447bc in tcp_do_segment (m=3D0xcaa8d200, th=3D0xca6aa824, > > so=3D0xc6372ce0, > > tp=3D0xc63b4d20, drop_hdrlen=3D52, tlen=3D1448, iptos=3D0 '\0', ti_= locked=3D2) > > at /usr/src/sys/netinet/tcp_input.c:1775 > > #16 0xc0847930 in tcp_input (m=3D0xcaa8d200, off0=3D20) > > at /usr/src/sys/netinet/tcp_input.c:1329 > > #17 0xc07ddaf7 in ip_input (m=3D0xcaa8d200) > > at /usr/src/sys/netinet/ip_input.c:787 > > #18 0xc07b8859 in netisr_dispatch_src (proto=3D1, source=3D0, m=3D0xcaa= 8d200) > > at /usr/src/sys/net/netisr.c:859 > > #19 0xc07b8af0 in netisr_dispatch (proto=3D1, m=3D0xcaa8d200) > > at /usr/src/sys/net/netisr.c:946 > > #20 0xc07ae5e1 in ether_demux (ifp=3D0xc56ed800, m=3D0xcaa8d200) > > at /usr/src/sys/net/if_ethersubr.c:894 > > #21 0xc07aeb5f in ether_input (ifp=3D0xc56ed800, m=3D0xcaa8d200) > > at /usr/src/sys/net/if_ethersubr.c:753 > > #22 0xc09977b2 in nfe_int_task (arg=3D0xc56ff000, pending=3D1) > > at /usr/src/sys/dev/nfe/if_nfe.c:2187 > > #23 0xc07387ca in taskqueue_run_locked (queue=3D0xc5702440) > > at /usr/src/sys/kern/subr_taskqueue.c:248 > > #24 0xc073895c in taskqueue_thread_loop (arg=3D0xc56ff130) > > at /usr/src/sys/kern/subr_taskqueue.c:385 > > #25 0xc06d1027 in fork_exit (callout=3D0xc07388a0 , > > arg=3D0xc56ff130, frame=3D0xc538ed28) at /usr/src/sys/kern/kern_for= k.c:861 > > #26 0xc09a5c24 in fork_trampoline () at > > /usr/src/sys/i386/i386/exception.s:275 > > (kgdb) > >=20 BTW, we had a similar panic, "spinlock held too long", the spinlock is the sched lock N, on busy 8-core box recently upgraded to the stable/8. Unfortunately, machine hung dumping core, so the stack trace for the owner thread was not available. I was unable to make any conclusion from the data that was present. If the situation is reproducable, you coulld try to revert r221937. This is pure speculation, though. --g0LfaOww9mCf9g0e Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk4VbEsACgkQC3+MBN1Mb4iU9gCgxnDJw+3nI7TIfBHBKi2QCTev DwIAn2Zpb3dOwCkYNf03tBahoyOVYIfB =jWgx -----END PGP SIGNATURE----- --g0LfaOww9mCf9g0e-- From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 09:51:01 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9DD4F1065670 for ; Thu, 7 Jul 2011 09:51:01 +0000 (UTC) (envelope-from ari@ish.com.au) Received: from fish.ish.com.au (eth5921.nsw.adsl.internode.on.net [59.167.240.32]) by mx1.freebsd.org (Postfix) with ESMTP id 583178FC16 for ; Thu, 7 Jul 2011 09:51:01 +0000 (UTC) Received: from ip-136.ish.com.au ([203.29.62.136]:57529) by fish.ish.com.au with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.69) (envelope-from ) id 1Qel34-0000uQ-0r; Thu, 07 Jul 2011 19:39:06 +1000 X-CTCH-RefID: str=0001.0A150201.4E157EBA.0123:SCFSTAT13512334,ss=1,fgs=0 Message-ID: <4E157EB9.4030201@ish.com.au> Date: Thu, 07 Jul 2011 19:39:05 +1000 From: Aristedes Maniatis User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: freebsd-stable Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: system internal timer runs 10 times too slow X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 09:51:01 -0000 We upgraded an existing system to a new motherboard/CPU and found that timing in various programs is very odd. For example "top" only updates every 10 seconds instead of every second. And this confirms the oddness: # while true; do echo `date`; sleep 1; done Thu Jul 7 19:09:01 EST 2011 Thu Jul 7 19:09:11 EST 2011 Thu Jul 7 19:09:21 EST 2011 10 seconds instead of 1. So I looked first at the kernel timers: # dmesg | grep -i time Timecounter "i8254" frequency 1193182 Hz quality 0 Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 pci3: at device 0.1 (no driver attached) atrtc0: port 0x70-0x71 irq 8 on acpi0 acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 Timecounters tick every 1.000 msec I switched i8254 and then to HPET. No difference. # sysctl -w kern.timecounter.hardware=i8254 kern.timecounter.hardware: ACPI-fast -> i8254 # while true; do echo `date`; sleep 1; done Thu Jul 7 19:09:40 EST 2011 Thu Jul 7 19:09:41 EST 2011 I switched to TSC: # sysctl -w kern.timecounter.hardware=TSC kern.timecounter.hardware: HPET -> TSC # while true; do echo `date`; sleep 1; done Thu Jul 7 19:25:56 EST 2011 Thu Jul 7 19:25:57 EST 2011 Thu Jul 7 19:25:58 EST 2011 Now this looks like it fixed the problem, but actually it is worse. Now the clock matches what you'd expect, but there is still 10 seconds in real time between those date entries. That is, now the system clock is running 10 times too slow as well. # uname -a FreeBSD delish.ish.com.au 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Thu Feb 17 02:41:51 UTC 2011 root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 Base board information Manufacturer: ASUSTeK Computer INC. Product Name: P6X58D-E BIOS information Vendor: American Megatrends Inc. Version: 0502 Release Date: 11/16/2010 BIOS Revision: 8.15 CPU Model: Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz Thanks in advance for any help. Ari -- --------------------------> Aristedes Maniatis ish http://www.ish.com.au Level 1, 30 Wilson Street Newtown 2042 Australia phone +61 2 9550 5001 fax +61 2 9550 4001 GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 09:58:02 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DA57A1065672 for ; Thu, 7 Jul 2011 09:58:02 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA11.westchester.pa.mail.comcast.net (qmta11.westchester.pa.mail.comcast.net [76.96.59.211]) by mx1.freebsd.org (Postfix) with ESMTP id 8918D8FC13 for ; Thu, 7 Jul 2011 09:58:02 +0000 (UTC) Received: from omta11.westchester.pa.mail.comcast.net ([76.96.62.36]) by QMTA11.westchester.pa.mail.comcast.net with comcast id 4xwl1h0010mv7h05Bxy20d; Thu, 07 Jul 2011 09:58:02 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta11.westchester.pa.mail.comcast.net with comcast id 4xy11h00A1t3BNj3Xxy2S7; Thu, 07 Jul 2011 09:58:02 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 4DEC8102C36; Thu, 7 Jul 2011 02:58:00 -0700 (PDT) Date: Thu, 7 Jul 2011 02:58:00 -0700 From: Jeremy Chadwick To: Aristedes Maniatis Message-ID: <20110707095800.GA6295@icarus.home.lan> References: <4E157EB9.4030201@ish.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E157EB9.4030201@ish.com.au> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable Subject: Re: system internal timer runs 10 times too slow X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 09:58:03 -0000 On Thu, Jul 07, 2011 at 07:39:05PM +1000, Aristedes Maniatis wrote: > We upgraded an existing system to a new motherboard/CPU and found that timing in various programs is very odd. For example "top" only updates every 10 seconds instead of every second. And this confirms the oddness: > > # while true; do echo `date`; sleep 1; done > Thu Jul 7 19:09:01 EST 2011 > Thu Jul 7 19:09:11 EST 2011 > Thu Jul 7 19:09:21 EST 2011 > > 10 seconds instead of 1. > > > So I looked first at the kernel timers: > > # dmesg | grep -i time > Timecounter "i8254" frequency 1193182 Hz quality 0 > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 > pci3: at device 0.1 (no driver attached) > atrtc0: port 0x70-0x71 irq 8 on acpi0 > acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 > Timecounter "HPET" frequency 14318180 Hz quality 900 > Timecounters tick every 1.000 msec > > > I switched i8254 and then to HPET. No difference. > > # sysctl -w kern.timecounter.hardware=i8254 > kern.timecounter.hardware: ACPI-fast -> i8254 > # while true; do echo `date`; sleep 1; done > Thu Jul 7 19:09:40 EST 2011 > Thu Jul 7 19:09:41 EST 2011 > > I switched to TSC: > > # sysctl -w kern.timecounter.hardware=TSC > kern.timecounter.hardware: HPET -> TSC > # while true; do echo `date`; sleep 1; done > Thu Jul 7 19:25:56 EST 2011 > Thu Jul 7 19:25:57 EST 2011 > Thu Jul 7 19:25:58 EST 2011 > > Now this looks like it fixed the problem, but actually it is worse. Now the clock matches what you'd expect, but there is still 10 seconds in real time between those date entries. That is, now the system clock is running 10 times too slow as well. > > > # uname -a > FreeBSD delish.ish.com.au 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Thu Feb 17 02:41:51 UTC 2011 root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 > > Base board information > Manufacturer: ASUSTeK Computer INC. > Product Name: P6X58D-E > > BIOS information > Vendor: American Megatrends Inc. > Version: 0502 > Release Date: 11/16/2010 > BIOS Revision: 8.15 > > CPU Model: Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz Do you have anything like powerd(8) enabled, or EIST / Intel SpeedStep technology enabled in your system BIOS? If so, can you try disabling powerd and/or disabling EIST/SS? Alternately, and this isn't to say FreeBSD doesn't have a problem, do you have a replacement/spare motherboard you can try? There's always the possibility that you have a bad crystal on the motherboard and a replacement board would rule that out. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 11:33:14 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2ECD31065677; Thu, 7 Jul 2011 11:33:14 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id 663D18FC1C; Thu, 7 Jul 2011 11:33:13 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.4/8.14.4) with ESMTP id p67BX8Hp058109; Thu, 7 Jul 2011 07:33:08 -0400 (EDT) (envelope-from mike@sentex.net) Message-ID: <4E159959.2070401@sentex.net> Date: Thu, 07 Jul 2011 07:32:41 -0400 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: Kostik Belousov References: <4E154A63.90600@sentex.net> <4E15620A.9030608@FreeBSD.org> <20110707082027.GX48734@deviant.kiev.zoral.com.ua> In-Reply-To: <20110707082027.GX48734@deviant.kiev.zoral.com.ua> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on IPv6:2607:f3e0:0:1::12 Cc: FreeBSD-STABLE Mailing List , Andriy Gapon Subject: Re: panic: spin lock held too long (RELENG_8 from today) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 11:33:14 -0000 On 7/7/2011 4:20 AM, Kostik Belousov wrote: > > BTW, we had a similar panic, "spinlock held too long", the spinlock > is the sched lock N, on busy 8-core box recently upgraded to the > stable/8. Unfortunately, machine hung dumping core, so the stack trace > for the owner thread was not available. > > I was unable to make any conclusion from the data that was present. > If the situation is reproducable, you coulld try to revert r221937. This > is pure speculation, though. Another crash just now after 5hrs uptime. I will try and revert r221937 unless there is any extra debugging you want me to add to the kernel instead ? This is an inbound mail server so a little disruption is possible kgdb /usr/obj/usr/src/sys/recycle/kernel.debug vmcore.13 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Unread portion of the kernel message buffer: spin lock 0xc0b1d200 (sched lock 1) held by 0xc5dac2e0 (tid 100109) too long panic: spin lock held too long cpuid = 0 Uptime: 5h37m43s Physical memory: 2035 MB Dumping 260 MB: 245 229 213 197 181 165 149 133 117 101 85 69 53 37 21 5 Reading symbols from /boot/kernel/amdsbwd.ko...Reading symbols from /boot/kernel/amdsbwd.ko.symbols...done. done. Loaded symbols for /boot/kernel/amdsbwd.ko #0 doadump () at pcpu.h:231 231 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:231 #1 0xc06fd6d3 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:429 #2 0xc06fd937 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:602 #3 0xc06ed95f in _mtx_lock_spin_failed (m=0x0) at /usr/src/sys/kern/kern_mutex.c:490 #4 0xc06ed9e5 in _mtx_lock_spin (m=0xc0b1d200, tid=3312388992, opts=0, file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:526 #5 0xc0720254 in sched_add (td=0xc61892e0, flags=0) at /usr/src/sys/kern/sched_ule.c:1119 #6 0xc07203f9 in sched_wakeup (td=0xc61892e0) at /usr/src/sys/kern/sched_ule.c:1950 #7 0xc07061f8 in setrunnable (td=0xc61892e0) at /usr/src/sys/kern/kern_synch.c:499 #8 0xc07362af in sleepq_resume_thread (sq=0xc55311c0, td=0xc61892e0, pri=Variable "pri" is not available. ) at /usr/src/sys/kern/subr_sleepqueue.c:751 #9 0xc0736e18 in sleepq_signal (wchan=0xc60386d0, flags=1, pri=0, queue=0) at /usr/src/sys/kern/subr_sleepqueue.c:825 #10 0xc06b6764 in cv_signal (cvp=0xc60386d0) at /usr/src/sys/kern/kern_condvar.c:422 #11 0xc08eaa0d in xprt_assignthread (xprt=Variable "xprt" is not available. ) at /usr/src/sys/rpc/svc.c:342 #12 0xc08ec502 in xprt_active (xprt=0xc5db8a00) at /usr/src/sys/rpc/svc.c:378 #13 0xc08ee051 in svc_vc_soupcall (so=0xc618a19c, arg=0xc5db8a00, waitflag=1) at /usr/src/sys/rpc/svc_vc.c:747 #14 0xc075bbb1 in sowakeup (so=0xc618a19c, sb=0xc618a1f0) at /usr/src/sys/kern/uipc_sockbuf.c:191 #15 0xc08447bc in tcp_do_segment (m=0xc6567a00, th=0xc6785824, so=0xc618a19c, tp=0xc617e000, drop_hdrlen=52, tlen=1448, iptos=0 '\0', ti_locked=2) at /usr/src/sys/netinet/tcp_input.c:1775 #16 0xc0847930 in tcp_input (m=0xc6567a00, off0=20) at /usr/src/sys/netinet/tcp_input.c:1329 #17 0xc07ddaf7 in ip_input (m=0xc6567a00) at /usr/src/sys/netinet/ip_input.c:787 #18 0xc07b8859 in netisr_dispatch_src (proto=1, source=0, m=0xc6567a00) at /usr/src/sys/net/netisr.c:859 #19 0xc07b8af0 in netisr_dispatch (proto=1, m=0xc6567a00) at /usr/src/sys/net/netisr.c:946 #20 0xc07ae5e1 in ether_demux (ifp=0xc56ed800, m=0xc6567a00) at /usr/src/sys/net/if_ethersubr.c:894 #21 0xc07aeb5f in ether_input (ifp=0xc56ed800, m=0xc6567a00) at /usr/src/sys/net/if_ethersubr.c:753 #22 0xc09977b2 in nfe_int_task (arg=0xc56ff000, pending=1) at /usr/src/sys/dev/nfe/if_nfe.c:2187 #23 0xc07387ca in taskqueue_run_locked (queue=0xc5702440) at /usr/src/sys/kern/subr_taskqueue.c:248 #24 0xc073895c in taskqueue_thread_loop (arg=0xc56ff130) at /usr/src/sys/kern/subr_taskqueue.c:385 #25 0xc06d1027 in fork_exit (callout=0xc07388a0 , arg=0xc56ff130, frame=0xc538ed28) at /usr/src/sys/kern/kern_fork.c:861 #26 0xc09a5c24 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:275 -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 11:41:24 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 744501065675 for ; Thu, 7 Jul 2011 11:41:24 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta01.emeryville.ca.mail.comcast.net (qmta01.emeryville.ca.mail.comcast.net [76.96.30.16]) by mx1.freebsd.org (Postfix) with ESMTP id 59D058FC1A for ; Thu, 7 Jul 2011 11:41:24 +0000 (UTC) Received: from omta20.emeryville.ca.mail.comcast.net ([76.96.30.87]) by qmta01.emeryville.ca.mail.comcast.net with comcast id 4zYu1h0021smiN4A1zhMC6; Thu, 07 Jul 2011 11:41:21 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta20.emeryville.ca.mail.comcast.net with comcast id 4zhL1h00Z1t3BNj8gzhMzA; Thu, 07 Jul 2011 11:41:21 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id C431F102C36; Thu, 7 Jul 2011 04:41:22 -0700 (PDT) Date: Thu, 7 Jul 2011 04:41:22 -0700 From: Jeremy Chadwick To: Mike Tancsa Message-ID: <20110707114122.GA8459@icarus.home.lan> References: <4E154A63.90600@sentex.net> <4E15620A.9030608@FreeBSD.org> <20110707082027.GX48734@deviant.kiev.zoral.com.ua> <4E159959.2070401@sentex.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E159959.2070401@sentex.net> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Kostik Belousov , FreeBSD-STABLE Mailing List , Andriy Gapon Subject: Re: panic: spin lock held too long (RELENG_8 from today) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 11:41:24 -0000 On Thu, Jul 07, 2011 at 07:32:41AM -0400, Mike Tancsa wrote: > On 7/7/2011 4:20 AM, Kostik Belousov wrote: > > > > BTW, we had a similar panic, "spinlock held too long", the spinlock > > is the sched lock N, on busy 8-core box recently upgraded to the > > stable/8. Unfortunately, machine hung dumping core, so the stack trace > > for the owner thread was not available. > > > > I was unable to make any conclusion from the data that was present. > > If the situation is reproducable, you coulld try to revert r221937. This > > is pure speculation, though. > > Another crash just now after 5hrs uptime. I will try and revert r221937 > unless there is any extra debugging you want me to add to the kernel > instead ? > > This is an inbound mail server so a little disruption is possible > > kgdb /usr/obj/usr/src/sys/recycle/kernel.debug vmcore.13 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd"... > > Unread portion of the kernel message buffer: > spin lock 0xc0b1d200 (sched lock 1) held by 0xc5dac2e0 (tid 100109) too long > panic: spin lock held too long > cpuid = 0 > Uptime: 5h37m43s > Physical memory: 2035 MB > Dumping 260 MB: 245 229 213 197 181 165 149 133 117 101 85 69 53 37 21 5 > > Reading symbols from /boot/kernel/amdsbwd.ko...Reading symbols from > /boot/kernel/amdsbwd.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/amdsbwd.ko > #0 doadump () at pcpu.h:231 > 231 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) bt > #0 doadump () at pcpu.h:231 > #1 0xc06fd6d3 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:429 > #2 0xc06fd937 in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:602 > #3 0xc06ed95f in _mtx_lock_spin_failed (m=0x0) at > /usr/src/sys/kern/kern_mutex.c:490 > #4 0xc06ed9e5 in _mtx_lock_spin (m=0xc0b1d200, tid=3312388992, opts=0, > file=0x0, line=0) > at /usr/src/sys/kern/kern_mutex.c:526 > #5 0xc0720254 in sched_add (td=0xc61892e0, flags=0) at > /usr/src/sys/kern/sched_ule.c:1119 > #6 0xc07203f9 in sched_wakeup (td=0xc61892e0) at > /usr/src/sys/kern/sched_ule.c:1950 > #7 0xc07061f8 in setrunnable (td=0xc61892e0) at > /usr/src/sys/kern/kern_synch.c:499 > #8 0xc07362af in sleepq_resume_thread (sq=0xc55311c0, td=0xc61892e0, > pri=Variable "pri" is not available. > ) > at /usr/src/sys/kern/subr_sleepqueue.c:751 > #9 0xc0736e18 in sleepq_signal (wchan=0xc60386d0, flags=1, pri=0, queue=0) > at /usr/src/sys/kern/subr_sleepqueue.c:825 > #10 0xc06b6764 in cv_signal (cvp=0xc60386d0) at > /usr/src/sys/kern/kern_condvar.c:422 > #11 0xc08eaa0d in xprt_assignthread (xprt=Variable "xprt" is not available. > ) at /usr/src/sys/rpc/svc.c:342 > #12 0xc08ec502 in xprt_active (xprt=0xc5db8a00) at > /usr/src/sys/rpc/svc.c:378 > #13 0xc08ee051 in svc_vc_soupcall (so=0xc618a19c, arg=0xc5db8a00, > waitflag=1) at /usr/src/sys/rpc/svc_vc.c:747 > #14 0xc075bbb1 in sowakeup (so=0xc618a19c, sb=0xc618a1f0) at > /usr/src/sys/kern/uipc_sockbuf.c:191 > #15 0xc08447bc in tcp_do_segment (m=0xc6567a00, th=0xc6785824, > so=0xc618a19c, tp=0xc617e000, drop_hdrlen=52, > tlen=1448, iptos=0 '\0', ti_locked=2) at > /usr/src/sys/netinet/tcp_input.c:1775 > #16 0xc0847930 in tcp_input (m=0xc6567a00, off0=20) at > /usr/src/sys/netinet/tcp_input.c:1329 > #17 0xc07ddaf7 in ip_input (m=0xc6567a00) at > /usr/src/sys/netinet/ip_input.c:787 > #18 0xc07b8859 in netisr_dispatch_src (proto=1, source=0, m=0xc6567a00) > at /usr/src/sys/net/netisr.c:859 > #19 0xc07b8af0 in netisr_dispatch (proto=1, m=0xc6567a00) at > /usr/src/sys/net/netisr.c:946 > #20 0xc07ae5e1 in ether_demux (ifp=0xc56ed800, m=0xc6567a00) at > /usr/src/sys/net/if_ethersubr.c:894 > #21 0xc07aeb5f in ether_input (ifp=0xc56ed800, m=0xc6567a00) at > /usr/src/sys/net/if_ethersubr.c:753 > #22 0xc09977b2 in nfe_int_task (arg=0xc56ff000, pending=1) at > /usr/src/sys/dev/nfe/if_nfe.c:2187 > #23 0xc07387ca in taskqueue_run_locked (queue=0xc5702440) at > /usr/src/sys/kern/subr_taskqueue.c:248 > #24 0xc073895c in taskqueue_thread_loop (arg=0xc56ff130) at > /usr/src/sys/kern/subr_taskqueue.c:385 > #25 0xc06d1027 in fork_exit (callout=0xc07388a0 , > arg=0xc56ff130, frame=0xc538ed28) > at /usr/src/sys/kern/kern_fork.c:861 > #26 0xc09a5c24 in fork_trampoline () at > /usr/src/sys/i386/i386/exception.s:275 1. info threads 2. Find the index value that matches the tid in question (in the above spin lock panic, that'd be tid 100109). The index value will be the first number shown on the left 3. thread {index} 4. bt If this doesn't work, alternatively you can try (from the beginning) "thread apply all bt" and provide the output from that. (It will be quite lengthy, and at this point I think tid 100109 is the one of interest in this crash, based on what Andriy said earlier) -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 11:53:06 2011 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ED0211065670 for ; Thu, 7 Jul 2011 11:53:06 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 418358FC1A for ; Thu, 7 Jul 2011 11:53:06 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id OAA19961; Thu, 07 Jul 2011 14:52:56 +0300 (EEST) (envelope-from avg@FreeBSD.org) Message-ID: <4E159E18.7070903@FreeBSD.org> Date: Thu, 07 Jul 2011 14:52:56 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:5.0) Gecko/20110705 Thunderbird/5.0 MIME-Version: 1.0 To: Jeremy Chadwick References: <4E154A63.90600@sentex.net> <4E15620A.9030608@FreeBSD.org> <20110707082027.GX48734@deviant.kiev.zoral.com.ua> <4E159959.2070401@sentex.net> <20110707114122.GA8459@icarus.home.lan> In-Reply-To: <20110707114122.GA8459@icarus.home.lan> X-Enigmail-Version: 1.2pre Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: Kostik Belousov , FreeBSD-STABLE Mailing List Subject: Re: panic: spin lock held too long (RELENG_8 from today) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 11:53:07 -0000 on 07/07/2011 14:41 Jeremy Chadwick said the following: > 1. info threads > 2. Find the index value that matches the tid in question (in the above > spin lock panic, that'd be tid 100109). The index value will be > the first number shown on the left > 3. thread {index} Just in case, in kgdb there is a command 'tid' that does all of the above steps in one go. > 4. bt > > If this doesn't work, alternatively you can try (from the beginning) > "thread apply all bt" and provide the output from that. (It will be > quite lengthy, and at this point I think tid 100109 is the one of > interest in this crash, based on what Andriy said earlier) -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 12:03:56 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 588641065675; Thu, 7 Jul 2011 12:03:56 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id CC07A8FC1B; Thu, 7 Jul 2011 12:03:53 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.4/8.14.4) with ESMTP id p67C3pf1061169; Thu, 7 Jul 2011 08:03:51 -0400 (EDT) (envelope-from mike@sentex.net) Message-ID: <4E15A08C.6090407@sentex.net> Date: Thu, 07 Jul 2011 08:03:24 -0400 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: Kostik Belousov References: <4E154A63.90600@sentex.net> <4E15620A.9030608@FreeBSD.org> <20110707082027.GX48734@deviant.kiev.zoral.com.ua> <4E159959.2070401@sentex.net> In-Reply-To: <4E159959.2070401@sentex.net> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.67 on IPv6:2607:f3e0:0:1::12 Cc: FreeBSD-STABLE Mailing List , Andriy Gapon Subject: Re: panic: spin lock held too long (RELENG_8 from today) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 12:03:56 -0000 On 7/7/2011 7:32 AM, Mike Tancsa wrote: > On 7/7/2011 4:20 AM, Kostik Belousov wrote: >> >> BTW, we had a similar panic, "spinlock held too long", the spinlock >> is the sched lock N, on busy 8-core box recently upgraded to the >> stable/8. Unfortunately, machine hung dumping core, so the stack trace >> for the owner thread was not available. >> >> I was unable to make any conclusion from the data that was present. >> If the situation is reproducable, you coulld try to revert r221937. This >> is pure speculation, though. > > Another crash just now after 5hrs uptime. I will try and revert r221937 > unless there is any extra debugging you want me to add to the kernel > instead ? > > This is an inbound mail server so a little disruption is possible > > kgdb /usr/obj/usr/src/sys/recycle/kernel.debug vmcore.13 > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd"... > > Unread portion of the kernel message buffer: > spin lock 0xc0b1d200 (sched lock 1) held by 0xc5dac2e0 (tid 100109) too long > panic: spin lock held too long > cpuid = 0 > Uptime: 5h37m43s > Physical memory: 2035 MB > Dumping 260 MB: 245 229 213 197 181 165 149 133 117 101 85 69 53 37 21 5 And the second crash from today kgdb /usr/obj/usr/src/sys/recycle/kernel.debug vmcore.13 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Unread portion of the kernel message buffer: spin lock 0xc0b1d200 (sched lock 1) held by 0xc5dac2e0 (tid 100109) too long panic: spin lock held too long cpuid = 0 Uptime: 5h37m43s Physical memory: 2035 MB Dumping 260 MB: 245 229 213 197 181 165 149 133 117 101 85 69 53 37 21 5 Reading symbols from /boot/kernel/amdsbwd.ko...Reading symbols from /boot/kernel/amdsbwd.ko.symbols...done. done. Loaded symbols for /boot/kernel/amdsbwd.ko #0 doadump () at pcpu.h:231 231 pcpu.h: No such file or directory. in pcpu.h (kgdb) tid 100109 [Switching to thread 82 (Thread 100109)]#0 sched_switch (td=0xc5dac2e0, newtd=0xc553c5c0, flags=260) at /usr/src/sys/kern/sched_ule.c:1866 1866 cpuid = PCPU_GET(cpuid); (kgdb) list 1861 /* 1862 * We may return from cpu_switch on a different cpu. However, 1863 * we always return with td_lock pointing to the current cpu's 1864 * run queue lock. 1865 */ 1866 cpuid = PCPU_GET(cpuid); 1867 tdq = TDQ_CPU(cpuid); 1868 lock_profile_obtain_lock_success( 1869 &TDQ_LOCKPTR(tdq)->lock_object, 0, 0, __FILE__, __LINE__); 1870 #ifdef HWPMC_HOOKS (kgdb) p *td $1 = {td_lock = 0xc0b1d200, td_proc = 0xc5db4000, td_plist = {tqe_next = 0xc5dac5c0, tqe_prev = 0xc5dac008}, td_runq = {tqe_next = 0x0, tqe_prev = 0xc0b1d334}, td_slpq = {tqe_next = 0x0, tqe_prev = 0xc65d3b00}, td_lockq = {tqe_next = 0x0, tqe_prev = 0xc51f6b38}, td_cpuset = 0xc5533e38, td_sel = 0x0, td_sleepqueue = 0xc65d3b00, td_turnstile = 0xc63ceb80, td_umtxq = 0xc5d229c0, td_tid = 100109, td_sigqueue = { sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xc5dac340}, sq_proc = 0xc5db4000, sq_flags = 1}, td_flags = 4, td_inhibitors = 0, td_pflags = 2097152, td_dupfd = 0, td_sqqueue = 0, td_wchan = 0x0, td_wmesg = 0x0, td_lastcpu = 0 '\0', td_oncpu = 1 '\001', td_owepreempt = 0 '\0', td_tsqueue = 0 '\0', td_locks = -291, td_rw_rlocks = 0, td_lk_slocks = 0, td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, td_intr_nesting_level = 0, td_pinned = 0, td_ucred = 0xc5538100, td_estcpu = 0, td_slptick = 0, td_blktick = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 1048, ru_ixrss = 85216, ru_idrss = 3834720, ru_isrss = 681728, ru_minflt = 0, ru_majflt = 0, ru_nswap = 0, ru_inblock = 82, ru_oublock = 271222, ru_msgsnd = 135625, ru_msgrcv = 2427350, ru_nsignals = 0, ru_nvcsw = 2076938, ru_nivcsw = 731134}, td_incruntime = 852332612, td_runtime = 88202475877, td_pticks = 5326, td_sticks = 48, td_iticks = 0, td_uticks = 0, td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_sigmask = {__bits = {0, 0, 0, 0}}, td_generation = 2808072, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 0}, td_xsig = 0, td_profil_addr = 0, td_profil_ticks = 0, td_name = "nfsd: service\000\000\000\000\000\000", td_fpop = 0x0, td_dbgflags = 0, td_dbgksi = {ksi_link = { tqe_next = 0x0, tqe_prev = 0x0}, ksi_info = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = { _mqd = 0}, _poll = {_band = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 0, 0, 0, 0, 0}}}}, ksi_flags = 0, ksi_sigq = 0x0}, td_ng_outbound = 0, td_osd = {osd_nslots = 0, osd_slots = 0x0, osd_next = { le_next = 0x0, le_prev = 0x0}}, td_rqindex = 32 ' ', td_base_pri = 160 ' ', td_priority = 128 '\200', td_pri_class = 3 '\003', td_user_pri = 128 '\200', td_base_user_pri = 128 '\200', td_pcb = 0xe7d14d80, td_state = TDS_RUNNING, td_retval = {0, 0}, td_slpcallout = {c_links = {sle = {sle_next = 0xc5d704e0}, tqe = { tqe_next = 0xc5d704e0, tqe_prev = 0xc55bb1f0}}, c_time = 20246590, c_arg = 0xc5dac2e0, c_func = 0xc0736bc0 , c_lock = 0x0, c_flags = 18, c_cpu = 32}, td_frame = 0xe7d14d28, td_kstack_obj = 0xc6182088, td_kstack = 3889246208, td_kstack_pages = 2, td_unused1 = 0x0, td_unused2 = 0, td_unused3 = 0, td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 582}, td_sched = 0xc5dac58c, td_ar = 0x0, td_syscalls = 0, td_lprof = {{lh_first = 0x0}, {lh_first = 0x0}}, td_dtrace = 0x0, td_errno = 0, td_vnet = 0x0, td_vnet_lpush = 0x0, td_rux = {rux_runtime = 87350143265, rux_uticks = 0, rux_sticks = 5278, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, td_map_def_user = 0x0, td_dbg_forked = 0} (kgdb) p *newtd $2 = {td_lock = 0xc0b1cb80, td_proc = 0xc553a810, td_plist = {tqe_next = 0xc553c8a0, tqe_prev = 0xc553a818}, td_runq = {tqe_next = 0x0, tqe_prev = 0x0}, td_slpq = {tqe_next = 0x0, tqe_prev = 0x0}, td_lockq = { tqe_next = 0x0, tqe_prev = 0x0}, td_cpuset = 0xc5533e38, td_sel = 0x0, td_sleepqueue = 0xc5531e00, td_turnstile = 0xc553d000, td_umtxq = 0xc5527ac0, td_tid = 100004, td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xc553c620}, sq_proc = 0xc553a810, sq_flags = 1}, td_flags = 262180, td_inhibitors = 0, td_pflags = 2097152, td_dupfd = 0, td_sqqueue = 0, td_wchan = 0x0, td_wmesg = 0x0, td_lastcpu = 0 '\0', td_oncpu = 255 'ÿ', td_owepreempt = 0 '\0', td_tsqueue = 0 '\0', td_locks = 0, td_rw_rlocks = 0, td_lk_slocks = 0, td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, td_intr_nesting_level = 0, td_pinned = 0, td_ucred = 0xc5535600, td_estcpu = 0, td_slptick = 0, td_blktick = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 33962290, ru_nivcsw = 40323696}, td_incruntime = 370201469879, td_runtime = 41685199750119, td_pticks = 2502607, td_sticks = 22282, td_iticks = 0, td_uticks = 0, td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_sigmask = {__bits = {0, 0, 0, 0}}, td_generation = 74285986, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 0}, td_xsig = 0, td_profil_addr = 0, td_profil_ticks = 0, td_name = "idle: cpu0\000\000\000\000\000\000\000\000\000", td_fpop = 0x0, td_dbgflags = 0, td_dbgksi = { ksi_link = {tqe_next = 0x0, tqe_prev = 0x0}, ksi_info = {si_signo = 0, si_errno = 0, si_code = 0, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = {sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {_trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = {_mqd = 0}, _poll = {_band = 0}, __spare__ = {__spare1__ = 0, __spare2__ = { 0, 0, 0, 0, 0, 0, 0}}}}, ksi_flags = 0, ksi_sigq = 0x0}, td_ng_outbound = 0, td_osd = { osd_nslots = 0, osd_slots = 0x0, osd_next = {le_next = 0x0, le_prev = 0x0}}, td_rqindex = 0 '\0', td_base_pri = 255 'ÿ', td_priority = 255 'ÿ', td_pri_class = 4 '\004', td_user_pri = 160 ' ', td_base_user_pri = 160 ' ', td_pcb = 0xc51e3d80, td_state = TDS_CAN_RUN, td_retval = {0, 0}, td_slpcallout = { c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 16, c_cpu = 0}, td_frame = 0xc51e3d28, td_kstack_obj = 0xc157ddd0, td_kstack = 3307085824, td_kstack_pages = 2, td_unused1 = 0x0, td_unused2 = 0, td_unused3 = 0, td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 582}, td_sched = 0xc553c86c, td_ar = 0x0, td_syscalls = 0, td_lprof = {{lh_first = 0x0}, {lh_first = 0x0}}, td_dtrace = 0x0, td_errno = 0, td_vnet = 0x0, td_vnet_lpush = 0x0, td_rux = {rux_runtime = 41315105445882, rux_uticks = 0, rux_sticks = 2480325, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, td_map_def_user = 0x0, td_dbg_forked = 0} (kgdb) p *mtx $3 = {lock_object = {lo_name = 0xc0a3af04 "sleepq chain", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4} (kgdb) disassemble Dump of assembler code for function sched_switch: 0xc07206c0 : push %ebp 0xc07206c1 : mov %esp,%ebp 0xc07206c3 : push %edi 0xc07206c4 : push %esi 0xc07206c5 : push %ebx 0xc07206c6 : sub $0x24,%esp 0xc07206c9 : mov 0x8(%ebp),%esi 0xc07206cc : mov (%esi),%eax 0xc07206ce : mov %fs:0x20,%eax 0xc07206d4 : mov %eax,0xfffffff0(%ebp) 0xc07206d7 : mov %eax,0xffffffec(%ebp) 0xc07206da : imul $0x680,%eax,%eax 0xc07206e0 : lea 0xc0b1cb80(%eax),%edi 0xc07206e6 : mov 0x248(%esi),%ebx 0xc07206ec : mov (%esi),%eax 0xc07206ee : mov %eax,0xffffffe4(%ebp) 0xc07206f1 : mov 0xc0b16e8c,%eax 0xc07206f6 : mov %eax,0x8(%ebx) 0xc07206f9 : movzbl 0x8d(%esi),%eax 0xc0720700 : mov %al,0x8c(%esi) 0xc0720706 : movb $0xff,0x8d(%esi) 0xc072070d : mov 0x10(%ebp),%eax 0xc0720710 : and $0x400,%eax 0xc0720715 : jne 0xc072071e 0xc0720717 : andl $0xfffeffff,0x70(%esi) 0xc072071e : movb $0x0,0x8e(%esi) 0xc0720725 : addw $0x1,0x24(%edi) 0xc072072a : testb $0x20,0x70(%esi) 0xc072072e : je 0xc0720740 0xc0720730 : movl $0x2,0x1f4(%esi) 0xc072073a : jmp 0xc0720924 0xc072073f : nop 0xc0720740 : cmpl $0x4,0x1f4(%esi) 0xc0720747 : jne 0xc07208c0 0xc072074d : cmp $0x1,%eax 0xc0720750 : sbb %edx,%edx 0xc0720752 : and $0xfffffff8,%edx 0xc0720755 : add $0xb,%edx 0xc0720758 : mov %edx,0xffffffe8(%ebp) 0xc072075b : cmpl $0x0,0xac(%esi) 0xc0720762 : jne 0xc0720790 0xc0720764 : mov 0x28(%esi),%edx 0xc0720767 : movzbl 0x6(%ebx),%ecx ---Type to continue, or q to quit--- 0xc072076b : mov %ecx,%eax 0xc072076d : shr $0x5,%al 0xc0720770 : movzbl %al,%eax 0xc0720773 : mov (%edx,%eax,4),%eax 0xc0720776 : and $0x1f,%ecx 0xc0720779 : sar %cl,%eax 0xc072077b : test $0x1,%al 0xc072077d : jne 0xc0720790 0xc072077f : mov $0x0,%edx 0xc0720784 : mov %esi,%eax 0xc0720786 : call 0xc071ff90 0xc072078b : mov %al,0x6(%ebx) 0xc072078e : mov %esi,%esi 0xc0720790 : movzbl 0x6(%ebx),%eax 0xc0720794 : cmp 0xffffffec(%ebp),%eax 0xc0720797 : jne 0xc0720850 0xc072079d : mov (%esi),%eax 0xc072079f : movzbl 0x1ea(%esi),%ebx 0xc07207a6 : mov 0x248(%esi),%ecx 0xc07207ac : movl $0x3,0x1f4(%esi) 0xc07207b6 : cmpl $0x0,0xac(%esi) 0xc07207bd : jne 0xc07207c8 0xc07207bf : addl $0x1,0x20(%edi) 0xc07207c3 : orw $0x2,0x4(%ecx) 0xc07207c8 : cmp $0x9f,%bl 0xc07207cb : ja 0xc07207d4 0xc07207cd : lea 0x2c(%edi),%eax 0xc07207d0 : mov %eax,(%ecx) 0xc07207d2 : jmp 0xc0720833 0xc07207d4 : cmp $0xdf,%bl 0xc07207d7 : ja 0xc072082b 0xc07207d9 : lea 0x234(%edi),%eax 0xc07207df : mov %eax,(%ecx) 0xc07207e1 : testb $0x18,0xffffffe8(%ebp) 0xc07207e5 : jne 0xc0720806 0xc07207e7 : movzbl 0x2a(%edi),%edx 0xc07207eb : lea 0x60(%ebx,%edx,1),%eax 0xc07207ef : and $0x3f,%eax 0xc07207f2 : movzbl 0x2b(%edi),%ebx 0xc07207f6 : cmp %dl,%bl 0xc07207f8 : je 0xc072080a 0xc07207fa : cmp %al,%bl 0xc07207fc : jne 0xc072080a 0xc07207fe : sub $0x1,%eax ---Type to continue, or q to quit--- 0xc0720801 : and $0x3f,%eax 0xc0720804 : jmp 0xc072080a 0xc0720806 : movzbl 0x2b(%edi),%eax 0xc072080a : movzbl %al,%eax 0xc072080d : mov (%ecx),%edx 0xc072080f : mov 0xffffffe8(%ebp),%ecx 0xc0720812 : mov %ecx,0xc(%esp) 0xc0720816 : mov %eax,0x8(%esp) 0xc072081a : mov %esi,0x4(%esp) 0xc072081e : mov %edx,(%esp) 0xc0720821 : call 0xc07051a0 0xc0720826 : jmp 0xc0720924 0xc072082b : lea 0x43c(%edi),%eax 0xc0720831 : mov %eax,(%ecx) 0xc0720833 : mov (%ecx),%eax 0xc0720835 : mov 0xffffffe8(%ebp),%edx 0xc0720838 : mov %edx,0x8(%esp) 0xc072083c : mov %esi,0x4(%esp) 0xc0720840 : mov %eax,(%esp) 0xc0720843 : call 0xc0704dc0 0xc0720848 : jmp 0xc0720924 0xc072084d : lea 0x0(%esi),%esi 0xc0720850 : mov 0x248(%esi),%eax 0xc0720856 : movzbl 0x6(%eax),%eax 0xc072085a : imul $0x680,%eax,%eax 0xc0720860 : lea 0xc0b1cb80(%eax),%ebx 0xc0720866 : mov %esi,%edx 0xc0720868 : mov %edi,%eax 0xc072086a : call 0xc071da30 0xc072086f : call 0xc09af910 0xc0720874 : mov %esi,(%esp) 0xc0720877 : call 0xc06edf20 0xc072087c : mov %edi,%edx 0xc072087e : mov %ebx,%eax 0xc0720880 : call 0xc071e080 0xc0720885 : mov 0xffffffe8(%ebp),%ecx 0xc0720888 : mov %esi,%edx 0xc072088a : mov %ebx,%eax 0xc072088c : call 0xc071ef50 0xc0720891 : mov %esi,%edx 0xc0720893 : mov %ebx,%eax 0xc0720895 : call 0xc071e990 0xc072089a : mov 0x8(%ebx),%eax 0xc072089d : test %eax,%eax ---Type to continue, or q to quit--- 0xc072089f : je 0xc07208a9 0xc07208a1 : sub $0x1,%eax 0xc07208a4 : mov %eax,0x8(%ebx) 0xc07208a7 : jmp 0xc07208b1 0xc07208a9 : mov $0x4,%eax 0xc07208ae : xchg %eax,0x10(%ebx) 0xc07208b1 : call 0xc09afae0 0xc07208b6 : call 0xc09afae0 0xc07208bb : mov %ebx,0xffffffe4(%ebp) 0xc07208be : jmp 0xc0720924 0xc07208c0 : mov %fs:0x0,%ebx 0xc07208c7 : call 0xc09af910 0xc07208cc : mov $0x4,%eax 0xc07208d1 : lock cmpxchg %ebx,0x10(%edi) 0xc07208d6 : sete %al 0xc07208d9 : test %al,%al 0xc07208db : jne 0xc0720910 0xc07208dd : mov 0x10(%edi),%eax 0xc07208e0 : cmp %ebx,%eax 0xc07208e2 : jne 0xc07208ea 0xc07208e4 : addl $0x1,0x8(%edi) 0xc07208e8 : jmp 0xc0720910 0xc07208ea : movl $0x0,0x10(%esp) 0xc07208f2 : movl $0x0,0xc(%esp) 0xc07208fa : movl $0x0,0x8(%esp) 0xc0720902 : mov %ebx,0x4(%esp) 0xc0720906 : mov %edi,(%esp) 0xc0720909 : call 0xc06ed970 <_mtx_lock_spin> 0xc072090e : mov %esi,%esi 0xc0720910 : mov %esi,(%esp) 0xc0720913 : call 0xc06edf20 0xc0720918 : mov %eax,0xffffffe4(%ebp) 0xc072091b : mov %esi,%edx 0xc072091d : mov %edi,%eax 0xc072091f : call 0xc071da30 0xc0720924 : call 0xc0705140 0xc0720929 : mov %eax,%ebx 0xc072092b : cmp %eax,%esi 0xc072092d : je 0xc07209b1 0xc0720933 : mov 0x4(%esi),%ecx 0xc0720936 : lock cmpxchg %eax,0x5c(%ecx) 0xc072093b : test $0x800000,%eax 0xc0720940 : je 0xc0720960 0xc0720942 : mov 0xc0b184dc,%eax ---Type to continue, or q to quit--- 0xc0720947 : test %eax,%eax 0xc0720949 : je 0xc0720960 0xc072094b : movl $0x0,0x8(%esp) 0xc0720953 : movl $0x3,0x4(%esp) 0xc072095b : mov %esi,(%esp) 0xc072095e : call *%eax 0xc0720960 : mov %ebx,0x10(%edi) 0xc0720963 : mov 0xffffffe4(%ebp),%edx 0xc0720966 : mov %edx,0x8(%esp) 0xc072096a : mov %ebx,0x4(%esp) 0xc072096e : mov %esi,(%esp) 0xc0720971 : call 0xc09bcdc4 0xc0720976 : mov %fs:0x20,%eax 0xc072097c : mov %eax,0xfffffff0(%ebp) 0xc072097f : mov %eax,0xffffffec(%ebp) 0xc0720982 : mov 0x4(%esi),%ecx 0xc0720985 : lock cmpxchg %eax,0x5c(%ecx) 0xc072098a : test $0x800000,%eax 0xc072098f : je 0xc07209b6 0xc0720991 : mov 0xc0b184dc,%eax 0xc0720996 : test %eax,%eax 0xc0720998 : je 0xc07209b6 0xc072099a : movl $0x0,0x8(%esp) 0xc07209a2 : movl $0x2,0x4(%esp) 0xc07209aa : mov %esi,(%esp) 0xc07209ad : call *%eax 0xc07209af : jmp 0xc07209b6 0xc07209b1 : mov 0xffffffe4(%ebp),%eax 0xc07209b4 : xchg %eax,(%esi) 0xc07209b6 : movzbl 0xffffffec(%ebp),%edx 0xc07209ba : mov %dl,0x8d(%esi) 0xc07209c0 : add $0x24,%esp 0xc07209c3 : pop %ebx 0xc07209c4 : pop %esi 0xc07209c5 : pop %edi 0xc07209c6 : pop %ebp 0xc07209c7 : ret End of assembler dump. -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 12:09:36 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7EF561065673 for ; Thu, 7 Jul 2011 12:09:36 +0000 (UTC) (envelope-from freebsd-stable@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 36AD38FC08 for ; Thu, 7 Jul 2011 12:09:36 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1QenOh-0006Z7-32 for freebsd-stable@freebsd.org; Thu, 07 Jul 2011 14:09:35 +0200 Received: from dtmd-4db2c633.pool.mediaways.net ([77.178.198.51]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 07 Jul 2011 14:09:35 +0200 Received: from christian.baer by dtmd-4db2c633.pool.mediaways.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 07 Jul 2011 14:09:35 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-stable@freebsd.org From: Christian Baer Date: Thu, 07 Jul 2011 14:09:20 +0200 Lines: 13 Message-ID: References: <52F39CE0-EEC7-4180-8186-BF8696AF279D@lassitu.de> <20110618175215.GA18645@icarus.home.lan> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: dtmd-4db2c633.pool.mediaways.net User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.16) Gecko/20101125 Lightning/1.0b1 Thunderbird/3.0.11 In-Reply-To: Subject: Re: Crashes with Promise controller X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 12:09:36 -0000 On 06.07.2011 01:00, George Kontostanos wrote: [Promise PDC40718 SATA300 controller] > There are a lot of people I know that have similar issues. It has > caused me to replace 3 disks so far. I am afraid that this controller > should be marked as junk. Do you have an alternative controller in mind? Preferably I mean one that doesn't cost ten times as much. :-) Cheers! Chris From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 12:15:18 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 881C9106566C for ; Thu, 7 Jul 2011 12:15:18 +0000 (UTC) (envelope-from thomas@ronner.org) Received: from mail.knopje.net (unknown [IPv6:2001:470:1f15:a0::10]) by mx1.freebsd.org (Postfix) with ESMTP id ACC0A8FC12 for ; Thu, 7 Jul 2011 12:15:17 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.knopje.net (Postfix) with ESMTP id 6FD84380BE; Thu, 7 Jul 2011 14:15:13 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at knopje.net Received: from mail.knopje.net ([127.0.0.1]) by localhost (hal.knopje.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uIYdz11LLbVz; Thu, 7 Jul 2011 14:15:09 +0200 (CEST) Received: from appelflap.local (rtutr01.ic-s.nl [213.214.96.4]) by mail.knopje.net (Postfix) with ESMTPSA id C07763806F; Thu, 7 Jul 2011 14:15:09 +0200 (CEST) Message-ID: <4E15A34E.7090205@ronner.org> Date: Thu, 07 Jul 2011 14:15:10 +0200 From: Thomas Ronner User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: Christian Baer References: <52F39CE0-EEC7-4180-8186-BF8696AF279D@lassitu.de> <20110618175215.GA18645@icarus.home.lan> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: Crashes with Promise controller X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 12:15:18 -0000 On 7/7/11 2:09 PM, Christian Baer wrote: > Do you have an alternative controller in mind? Preferably I mean one > that doesn't cost ten times as much. :-) I suggest an LSI 1068 based SAS controller. With the SAS->SATA cables included it will cost at most five times as much :) Regards, Thomas From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 12:26:08 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F1F941065674 for ; Thu, 7 Jul 2011 12:26:08 +0000 (UTC) (envelope-from freebsd-stable@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id A96888FC12 for ; Thu, 7 Jul 2011 12:26:08 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1Qeneh-0000QV-Jh for freebsd-stable@freebsd.org; Thu, 07 Jul 2011 14:26:07 +0200 Received: from dtmd-4db2c633.pool.mediaways.net ([77.178.198.51]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 07 Jul 2011 14:26:07 +0200 Received: from christian.baer by dtmd-4db2c633.pool.mediaways.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 07 Jul 2011 14:26:07 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-stable@freebsd.org From: Christian Baer Date: Thu, 07 Jul 2011 14:25:55 +0200 Lines: 38 Message-ID: References: <52F39CE0-EEC7-4180-8186-BF8696AF279D@lassitu.de> <20110618175215.GA18645@icarus.home.lan> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: dtmd-4db2c633.pool.mediaways.net User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.16) Gecko/20101125 Lightning/1.0b1 Thunderbird/3.0.11 In-Reply-To: Subject: Re: Crashes with Promise controller X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 12:26:09 -0000 On 06.07.2011 01:00, George Kontostanos wrote: > There are a lot of people I know that have similar issues. It has > caused me to replace 3 disks so far. I am afraid that this controller > should be marked as junk. Just go another crash. This one is different... --- snip --- dev = ad16p1.eli, ino = 4, fs = /archive/drives/archive01 panic: ffs_freefile: freeing free inode KDB: stack backtrace: db_trace_self_wrapper(c0a1552c,6972642f,2f736576,68637261,30657669,a0d31,0,0,c485b0a4,100,c3ad05c0,d124c000,d96ceb7c,c06eb07b,c485b0a4,100,d96ceb7c,c06eb32b,c0a13393,d96ceb88) at db_trace_self_wrapper+0x26 kdb_backtrace(c0a13393,c0ace8c0,c0a2edcc,d96cebac,d96cebac,...) at kdb_backtrace+0x2b panic(c0a2edcc,c3cd7a78,4,c3b070d4,0,...) at panic+0xf8 ffs_freefile(c3b0b400,c3b07000,c3d8b000,4,41c0,...) at ffs_freefile+0x357 handle_workitem_freefile(0,d96cec6c,2,d96cec78,c06f1ea0,...) at handle_workitem_freefile+0xc7 process_worklist_item(c0a3012c,0,0,0,c3ad05c0,...) at process_worklist_item+0x2bc softdep_process_worklist(c3b1a284,0,44,c0a3012c,3e8,...) at softdep_process_worklist+0xc2 softdep_flush(0,d96ced28,0,0,0,...) at softdep_flush+0x161 fork_exit(c08fb600,0,d96ced28) at fork_exit+0x86 fork_trampoline() at fork_trampoline+0x8 --- trap 0, eip = 0, esp = 0xd96ced60, ebp = 0 --- KDB: enter: panic [thread pid 18 tid 100038 ] Stopped at kdb_enter+0x3b: movl $0,kdb_why --- snap --- I got it while trying to delete some files (5 or 6 in all). Does that help anyone here? Cheers! Chris From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 13:25:57 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EAC1B106564A for ; Thu, 7 Jul 2011 13:25:57 +0000 (UTC) (envelope-from gkontos.mail@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id B57288FC13 for ; Thu, 7 Jul 2011 13:25:57 +0000 (UTC) Received: by iwr19 with SMTP id 19so1106815iwr.13 for ; Thu, 07 Jul 2011 06:25:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=oRYgKl5s3hSkBsGRRDvlof2PvTm15PrypHnSuPY97Uo=; b=He7Lu1P9C6nXQ6DHLTQan5tB5uJOheG2eI+MSduC78NGa6VKy+hMTm8H2IGrrP6PIg urpoOLgD49XoMaj3qs1ztrrC/PSZPPf+RJUcq3Czkp7Lw7xml/fdlIBOD/L70zijqydR yCJAXa/c4Dv3iLzqNCIwe8CN31+io+KVrhoZM= MIME-Version: 1.0 Received: by 10.231.19.201 with SMTP id c9mr672166ibb.188.1310045156944; Thu, 07 Jul 2011 06:25:56 -0700 (PDT) Received: by 10.231.15.205 with HTTP; Thu, 7 Jul 2011 06:25:56 -0700 (PDT) In-Reply-To: <4E15A34E.7090205@ronner.org> References: <52F39CE0-EEC7-4180-8186-BF8696AF279D@lassitu.de> <20110618175215.GA18645@icarus.home.lan> <4E15A34E.7090205@ronner.org> Date: Thu, 7 Jul 2011 16:25:56 +0300 Message-ID: From: George Kontostanos To: Thomas Ronner Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-stable@freebsd.org, Christian Baer Subject: Re: Crashes with Promise controller X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 13:25:58 -0000 On Thu, Jul 7, 2011 at 3:15 PM, Thomas Ronner wrote: > On 7/7/11 2:09 PM, Christian Baer wrote: >> >> Do you have an alternative controller in mind? Preferably I mean one >> that doesn't cost ten times as much. :-) > > I suggest an LSI 1068 based SAS controller. With the SAS->SATA cables > included it will cost at most five times as much :) > > > > Regards, > Thomas > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > Hi Thomas, I have found a used LSI SAS3041E-HP 4-Port PCI-E SAS/SATA RAID controller on ebay. It comes around 70$ which is pretty much the same price I paid for the junk am using now. Do you know if this is supported because I can't find anything relevant in HCL and I read that some people use it without problems. Thanks -- George Kontostanos aisecure.net From owner-freebsd-stable@FreeBSD.ORG Thu Jul 7 16:55:11 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F2F531065670 for ; Thu, 7 Jul 2011 16:55:11 +0000 (UTC) (envelope-from ykirill@yahoo.com) Received: from nm24-vm1.bullet.mail.ne1.yahoo.com (nm24-vm1.bullet.mail.ne1.yahoo.com [98.138.90.45]) by mx1.freebsd.org (Postfix) with SMTP id 98D368FC21 for ; Thu, 7 Jul 2011 16:55:11 +0000 (UTC) Received: from [98.138.90.48] by nm24.bullet.mail.ne1.yahoo.com with NNFMP; 07 Jul 2011 16:41:56 -0000 Received: from [98.138.87.2] by tm1.bullet.mail.ne1.yahoo.com with NNFMP; 07 Jul 2011 16:41:56 -0000 Received: from [127.0.0.1] by omp1002.mail.ne1.yahoo.com with NNFMP; 07 Jul 2011 16:41:56 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 493786.61826.bm@omp1002.mail.ne1.yahoo.com Received: (qmail 71381 invoked by uid 60001); 7 Jul 2011 16:41:56 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1310056916; bh=RFeVeE9Acbm9LPWrGBgIAZoFuGbXI1v68zSSbBqnydQ=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=QMbTrR19ayaXG7xNtan7SHttjiERRjFdFxV2ZDepc4dcwM1+4JF3tUbMwnRLOD8b2OWuDhtM/J5uVlzBsKO3KKoTt+PGKtGGBYzbHpUVvm5rJdW2murjVBFQ+HYNRswn3kAU3TKs+fZRoMbZwDrGCfTo/VPz6+SlxselMJ1JKF4= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=WhwcwfiphdThqWSCwFPWLVX6RohNxXaTNnV3eugFIX6VJvOOxWjWWfZcv/BSrRxSq/3K+TVHN6P56ozCzL3VIHaxwi0qX1L8pkpj4tIYv/kup6Jjs32UfATg6XgbVhRVm2U4NntbhxFZAixPtEjxiGmjTzk/OqO/tGaq4+OgVoo=; Message-ID: <77666.71118.qm@web120527.mail.ne1.yahoo.com> X-YMail-OSG: yzVXkHMVM1k8vtoBVdJ6SabVPbgjEAQsAzEu6zBvbqd.i.s z.6cpJfjoawCeLL.BdkbproyHthW7GZ9ean4bjHaPVCVYxrKaJN3Ob04pBQQ zyNtwGn.Z_HY9MPUorarTERpY_f2MHhoJ9rLwEWQvJ3ihDAoWZa5vl9EYh9X i5BkOEqLokdTYhZL07OfsF4hOKgy2Lkb3V_lS6o58NecmNtTntdQQ5BE1STd iJHJ0N.m8YVUoWGm5u5THwB0v8NijvcPNKrJAv9mFaYjN9G7TeQpjAqghM02 vdghONAGMU.9wupf9fJXa4XdSE2FUGIEs95z2tAsZnDSeVPg3SRdhtGEstnw 6_GvAgC3LLxrFx332EbhU7CBHzFMWNgDNC9RRLV_aq9pa0Pk19LGcHm5DHiu wl0n6oCXhYxhejiGc Received: from [212.45.22.73] by web120527.mail.ne1.yahoo.com via HTTP; Thu, 07 Jul 2011 09:41:54 PDT X-Mailer: YahooMailClassic/14.0.3 YahooMailWebService/0.8.111.304355 Date: Thu, 7 Jul 2011 09:41:54 -0700 (PDT) From: Kirill Yelizarov To: freebsd-stable@freebsd.org In-Reply-To: <20110707095800.GA6295@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: system internal timer runs 10 times too slow X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jul 2011 16:55:12 -0000 =0A=0A--- On Thu, 7/7/11, Jeremy Chadwick wrote:= =0A=0A> From: Jeremy Chadwick =0A> Subject: Re: s= ystem internal timer runs 10 times too slow=0A> To: "Aristedes Maniatis" =0A> Cc: "freebsd-stable" =0A> Da= te: Thursday, July 7, 2011, 1:58 PM=0A> On Thu, Jul 07, 2011 at 07:39:05PM= =0A> +1000, Aristedes Maniatis wrote:=0A> > We upgraded an existing system = to a new=0A> motherboard/CPU and found that timing in various programs is= =0A> very odd. For example "top" only updates every 10 seconds=0A> instead = of every second. And this confirms the oddness:=0A> > =0A> > # while true; = do echo `date`; sleep 1; done=0A> > Thu Jul 7 19:09:01 EST 2011=0A> > Thu J= ul 7 19:09:11 EST 2011=0A> > Thu Jul 7 19:09:21 EST 2011=0A> > =0A> > 10 se= conds instead of 1.=0A> > =0A> > =0A> > So I looked first at the kernel tim= ers:=0A> > =0A> > # dmesg | grep -i time=0A> > Timecounter "i8254" frequenc= y 1193182 Hz quality 0=0A> > Timecounter "ACPI-fast" frequency 3579545 Hz q= uality=0A> 1000=0A> > acpi_timer0: <24-bit timer at 3.579545MHz> port=0A> 0= x808-0x80b on acpi0=0A> > pci3: at device 0.1 (no driver= =0A> attached)=0A> > atrtc0: port 0x70-0x71 irq 8=0A> o= n acpi0=0A> > acpi_hpet0: iomem=0A> 0xfed00000= -0xfed003ff on acpi0=0A> > Timecounter "HPET" frequency 14318180 Hz quality= 900=0A> > Timecounters tick every 1.000 msec=0A> > =0A> > =0A> > I switche= d i8254 and then to HPET. No difference.=0A> > =0A> > # sysctl -w kern.time= counter.hardware=3Di8254=0A> > kern.timecounter.hardware: ACPI-fast -> i825= 4=0A> > # while true; do echo `date`; sleep 1; done=0A> > Thu Jul 7 19:09:4= 0 EST 2011=0A> > Thu Jul 7 19:09:41 EST 2011=0A> > =0A> > I switched to TSC= :=0A> > =0A> > # sysctl -w kern.timecounter.hardware=3DTSC=0A> > kern.timec= ounter.hardware: HPET -> TSC=0A> > # while true; do echo `date`; sleep 1; d= one=0A> > Thu Jul 7 19:25:56 EST 2011=0A> > Thu Jul 7 19:25:57 EST 2011=0A>= > Thu Jul 7 19:25:58 EST 2011=0A> > =0A> > Now this looks like it fixed th= e problem, but actually=0A> it is worse. Now the clock matches what you'd e= xpect, but=0A> there is still 10 seconds in real time between those date=0A= > entries. That is, now the system clock is running 10 times=0A> too slow a= s well.=0A> > =0A> > =0A> > # uname -a=0A> > FreeBSD delish.ish.com.au 8.2-= RELEASE FreeBSD=0A> 8.2-RELEASE #0: Thu Feb 17 02:41:51 UTC 2011=A0=0A> =A0= =A0=A0root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC=A0=0A> amd64= =0A> > =0A> > Base board information=0A> > Manufacturer: ASUSTeK Computer I= NC.=0A> > Product Name: P6X58D-E=0A> > =0A> > BIOS information=0A> > Vendor= : American Megatrends Inc.=0A> > Version: 0502=0A> > Release Date: 11/16/20= 10=0A> > BIOS Revision: 8.15=0A> > =0A> > CPU Model:=A0=A0=A0 Intel(R) Core= (TM) i7=0A> CPU=A0 =A0 =A0 =A0=A0=A0960=A0 @=0A> 3.20GHz=0A> =0A> Do you ha= ve anything like powerd(8) enabled, or EIST /=0A> Intel SpeedStep=0A> techn= ology enabled in your system BIOS?=A0 If so, can=0A> you try disabling=0A> = powerd and/or disabling EIST/SS?=0A> =0A> Alternately, and this isn't to sa= y FreeBSD doesn't have a=0A> problem, do=0A> you have a replacement/spare m= otherboard you can try?=A0=0A> There's always=0A> the possibility that you = have a bad crystal on the=0A> motherboard and a=0A> replacement board would= rule that out.=0A=0AI also suggest to check you C mode hw.acpi.cpu.cx_lowe= st. I had same behavior on my note some time ago. This happened when i trie= d to use C3 so i stayed at C2.=0A=0AKirill=0A=0A