From owner-freebsd-stable@FreeBSD.ORG Tue Jul 17 12:38:51 2012 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1CC52106564A for ; Tue, 17 Jul 2012 12:38:51 +0000 (UTC) (envelope-from gmx@ross.cx) Received: from www81.your-server.de (www81.your-server.de [213.133.104.81]) by mx1.freebsd.org (Postfix) with ESMTP id CCFAE8FC0C for ; Tue, 17 Jul 2012 12:38:50 +0000 (UTC) Received: from [92.76.69.32] (helo=michael-think) by www81.your-server.de with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.74) (envelope-from ) id 1Sr73B-0008Qu-29; Tue, 17 Jul 2012 14:38:49 +0200 Content-Type: text/plain; charset=iso-8859-15; format=flowed; delsp=yes To: freebsd-stable@freebsd.org, "Steven Hartland" References: <6BF15045231C4F399F10BBBEA66A2CF8@multiplay.co.uk> Date: Tue, 17 Jul 2012 14:38:43 +0200 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Michael Ross" Message-ID: In-Reply-To: <6BF15045231C4F399F10BBBEA66A2CF8@multiplay.co.uk> User-Agent: Opera Mail/12.00 (Win32) X-Authenticated-Sender: gmx@ross.cx X-Virus-Scanned: Clear (ClamAV 0.97.3/15144/Tue Jul 17 10:52:44 2012) Cc: Subject: Re: 8.2 ->8.3 regression on disk writes X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 Jul 2012 12:38:51 -0000 Am 16.07.2012, 15:34 Uhr, schrieb Steven Hartland : > > ----- Original Message ----- From: "Michael Ross" > To: > Sent: Monday, July 16, 2012 2:23 PM > Subject: 8.2 ->8.3 regression on disk writes > > >> Hello, >> using 8.2 the machine runs fine, >> using 8.3 or higher, not so much. >> In laymans terms, >> if I do "too many" writes/time just once, the machine can't do any >> disk access for a couple of hours. >> As in: What's already running stays running, no crashes or anything, >> but as soon as I need to read from disk (login, start program not >> cached in memory from previous run), >> I'm all out of luck. >> I killed the testing ftp-transfer about 15 seconds after the transfer >> speed dropped, >> now I'm waiting for 10+ minutes for ``top'' to start. >> I can install ports and kernels and world fine, >> but "ezjail-admin install" or transferring a few GB of files from >> another machine sends it to limbo. >> The next step would likely be to go through the kernel changes between >> 8.2 and 8.3 to narrow it down, >> I'd appreciate pointers as to what kernel changes to look out for, >> or other suggestions on what to do. >> Verbose dmesg: http://gurder.ross.cx/misc/dmesg.txt > > You have some strangeness there: > > I see: > > "real memory = 17179869184 (16384 MB)" > "avail memory = 2050920448 (1955 MB)" > > So even though you have 16GB ram your only using 2GB of it which > will likely cause slowness under ZFS including disabling prefetch > > "ZFS NOTICE: Prefetch is disabled by default if less than 4GB of > RAM is present; to enable, add "vfs.zfs.prefetch_disable=0" to > /boot/loader.conf." > > Is this a VM or something? > > Regards > Steve The machine has 2GB of RAM installed -- at least that's what the hoster says and what the Debian system it comes with reports. Not a VM. I was mentally set for "disk trouble" and didn't even spot that, neither did I spot the little gem above: "WARNING: This architecture revision has known SMP hardware bugs which may cause random instability" Kernel source has this to say: /* * Opteron Rev E shows a bug as in very rare occasions a read memory * barrier is not performed as expected if it is followed by a * non-atomic read-modify-write instruction. * As long as that bug pops up very rarely (intensive machine usage * on other operating systems generally generates one unexplainable * crash any 2 months) and as long as a model specific fix would be * impratical at this stage, print out a warning string if the broken * model and family are identified. */ "Very rare" "random instability", read: Crappy hardware. Plus, after about 1 week of running tests with 8.2, 8.3, 9.0 and 9-STABLE kernels I managed to provoke the problem on 8.2 as well, so I'm embarrassed now for calling regression where there is none. Short, I'm giving up on this machine; thanks for the help, sorry for the noise, Michael