From owner-freebsd-current@FreeBSD.ORG Tue Sep 16 22:44:48 2014 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 87E4DCDA for ; Tue, 16 Sep 2014 22:44:48 +0000 (UTC) Received: from gw.catspoiler.org (cl-1657.chi-02.us.sixxs.net [IPv6:2001:4978:f:678::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 325248D5 for ; Tue, 16 Sep 2014 22:44:48 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id s8GMibax038105; Tue, 16 Sep 2014 15:44:41 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201409162244.s8GMibax038105@gw.catspoiler.org> Date: Tue, 16 Sep 2014 15:44:37 -0700 (PDT) From: Don Lewis Subject: Re: zpool: multiple IDs, CURRENT drops all pools after reboot To: ohartman@zedat.fu-berlin.de In-Reply-To: <20140917003433.47f4318b.ohartman@zedat.fu-berlin.de> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: freebsd-current@FreeBSD.org, killing@multiplay.co.uk X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Sep 2014 22:44:48 -0000 On 17 Sep, O. Hartmann wrote: > Am Tue, 16 Sep 2014 22:06:36 +0100 > "Steven Hartland" schrieb: >> All that said you shouldnt end up with corrupt data no matter >> what. >> >> Are there any other symptoms? Has memory been checked for >> faults etc? >> >> Regards >> Steve > > The reason why my desktop has only 4 GB left is that I discovered memory corruption when > equipted with 8 GB - there occured a strange bit flip. I can not assure that by ripping > off 4 GB (2 times 2GB, it is an old C2D/P45 based box) the problem has gone. I susepct > a dying chipset - when overheated (at the moment BIOS shows 80 degrees Celsius), the > problem is more frequent. You might want to try relaxing the memory timing in the BIOS. In particular, try adding some cycles to CAS Latency. I've had a couple of systems with ECC RAM that had excessive ECC errors with the default RAM timing that got much better when I increased CAS Latency. You might also want to try rigging an extra fan to blow on the RAM and/or chipset.