From owner-freebsd-stable@FreeBSD.ORG Tue Nov 12 02:40:15 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D9368A49; Tue, 12 Nov 2013 02:40:14 +0000 (UTC) Received: from smtp1.multiplay.co.uk (smtp1.multiplay.co.uk [85.236.96.35]) by mx1.freebsd.org (Postfix) with ESMTP id 6FC3837FB; Tue, 12 Nov 2013 02:40:14 +0000 (UTC) Received: by smtp1.multiplay.co.uk (Postfix, from userid 65534) id B761220E7088A; Tue, 12 Nov 2013 02:32:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.multiplay.co.uk X-Spam-Level: X-Spam-Status: No, score=-2.3 required=8.0 tests=ALL_TRUSTED,AWL,BAYES_00, STOX_REPLY_TYPE autolearn=no version=3.3.1 Received: from r2d2 (82-69-141-170.dsl.in-addr.zen.co.uk [82.69.141.170]) by smtp1.multiplay.co.uk (Postfix) with ESMTPA id ED78220E70885; Tue, 12 Nov 2013 02:32:28 +0000 (UTC) Message-ID: <4EB902F80CE84DD2BF36C85EF4CE8EF8@multiplay.co.uk> From: "Steven Hartland" To: , , "Andriy Gapon" References: <20967.760.95825.310085@gargle.gargle.HOWL><51E80B30.1090004@FreeBSD.org><20968.10645.880772.30501@gargle.gargle.HOWL><520202E5.30300@FreeBSD.org><20994.55913.93606.436124@gargle.gargle.HOWL> <21111.12085.958991.356982@gargle.gargle.HOWL> Subject: Re: Help with filing a [maybe] ZFS/mmap bug. Date: Tue, 12 Nov 2013 02:32:25 -0000 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: re@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Nov 2013 02:40:15 -0000 ----- Original Message ----- From: "George Hartzell" > > > Andriy Gapon writes: > > > > on 18/07/2013 20:44 George Hartzell said the following: > > > > > Andriy Gapon writes: > > > > > > on 17/07/2013 23:47 George Hartzell said the following: > > > > > > > How should I move forward with this? > > > > > > > > > > > > Could you please try to reproduce this problem using a kernel built with > > > > > > INVARIANTS options? > > > > > > > > > > I added INVARIANT_SUPPORT and INVARIANTS options to the GENERIC > > > > > kernel, rebuilt it, installed it and running through my "test case" > > > > > generated a lot of invalid flac files. I"m not sure what the options > > > > > are/were supposed to do though, it looks like they generally lead to > > > > > KASSERTS, which lead to abort()'s. Nothing in /var/log/messages or on > > > > > the console. > > > > > > > > George, > > > > > > > > do you have anything new on this issue? > > > > > > Since the message that you quoted I narrowed down my "test case" > > > somewhat but I have not yet produced a stand-alone tool that > > > reproduces it (you still have to go through picard et al.). > > > > > > > Could you please try the following patch? > > > > http://people.freebsd.org/~avg/zfs-putpages.diff > > > > > > > > I expect it to not really fix the issue, but it may help to narrow it down. > > > > Please keep INVARIANTS. > > > > > > Absolutely. Probably not until the weekend, but I'll give it a go. > > > > > > Thanks for following up. > > > > Did you manage to make any progress with this? > > > > We're seeing a problem where rrdcached corrupts rrd files and remembering > > this thread and knowning it uses mmap and we're on ZFS I was wondering > > it this may be the cause for this issue too. > > > > I've just recompiled rrdtool without mmap support and am clearing down > > all corrupted files but it would be good to know if any progress was > > made on this? > > > > Regards > > Steve > > I was able recreate the problem on a 10-BETA-something-or-other > recently (I'd only been using 9 up until then). Andriy's patches > didn't make a difference. I haven't heard anything since reporting > back to him. I've pretty much confirmed mmap support is causing the corruption when running rrdcached as since rebuilding with mmap disabled I've had no further corruption. @George when you got corruption what did the files look like? I ask as here I see lots of zeros as through the file size was correct but pretty much blanked. @avg what was your thinking behind what may be the issue here? If this is a mmap bug in zfs its a pretty serious one given the amount of silent corruption you can get. @re Although reported incidents appear to be rare as its silent data corruption users may be blissfully unaware its happening. Given that my gut feeling is this is serious enough that we need to get something in place before 10 release, even if this is make ZFS report ENOTSUP for mmap calls, would you agree? Regards Steve