From owner-freebsd-stable@freebsd.org Sat Jul 9 01:43:14 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A645FB83E1D; Sat, 9 Jul 2016 01:43:14 +0000 (UTC) (envelope-from dcrosstech@gmail.com) Received: from mail-yw0-x22a.google.com (mail-yw0-x22a.google.com [IPv6:2607:f8b0:4002:c05::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 73B8C18B9; Sat, 9 Jul 2016 01:43:14 +0000 (UTC) (envelope-from dcrosstech@gmail.com) Received: by mail-yw0-x22a.google.com with SMTP id l125so50839438ywb.2; Fri, 08 Jul 2016 18:43:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:cc; bh=s9wFMRb5nSXLWOxXSbco6K4+d90kJ7McKN32ouFL5Cw=; b=Jre5cmm+10apMmStyyjiCkew3eGvH5AppI+CpYf74E0Ei2/x+WZJN2nR2L/hviMbCd 315YEYB0ufUlbEuCHxuEf1DWrmyKYhcBP5KpuoFw9G8dhdsYmowY6A7FPsKv1vep/PW3 P76DYlb4eCa9xzllyxrLb6YaO11jW69DlUmOWQ/xe1qjGWlLHLpZzFoqv0JgXhM3i9Rc jI/rT54BQQdglWE5Nz1Uhu1QjKe+maTV6zH0j4N8OnTKK+f7TwrFz0InVV0FZMjahC0S 85L3dxqYjuCeG2p6Cf6ao/Vu57Qo7YvoZKilE2sXYuDnnsn3OK3FJPA9bZWYAX5o6O+K h3iQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=s9wFMRb5nSXLWOxXSbco6K4+d90kJ7McKN32ouFL5Cw=; b=On6E2EAJdOGwkgVsZLqZAs5wREV41ReW5Fqv6pINIHNRkv1APBiiaxgMx9r9xHTVrF qePcn2qPcLaSnxu0QPuyB2dpb/ZOupFX9ndLN81zXr5sYjjvKzlmdYsi8zXtCUsaSUaK NxtyCZ0HIeiW9K+P7Zq+5ASve9CzovfPch1IrcJNmBSS6PR3492JjzTV8YIc5qzcxm+X jyyOzh61tKhZ9V2X6XumPz/eG8kjO74wZQ9gJ4GsSLanOT0MHJ6RZjx4y8/kxnQSf0YN JT7UIOGwPwF6/j7YgU2yCiqKc8Bjn5lvkvSgjcsqikNdOCsGNrkYrClvfA2HwnN4/H65 Mfpw== X-Gm-Message-State: ALyK8tLjYGaFn4fg4A4RzLheSqw5T/aKTklr34TdDzp2cZjrkSWuVm6NlujkdTbmLsiaalVEAAhxcTCtWgHwag== X-Received: by 10.13.217.20 with SMTP id b20mr115656ywe.44.1468028592896; Fri, 08 Jul 2016 18:43:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.37.212.66 with HTTP; Fri, 8 Jul 2016 18:43:12 -0700 (PDT) From: David Cross Date: Fri, 8 Jul 2016 21:43:12 -0400 Message-ID: Subject: Re: Reproducable panic in FFS with softupdates and no journaling (10.3-RELEASE-pLATEST) FOUND IT, including reproduction steps To: Konstantin Belousov Cc: freebsd-stable@freebsd.org, freebsd-hackers@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.22 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Jul 2016 01:43:14 -0000 Ok... I found it. All of the writes go through ffs_write (including VOP_RECLAIM, so my statement that VOP_RECLAIM couldn't handle things that vinvalbuf left behind is obviously incorrect). Sometimes it worked, sometimes it paniced, I started putting more deugging into it and I noticed the following: The problem file would balloc twice as follows: attempting to balloc inode 18237205 softdep_setup_allocdirect(18237205, 1, 72834400, 0, 8192, 0, 0xfffffe00f76a6d88) balloc at 291337600, flags: 50000 attempting to balloc inode 18237205 softdep_setup_allocdirect(18237205, 0, 72834448, 0, 16384, 0, 0xfffffe00f7749970) balloc at 291337792, flags: 7f040080 panic: softdep_deallocate_dependencies: dangling deps Furthrer reading of ffs_write to figure out why it worked sometimes and not others pointed me at the IO_SYNC flag, if passed in ffs_write dispatches to bwrite.. which gives the panic, otherwise it goes to bawrite which does not. However the problem is in ufs_balloc, around line 778 (which I saw in the earlier newbuf dump); There NO call to any write method for that buffer. If we compare this to the other calls to softdep_setup_allocdirect in that function (lines: 148, 264, 708, 828) we see that each of them has some call to bwrite, bdwrite, bawrite following it (a number of the other calls do not make any direct calls to b*writes either, I do not know nearly enough to say if those are correct or incorrect; I tried adding bwrite arround those lines with a conditional on IO_SYNC and I only made it panic earlier. I just don't know what the semantics of this enough. That being said, I was finally able to isolate a set of reproduction steps that anyone can run. As it stands it relies on a set of filesystem options that are no longer standard (but were, not that long ago), but I definitely believe they could be trivially modified to work on *any* UFS1/UFS2 filesystem... To that extent I am NOT including them, I will reply individually with the exploit code an instructions to reproduce; if you want, and you have an appropriate commit history or other credentials I will forward it on. Thanks, and I eagerly look forward to the patch, or assisting where I can in development.