From owner-freebsd-hackers@freebsd.org  Tue Jun 11 21:47:02 2019
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B537515C809F
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue, 11 Jun 2019 21:47:02 +0000 (UTC)
 (envelope-from asomers@gmail.com)
Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com
 [209.85.167.42])
 (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
 server-signature RSA-PSS (4096 bits)
 client-signature RSA-PSS (2048 bits) client-digest SHA256)
 (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 3379C81BFF
 for <freebsd-hackers@freebsd.org>; Tue, 11 Jun 2019 21:47:02 +0000 (UTC)
 (envelope-from asomers@gmail.com)
Received: by mail-lf1-f42.google.com with SMTP id y198so10464797lfa.1
 for <freebsd-hackers@freebsd.org>; Tue, 11 Jun 2019 14:47:02 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=cN5iYXVq5PUUCcd3LX3jWGQjvZRoyG6iMbVVrHDQ0gs=;
 b=dqCGcxNhpbnb5EhTAOIJw/7HkCH7rbmRU3xuyoY31Mu/7cG21k6F6Y1w6E992lV/nz
 WGThXYOJvdHWJH1ovciiXCxEp3SrfnMZ75eBcctRkbAwds3QzON/VRJIfWVvNKvDkyAm
 zBco9viDXUT73pdvOCwPD4Np9HeTXLkYrxpqLGeQYBeusAXDCyvlfuYLTrbuM1AY12Fb
 Sav/bbtLPrT8J09vrencurtgv4nDc+U7DzhhP/wmf/X0JfvH4/lFSAlH37XZ4GbLZL5X
 5RbRrzX1f97ukeTvA0+h8qSU4KzFVzzzuQN/wm4ACRI8rrESEL1G4jztBEjlTQGs+C7u
 jKIw==
X-Gm-Message-State: APjAAAW4hZvdOkwc8F6za1oH6ekmqE1GgprnoS3EOGpTeiA4WmEyBBoX
 I6PQlvl6zHu3XpQ5oO0UposRiwiWgJFekAtRhxE=
X-Google-Smtp-Source: APXvYqz3JtNnU9li+L2zDajpvO4DW2QCJ6o3qxIvti3TPWVaS8TQqeIHc6DCDFHR1hqjqkZ8dDNw5imFGpUBRxa1Ia8=
X-Received: by 2002:a19:5218:: with SMTP id m24mr26794394lfb.109.1560289614662; 
 Tue, 11 Jun 2019 14:46:54 -0700 (PDT)
MIME-Version: 1.0
References: <CAOtMX2gPHy1GWkLyOm5sF=e0zgnj0UEKijFbOnPk6sRo9K4Yew@mail.gmail.com>
 <20190611203018.GC75280@kib.kiev.ua>
In-Reply-To: <20190611203018.GC75280@kib.kiev.ua>
From: Alan Somers <asomers@freebsd.org>
Date: Tue, 11 Jun 2019 15:46:42 -0600
Message-ID: <CAOtMX2gQKz+w+kTO5zAk32S3Xz7O68c=Fd9nth+AHzDy-_JL1w@mail.gmail.com>
Subject: Re: panic: vm_fault_hold: fault on nofault entry in fusefs
To: Konstantin Belousov <kostikbel@gmail.com>
Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org>
Content-Type: text/plain; charset="UTF-8"
X-Rspamd-Queue-Id: 3379C81BFF
X-Spamd-Bar: ------
Authentication-Results: mx1.freebsd.org
X-Spamd-Result: default: False [-6.95 / 15.00];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0];
 NEURAL_HAM_SHORT(-0.95)[-0.952,0];
 NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[]
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Jun 2019 21:47:03 -0000

On Tue, Jun 11, 2019 at 2:30 PM Konstantin Belousov <kostikbel@gmail.com> wrote:
>
> On Tue, Jun 11, 2019 at 02:12:22PM -0600, Alan Somers wrote:
> > Can somebody please help me to debug a fusefs problem?  I have a 100%
> > reproducible panic with the above message.  Evidentially there's
> > something I don't know about buf(9) and uiomove(9).  The good news is
> > that the panic is sufficiently reproducible and sufficiently
> > instrumented that I know exactly what's happening; I just don't know
> > why.  Here's a summary of what happens.
> >
> > 1) fusefs's VOP_WRITE method gets called with a buffer that spans a
> > logical block boundary, but does not extend the size of the file.
> > 2) It splits the write into two parts.  Each one calls getblk to
> > allocate a struct buf, fills in the old data with a read, and fills
> > the new data with uiomove.
> > 3) After the file gets close()ed, VOP_INACTIVE calls vn_fsync_buf to
> > flush dirty buffers.
> > 4) VOP_STRATEGY successfully writes the first buffer and frees it with
> > bufdone().
> > 5) VOP_STRATEGY tries to write the second buffer, but panics during
> > uiomove.  The address that caused the panic is always exactly 4KB into
> > the buffer.
> >
> > So what am I doing wrong?  The address that causes the panic in step 5
> > was successfully accessed in step 2, so this isn't some kind of buffer
> > overrun.  Does it have something to do with the fact that the read
> > operation in step 2 called bufdone()?  Seems unlikely because it did
> > that for both buffers, yet only the second one panics.  Or does the
> > address actually fault during both VOP_WRITE and VOP_STRATEGY, but
> > something low down handles the fault in the first case?  I'd be
> > grateful for any help that anyone can offer.
> > -Alan
> >
> > P.S.
> > Here's the panic's stack
> > panic: vm_fault_hold: fault on nofault entry, addr: 0xfffffe0004591000
> > cpuid = 1
> > time = 1560283621
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0031c21f80
> > vpanic() at vpanic+0x19d/frame 0xfffffe0031c21fd0
> > panic() at panic+0x43/frame 0xfffffe0031c22030
> > vm_fault_hold() at vm_fault_hold+0x2064/frame 0xfffffe0031c22170
> > vm_fault() at vm_fault+0x60/frame 0xfffffe0031c221b0
> > trap_pfault() at trap_pfault+0x188/frame 0xfffffe0031c22200
> > trap() at trap+0x2b4/frame 0xfffffe0031c22310
> > calltrap() at calltrap+0x8/frame 0xfffffe0031c22310
> > --- trap 0xc, rip = 0xffffffff8108c9e6, rsp = 0xfffffe0031c223e0, rbp
> > = 0xfffffe0031c223e0 ---
> > memmove_erms() at memmove_erms+0x116/frame 0xfffffe0031c223e0
> > uiomove_faultflag() at uiomove_faultflag+0x146/frame 0xfffffe0031c22420
> > fuse_write_directbackend() at fuse_write_directbackend+0x1cd/frame
> > 0xfffffe0031c224f0
> > fuse_io_strategy() at fuse_io_strategy+0x24d/frame 0xfffffe0031c22590
> > fuse_vnop_strategy() at fuse_vnop_strategy+0x2a/frame 0xfffffe0031c225a0
> > VOP_STRATEGY_APV() at VOP_STRATEGY_APV+0x63/frame 0xfffffe0031c225c0
> > bufstrategy() at bufstrategy+0x44/frame 0xfffffe0031c225f0
> > bufwrite() at bufwrite+0x259/frame 0xfffffe0031c22640
> > vn_fsync_buf() at vn_fsync_buf+0x23e/frame 0xfffffe0031c226a0
> > fuse_vnop_inactive() at fuse_vnop_inactive+0x7e/frame 0xfffffe0031c226e0
> > VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0x63/frame 0xfffffe0031c22700
> > vinactive() at vinactive+0xcd/frame 0xfffffe0031c22750
> > vputx() at vputx+0x2d0/frame 0xfffffe0031c227b0
> > vn_close1() at vn_close1+0x116/frame 0xfffffe0031c22820
> > vn_closefile() at vn_closefile+0x4c/frame 0xfffffe0031c228a0
> > _fdrop() at _fdrop+0x1a/frame 0xfffffe0031c228c0
> > closef() at closef+0x1ec/frame 0xfffffe0031c22950
> > closefp() at closefp+0x9c/frame 0xfffffe0031c22990
> > amd64_syscall() at amd64_syscall+0x276/frame 0xfffffe0031c22ab0
> > fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0031c22ab0
> > --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x8006842ba, rsp =
> > 0x7fffffffe748, rbp = 0x7fffffffe760 ---
> > KDB: enter: panic
> Start with dumping core.  Then print out the struct buf and show it.

Thanks for the tip.  I think I've figured it out: after VOP_WRITE but
before VOP_INACTIVE a VOP_SETATTR was truncating the file.  And a
legacy of fuse_io.c's origins as a copy/paste of the NFS client is
that it has are two different ways to track the valid region of a buf.
-Alan