From owner-freebsd-hackers@freebsd.org Tue Jun 11 21:47:02 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B537515C809F for ; Tue, 11 Jun 2019 21:47:02 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3379C81BFF for ; Tue, 11 Jun 2019 21:47:02 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-lf1-f42.google.com with SMTP id y198so10464797lfa.1 for ; Tue, 11 Jun 2019 14:47:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cN5iYXVq5PUUCcd3LX3jWGQjvZRoyG6iMbVVrHDQ0gs=; b=dqCGcxNhpbnb5EhTAOIJw/7HkCH7rbmRU3xuyoY31Mu/7cG21k6F6Y1w6E992lV/nz WGThXYOJvdHWJH1ovciiXCxEp3SrfnMZ75eBcctRkbAwds3QzON/VRJIfWVvNKvDkyAm zBco9viDXUT73pdvOCwPD4Np9HeTXLkYrxpqLGeQYBeusAXDCyvlfuYLTrbuM1AY12Fb Sav/bbtLPrT8J09vrencurtgv4nDc+U7DzhhP/wmf/X0JfvH4/lFSAlH37XZ4GbLZL5X 5RbRrzX1f97ukeTvA0+h8qSU4KzFVzzzuQN/wm4ACRI8rrESEL1G4jztBEjlTQGs+C7u jKIw== X-Gm-Message-State: APjAAAW4hZvdOkwc8F6za1oH6ekmqE1GgprnoS3EOGpTeiA4WmEyBBoX I6PQlvl6zHu3XpQ5oO0UposRiwiWgJFekAtRhxE= X-Google-Smtp-Source: APXvYqz3JtNnU9li+L2zDajpvO4DW2QCJ6o3qxIvti3TPWVaS8TQqeIHc6DCDFHR1hqjqkZ8dDNw5imFGpUBRxa1Ia8= X-Received: by 2002:a19:5218:: with SMTP id m24mr26794394lfb.109.1560289614662; Tue, 11 Jun 2019 14:46:54 -0700 (PDT) MIME-Version: 1.0 References: <20190611203018.GC75280@kib.kiev.ua> In-Reply-To: <20190611203018.GC75280@kib.kiev.ua> From: Alan Somers Date: Tue, 11 Jun 2019 15:46:42 -0600 Message-ID: Subject: Re: panic: vm_fault_hold: fault on nofault entry in fusefs To: Konstantin Belousov Cc: FreeBSD Hackers Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 3379C81BFF X-Spamd-Bar: ------ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-6.95 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.95)[-0.952,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Jun 2019 21:47:03 -0000 On Tue, Jun 11, 2019 at 2:30 PM Konstantin Belousov wrote: > > On Tue, Jun 11, 2019 at 02:12:22PM -0600, Alan Somers wrote: > > Can somebody please help me to debug a fusefs problem? I have a 100% > > reproducible panic with the above message. Evidentially there's > > something I don't know about buf(9) and uiomove(9). The good news is > > that the panic is sufficiently reproducible and sufficiently > > instrumented that I know exactly what's happening; I just don't know > > why. Here's a summary of what happens. > > > > 1) fusefs's VOP_WRITE method gets called with a buffer that spans a > > logical block boundary, but does not extend the size of the file. > > 2) It splits the write into two parts. Each one calls getblk to > > allocate a struct buf, fills in the old data with a read, and fills > > the new data with uiomove. > > 3) After the file gets close()ed, VOP_INACTIVE calls vn_fsync_buf to > > flush dirty buffers. > > 4) VOP_STRATEGY successfully writes the first buffer and frees it with > > bufdone(). > > 5) VOP_STRATEGY tries to write the second buffer, but panics during > > uiomove. The address that caused the panic is always exactly 4KB into > > the buffer. > > > > So what am I doing wrong? The address that causes the panic in step 5 > > was successfully accessed in step 2, so this isn't some kind of buffer > > overrun. Does it have something to do with the fact that the read > > operation in step 2 called bufdone()? Seems unlikely because it did > > that for both buffers, yet only the second one panics. Or does the > > address actually fault during both VOP_WRITE and VOP_STRATEGY, but > > something low down handles the fault in the first case? I'd be > > grateful for any help that anyone can offer. > > -Alan > > > > P.S. > > Here's the panic's stack > > panic: vm_fault_hold: fault on nofault entry, addr: 0xfffffe0004591000 > > cpuid = 1 > > time = 1560283621 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0031c21f80 > > vpanic() at vpanic+0x19d/frame 0xfffffe0031c21fd0 > > panic() at panic+0x43/frame 0xfffffe0031c22030 > > vm_fault_hold() at vm_fault_hold+0x2064/frame 0xfffffe0031c22170 > > vm_fault() at vm_fault+0x60/frame 0xfffffe0031c221b0 > > trap_pfault() at trap_pfault+0x188/frame 0xfffffe0031c22200 > > trap() at trap+0x2b4/frame 0xfffffe0031c22310 > > calltrap() at calltrap+0x8/frame 0xfffffe0031c22310 > > --- trap 0xc, rip = 0xffffffff8108c9e6, rsp = 0xfffffe0031c223e0, rbp > > = 0xfffffe0031c223e0 --- > > memmove_erms() at memmove_erms+0x116/frame 0xfffffe0031c223e0 > > uiomove_faultflag() at uiomove_faultflag+0x146/frame 0xfffffe0031c22420 > > fuse_write_directbackend() at fuse_write_directbackend+0x1cd/frame > > 0xfffffe0031c224f0 > > fuse_io_strategy() at fuse_io_strategy+0x24d/frame 0xfffffe0031c22590 > > fuse_vnop_strategy() at fuse_vnop_strategy+0x2a/frame 0xfffffe0031c225a0 > > VOP_STRATEGY_APV() at VOP_STRATEGY_APV+0x63/frame 0xfffffe0031c225c0 > > bufstrategy() at bufstrategy+0x44/frame 0xfffffe0031c225f0 > > bufwrite() at bufwrite+0x259/frame 0xfffffe0031c22640 > > vn_fsync_buf() at vn_fsync_buf+0x23e/frame 0xfffffe0031c226a0 > > fuse_vnop_inactive() at fuse_vnop_inactive+0x7e/frame 0xfffffe0031c226e0 > > VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0x63/frame 0xfffffe0031c22700 > > vinactive() at vinactive+0xcd/frame 0xfffffe0031c22750 > > vputx() at vputx+0x2d0/frame 0xfffffe0031c227b0 > > vn_close1() at vn_close1+0x116/frame 0xfffffe0031c22820 > > vn_closefile() at vn_closefile+0x4c/frame 0xfffffe0031c228a0 > > _fdrop() at _fdrop+0x1a/frame 0xfffffe0031c228c0 > > closef() at closef+0x1ec/frame 0xfffffe0031c22950 > > closefp() at closefp+0x9c/frame 0xfffffe0031c22990 > > amd64_syscall() at amd64_syscall+0x276/frame 0xfffffe0031c22ab0 > > fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0031c22ab0 > > --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x8006842ba, rsp = > > 0x7fffffffe748, rbp = 0x7fffffffe760 --- > > KDB: enter: panic > Start with dumping core. Then print out the struct buf and show it. Thanks for the tip. I think I've figured it out: after VOP_WRITE but before VOP_INACTIVE a VOP_SETATTR was truncating the file. And a legacy of fuse_io.c's origins as a copy/paste of the NFS client is that it has are two different ways to track the valid region of a buf. -Alan