From nobody Tue Jun  8 12:48:19 2021
X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 5DD815D5CA9
	for <freebsd-current@mlmmj.nyi.freebsd.org>; Tue,  8 Jun 2021 12:48:32 +0000 (UTC)
	(envelope-from wlosh@bsdimp.com)
Received: from mail-qv1-xf2c.google.com (mail-qv1-xf2c.google.com [IPv6:2607:f8b0:4864:20::f2c])
	(using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK))
	by mx1.freebsd.org (Postfix) with ESMTPS id 4FzqmH4g0Qz4jJx
	for <freebsd-current@freebsd.org>; Tue,  8 Jun 2021 12:48:31 +0000 (UTC)
	(envelope-from wlosh@bsdimp.com)
Received: by mail-qv1-xf2c.google.com with SMTP id x6so5755166qvx.4
        for <freebsd-current@freebsd.org>; Tue, 08 Jun 2021 05:48:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=bsdimp-com.20150623.gappssmtp.com; s=20150623;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=xxT8kkuHVndnQEpYQ666wygQ41S+B+tg+ywqdqv6y78=;
        b=CTIGy1C4glhK4VfDegr5QWu0rgfHsqLMKkRNbUJRjh8gbOV8qyGTD67lCN2dBIP+lN
         tYPwazJW8mESR+VFhp4BLOGQ03WJK53TdL/eLNNM1Tx5VLInuD3vpb9+jlijyV82NUrb
         nW0ammwtzj1s0ljn94lge6B1nAB9Tdd/3Z+Uk7cBcqxDwTnXK0tXaSJWtxyo3MbCXC+S
         pbp5HbllPFlzwU0+KKgO5dT6eBs4UzM4l8gOTgFWWJYLlUB/OmZva3hubf6M6w5A11Kn
         CO1HWD233em79TP4PezQNKvas6zDYLh1JhlwS4KjP/JzHPRbV+DG54fsfQUwYH6J8hfi
         hIXw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=xxT8kkuHVndnQEpYQ666wygQ41S+B+tg+ywqdqv6y78=;
        b=WAstNQUfBmCZTg51XtFSuPVlGc7G6dzWzP5ti7pit/z/vuz4kHoBeRcvjt2TZuxFeQ
         uI7+Hef07kcjOrVy9Nsah9U12/7LSJZMxU4bpLiS1uO2taWA2PYj8gPuM/8AEqOCDiLD
         8jB4zLSGH4Z4dSPkM5p2rMXgy3+wGxdDkmjmziMHLG+E2VHz8BrTFvYjrXiAGzLzajYZ
         XtIud+eU/csyoTFArkZtpW5vGevLqBFSa3oRrVQFUfkG59K95QTqxslUOcvnOOPvmh51
         DF1Qn7QEjun4kOn2lE/xQXIVWvLx7hNtr6OkFWnvGUtOWmtp6YaYUk7Ql21r2lJwf1WH
         JSCg==
X-Gm-Message-State: AOAM531h2yvBvsM4uB08gweAL+7/quPdbdpBrOS67Hg0diMq5vW68JcI
	DUiKW7DKlQJfXRjrUJBjoso57ujKiFEG8dgThqEzPA==
X-Google-Smtp-Source: ABdhPJz6OC63PEFm1X3sTLu6KKj7tLRUt/BY2ulsabqzibhzjUUwpIBWB8UJyjtOnrd/nB4T5ADAPK9O9FUbPDM6XZY=
X-Received: by 2002:a0c:f542:: with SMTP id p2mr23132195qvm.24.1623156510388;
 Tue, 08 Jun 2021 05:48:30 -0700 (PDT)
List-Id: Discussions about the use of FreeBSD-current <freebsd-current.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-current
List-Help: <mailto:freebsd-current+help@freebsd.org>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Subscribe: <mailto:freebsd-current+subscribe@freebsd.org>
List-Unsubscribe: <mailto:freebsd-current+unsubscribe@freebsd.org>
Sender: owner-freebsd-current@freebsd.org
MIME-Version: 1.0
References: <20210607090109.08ecb130@ernst.home> <YL6Hc6XmEIIx76gt@nuc>
 <20210608084646.6a7e1bc7@ernst.home> <CANCZdfpODXLP9B+AAXSd5dTS4-V_9r+Eg90z7sz18wTaE8ELZA@mail.gmail.com>
In-Reply-To: <CANCZdfpODXLP9B+AAXSd5dTS4-V_9r+Eg90z7sz18wTaE8ELZA@mail.gmail.com>
From: Warner Losh <imp@bsdimp.com>
Date: Tue, 8 Jun 2021 06:48:19 -0600
Message-ID: <CANCZdfpPjNn7+AT+OzkAGks1a52z9EX7-_49tJLLd0bweMF+aQ@mail.gmail.com>
Subject: Re: kernel panic while copying files
To: Gary Jennejohn <gljennjohn@gmail.com>
Cc: Mark Johnston <markj@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org>, 
	=?UTF-8?Q?Edward_Tomasz_Napiera=C5=82a?= <trasz@freebsd.org>
Content-Type: multipart/alternative; boundary="00000000000037a20505c44093e0"
X-Rspamd-Queue-Id: 4FzqmH4g0Qz4jJx
X-Spamd-Bar: -
Authentication-Results: mx1.freebsd.org;
	dkim=pass header.d=bsdimp-com.20150623.gappssmtp.com header.s=20150623 header.b=CTIGy1C4;
	dmarc=none;
	spf=none (mx1.freebsd.org: domain of wlosh@bsdimp.com has no SPF policy when checking 2607:f8b0:4864:20::f2c) smtp.mailfrom=wlosh@bsdimp.com
X-Spamd-Result: default: False [-1.00 / 15.00];
	 RCVD_TLS_ALL(0.00)[];
	 ARC_NA(0.00)[];
	 R_DKIM_ALLOW(-0.20)[bsdimp-com.20150623.gappssmtp.com:s=20150623];
	 NEURAL_HAM_MEDIUM(-1.00)[-1.000];
	 FROM_HAS_DN(0.00)[];
	 RCPT_COUNT_THREE(0.00)[4];
	 NEURAL_SPAM_SHORT(1.00)[0.999];
	 NEURAL_HAM_LONG(-1.00)[-1.000];
	 MIME_GOOD(-0.10)[multipart/alternative,text/plain];
	 PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org];
	 DMARC_NA(0.00)[bsdimp.com];
	 SPAMHAUS_ZRD(0.00)[2607:f8b0:4864:20::f2c:from:127.0.2.255];
	 TO_MATCH_ENVRCPT_SOME(0.00)[];
	 TO_DN_ALL(0.00)[];
	 DKIM_TRACE(0.00)[bsdimp-com.20150623.gappssmtp.com:+];
	 RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::f2c:from];
	 R_SPF_NA(0.00)[no SPF record];
	 FREEMAIL_TO(0.00)[gmail.com];
	 FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com];
	 MIME_TRACE(0.00)[0:+,1:+,2:~];
	 RBL_DBL_DONT_QUERY_IPS(0.00)[2607:f8b0:4864:20::f2c:from];
	 ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US];
	 FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com];
	 MAILMAN_DEST(0.00)[freebsd-current];
	 RCVD_COUNT_TWO(0.00)[2]
X-ThisMailContainsUnwantedMimeParts: Y

--00000000000037a20505c44093e0
Content-Type: text/plain; charset="UTF-8"

Sorry to reply to myself... had a thought as my brain rested while making
tea...

I think we may need to consider reverting (or at least not yet enabling)
the uma stuff.

On Tue, Jun 8, 2021 at 6:27 AM Warner Losh <imp@bsdimp.com> wrote:

>
>
> On Tue, Jun 8, 2021 at 2:47 AM Gary Jennejohn <gljennjohn@gmail.com>
> wrote:
>
>> On Mon, 7 Jun 2021 16:54:11 -0400
>> Mark Johnston <markj@freebsd.org> wrote:
>>
>> > On Mon, Jun 07, 2021 at 11:01:09AM +0200, Gary Jennejohn wrote:
>> > > I've seen this panic three times in the last two days:
>> > >
>> > > [first panic]
>> > > Unread portion of the kernel message buffer:
>> > >
>> > >
>> > > Fatal trap 12: page fault while in kernel mode
>> > > cpuid = 3; apic id = 03
>> > > fault virtual address   = 0x801118000
>> > > fault code              = supervisor write data, page not present
>> > > instruction pointer     = 0x20:0xffffffff808d2212
>> > > stack pointer           = 0x28:0xfffffe00dbc8c760
>> > > frame pointer           = 0x28:0xfffffe00dbc8c7a0
>> > > code segment            = base 0x0, limit 0xfffff, type 0x1b
>> > >                         = DPL 0, pres 1, long 1, def32 0, gran 1
>> > > processor eflags        = interrupt enabled, resume, IOPL = 0
>> > > current process         = 28 (dom0)
>> > > trap number             = 12
>> > > panic: page fault
>> > > cpuid = 3
>> > > time = 1622963058
>> > > KDB: stack backtrace:
>> > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>> 0xfffffe00dbc8c410
>> > > vpanic() at vpanic+0x181/frame 0xfffffe00dbc8c460
>> > > panic() at panic+0x43/frame 0xfffffe00dbc8c4c0
>> > > trap_fatal() at trap_fatal+0x387/frame 0xfffffe00dbc8c520
>> > > trap_pfault() at trap_pfault+0x4f/frame 0xfffffe00dbc8c580
>> > > trap() at trap+0x253/frame 0xfffffe00dbc8c690
>> > > calltrap() at calltrap+0x8/frame 0xfffffe00dbc8c690
>> > > --- trap 0xc, rip = 0xffffffff808d2212, rsp = 0xfffffe00dbc8c760, rbp
>> = 0xfffffe00dbc8c7a0 ---
>> > > zone_release() at zone_release+0x1f2/frame 0xfffffe00dbc8c7a0
>> > > bucket_drain() at bucket_drain+0xda/frame 0xfffffe00dbc8c7d0
>> > > bucket_cache_reclaim_domain() at
>> bucket_cache_reclaim_domain+0x30a/frame 0xfffffe00dbc8c830
>> > > zone_reclaim() at zone_reclaim+0x162/frame 0xfffffe00dbc8c880
>> > > uma_reclaim_domain() at uma_reclaim_domain+0xa2/frame
>> 0xfffffe00dbc8c8b0
>> > > vm_pageout_worker() at vm_pageout_worker+0x41e/frame
>> 0xfffffe00dbc8cb70
>> > > vm_pageout() at vm_pageout+0x21e/frame 0xfffffe00dbc8cbb0
>> > > fork_exit() at fork_exit+0x7e/frame 0xfffffe00dbc8cbf0
>> > > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00dbc8cbf0
>> > > --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>> > > KDB: enter: panic
>> > >
>> > > __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
>> > > 55       __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct
>> pcpu,
>> > >        pc_curthread)));
>> > >
>> > > One difference was that in the second and third panics the fault
>> virtual
>> > > address was 0x0.  But the backtrace was the same.
>> > >
>> > > Relevant info from the info.x files:
>> > > Architecture: amd64
>> > > Architecture Version: 2
>> > > Version String: FreeBSD 14.0-CURRENT #33 main-n247184-1970d693039:
>> Sat Jun
>> > > 5 09:58:55 CEST 2021
>> > >
>> > > CPU: AMD Ryzen 5 1600 Six-Core Processor             (3194.09-MHz
>> K8-class CPU)
>> > >   Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1
>> Stepping=1
>> > >   AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
>> > >   AMD
>> Features2=0x35c233ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX>
>> > >   AMD Extended Feature Extensions ID
>> EBX=0x1007<CLZERO,IRPerf,XSaveErPtr,IBPB>
>> > >
>> > > I have 16GiB of memory in the box.
>> > >
>> > > The panic occurred while copying files from an internal SATA SSD to a
>> > > SATA 8TB disk in an external USB3 docking station.  The panic seems to
>> > > occur quite quickly, after only a few files have been copied.
>> > >
>> > > swap is on a different internal disk.
>> > >
>> > > I can poke around in the crash dumps with kgdb if anyone wants more
>> > > information.
>> >
>> > Are you running with invariants configured in the kernel?  If not,
>> > please try to reproduce this in a kernel with
>> >
>> > options INVARIANT_SUPPORT
>> > options INVARIANTS
>> >
>> > configured.
>> >
>> > A stack trace with line numbers would also be helpful.
>>
>> Thanks for the hint.  After enabling INVARIANTS the kernel panics as
>> soon I turn on the external USB3 disk.  No user disk access required.
>>
>> Version String: FreeBSD 14.0-CURRENT #34 main-n247239-f570a6723e1: Tue Jun
>> 8 09:34:32 CEST 2021
>>
>> Here the kgdb backtrace:
>>
>> Unread portion of the kernel message buffer:
>> panic: Duplicate free of 0xfffff800356b9000 from zone
>> 0xfffffe00dcbdd800(da_ccb) slab 0xfffff800356b9fd8(0)
>> cpuid = 8
>> time = 1623140519
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>> 0xfffffe00c5f398c0
>> vpanic() at vpanic+0x181/frame 0xfffffe00c5f39910
>> panic() at panic+0x43/frame 0xfffffe00c5f39970
>> uma_dbg_free() at uma_dbg_free+0x1e1/frame 0xfffffe00c5f399b0
>> uma_zfree_arg() at uma_zfree_arg+0x147/frame 0xfffffe00c5f39a00
>> camperiphdone() at camperiphdone+0x1b7/frame 0xfffffe00c5f39b20
>> xpt_done_process() at xpt_done_process+0x3dd/frame 0xfffffe00c5f39b60
>> xpt_done_td() at xpt_done_td+0xf5/frame 0xfffffe00c5f39bb0
>> fork_exit() at fork_exit+0x80/frame 0xfffffe00c5f39bf0
>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00c5f39bf0
>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>> KDB: enter: panic
>>
>> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
>> 55              __asm("movq %%gs:%P1,%0" : "=r" (td) : "n"
>> (offsetof(struct pcpu,
>> (kgdb) bt
>> #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
>> #1  doadump (textdump=textdump@entry=0)
>>     at /usr/src/sys/kern/kern_shutdown.c:399
>> #2  0xffffffff8040c39a in db_dump (dummy=<optimized out>,
>>     dummy2=<unavailable>, dummy3=<unavailable>, dummy4=<unavailable>)
>>     at /usr/src/sys/ddb/db_command.c:575
>> #3  0xffffffff8040c192 in db_command (last_cmdp=<optimized out>,
>>     cmd_table=<optimized out>, dopager=dopager@entry=1)
>>     at /usr/src/sys/ddb/db_command.c:482
>> #4  0xffffffff8040beed in db_command_loop ()
>>     at /usr/src/sys/ddb/db_command.c:535
>> #5  0xffffffff8040f616 in db_trap (type=<optimized out>, code=<optimized
>> out>)
>>     at /usr/src/sys/ddb/db_main.c:270
>> #6  0xffffffff8066b1c4 in kdb_trap (type=type@entry=3, code=code@entry=0,
>>     tf=<optimized out>, tf@entry=0xfffffe00c5f397f0)
>>     at /usr/src/sys/kern/subr_kdb.c:727
>> #7  0xffffffff809a4e96 in trap (frame=0xfffffe00c5f397f0)
>>     at /usr/src/sys/amd64/amd64/trap.c:604
>> #8  <signal handler called>
>> #9  kdb_enter (why=0xffffffff80a61a23 "panic", msg=<optimized out>)
>>     at /usr/src/sys/kern/subr_kdb.c:506
>> #10 0xffffffff806207a2 in vpanic (fmt=<optimized out>, ap=<optimized out>,
>>     ap@entry=0xfffffe00c5f39950) at /usr/src/sys/kern/kern_shutdown.c:907
>> #11 0xffffffff80620533 in panic (
>>     fmt=0xffffffff80d635c8 <cnputs_mtx> ".\024\244\200\377\377\377\377")
>>     at /usr/src/sys/kern/kern_shutdown.c:843
>> #12 0xffffffff808e12b1 in uma_dbg_free (zone=0xfffffe00dcbdd800,
>>     slab=0xfffff800356b9fd8, item=0xfffff800356b9000)
>>     at /usr/src/sys/vm/uma_core.c:5664
>> #13 0xffffffff808d9de7 in item_dtor (zone=0xfffffe00dcbdd800,
>>     item=0xfffff800356b9000, size=544, udata=0x0, skip=SKIP_NONE)
>>     at /usr/src/sys/vm/uma_core.c:3418
>> #14 uma_zfree_arg (zone=0xfffffe00dcbdd800, item=0xfffff800356b9000,
>>     udata=udata@entry=0x0) at /usr/src/sys/vm/uma_core.c:4374
>> #15 0xffffffff802da503 in uma_zfree (zone=0xffffffff80d635c8 <cnputs_mtx>,
>>     item=0x200) at /usr/src/sys/vm/uma.h:404
>>
>
> OK. This is a bad stack trace. camperiphdone doesn't call uma_zfree()...
> It does call
> xpt_free_ccb, though, and that's likely what's going wrong. And that
> matches the line
> numbers. Most likely this is llvm's tail call optimizations...  Can you
> compile the kernel
> either -O0 or with -fno-optimize-sibling-calls? That will give a better
> call stack.
>
> However, it's likely the new UMA stuff trasz committed (or it's providing
> better
> diagnostics than the old malloc based code which seems more likely) that
> can be
> disabled by the tunable kern.cam.da.enable_uma_ccbs=0.
>
> The lines in question:
>         saved_ccb = (union ccb *)done_ccb->ccb_h.saved_ccb_ptr;
>         bcopy(saved_ccb, done_ccb, sizeof(*done_ccb));
>         xpt_free_ccb(saved_ccb);
>
> So we overwrite the done_ccb with the saved_ccb's contents and then free
> the saved ccb.
> That's likely OKish, though.
>
> We copy entire CCBs around in this code a lot, and I've not traced through
> it. But we're
> sending a scsi start unit in response to some error that is being reported
> via cam_periph_error()
>

So if we're copying entire CCBs around, and the way they were allocated
varies, then
we'll copy over the 'how to free' flags as well, and that might cause us to
free using
the wrong routines leading to a situation like this.... I've not puzzled
through all the
paths, because there seem to be other places where we copy entire CCBs in
this
recovery code, but I think that needs to have a good hard look.

Warner


> #16 0xffffffff802d9117 in camperiphdone (periph=0xfffff800061e2c00,
>>     done_ccb=0xfffff800355d6cc0) at /usr/src/sys/cam/cam_periph.c:1427
>> #17 0xffffffff802dfebd in xpt_done_process (ccb_h=0xfffff800355d6cc0)
>>     at /usr/src/sys/cam/cam_xpt.c:5491
>> #18 0xffffffff802e1ec5 in xpt_done_td (
>>     arg=arg@entry=0xffffffff80d33d80 <cam_doneqs>)
>>     at /usr/src/sys/cam/cam_xpt.c:5546
>> #19 0xffffffff805dad80 in fork_exit (callout=0xffffffff802e1dd0
>> <xpt_done_td>,
>>     arg=0xffffffff80d33d80 <cam_doneqs>, frame=0xfffffe00c5f39c00)
>>     at /usr/src/sys/kern/kern_fork.c:1083
>> #20 <signal handler called>
>>
>> Apparently caused by recent changes to CAM.
>>
>> Let me know if you want more information.
>>
>
> what's *periph say?
>
> camperiphdone is only use when we send a scsi start/stop is sent to try to
> recover
> in a couple of cases:
> switch(ccb->csio.scsi_status) { ...
>         case SCSI_STATUS_CMD_TERMINATED:
>         case SCSI_STATUS_CHECK_COND:
>                 error = camperiphscsisenseerror(ccb, orig_ccb,
> ...
> is the path. It might be profitable to use the debugger to trap here to
> see where we're
> kicking off this error recovery process.
>
> Warner
>

--00000000000037a20505c44093e0--