From nobody Wed Dec 1 20:56:33 2021 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 53EA318C34D8 for ; Wed, 1 Dec 2021 20:56:45 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-ua1-x92f.google.com (mail-ua1-x92f.google.com [IPv6:2607:f8b0:4864:20::92f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4J4BGP1hpLz4qyS for ; Wed, 1 Dec 2021 20:56:45 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-ua1-x92f.google.com with SMTP id j14so51625632uan.10 for ; Wed, 01 Dec 2021 12:56:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=JgRg2xtDcuGc6Zpd3FyuuuRndcYvANkauJJu6b59udw=; b=gi+yg5QQMTWOPYKTTBtXXPX/jTr66biRNO6lYGZTLa2E7+KDPJCd/G5Qpn6LFRreej tm9G8rtPfyVL4m06xG/hAG8JunfUrIrsA4J9Ju8RJsIWoes51YK/dklf5VIuc+umGpXv bwAwB9bVTkYz5ptScnJeKlHS8HWv2xfl/6p920VB4DKG1Ef+Xxzdh1f8yJnXoOECLDhx SJJ864hFRNom1E1dszjnltk2Ap5q2mshX7u+VDhjGegZG3qde9h9iVl7HZHHbAgcZzss rQpJBjHqHl7ZhfwlSNgno2qrmPcQiBNYDQtTti8URbgN1fJrqLzDOzTsrD4LVZ8NeK2V 0Z1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JgRg2xtDcuGc6Zpd3FyuuuRndcYvANkauJJu6b59udw=; b=oxuShif3D3SKmfSt7LnTMf7J3m3YMXqT23oBh1b5V2Jx8gzfnd9g11kYBI+A7a0CPv DPnoc4jhTS+EsqDp4X1tHcpZ8uyXlzuPQvuoXRua3s3XmJ0VtE15ZRgLpuoLmoCZcb0g bA+kXHqFtZqmtMhFab+UqWU0p78unpWwiCtfnjQblcHOvZaj4yW1JhXkYtEhtUKr0S6k chmR5ykgxj3RJ4ZUbpUuFzH2z7sghwvIHDTntt8JqrwgujgU6HBkceMjOortSVqt49pv cv3ZaEAZWithPxZ/qHQsim4k8fIBEJRUPjY+OMFtDMSh56SmoizrABiGhV6lDt2pQzVj agAA== X-Gm-Message-State: AOAM530A4GaNnDKrXnuP9EDhmv13GZRTkMIvU5A/VK/+eZpMjuzm3WW1 oXtJMCz4TsDM6doOYLmhIRtkThNbXqoYHbc2+S4l2dTSYXuXie9T X-Google-Smtp-Source: ABdhPJxNpmkxjtqcN1jpv5z0TO8Pfb5k5B+zJTgyuU1uOzjXOASZJlR6vQemhnQj7kyEcs5s+pQYHytn0wWpc62yuSQ= X-Received: by 2002:a05:6102:3e95:: with SMTP id m21mr10762043vsv.77.1638392204557; Wed, 01 Dec 2021 12:56:44 -0800 (PST) List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Warner Losh Date: Wed, 1 Dec 2021 13:56:33 -0700 Message-ID: Subject: Re: ZFS deadlocks triggered by HDD timeouts To: Alan Somers Cc: FreeBSD Content-Type: multipart/alternative; boundary="0000000000005b14f905d21be9fc" X-Rspamd-Queue-Id: 4J4BGP1hpLz4qyS X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: Y --0000000000005b14f905d21be9fc Content-Type: text/plain; charset="UTF-8" On Wed, Dec 1, 2021 at 1:47 PM Alan Somers wrote: > On Wed, Dec 1, 2021 at 1:37 PM Warner Losh wrote: > > > > > > > > On Wed, Dec 1, 2021 at 1:28 PM Alan Somers wrote: > >> > >> On Wed, Dec 1, 2021 at 11:25 AM Warner Losh wrote: > >> > > >> > > >> > > >> > On Wed, Dec 1, 2021, 11:16 AM Alan Somers > wrote: > >> >> > >> >> On a stable/13 build from 16-Sep-2021 I see frequent ZFS deadlocks > >> >> triggered by HDD timeouts. The timeouts are probably caused by > >> >> genuine hardware faults, but they didn't lead to deadlocks in > >> >> 12.2-RELEASE or 13.0-RELEASE. Unfortunately I don't have much > >> >> additional information. ZFS's stack traces aren't very informative, > >> >> and dmesg doesn't show anything besides the usual information about > >> >> the disk timeout. I don't see anything obviously related in the > >> >> commit history for that time range, either. > >> >> > >> >> Has anybody else observed this phenomenon? Or does anybody have a > >> >> good way to deliberately inject timeouts? CAM makes it easy enough > to > >> >> inject an error, but not a timeout. If it did, then I could bisect > >> >> the problem. As it is I can only reproduce it on production servers. > >> > > >> > > >> > What SIM? Timeouts are tricky because they have many sources, some of > which are nonlocal... > >> > > >> > Warner > >> > >> mpr(4) > > > > > > Is this just a single drive that's acting up, or is the controller > initialized as part of the error recovery? > > I'm not doing anything fancy with mprutil or sas3flash, if that's what > you're asking. > No. I'm asking if you've enabled debugging on the recovery messages and see that we enter any kind of controller reset when the timeouts occur. > > If a single drive, > > are there multiple timeouts that happen at the same time such that we > timeout a request while we're waiting for > > the abort command we send to the firmware to be acknowledged? > > I don't know. > OK. > > Would you be able to run a kgdb script to see > > if you're hitting a situation that I fixed in mpr that would cause I/O > to never complete in this rather odd circumstance? > > If you can, and if it is, then there's a change I can MFC :). > > Possibly. When would I run this kgdb script? Before ZFS locks up, > after, or while the problematic timeout happens? > After the timeouts. I've been doing 'kgdb' followed by 'source mpr-hang.gdb' to run this. What you are looking for is anything with a qfrozen_cnt > 0.. The script is imperfect and racy with normal operations (but not in a bad way), so you may need to run it a couple of times to get consistent data. On my systems, there'd be one or two devices with a frozen count > 1 and no I/O happened on those drives and processes hung. That might not be any different than a deadlock :) Warner P.S. here's the mpr-hang.gdb script. Not sure if I can make an attachment survive the mailing lists :) define cam_path set $path=(struct cam_path *)$arg0 printf " Periph: %p\n", $path->periph printf " Bus: %p\n", $path->bus printf " Target: %p\n", $path->target printf " Device: %p\n", $path->device end define periph set $periph = (struct cam_periph *)$arg0 printf "%s%d:\n", $periph->periph_name, $periph->unit_number printf "softc: %p\n", $periph->softc printf "sim: %p\n", $periph->sim printf "flags: 0x%x\n", $periph->flags cam_path $periph->path printf "priority: sched %d immed %d\n", $periph->scheduled_priority, $periph->immediate_priority printf "allocated %d allocating %d\n", $periph->periph_allocated, $periph->periph_allocating printf "refcount: %d\n", $periph->refcount printf "qfrozen_cnt: %d\n", $periph->path->device->ccbq.queue.qfrozen_cnt end define periphunits set $count = 0 set $driver = $arg0 set $periph = $driver.units.tqh_first while ($periph != 0) if $periph->periph_allocated != 0 || $periph->periph_allocating != 0 || $periph->path->device->ccbq.queue.qfrozen_cnt != 0 periph $periph set $count = $count + 1 end set $periph = $periph->unit_links.tqe_next end if ($count == 0) printf "No problems found for periph %s\n", $driver->driver_name end end define periphs set $i = 0 while (periph_drivers[$i] != 0) set $p = periph_drivers[$i] periphunits $p set $i = $i + 1 end end periphs Warner --0000000000005b14f905d21be9fc--