From nobody Wed Dec 1 20:28:14 2021 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 1347918B5EEC for ; Wed, 1 Dec 2021 20:28:26 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ot1-f46.google.com (mail-ot1-f46.google.com [209.85.210.46]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4J49dj6sbqz4gGF for ; Wed, 1 Dec 2021 20:28:25 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-ot1-f46.google.com with SMTP id n17-20020a9d64d1000000b00579cf677301so36924624otl.8 for ; Wed, 01 Dec 2021 12:28:25 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wtn9dWYfZpUH4jUGqoFBEs5io46YnT5zRacr2qJ49gc=; b=V0/DSiiZsXDuiOw25ovA3P8LXJZYr87NjWpNHLWliDikw4/UFm8OuXZtHF2TsY5QmS Drcm2yj9MYvxVNTr4roGtdoBi/T2PeNeyIgfHT1NPzWG57mU9jBJWQQ/akGzIb94pkA1 eAL5yLWqU8o/tLEvcuU5daLxeumQIsJKrwj1NlrK8bHDj4EHBITRpNIqyXTD89xJ0Tyn 9HHvZ4WAJ2T0mZxp9Nkn6bsEDpN1EDCKIiHG5I1m92FeQEXVj8q42s45UlKa5j0XtE4L qyvZSABZY1rjtOukbpSBuyM7uQ3c/T7LFi7Phc47CHEQVkGkNHVft55L5v9/9+ZgDVSm 9flA== X-Gm-Message-State: AOAM533spxx5S44IbByCC9E13VkbPgD16FpqqsZWRBL/p2lZ4PdB3mCl egmxDN7Cj2HGajE/aXYw62PWzRqsr1ByfW6DZBSGd+XI X-Google-Smtp-Source: ABdhPJy3GCBgWXMUlGGCzLZjmySclT2ELid6yYGkKVO7zmILK84J6lnj/1fMEtPrNBBVJ7UbUiXQmmAW/LfM5NFqMgQ= X-Received: by 2002:a9d:6d98:: with SMTP id x24mr7472569otp.371.1638390505339; Wed, 01 Dec 2021 12:28:25 -0800 (PST) List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Alan Somers Date: Wed, 1 Dec 2021 13:28:14 -0700 Message-ID: Subject: Re: ZFS deadlocks triggered by HDD timeouts To: Warner Losh Cc: FreeBSD Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 4J49dj6sbqz4gGF X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-ThisMailContainsUnwantedMimeParts: N On Wed, Dec 1, 2021 at 11:25 AM Warner Losh wrote: > > > > On Wed, Dec 1, 2021, 11:16 AM Alan Somers wrote: >> >> On a stable/13 build from 16-Sep-2021 I see frequent ZFS deadlocks >> triggered by HDD timeouts. The timeouts are probably caused by >> genuine hardware faults, but they didn't lead to deadlocks in >> 12.2-RELEASE or 13.0-RELEASE. Unfortunately I don't have much >> additional information. ZFS's stack traces aren't very informative, >> and dmesg doesn't show anything besides the usual information about >> the disk timeout. I don't see anything obviously related in the >> commit history for that time range, either. >> >> Has anybody else observed this phenomenon? Or does anybody have a >> good way to deliberately inject timeouts? CAM makes it easy enough to >> inject an error, but not a timeout. If it did, then I could bisect >> the problem. As it is I can only reproduce it on production servers. > > > What SIM? Timeouts are tricky because they have many sources, some of which are nonlocal... > > Warner mpr(4)