From nobody Mon Jun 6 18:22:25 2022 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id B94661BE323D for ; Mon, 6 Jun 2022 18:22:29 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qv1-xf36.google.com (mail-qv1-xf36.google.com [IPv6:2607:f8b0:4864:20::f36]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4LH2045j8Zz4w8g for ; Mon, 6 Jun 2022 18:22:28 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qv1-xf36.google.com with SMTP id ca19so3717339qvb.10 for ; Mon, 06 Jun 2022 11:22:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=0SvY54UVQWo3ZGhA66ipwoS9bP6dtefSbwj7BpB6D2M=; b=gpVPsEhpxd/blqDtt9L6Fo3AL/bSMRYWVnc0LKo97Imw3/HnMD17L/OAdJ9U3bS3NO x3Ml73SQDKSJ50O+CafFYXO32Stfs8jypTRECLCj3OMRCSg8dz6PqfoLdLosf8xj9Xkj 2ERLLt8fFdifr3FPLx067NDxlfRnI0E50+bQgLGnPY7t2v+cyrVvcVEMLZS5eg7SP1YS vbpQFWfo5LiVb8Bbw3+TcSK6d6/b5EYihqAs+YrLC4fsfPzxMIeBxctos9QKNA7b6Ucg C/HvA2kdBBbHJfk8kmrVmK15CywVJlLY66I4vsNT6egDWMXHvSw9cwvfZXMEd9+FVIip U/Bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=0SvY54UVQWo3ZGhA66ipwoS9bP6dtefSbwj7BpB6D2M=; b=WK/SkGEU0a4Qo1Vlra3Mgz+bzscs66s6XCskTP7gIyxs68/hpnQcch0Fy7w57Zaqd/ hH3pPK5pIKWXjbV6pvs9piwntmFB0mTa+gcIjhLugqWvhAJhMxVBN0sKEovNaRvheKh/ TzXR5/LxRgGd0mIo//JoRsgyZY9jgy2BJ0ylL15Io8Ideb1diixw70rb4RA0u8qs+QQK gJtqCmyl7LhpvN9qfYRt4uT8UZsUBq1EDF0kEUynecOuUnTBc2Q03ArzZmxyRQsQ6wDB VqvKartYghzzLNk7nVUBELs19MuFes3pEZSqtrKm1HD/IMd478mbvqGLeUvI+VNzS3fx zA4A== X-Gm-Message-State: AOAM5312/RgOpR307MWOly2otUMy1n4kFiW//jvhrbWDs6FWBwIQXaki IOm2joIERv5CxLOVbmaUMns= X-Google-Smtp-Source: ABdhPJw6SS65X0jEoA5WQnJmBYtFYzYIaGGdZeCAnEU4hJYWYt1Kco9rYM3Sk+aKXAhHxkg96GMizw== X-Received: by 2002:a05:6214:1948:b0:464:4c88:dafa with SMTP id q8-20020a056214194800b004644c88dafamr32998569qvk.12.1654539748143; Mon, 06 Jun 2022 11:22:28 -0700 (PDT) Received: from nuc (198-84-189-58.cpe.teksavvy.com. [198.84.189.58]) by smtp.gmail.com with ESMTPSA id f15-20020ac86ecf000000b00304edcfa109sm2033207qtv.33.2022.06.06.11.22.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jun 2022 11:22:27 -0700 (PDT) Date: Mon, 6 Jun 2022 14:22:25 -0400 From: Mark Johnston To: Jan Mikkelsen Cc: Paul Floyd , FreeBSD Hackers Subject: Re: Hang ast / pipelk / piperd Message-ID: References: <84015bf9-8504-1c3c-0ba5-58d0d7824843@gmail.com> <6DBE2C6F-7F8B-457A-AB10-1912965C3376@transactionware.com> List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <6DBE2C6F-7F8B-457A-AB10-1912965C3376@transactionware.com> X-Rspamd-Queue-Id: 4LH2045j8Zz4w8g X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=gpVPsEhp; dmarc=none; spf=pass (mx1.freebsd.org: domain of markjdb@gmail.com designates 2607:f8b0:4864:20::f36 as permitted sender) smtp.mailfrom=markjdb@gmail.com X-Spamd-Result: default: False [-1.82 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; NEURAL_HAM_SHORT(-0.12)[-0.122]; FORGED_SENDER(0.30)[markj@freebsd.org,markjdb@gmail.com]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[markj@freebsd.org,markjdb@gmail.com]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-hackers@freebsd.org]; DMARC_NA(0.00)[freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::f36:from]; MLMMJ_DEST(0.00)[freebsd-hackers]; MID_RHS_NOT_FQDN(0.50)[]; FREEMAIL_CC(0.00)[gmail.com,freebsd.org]; RCVD_TLS_ALL(0.00)[] X-ThisMailContainsUnwantedMimeParts: N On Thu, Jun 02, 2022 at 12:49:45PM +0200, Jan Mikkelsen wrote: > All these mi_switch+0xc2 hangs reminded me of something I saw once on 13.1-RC2 back in April. The machine was running five concurrent “make -j32 installword” processes. > > The machine hung, disk activity stopped. Results of ^T on various running commands: > > ^T on a “tail -F” command: > > load: 1.93 cmd: tail 27541 [zfs teardown inactive] 393.65r 0.06u 0.10s 0% 2548k > mi_switch+0xc2 _sleep+0x1fc rms_rlock_fallback+0x90 zfs_freebsd_reclaim+0x26 VOP_RECLAIM_APV+0x1f vgonel+0x342 vnlru_free_impl+0x2f7 vn_alloc_hard+0xc8 getnewvnode_reserve+0x93 zfs_zget+0x22 zfs_dirent_lookup+0x16b zfs_dirlook+0x7a zfs_lookup+0x3d0 zfs_cache_lookup+0xa9 VOP_LOOKUP+0x30 cache_fplookup_noentry+0x1a3 cache_fplookup+0x366 namei+0x12a > > ^T on a zsh doing a cd to a UFS directory: > > load: 0.48 cmd: zsh 86937 [zfs teardown inactive] 84663.01r 0.06u 0.01s 0% 6412k > mi_switch+0xc2 _sleep+0x1fc rms_rlock_fallback+0x90 zfs_freebsd_reclaim+0x26 VOP_RECLAIM_APV+0x1f vgonel+0x342 vnlru_free_impl+0x2f7 vn_alloc_hard+0xc8 getnewvnode_reserve+0x93 zfs_zget+0x22 zfs_dirent_lookup+0x16b zfs_dirlook+0x7a zfs_lookup+0x3d0 zfs_cache_lookup+0xa9 lookup+0x45c namei+0x259 kern_statat+0xf3 sys_fstatat+0x2f This looks very similar to the problem described here: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261448 Though, in my case I did not see any deadlocks. In other words, the hang always ended after some time (typically a few seconds). > ^T on an attempt to start gstat > > load: 0.17 cmd: gstat 63307 [ufs] 298.29r 0.00u 0.00s 0% 228k > mi_switch+0xc2 sleeplk+0xf6 lockmgr_slock_hard+0x3e7 ffs_lock+0x6c _vn_lock+0x48 vget_finish+0x21 cache_lookup+0x26c vfs_cache_lookup+0x7b lookup+0x45c namei+0x259 vn_open_cred+0x533 kern_openat+0x283 amd64_syscall+0x10c fast_syscall_common+0xf8 > > A short press of the system power button did nothing. > > The installworld target directories were on a ZFS filesystem with a single mirror of two SATA SSDs. > > Unsure if it’s related because the rest of the stack traces are different. However, the mi_switch+0xc2 triggered a memory. mi_switch() is main entry point into the CPU scheduler, so pretty much any thread which isn't on a CPU will have mi_switch() appear in its backtrace.