From owner-freebsd-current@freebsd.org Sun May 10 10:02:51 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 3BD072E07F9 for ; Sun, 10 May 2020 10:02:51 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com [209.85.167.47]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49Kfkx4dgvz41Pp for ; Sun, 10 May 2020 10:02:49 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lf1-f47.google.com with SMTP id u4so4921301lfm.7 for ; Sun, 10 May 2020 03:02:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=jHz/3X2OH7kV3XuYcY8OA8Eq2H6uX9Yg4SNeEynaBuE=; b=fp0PkIpDGI6kQRgbdTJuPFXZat6wmONRpCmXoNi2l446PjUadD0iaD+sv3zE+K+P6Z kckqSOII8/w2LwV2bk3beJoHaanMDRSInuw5Mb0CW4PYeo6fThDxTvQrzEBhMGpDJinL GTc3KIuTcGbMgPmLHKzdslIBFxNp0lBtb+U7SbC7t3i+E3XUK5pf2XFMjh0QkdO415Zb wwYcOE57RnubuhlJSMNhk3ycPIZzK9KYrKzxab3nk5zkvvUmJtJjk08sjbw9XunsI94z RRKYOTBRx9g+jprdzAqRepCDBZD7xkfYr6ubSEJawhD/Pwtx+Kky8y9svy+89wFto9Rv h4rg== X-Gm-Message-State: AOAM533wOEBJiT5nSUgAbVySMIR3kdlzMom0LHcLguMtDBc5G904ltaL aUcVtcwdH6gZRxG7dblhZ407Ba3Zeac= X-Google-Smtp-Source: ABdhPJxXdM4/u7vzLrocNyu6eqETvZN+C9l/fccq96fVCpcfPVO3kbjKlxX0Nfk38ca1+/F4hwm8MA== X-Received: by 2002:ac2:5f73:: with SMTP id c19mr7336786lfc.135.1589104967492; Sun, 10 May 2020 03:02:47 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id u4sm7232843lfu.81.2020.05.10.03.02.45 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 10 May 2020 03:02:46 -0700 (PDT) Subject: Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ? To: Konstantin Belousov Cc: FreeBSD Current References: <0d7db402-621e-cc6b-2918-2078f63e2a9b@FreeBSD.org> <20200508161500.GC44519@kib.kiev.ua> <6485ab77-a3d0-8916-9431-74e4da1e3ea7@FreeBSD.org> <20200509161325.GH44519@kib.kiev.ua> <20200509165010.GI44519@kib.kiev.ua> From: Andriy Gapon Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= mQINBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABtB5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz6JAlQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryLkCDQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAYkCPAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <881b2aa7-b9df-da64-25b6-783f076115c4@FreeBSD.org> Date: Sun, 10 May 2020 13:02:45 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Firefox/60.0 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20200509165010.GI44519@kib.kiev.ua> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 49Kfkx4dgvz41Pp X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of agapon@gmail.com designates 209.85.167.47 as permitted sender) smtp.mailfrom=agapon@gmail.com X-Spamd-Result: default: False [-1.25 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_ALL(0.00)[]; SUBJECT_HAS_EXCLAIM(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; FORGED_SENDER(0.30)[avg@FreeBSD.org,agapon@gmail.com]; FREEMAIL_TO(0.00)[gmail.com]; RECEIVED_SPAMHAUS_PBL(0.00)[96.151.72.93.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10]; SUBJECT_ENDS_QUESTION(1.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; R_DKIM_NA(0.00)[]; FROM_NEQ_ENVFROM(0.00)[avg@FreeBSD.org,agapon@gmail.com]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.99)[-0.990,0]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-0.999,0]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; DMARC_NA(0.00)[FreeBSD.org]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[47.167.85.209.list.dnswl.org : 127.0.5.0]; IP_SCORE(-0.26)[ip: (-0.43), ipnet: 209.85.128.0/17(-0.39), asn: 15169(-0.43), country: US(-0.05)]; RWL_MAILSPIKE_POSSIBLE(0.00)[47.167.85.209.rep.mailspike.net : 127.0.0.17]; RCVD_TLS_ALL(0.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 May 2020 10:02:51 -0000 On 09/05/2020 19:50, Konstantin Belousov wrote: > On Sat, May 09, 2020 at 07:16:27PM +0300, Andriy Gapon wrote: >> On 09/05/2020 19:13, Konstantin Belousov wrote: >>> On Sat, May 09, 2020 at 06:52:24PM +0300, Andriy Gapon wrote: >>>> I tried this change: >>>> diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c >>>> index 4deed86a76d1a..b834b7f0388b7 100644 >>>> --- a/sys/amd64/amd64/pmap.c >>>> +++ b/sys/amd64/amd64/pmap.c >>>> @@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap) >>>> #define NPV_LIST_LOCKS MAXCPU >>>> >>>> #define PHYS_TO_PV_LIST_LOCK(pa) \ >>>> - (&pv_list_locks[pa_index(pa) % NPV_LIST_LOCKS]) >>>> + (&pv_list_locks[((pa) >> PDRSHIFT) % NPV_LIST_LOCKS]) >>>> #endif >>>> >>>> #define CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa) do { \ >>>> >>>> It fixed the original problem, but I got a new panic. >>>> "DI already started" in pmap_remove() -> pmap_delayed_invl_start_u(). >>>> I guess that !NUMA variant does not get much testing, so I'll probably just >>>> stick with the default. >>> Why didn't you just removed the KASSERT from pa_index ? >> >> Well, I thought it might be useful in the NUMA case. >> pa_index() definition is shared between both cases. > Might be define the macro two times, for NUMA/non-NUMA. non-NUMA case > does not need the assert, because users take it mod NPV_LIST_LOCKS. Yes, this works. Thank you! diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c index 4deed86a76d1a..8dd236acc8205 100644 --- a/sys/amd64/amd64/pmap.c +++ b/sys/amd64/amd64/pmap.c @@ -323,12 +323,12 @@ pmap_pku_mask_bit(pmap_t pmap) #endif #undef pa_index +#ifdef NUMA #define pa_index(pa) ({ \ KASSERT((pa) <= vm_phys_segs[vm_phys_nsegs - 1].end, \ ("address %lx beyond the last segment", (pa))); \ (pa) >> PDRSHIFT; \ }) -#ifdef NUMA #define pa_to_pmdp(pa) (&pv_table[pa_index(pa)]) #define pa_to_pvh(pa) (&(pa_to_pmdp(pa)->pv_page)) #define PHYS_TO_PV_LIST_LOCK(pa) ({ \ @@ -340,6 +340,7 @@ pmap_pku_mask_bit(pmap_t pmap) _lock; \ }) #else +#define pa_index(pa) ((pa) >> PDRSHIFT) #define pa_to_pvh(pa) (&pv_table[pa_index(pa)]) #define NPV_LIST_LOCKS MAXCPU -- Andriy Gapon From owner-freebsd-current@freebsd.org Sun May 10 10:25:44 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 91E5A2E21A9 for ; Sun, 10 May 2020 10:25:44 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49KgFN2SRNz434W; Sun, 10 May 2020 10:25:44 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id 04AAPbRX072155 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sun, 10 May 2020 13:25:40 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 04AAPbRX072155 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id 04AAPbla072154; Sun, 10 May 2020 13:25:37 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 10 May 2020 13:25:37 +0300 From: Konstantin Belousov To: Andriy Gapon Cc: FreeBSD Current Subject: Re: CHANGE_PV_LIST_LOCK_TO_PHYS is not correct when !NUMA ? Message-ID: <20200510102537.GE68906@kib.kiev.ua> References: <0d7db402-621e-cc6b-2918-2078f63e2a9b@FreeBSD.org> <20200508161500.GC44519@kib.kiev.ua> <6485ab77-a3d0-8916-9431-74e4da1e3ea7@FreeBSD.org> <20200509161325.GH44519@kib.kiev.ua> <20200509165010.GI44519@kib.kiev.ua> <881b2aa7-b9df-da64-25b6-783f076115c4@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <881b2aa7-b9df-da64-25b6-783f076115c4@FreeBSD.org> X-Spam-Status: No, score=-0.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED,PLING_QUERY autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on tom.home X-Rspamd-Queue-Id: 49KgFN2SRNz434W X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-6.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 May 2020 10:25:44 -0000 On Sun, May 10, 2020 at 01:02:45PM +0300, Andriy Gapon wrote: > On 09/05/2020 19:50, Konstantin Belousov wrote: > > On Sat, May 09, 2020 at 07:16:27PM +0300, Andriy Gapon wrote: > >> On 09/05/2020 19:13, Konstantin Belousov wrote: > >>> On Sat, May 09, 2020 at 06:52:24PM +0300, Andriy Gapon wrote: > >>>> I tried this change: > >>>> diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c > >>>> index 4deed86a76d1a..b834b7f0388b7 100644 > >>>> --- a/sys/amd64/amd64/pmap.c > >>>> +++ b/sys/amd64/amd64/pmap.c > >>>> @@ -345,7 +345,7 @@ pmap_pku_mask_bit(pmap_t pmap) > >>>> #define NPV_LIST_LOCKS MAXCPU > >>>> > >>>> #define PHYS_TO_PV_LIST_LOCK(pa) \ > >>>> - (&pv_list_locks[pa_index(pa) % NPV_LIST_LOCKS]) > >>>> + (&pv_list_locks[((pa) >> PDRSHIFT) % NPV_LIST_LOCKS]) > >>>> #endif > >>>> > >>>> #define CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa) do { \ > >>>> > >>>> It fixed the original problem, but I got a new panic. > >>>> "DI already started" in pmap_remove() -> pmap_delayed_invl_start_u(). > >>>> I guess that !NUMA variant does not get much testing, so I'll probably just > >>>> stick with the default. > >>> Why didn't you just removed the KASSERT from pa_index ? > >> > >> Well, I thought it might be useful in the NUMA case. > >> pa_index() definition is shared between both cases. > > Might be define the macro two times, for NUMA/non-NUMA. non-NUMA case > > does not need the assert, because users take it mod NPV_LIST_LOCKS. > > Yes, this works. > Thank you! > > diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c > index 4deed86a76d1a..8dd236acc8205 100644 > --- a/sys/amd64/amd64/pmap.c > +++ b/sys/amd64/amd64/pmap.c > @@ -323,12 +323,12 @@ pmap_pku_mask_bit(pmap_t pmap) > #endif > > #undef pa_index > +#ifdef NUMA > #define pa_index(pa) ({ \ > KASSERT((pa) <= vm_phys_segs[vm_phys_nsegs - 1].end, \ > ("address %lx beyond the last segment", (pa))); \ > (pa) >> PDRSHIFT; \ > }) > -#ifdef NUMA > #define pa_to_pmdp(pa) (&pv_table[pa_index(pa)]) > #define pa_to_pvh(pa) (&(pa_to_pmdp(pa)->pv_page)) > #define PHYS_TO_PV_LIST_LOCK(pa) ({ \ > @@ -340,6 +340,7 @@ pmap_pku_mask_bit(pmap_t pmap) > _lock; \ > }) > #else > +#define pa_index(pa) ((pa) >> PDRSHIFT) > #define pa_to_pvh(pa) (&pv_table[pa_index(pa)]) > > #define NPV_LIST_LOCKS MAXCPU Looks good to me. From owner-freebsd-current@freebsd.org Sun May 10 14:55:24 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D5FDB2E993F; Sun, 10 May 2020 14:55:24 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: from mail-il1-f177.google.com (mail-il1-f177.google.com [209.85.166.177]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49KnDW58nWz4HMG; Sun, 10 May 2020 14:55:23 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: by mail-il1-f177.google.com with SMTP id x2so6001728ilp.13; Sun, 10 May 2020 07:55:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=cvoN9VjTXi4D2Mt2zCUU7BMhhgsVaYhoQxSpmGDoVas=; b=cif8GBHTWYVckEugdqUch9y8lUju83MIqzjBBucnjMv4f3ihvjm31ElSoicf8VWUMF mi9lxZycY0S0I1vG7Q0Rd+bYHyP8hHROF8b0COsDUHGKHZNK23q64SVmfj73nbXLKRp0 WwsegMsaD59yT+LJ+VFq8sxvJOYhoPpJz6WQTRbngTe+SWk04rvV44DElrHSZzm3XXeR lYLujiidC4Mo5XH+Ff20Kd7xs2Tsj82H6okoTy3J1vrBmWJo4OHBfgONiGqqG91W9DC0 paWkrxY5gAuaKLAuQUwRSxrUWiaQLDx8iGPbj6/RxE910oRpCp5s3907j2AEEDYQCtue z1ag== X-Gm-Message-State: AGi0PuYPxNo1KRXIHZlEkkuJcJWBOBcoAtns10pJOQnPiB36yrhaRA91 7pbaRKWs+rsvr/mxyyoY5vlPbX6188OCYhIC5O3nmZXSRgg= X-Google-Smtp-Source: APiQypKcK0215EOluJHdlVCMp0syIfOGyasSk6y+f4ZuzYngHB5JZsWKrFpHp9q0Djqc4WYLIl46eTyIXeiyOw5832o= X-Received: by 2002:a05:6e02:141:: with SMTP id j1mr12622288ilr.100.1589122521378; Sun, 10 May 2020 07:55:21 -0700 (PDT) MIME-Version: 1.0 From: Ed Maste Date: Sun, 10 May 2020 10:55:10 -0400 Message-ID: Subject: HEADS-UP: obsolete GNU as 2.17.50 retirement for FreeBSD 13, expected 2020-05-31 To: FreeBSD Ports , FreeBSD Current Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 49KnDW58nWz4HMG X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of carpeddiem@gmail.com designates 209.85.166.177 as permitted sender) smtp.mailfrom=carpeddiem@gmail.com X-Spamd-Result: default: False [-3.53 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[freebsd.org]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; IP_SCORE(-1.53)[ip: (-6.78), ipnet: 209.85.128.0/17(-0.39), asn: 15169(-0.43), country: US(-0.05)]; TO_DN_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[177.166.85.209.list.dnswl.org : 127.0.5.0]; FORGED_SENDER(0.30)[emaste@freebsd.org,carpeddiem@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[177.166.85.209.rep.mailspike.net : 127.0.0.17]; MIME_TRACE(0.00)[0:+]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[emaste@freebsd.org,carpeddiem@gmail.com]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 May 2020 14:55:24 -0000 All architectures supported by FreeBSD now using Clang and lld, and tools from obsolete GNU binutils 2.17.50 have been retired one by one - most recently objdump, in r360698. There is one binutil tool left: GNU as. I plan to disable it at the end of the month, and then remove all of binutils some weeks later if no additional issues are found. We have two src.conf build knobs for as - WITH_BINUTILS, which determines whether /usr/bin/as exists in the built system, and WITH_BINUTILS_BOOTSTRAP, which controls whether as is built as a bootstrap tool to build the rest of the system. 1. WITH_BINUTILS (default on i386 and amd64) Turning this off means the installed system will not have /usr/bin/as. A ports exp-run without as is in PR 205250. On i386 the most recent run failed in: comms/libfec comms/syncterm devel/plan9port emulators/vmw games/dxx-rebirth graphics/vulkan-loader lang/hla lang/mit-scheme lang/ocaml lang/ocaml-nox11 math/ldouble multimedia/mencoder multimedia/mplayer amd64 has similar failures in an earlier run. There are generally two ways to address a ports need for an assembler: a) Add a dependency on devel/binutils to the port's Makefile: BUILD_DEPENDS+=as:devel/binutils b) Use Clang's integrated assembler, via the compiler driver (cc). This is what we do for all but one file in the base system. Option (b) is the nicer choice, as it doesn't introduce a new dependency, but takes some more work. Option (a) is straightforward, and I have a proposed initial set of patches at https://reviews.freebsd.org/D24739. My hope is that individual maintainers of affected ports can prepare their port for the upcoming as retirement in whichever way they find most suitable; I'll have BUILD_DEPENDS patches ready for whichever ones are not ready at the end of the month. 2. WITH_BINUTILS_BOOTSTRAP GNU as is built only on amd64, and is used to assemble one file, the optimized assembly version of the Skein cryptographic hash - skein_block_asm.s. Unfortunately it makes extensive use of macro features that are not yet supported by Clang's integrated assembler. There are some proposed methods of addressing (for example, committing an assembled .o, or translating the assembly into something that IAS can handle) but none of them are particularly palatable. Right now I expect the most likely path forward is that we revert to the somewhat slower C-language skein implementation. Please let me know if you have any questions or concerns; I'm happy to help maintainers of affected ports find and test a solution. From owner-freebsd-current@freebsd.org Sun May 10 16:25:44 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D6F2A2EBFE8; Sun, 10 May 2020 16:25:44 +0000 (UTC) (envelope-from grahamperrin@gmail.com) Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49KqDl2V5Bz4N41; Sun, 10 May 2020 16:25:42 +0000 (UTC) (envelope-from grahamperrin@gmail.com) Received: by mail-wm1-x32b.google.com with SMTP id d207so2059712wmd.0; Sun, 10 May 2020 09:25:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=to:from:subject:message-id:date:user-agent:mime-version :content-transfer-encoding:content-language; bh=PcIQjh5bO/J/uHePZJafl7Hr8i2kMdXcCuEJb4o/h/c=; b=tyfyg1GFopeqcZW/QG0Tw/IrFChCzi7c9iyk49ItzmltO96zWuBff/4cnzL4gOHE19 5wVBG6+Ege5L+sZGDgHgBFCAL+EqONFHbww/c/J4/QWB51lPnOX2TMwR0beogZUpxT8S zv7vBZzYOqQ9QGv2u5iptMNw8klgLMmTIuXREXanCXO0etsePgp860nspqK8hPAFdGnB J6/WYVemq07L5Uo4ID7OzIRw1uLDiZrS0lZ7ZkkT7gSTs6vhd9UwL8K7FFOPBETpktH1 sitgqWj4eqYMmMGwMyh9ISSF2zUfVdL5KM1W9qZoljT5IgsQFGqG0N8lDwwXYFDZcMIx MNhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:subject:message-id:date:user-agent :mime-version:content-transfer-encoding:content-language; bh=PcIQjh5bO/J/uHePZJafl7Hr8i2kMdXcCuEJb4o/h/c=; b=T4Emc47KtTjiwTivleC1OYyWAkRtb2Kkt6iEN1txheNZpaV6xGzWA2StSmyP1R62RF 7+OSZzYrMEYd8U9xg8sVEbIMBlgqYBbEB6iUT7sRy0GGG5U4YliRFNIbxZV0JiMZfnqj gWiVlpMGeruXoOy/15zggoRRmssNpNo8QjFVOMNUNyzcuy8ncj+dhBZmHloKZzalx69m IPhVj4iJeY+Pcn+8tMOVXmdZ5gbY+gf+vGLhNAXt8DtdJHZOeCeyWKKfLvY2Xq59jZTc vHhwDIlU1uSSna6/RB6CAsx3T31gs/Iarlqfdyiy5QcqKydn4+4bgImAwcPmcrPMg/sU 09+A== X-Gm-Message-State: AGi0PuYDasSvQfID2qlVhgsfvkMH8z+PN2F0bSl2yJ5RjTXjC8WuCTnF 9o/nVF5hnJefE0h8uWuXIRmHrrAVKAc= X-Google-Smtp-Source: APiQypKFuwzq5r3FVhYaXgM1+ALotNM8Rl6SctaiZl23NMZtK0S+5szFnr4+xijkoYLjDRMVvEENgA== X-Received: by 2002:a05:600c:22d6:: with SMTP id 22mr26683120wmg.121.1589127940608; Sun, 10 May 2020 09:25:40 -0700 (PDT) Received: from [192.168.1.7] (79-66-147-78.dynamic.dsl.as9105.com. [79.66.147.78]) by smtp.gmail.com with ESMTPSA id b12sm14573011wro.18.2020.05.10.09.25.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 10 May 2020 09:25:39 -0700 (PDT) To: FreeBSD CURRENT , multimedia@FreeBSD.org From: Graham Perrin Subject: multimedia/vlc playback: blackness Message-ID: <11bcac93-7176-b0ce-43f8-08d724c4213f@gmail.com> Date: Sun, 10 May 2020 17:25:38 +0100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Rspamd-Queue-Id: 49KqDl2V5Bz4N41 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=tyfyg1GF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of grahamperrin@gmail.com designates 2a00:1450:4864:20::32b as permitted sender) smtp.mailfrom=grahamperrin@gmail.com X-Spamd-Result: default: False [-3.00 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; FREEMAIL_FROM(0.00)[gmail.com]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; RECEIVED_SPAMHAUS_PBL(0.00)[78.147.66.79.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.00)[ip: (-9.09), ipnet: 2a00:1450::/32(-2.29), asn: 15169(-0.43), country: US(-0.05)]; IP_SCORE_FREEMAIL(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[b.2.3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.5.4.1.0.0.a.2.list.dnswl.org : 127.0.5.0]; RCVD_TLS_ALL(0.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 May 2020 16:25:44 -0000 A few weeks ago, VLC ceased to show video. I hear the audio track but there's blackness. Debug log, for brief playback of a silent screen recording: I tried resetting preferences, no improvement. Any other suggestions? TIA ---- no matching bug, so I assume that the issue is specific to my envirnoment. grahamperrin@momh167-gjp4-8570p:~ % date ; uname -v Sun 10 May 2020 17:23:40 BST FreeBSD 13.0-CURRENT #55 r360732: Thu May  7 09:31:34 BST 2020 root@momh167-gjp4-8570p:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG grahamperrin@momh167-gjp4-8570p:~ % pkg query '%o %v %R' vlc multimedia/vlc 3.0.10_2,4 FreeBSD grahamperrin@momh167-gjp4-8570p:~ % From owner-freebsd-current@freebsd.org Sun May 10 18:59:39 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7EC262F0CBA; Sun, 10 May 2020 18:59:39 +0000 (UTC) (envelope-from manu@bidouilliste.com) Received: from mx.blih.net (mx.blih.net [212.83.155.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mx.blih.net", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49KtfL0V5Jz4X3B; Sun, 10 May 2020 18:59:37 +0000 (UTC) (envelope-from manu@bidouilliste.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bidouilliste.com; s=mx; t=1589137169; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=LFQMWArz9p2+aVHoJG3h16JNkiwrA854qlpevQuWhoY=; b=n9Z/UM6J9sYBJVpkYH7XTRcnM/ze4M43/4sh/IKGqF7yYHA26xf/QdzXwzvUqwuOeXmux+ D0cE8p+El3ZzcAdBurcLnrwtuEfe8iZPwLF3n71UhdKrG1nIxvEt8ExK6omcUITYp7uKYr YnGgsVQidhnwkrJQSW433POcGMksqgw= Received: from tails.home (lfbn-idf2-1-900-181.w86-238.abo.wanadoo.fr [86.238.131.181]) by mx.blih.net (OpenSMTPD) with ESMTPSA id 099a2823 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Sun, 10 May 2020 18:59:29 +0000 (UTC) Date: Sun, 10 May 2020 20:59:29 +0200 From: Emmanuel Vadot To: freebsd-current@freebsd.org, freebsd-x11@freebsd.org Subject: drm drivers project report (week of May 4th) Message-Id: <20200510205929.0dcf23d0b278bc242718a564@bidouilliste.com> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.32; amd64-portbld-freebsd13.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 49KtfL0V5Jz4X3B X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=bidouilliste.com header.s=mx header.b=n9Z/UM6J; dmarc=pass (policy=none) header.from=bidouilliste.com; spf=pass (mx1.freebsd.org: domain of manu@bidouilliste.com designates 212.83.155.74 as permitted sender) smtp.mailfrom=manu@bidouilliste.com X-Spamd-Result: default: False [-3.96 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[bidouilliste.com:s=mx]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MV_CASE(0.50)[]; DKIM_TRACE(0.00)[bidouilliste.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[bidouilliste.com,none]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; IP_SCORE(-1.46)[ip: (-9.44), ipnet: 212.83.128.0/19(1.73), asn: 12876(0.40), country: FR(-0.00)]; ASN(0.00)[asn:12876, ipnet:212.83.128.0/19, country:FR]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 May 2020 18:59:39 -0000 Hello all, This is the first report for this project of updating our drm drivers as it started this week. I've started this week by cleaning our drm-v5.0 port comparing it with Linux as the less diff we have, the easiest it is to apply the patches from Linux. One commit was made to base linuxkpi so we have less patches in the DRM drivers themself : https://svnweb.freebsd.org/changeset/base/360787 Next part was trying to find what is the best way to update without too much hassle. I had some somewhat hacky scripts lying around for the drm on ARM project so I've made them a bit more generic and commited them here https://github.com/freebsd/drm-kmod/tree/master/scripts . This still isn't the best way I think. Due to how Linux is "constructed" patches can comes from differents branches and aren't necessary in the correct order. There is also some weird commits that appears in two releases of Linux, I think this is because they come late in a fixes merge and then the maintainer of one of the drm branches do stuff in the other branch for the next Linux release. If anyone have a lot of git skill I'm interested in how they would approch this problem. I also have to see if it would make more sense to track the differents commits in the merge branch from the drm repos (https://cgit.freedesktop.org/drm/) or not. After finding a almost good way to update I proceeded with updating to Linux 5.1 . This update was almost too easy, thanks to Austin Shafer who updated linuxkpi to include a few more helpers and functions needed by 5.1. There is no ports for now but if you want to test on your system you can checkout or download a tarball here : https://github.com/freebsd/drm-kmod/releases Only FreeBSD 13-CURRENT is supported. Also if you are using one of the drm-kmod ports be sure to delete /boot/kernel/drm.ko /boot/kernel/amdgpu.ko /boot/kernel/i915kms.ko /boot/kernel/linuxkpi_gplv2.ko /boot/kernel/radeon.ko As the drm-kmod ports install the sources in the system a kernel compilation will rebuild those modules and install them in /boot/kernel whereas using https://github.com/freebsd/drm-kmod/ will install everything in /boot/modules This branch was tested with : i915 on : - i7-5600U - HD Graphics 5500 (Broadwell) - i7-7500U - Skylake GT2 [HD Graphics 520] (Skylake) amdgpu on : - Ryzen 7 3700U (Picasso/Vega10) Except Skylake which doesn't have working vulkan (but doesn't have either with drm-v5.0) everything works great. I've started then the update to 5.2, it's still wip and only have i915 enabled as I haven't fixed everything for amdgpu/radeon or vmware/vbox but a wip branch is available there : https://github.com/freebsd/drm-kmod/tree/5.2-wip A few commits where made in base linuxkpi : https://svnweb.freebsd.org/changeset/base/360851 https://svnweb.freebsd.org/changeset/base/360870 https://svnweb.freebsd.org/changeset/base/360871 I don't suggest that you track this branch as I will fix the commits and force push on it, this is because I want to have every commit that is buildable for Linux, buildable for us, very conveniant for bisecting problems but I'd appreciate if people would test and report to me directly on replying to this thread. I have plan to create a usb/iso image that people could burn and tes but that will not be before a few weeks. This was tested on the same Intel systems as above with the addition of : i915 on : - Silver N5000 CPU (Thanks to Austin Shafer) - i7-6700K (Skylake) (Thanks to jbeich@) - i7-8665U (Whiskeylake) (Thanks to db@) The goal for next week is obviously to continue the update for 5.2 and updating to 5.3. The goal is to go as quickly as possible to 5.4 and stabilise there in a branch while going further on the master branch as 5.4 is a LTS branch in Linux and receive a lot of fix update. Regards, -- Emmanuel Vadot From owner-freebsd-current@freebsd.org Sun May 10 22:32:50 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 1BACF2F6031 for ; Sun, 10 May 2020 22:32:50 +0000 (UTC) (envelope-from eric@vangyzen.net) Received: from smtp.vangyzen.net (hotblack.vangyzen.net [199.48.133.146]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49KzNK1fJmz3Gv3 for ; Sun, 10 May 2020 22:32:48 +0000 (UTC) (envelope-from eric@vangyzen.net) Received: from disco.vangyzen.net (unknown [70.97.188.230]) by smtp.vangyzen.net (Postfix) with ESMTPSA id 147F756468 for ; Sun, 10 May 2020 17:32:42 -0500 (CDT) Subject: Re: ${COMPILER_VERSION} < 40300 To: freebsd-current@freebsd.org References: <2f15c981-8846-ddef-6593-dadc14933cc5@vangyzen.net> From: Eric van Gyzen Message-ID: <65a954bf-22c3-c1be-5d42-b8bdc5b99023@vangyzen.net> Date: Sun, 10 May 2020 17:32:36 -0500 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <2f15c981-8846-ddef-6593-dadc14933cc5@vangyzen.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 49KzNK1fJmz3Gv3 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of eric@vangyzen.net designates 199.48.133.146 as permitted sender) smtp.mailfrom=eric@vangyzen.net X-Spamd-Result: default: False [-2.36 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-0.99)[-0.986,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+a]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; SUBJECT_HAS_CURRENCY(1.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-0.996,0]; TO_DN_NONE(0.00)[]; DMARC_NA(0.00)[vangyzen.net]; SUBJ_ALL_CAPS(2.03)[27]; IP_SCORE(-3.11)[ip: (-8.03), ipnet: 199.48.132.0/22(-3.65), asn: 36236(-3.80), country: US(-0.05)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:36236, ipnet:199.48.132.0/22, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 May 2020 22:32:50 -0000 > If I were to clean up obsolete ${COMPILER_VERSION} tests in the tree, > which ones should I keep?  I would probably confine it to head, so I > could prune quite a few. Thanks for the feedback, everyone. If you're interested: https://reviews.freebsd.org/D24802 Eric From owner-freebsd-current@freebsd.org Mon May 11 02:53:29 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 82A642D3A7E for ; Mon, 11 May 2020 02:53:29 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic312-25.consmr.mail.gq1.yahoo.com (sonic312-25.consmr.mail.gq1.yahoo.com [98.137.69.206]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49L5942nC0z3xkL for ; Mon, 11 May 2020 02:53:27 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: GEM_aYgVM1kUIyTf_LYoiFGb0Q7rW6OYpsUWyv59NjxsvA.W1mj_sUC6My4Hxor Xw.RK_oOsQKkGfhVtwZT4UPnAONfx8ljF1I4kwjNoPzDe46arN78Q1l35UlrMQbCP.eb7JcER3PV XGYU9jzY7uQrMYb0qOGYvsdv0BhaeTxXOXtgzG20NZWxeIer6reUAFA3G5Pr81RXpZ6h..jLIl9D 2teTg3EQWNMDGp8._cA6ki3jEfys5dIuB57PHbSmJPPSm_AVvWDAIK0ml2mMfRSK6oK.NJ62SWx. sOXmnsK4lNfvTU3bubG9eFWv3wXRW.OhzQvRNu7TUwrcfoKj6Kh6.hkFWUIvm6kN2bAucNSPtz6d on10mk5d9jWx4MY3oicMbQEZEAH4zpzI8ftYWQ5aEnWFNMeTtgGbW45rJiaXpZx.hCBHhCzNxDe0 HdvjJMBbHRHWfqKwKjgGHu97m3MXe_bnq5ef782Gw0P0aMd5RaWSNbNZ5DvA8x3R45O5yM1DlvTi 1_Fpk_GZu.GEi6rxyqB.l_udPOddLljo8MQ9PphkBPP4KLK1nhU4MkRaFOmgOvIGacMz.JpotjdZ mzFfE8QIqCeavOw45n5ZR7ov2C5xihkP20WM.7OD9HszMRVNN362zapGzuFQ0mjRN.7_P0Q_1EOZ x7KNVFniO0NCCT_iToBQ0TABf8fNydPesyVQvmct.b8pLEfrBN5_j97PJIl_thYu8z5Aq8.8ULAS fAUXsLvqv9TfdOrl8wPWYBeNCRB.GT76kR9ddhPaNwsw58V96u2NJCWcRBX8QjsH.lAZ96EFCVCK ImcJWpkADaUSaUqH7ekW.jA3gFEIdCnxKLA4mFN4RKnyg6VvYkBDZQJkSL_U1QprwWfotRFTFWbx e1KEDHUydze5Bugac3l8oOAz4v2XrzdbzxXxEzqcKVLVQvmdjcj9SkCJJ.pMU3w8K5F70Wi7tYlJ xjqAXP0lmwN2hzbwHlM7lY8CDEhZks04ipicvaeEOWar7xW8i9OJihIpimuFO6BRE.c4ldLTlyS_ _3phTkpX13rgAVVcSXIiVFyFM2dvDlCz1lw4O2HsN_2Xt5MLCO_3h2u7yEpxZoLMHQanWY3XdZEY Ytb0Q2PtislE8fnzD6c0W2O8pTGXvHk2iH6Vb431KrjDBTmt3agMXXsAuzib9Xt0Gapyx3QRPskM miyX23sHgbE3SNhPkqnC7aSu8D.2vAd271A19Zc17B7iwJCfGsuy6kFrlCluRNmkGFbvx4Aj8AOc tDmBHl.RsnrsIXl73CxyFhg.Z.g8avkyde6baMwXhDbH_FpeVbn_jw0pYVInPM_5WOikgsCMQtfU 6Fe15UM2xnBz2ZAqDWA7BgLmwTn2u1DCJ9qnBORAty.kJtWNvEprgAVA9DuyvnO0e6b81p8OrFro hy2HOI23hVggrLpumFJkhfrFKJdYbvhwm0vyWFxnfiE6B_TX1i4W5gSLTWzFh Received: from sonic.gate.mail.ne1.yahoo.com by sonic312.consmr.mail.gq1.yahoo.com with HTTP; Mon, 11 May 2020 02:53:26 +0000 Received: by smtp414.mail.bf1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 1e0514e9cf2507050b754f47e602e546; Mon, 11 May 2020 02:53:21 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\)) Subject: Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311 From: Mark Millard In-Reply-To: Date: Sun, 10 May 2020 19:53:19 -0700 Cc: Brandon Bergren , Justin Hibbits Content-Transfer-Encoding: 7bit Message-Id: References: <8479DD58-44F6-446A-9CA5-D01F0F7C1B38@yahoo.com> <17ACDA02-D7EF-4F26-874A-BB3E935CD072@yahoo.com> <695E6836-F860-4557-B7DE-CC1EDB347F18@yahoo.com> <121B9B09-141B-4DC3-918B-1E7CFB99E779@yahoo.com> <8AAB0462-3FA8-490C-8D8D-7C15B1C9E2DE@yahoo.com> <18E62746-80DB-4195-977D-4FF32D0129EE@yahoo.com> To: "vangyzen@freebsd.org" , svn-src-head@freebsd.org, FreeBSD Current , FreeBSD Hackers , FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3608.80.23.2.2) X-Rspamd-Queue-Id: 49L5942nC0z3xkL X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.50 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.00)[ip: (-3.69), ipnet: 98.137.64.0/21(0.83), asn: 36647(0.66), country: US(-0.05)]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[206.69.137.98.list.dnswl.org : 127.0.5.0]; RWL_MAILSPIKE_POSSIBLE(0.00)[206.69.137.98.rep.mailspike.net : 127.0.0.17]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 May 2020 02:53:29 -0000 [A new kind of experiment and partial results.] Given the zero'ed memory page(s) that for some of the example contexts include a page that should not be changing after initialization in my context (jemalloc global variables), I have attempted the following for such examples: A) Run gdb B) Attach to one of the live example processes. C) Check that the page is not zeroed yet. (print/x __je_sz_size2index_tab) D) protect the page containing the start of __je_sz_size2index_tab, using 0x1 as the PROT_READ mask. (print (int)mprotect(ADDRESS,1,0x1)) E) detach. The hope was to discover which of the following was involved: A) user-space code trying to write the page should get a SIGSEGV. In this case I'd likely be able to see what code was attempting the write. B) kernel-code doing something odd to the content or mapping of memory would not (or need not) lead to SIGSEGV. In this case I'd be unlikely to see what code lead to the zeros on the page. So far I've gotten only one failure example, nfsd during its handling of a SIGUSR1. Previous nfs mounts and dismounts worked fine, not asserting, indicating that at the time the page was not zeroed. I got no evidence of SIGSEGV from an attempted user space write to the page. But the nfsd.core shows the page as zeroed and the assert having caused abort(). That suggests the kernel side of things for what leads to the zeros. It turns out that just before the "unregsiteration()" activity is "killchildren()" activity: (gdb) list 971 972 static void 973 nfsd_exit(int status) 974 { 975 killchildren(); 976 unregistration(); 977 exit(status); 978 } (frame #12) used via: (gdb) list cleanup 954 /* 955 * Cleanup master after SIGUSR1. 956 */ 957 static void 958 cleanup(__unused int signo) 959 { 960 nfsd_exit(0); 961 } . . . and (for master): (void)signal(SIGUSR1, cleanup); This suggests the possibility that the zero'd pages could be associated with killing the child processes. (I've had a past aarch64 context where forking had problems with pages that were initially common to parent and child processes. In that context having the processes swap out [not just mostly paged out] and then swap back in was involved in showing the problem. The issue was fixed and was aarch64 specific. But it leaves me willing to consider fork-related memory management as possibly odd in some way for 32-bit powerpc.) Notes . . . Another possible kind of evidence: I've gone far longer with the machine doing just normal background processing with nothing failing on its own. This suggests that the (int)mprotect(ADDRESS,1,0x1) might be changing the context --or just doing the attach and detach in gdb does. I've nothing solid in this area so I'll ignore it, other than this note. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-current@freebsd.org Mon May 11 21:33:09 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id ED6552ED8A6 for ; Mon, 11 May 2020 21:33:09 +0000 (UTC) (envelope-from pete@nomadlogic.org) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 49LZ115KQBz4Dmn for ; Mon, 11 May 2020 21:33:09 +0000 (UTC) (envelope-from pete@nomadlogic.org) Received: by mailman.nyi.freebsd.org (Postfix) id B4E122ED8A5; Mon, 11 May 2020 21:33:09 +0000 (UTC) Delivered-To: current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B4A672ED8A4 for ; Mon, 11 May 2020 21:33:09 +0000 (UTC) (envelope-from pete@nomadlogic.org) Received: from mail.nomadlogic.org (mail.nomadlogic.org [174.136.98.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mail.nomadlogic.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49LZ102fM7z4Dmm for ; Mon, 11 May 2020 21:33:07 +0000 (UTC) (envelope-from pete@nomadlogic.org) Received: from [192.168.1.160] (cpe-23-243-161-111.socal.res.rr.com [23.243.161.111]) by mail.nomadlogic.org (OpenSMTPD) with ESMTPSA id bfd0ce57 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 11 May 2020 21:33:00 +0000 (UTC) Subject: Re: Xorg question To: Filippo Moretti , FreeBSD Current References: <324127642.580017.1589016944774.ref@mail.yahoo.com> <324127642.580017.1589016944774@mail.yahoo.com> From: Pete Wright Message-ID: Date: Mon, 11 May 2020 14:33:00 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <324127642.580017.1589016944774@mail.yahoo.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Rspamd-Queue-Id: 49LZ102fM7z4Dmm X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of pete@nomadlogic.org designates 174.136.98.114 as permitted sender) smtp.mailfrom=pete@nomadlogic.org X-Spamd-Result: default: False [-5.10 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[111.161.243.23.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[nomadlogic.org]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; IP_SCORE(-2.80)[ip: (-9.27), ipnet: 174.136.96.0/20(-4.23), asn: 25795(-0.43), country: US(-0.05)]; FREEMAIL_TO(0.00)[yahoo.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:25795, ipnet:174.136.96.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 May 2020 21:33:10 -0000 On 5/9/20 2:35 AM, Filippo Moretti wrote: > I run the latest current and I have the following packages installed>xf86-input-keyboard-1.9.0_4 > xf86-input-libinput-0.28.2_1 > xf86-input-mouse-1.9.3_3 Should I keep all of them or may I keep xf86-input-libinput I don't think there is any harm in having all three of those packages installed.  In fact I have all three on my system, but allow Xorg to auto configure itself, which picks up libinput by default. -p -- Pete Wright pete@nomadlogic.org @nomadlogicLA From owner-freebsd-current@freebsd.org Mon May 11 22:16:17 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0A4BA2EE8BF for ; Mon, 11 May 2020 22:16:17 +0000 (UTC) (envelope-from pete@nomadlogic.org) Received: from mail.nomadlogic.org (mail.nomadlogic.org [174.136.98.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mail.nomadlogic.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49LZym0qQ9z4HS8 for ; Mon, 11 May 2020 22:16:15 +0000 (UTC) (envelope-from pete@nomadlogic.org) Received: from [192.168.1.159] (cpe-23-243-161-111.socal.res.rr.com [23.243.161.111]) by mail.nomadlogic.org (OpenSMTPD) with ESMTPSA id ee0ddc4d (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Mon, 11 May 2020 22:16:14 +0000 (UTC) To: FreeBSD Current From: Pete Wright Subject: lockups on lenovo p43s under current Message-ID: Date: Mon, 11 May 2020 15:16:14 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Rspamd-Queue-Id: 49LZym0qQ9z4HS8 X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of pete@nomadlogic.org designates 174.136.98.114 as permitted sender) smtp.mailfrom=pete@nomadlogic.org X-Spamd-Result: default: False [-5.10 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[111.161.243.23.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; DMARC_NA(0.00)[nomadlogic.org]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; IP_SCORE(-2.80)[ip: (-9.28), ipnet: 174.136.96.0/20(-4.24), asn: 25795(-0.43), country: US(-0.05)]; TO_DN_ALL(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:25795, ipnet:174.136.96.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 May 2020 22:16:17 -0000 hello, i have a lenovo thinkpad P43s that exhibits lockups under CURRENT but behaves fine when running STABLE.  i've tried to find a fully reproducible situation to get this system to lockup but haven't found anything yet.  i am starting to suspect that the changes implemented in this review may be the issue though: https://reviews.freebsd.org/D23728 my reasoning is that i've observed issues when: - removing AC power from the laptop, or inserting AC power - when the system display has gone to sleep - randomly hanging during boot with this as last line: battery0: battery enitialization start unfortunately while the above seem to be cases where this has happened i haven't been able to %100 reproduce yet. so my first question is - would it be possible to just revert the changes in that diff, or has too much time gone past to just back out that single change.  alternatively, is there any debugging information i can get on my end that might help figure out what the root cause is? cheers, -pete -- Pete Wright pete@nomadlogic.org @nomadlogicLA From owner-freebsd-current@freebsd.org Mon May 11 22:28:08 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 663172EEEE9 for ; Mon, 11 May 2020 22:28:08 +0000 (UTC) (envelope-from ypankov@fastmail.com) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49LbDQ70zHz4JSW for ; Mon, 11 May 2020 22:28:06 +0000 (UTC) (envelope-from ypankov@fastmail.com) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 051A65C0112; Mon, 11 May 2020 18:28:06 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 11 May 2020 18:28:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.com; h= subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type:content-transfer-encoding; s=fm2; bh=H OVYzmwFG1HPzEhux4dS1Q4lB35m/SQ06zH20Hce6R4=; b=bCGzgqyvGc/NVjdQh T/ITwHFDufkiwj4Thw5QyQ7jgtvNQqkOPRnFJjxfRtaz/krdVIKZvbemqbcjXtQQ YPH5GhojeJGYQdnk+Itvx7JzLe6xjKj+a8qBfX6RKf58KI9pNTF6tG4+eFaRe8iN j1pF13a4qyib5ugY7leQcrmVxFwIW2BlCFt5eVYd99ouMp2qywIgdE7PjBJRV24g aEnn/swCYiTbahM8ECFwQVIPNEqDj0d/F6lt1mq4OhnENma+oGGXHoC7muCURAc3 8LmMjeA2ClcMxvUEqRqe5xy1jx3683vl5nZwCQJc5fsIb3sDQonvp7y10GTdT1Ln ELXPQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm2; bh=HOVYzmwFG1HPzEhux4dS1Q4lB35m/SQ06zH20Hce6 R4=; b=gD5mgZBh46uXewRDImE62DOvZWxdsnxpxmEdQxidpLaefosS3jR1AfWDf NOIOjoLwCFdm3lM6W5+TrlgPBYh/VxFq5vz/zVdjzEHJoe7JxZc4VhkoCcnjOcpz 9PDY26X29M93xjQnUntMmzy2T0rc2pjpm1xK2jToO/RGTApx+DYfBWfPxTYdx27a 9B6lEuefqYAoK+7/3qfu+QKHlTfqHTgeWpuvtv14qT3DNsPGpU8FR/RcB7xFkYYb To+qRtt82xOkcrlJkl3vTNSHP+rLvVks8DBw3ptpEtp0Ty82A3vdddraH+JAY79i FJjoQFX5K0QOu+LHko03wUc90u2pQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduhedrledugddutdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefuvfhfhffkffgfgggjtgfgsehtkeertddtfeejnecuhfhrohhmpegjuhhrihcu rfgrnhhkohhvuceohihprghnkhhovhesfhgrshhtmhgrihhlrdgtohhmqeenucggtffrrg htthgvrhhnpeegjeejhfeijeetgeelkeetkefgfeejudfhueevueettdeuledvleevffdt keeltdenucffohhmrghinhepfhhrvggvsghsugdrohhrghenucfkphepiedvrddukeefrd dukedrieefnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhho mhephihprghnkhhovhesfhgrshhtmhgrihhlrdgtohhm X-ME-Proxy: Received: from [192.168.1.6] (unknown [62.183.18.63]) by mail.messagingengine.com (Postfix) with ESMTPA id 8D49030662A7; Mon, 11 May 2020 18:28:04 -0400 (EDT) Subject: Re: lockups on lenovo p43s under current To: Pete Wright , FreeBSD Current References: From: Yuri Pankov Message-ID: <7cd71bcc-5d3c-594f-9c06-3aea48aedc63@fastmail.com> Date: Tue, 12 May 2020 01:28:04 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 49LbDQ70zHz4JSW X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=fastmail.com header.s=fm2 header.b=bCGzgqyv; dkim=pass header.d=messagingengine.com header.s=fm2 header.b=gD5mgZBh; dmarc=pass (policy=none) header.from=fastmail.com; spf=pass (mx1.freebsd.org: domain of ypankov@fastmail.com designates 66.111.4.27 as permitted sender) smtp.mailfrom=ypankov@fastmail.com X-Spamd-Result: default: False [-3.10 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[fastmail.com:s=fm2,messagingengine.com:s=fm2]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:66.111.4.27]; FREEMAIL_FROM(0.00)[fastmail.com]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; IP_SCORE_FREEMAIL(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[fastmail.com:+,messagingengine.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[fastmail.com,none]; IP_SCORE(0.00)[ip: (-9.84), ipnet: 66.111.4.0/24(-4.89), asn: 11403(-2.69), country: US(-0.05)]; RCVD_TLS_LAST(0.00)[]; RCVD_IN_DNSWL_LOW(-0.10)[27.4.111.66.list.dnswl.org : 127.0.5.1]; FROM_EQ_ENVFROM(0.00)[]; FREEMAIL_ENVFROM(0.00)[fastmail.com]; ASN(0.00)[asn:11403, ipnet:66.111.4.0/24, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RWL_MAILSPIKE_VERYGOOD(0.00)[27.4.111.66.rep.mailspike.net : 127.0.0.19] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 May 2020 22:28:08 -0000 Pete Wright wrote: > hello, > i have a lenovo thinkpad P43s that exhibits lockups under CURRENT but > behaves fine when running STABLE.  i've tried to find a fully > reproducible situation to get this system to lockup but haven't found > anything yet.  i am starting to suspect that the changes implemented in > this review may be the issue though: > > https://reviews.freebsd.org/D23728 > > my reasoning is that i've observed issues when: > - removing AC power from the laptop, or inserting AC power > - when the system display has gone to sleep > - randomly hanging during boot with this as last line: > battery0: battery enitialization start > > unfortunately while the above seem to be cases where this has happened i > haven't been able to %100 reproduce yet. > > so my first question is - would it be possible to just revert the > changes in that diff, or has too much time gone past to just back out > that single change.  alternatively, is there any debugging information i > can get on my end that might help figure out what the root cause is? Not really what you are asking, but it's possible to disable ACPI subdevices, so you could check if disabling cmbat completely helps and it's indeed the suspect: debug.acpi.disabled="cmbat" From owner-freebsd-current@freebsd.org Mon May 11 23:21:57 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id CA8572DA710 for ; Mon, 11 May 2020 23:21:57 +0000 (UTC) (envelope-from pete@nomadlogic.org) Received: from mail.nomadlogic.org (mail.nomadlogic.org [174.136.98.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mail.nomadlogic.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49LcQX4lM1z4NCb for ; Mon, 11 May 2020 23:21:56 +0000 (UTC) (envelope-from pete@nomadlogic.org) Received: from [192.168.1.159] (cpe-23-243-161-111.socal.res.rr.com [23.243.161.111]) by mail.nomadlogic.org (OpenSMTPD) with ESMTPSA id 41c9c03d (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 11 May 2020 23:21:54 +0000 (UTC) Subject: Re: lockups on lenovo p43s under current To: Yuri Pankov , FreeBSD Current References: <7cd71bcc-5d3c-594f-9c06-3aea48aedc63@fastmail.com> From: Pete Wright Message-ID: Date: Mon, 11 May 2020 16:21:54 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <7cd71bcc-5d3c-594f-9c06-3aea48aedc63@fastmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Rspamd-Queue-Id: 49LcQX4lM1z4NCb X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of pete@nomadlogic.org designates 174.136.98.114 as permitted sender) smtp.mailfrom=pete@nomadlogic.org X-Spamd-Result: default: False [-5.10 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[111.161.243.23.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[nomadlogic.org]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; IP_SCORE(-2.80)[ip: (-9.28), ipnet: 174.136.96.0/20(-4.25), asn: 25795(-0.44), country: US(-0.05)]; FREEMAIL_TO(0.00)[fastmail.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:25795, ipnet:174.136.96.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 May 2020 23:21:57 -0000 On 5/11/20 3:28 PM, Yuri Pankov wrote: > Pete Wright wrote: >> hello, >> i have a lenovo thinkpad P43s that exhibits lockups under CURRENT but >> behaves fine when running STABLE.  i've tried to find a fully >> reproducible situation to get this system to lockup but haven't found >> anything yet.  i am starting to suspect that the changes implemented >> in this review may be the issue though: >> >> https://reviews.freebsd.org/D23728 >> >> my reasoning is that i've observed issues when: >> - removing AC power from the laptop, or inserting AC power >> - when the system display has gone to sleep >> - randomly hanging during boot with this as last line: >> battery0: battery enitialization start >> >> unfortunately while the above seem to be cases where this has >> happened i haven't been able to %100 reproduce yet. >> >> so my first question is - would it be possible to just revert the >> changes in that diff, or has too much time gone past to just back out >> that single change.  alternatively, is there any debugging >> information i can get on my end that might help figure out what the >> root cause is? > > Not really what you are asking, but it's possible to disable ACPI > subdevices, so you could check if disabling cmbat completely helps and > it's indeed the suspect: > > debug.acpi.disabled="cmbat" Thanks Yuri, So I was able to boot my system once via batter with this set, but unfortunately it crashed after I tried to suspend/resume.  Realizing that was a bit optimistic I attempted to reboot the system and wasn't able to get it to fully boot after several attempts. I believe what the next step at this point is checkout the code right before this commit and see if I can get it to successfully boot.  I'll report back if I find anything after that test. -pete -- Pete Wright pete@nomadlogic.org @nomadlogicLA From owner-freebsd-current@freebsd.org Tue May 12 22:10:01 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7F3A72DD7D6 for ; Tue, 12 May 2020 22:10:01 +0000 (UTC) (envelope-from pete@nomadlogic.org) Received: from mail.nomadlogic.org (mail.nomadlogic.org [174.136.98.114]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mail.nomadlogic.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49MBn42kWwz42rV for ; Tue, 12 May 2020 22:09:59 +0000 (UTC) (envelope-from pete@nomadlogic.org) Received: from [192.168.1.160] (cpe-23-243-161-111.socal.res.rr.com [23.243.161.111]) by mail.nomadlogic.org (OpenSMTPD) with ESMTPSA id 885a5784 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 12 May 2020 22:09:53 +0000 (UTC) Subject: Re: lockups on lenovo p43s under current From: Pete Wright To: Yuri Pankov , FreeBSD Current References: <7cd71bcc-5d3c-594f-9c06-3aea48aedc63@fastmail.com> Message-ID: Date: Tue, 12 May 2020 15:09:53 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Rspamd-Queue-Id: 49MBn42kWwz42rV X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of pete@nomadlogic.org designates 174.136.98.114 as permitted sender) smtp.mailfrom=pete@nomadlogic.org X-Spamd-Result: default: False [-5.11 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[111.161.243.23.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[nomadlogic.org]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; IP_SCORE(-2.81)[ip: (-9.29), ipnet: 174.136.96.0/20(-4.27), asn: 25795(-0.45), country: US(-0.05)]; FREEMAIL_TO(0.00)[fastmail.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:25795, ipnet:174.136.96.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 May 2020 22:10:01 -0000 On 5/11/20 4:21 PM, Pete Wright wrote: > > > On 5/11/20 3:28 PM, Yuri Pankov wrote: >> Pete Wright wrote: >>> hello, >>> i have a lenovo thinkpad P43s that exhibits lockups under CURRENT >>> but behaves fine when running STABLE.  i've tried to find a fully >>> reproducible situation to get this system to lockup but haven't >>> found anything yet.  i am starting to suspect that the changes >>> implemented in this review may be the issue though: >>> >>> https://reviews.freebsd.org/D23728 >>> >>> my reasoning is that i've observed issues when: >>> - removing AC power from the laptop, or inserting AC power >>> - when the system display has gone to sleep >>> - randomly hanging during boot with this as last line: >>> battery0: battery enitialization start >>> >>> unfortunately while the above seem to be cases where this has >>> happened i haven't been able to %100 reproduce yet. >>> >>> so my first question is - would it be possible to just revert the >>> changes in that diff, or has too much time gone past to just back >>> out that single change.  alternatively, is there any debugging >>> information i can get on my end that might help figure out what the >>> root cause is? >> >> Not really what you are asking, but it's possible to disable ACPI >> subdevices, so you could check if disabling cmbat completely helps >> and it's indeed the suspect: >> >> debug.acpi.disabled="cmbat" > > Thanks Yuri, > So I was able to boot my system once via batter with this set, but > unfortunately it crashed after I tried to suspend/resume. Realizing > that was a bit optimistic I attempted to reboot the system and wasn't > able to get it to fully boot after several attempts. > > I believe what the next step at this point is checkout the code right > before this commit and see if I can get it to successfully boot.  I'll > report back if I find anything after that test. > To follow-up on this I believe the updates in the above review may be the culprit.  What I have done is built a memstick.img set to the commit right before the changes in D23728 were merged.  running this image I can boot my system, disconnect and reconnect AC power without any issues. i then booted from a memstick using the latest snapshot of current. i can disconnect AC power without issues, but reconnecting hangs the system immediately. i've tested this a couple times and it seems pretty reproducible, not sure what the best next step would be though.  would someone here be willing to help me debug this, or would it be best to file a PR along with a dmesg and output from acpiconf? cheers! -pete -- Pete Wright pete@nomadlogic.org @nomadlogicLA From owner-freebsd-current@freebsd.org Tue May 12 22:14:13 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 172572DDE0A for ; Tue, 12 May 2020 22:14:13 +0000 (UTC) (envelope-from bdrewery@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2610:1c1:1:6074::16:84]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "freefall.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49MBsw6p5rz43Jq for ; Tue, 12 May 2020 22:14:12 +0000 (UTC) (envelope-from bdrewery@FreeBSD.org) Received: from mail.xzibition.com (unknown [127.0.1.132]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by freefall.freebsd.org (Postfix) with ESMTPS id 56A285D0D for ; Tue, 12 May 2020 22:14:12 +0000 (UTC) (envelope-from bdrewery@FreeBSD.org) Received: from mail.xzibition.com (localhost [172.31.3.2]) by mail.xzibition.com (Postfix) with ESMTP id 777011AF72 for ; Tue, 12 May 2020 22:14:11 +0000 (UTC) X-Virus-Scanned: amavisd-new at mail.xzibition.com Received: from mail.xzibition.com ([172.31.3.2]) by mail.xzibition.com (mail.xzibition.com [172.31.3.2]) (amavisd-new, port 10026) with LMTP id UaqklSpjPevP for ; Tue, 12 May 2020 22:14:01 +0000 (UTC) To: freebsd-current@FreeBSD.org DKIM-Filter: OpenDKIM Filter v2.10.3 mail.xzibition.com 42F111AF68 From: Bryan Drewery Autocrypt: addr=bdrewery@FreeBSD.org; prefer-encrypt=mutual; keydata= mQENBFJphmsBCADiFgmS4bIzwZijrS31SjEMzg+n5zNellgM+HkShwehpqCiyhXdWrvH6dTZ a6u50pbUIX7doTR7W7PQHCjCTqtpwvcj0eulZva+iHFp+XrbgSFHn+VVXgkYP2MFySyZRFab D2qqzJBEJofhpv4HvY6uQI5K99pMqKr1Z/lHqsijYYu4RH2OfwB5PinId7xeldzWEonVoCr+ rfxzO/UrgA6v/3layGZcKNHFjmc3NqoN1DXtdaEHqtjIozzbndVkH6lkFvIpIrI6i5ox8pwp VxsxLCr/4Musd5CWgHiet5kSw2SzNeA8FbxdLYCpXNVu+uBACEbCUP+CSNy3NVfEUxsBABEB AAG0JEJyeWFuIERyZXdlcnkgPGJkcmV3ZXJ5QEZyZWVCU0Qub3JnPokBVwQTAQoAQQIbAwUL CQgHAwUVCgkICwUWAwIBAAIeAQIXgAIZARYhBPkXPLLDqup6XIofCTXXcbtuRpfPBQJb5hLu BQkNPvODAAoJEDXXcbtuRpfP9rMH/3f7cfX5rzyEV5QNfV/wS4jFukLoPZ4+nCM/TKxH3pEX 2bLbeQbkk6La8cueQ5Lpoht5XFZ18Y5TbMittngltrlNzoDD0h9are24OkDFGim3afJU7tkj IGQa1if+re+vI5BhzYwRhj0oKXzBi39M5oePd3L1dXfx83rg2FPyZBdIejsz6fR74T3JVkbd 6k2l5/3Zk2uiNMy+eBfDRgYE1E6CP28kV0nCeGKZgSVso0kGUUHud7voKqGVpMvbd0mE4pp4 PE5YJaFPjrll9miaDAvdU8LGIq5n6+aXPLKoQ/QNl6mg6ifgI6FfKILOkTizLW8E5PBSNnCm NapQ55yjm125AQ0EUmmGawEIAKJUU9+Q19oW1RK5jTf3m56j+szIc8Y9DaLC8REUKl4UZJBK BqCl6c0cukVApOD92XoU6hJPm2rLEyp/IcYcPPNTnVu8D8h9oag2L8EiFN7+2hk0xG+lwjc8 uOIZycme7AIJsBU4AZ1v63lxm2k104hwpiatgbe71GIGl7p1MX6ousP/wGzXCOF25Dx9w02C eRe7zEMfhnFjSUhzdCC9han2+KaVB7qIqNR3b8NfbwRNlwPmHqlhXffUow9OsQjSnTK8WKNR lx7xzVccXIvWP2wECFrmqmzMmXpSrmIuiWEpFwZ9x2a0Pva8dCNRiCVTK51IlRXKjaAxiN1u IUrMm6UAEQEAAYkBPAQYAQoAJgIbDBYhBPkXPLLDqup6XIofCTXXcbtuRpfPBQJb5hL4BQkN PvONAAoJEDXXcbtuRpfPCjcH/ivBsOpdpebpgLizSNU5/X4yWN5Aixsc9VBnQhGKAKnMINJQ VMpA55sD2JSPwloXYM/B3qyPJRS/9cwIuX5LDNKKOZU3Qp+TzleynM15/xea14orWYRGRict YHBM3Cnqp7OD8K6Q1uhs0fTxyJP7PZ/G0+7Corlf1DlHhDt6C2HldRPFvAvAgl6sR9Wzgcb7 rzub2HVtbJgl6YHbgyAG7x9NpXFqzx1JLAMdpt2DIYwoi+oMdRQlBIwNuKjQjCGzuXHandd3 kGvBAsyJpQ+coEep9UzwANaV28cXrFr2R4FSOcR50rBA2Nh/vqUYfpsvBvJlwuKAoV1djVHa ihNeL5E= Organization: FreeBSD Subject: zfs deadlock on r360452 relating to busy vm page Message-ID: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> Date: Tue, 12 May 2020 15:13:55 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="aIvlquH7auoI17NOayHMofyL3SITm4BOn" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 May 2020 22:14:13 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --aIvlquH7auoI17NOayHMofyL3SITm4BOn Content-Type: multipart/mixed; boundary="XElGlYQBYiLZJf7vq6TsZf2eItNVgQjfg"; protected-headers="v1" From: Bryan Drewery To: freebsd-current@FreeBSD.org Message-ID: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> Subject: zfs deadlock on r360452 relating to busy vm page --XElGlYQBYiLZJf7vq6TsZf2eItNVgQjfg Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable > panic: deadlres_td_sleep_q: possible deadlock detected for 0xfffffe25ee= fa2e00 (find), blocked for 1802392 ticks 2 stuck processes from procstat -kk before panic > 72559 101698 tail - mi_switch+0x155 sl= eepq_switch+0x11a _cv_wait+0x15a rangelock_enter+0x306 zfs_freebsd_getpag= es+0x14f VOP_GETPAGES_APV+0x59 vnode_pager_getpages+0x37 vm_pager_get_pag= es+0x4f vm_fault+0x780 vm_fault_trap+0x6e trap_pfault+0x1ee > 72985 107378 find - mi_switch+0x155 sl= eepq_switch+0x11a sleeplk+0x106 lockmgr_slock_hard+0x1f5 VOP_LOCK1_APV+0x= 40 _vn_lock+0x54 lookup+0xdd namei+0x524 vn_open_cred+0x32b kern_openat+0= x1fa filemon_wrapper_openat+0x15 amd64_syscall+0x73d The only find running was thread 107378 I couldn't record much from ddb but got locked vnodes. >=20 > db> show lockedvnods > Locked vnodes > vnode 0xfffff804de66e500: type VDIR > usecount 3, writecount 0, refcount 2 mountedhere 0 > flags () > v_object 0xfffff809459cb420 ref 0 pages 0 cleanbuf 0 dirtybuf 0 > lock type zfs: SHARED (count 1) > #0 0xffffffff80b94a0f at lockmgr_slock+0xdf > #1 0xffffffff810e2a40 at VOP_LOCK1_APV+0x40 > #2 0xffffffff80cb14f4 at _vn_lock+0x54 > #3 0xffffffff80c9b3ec at vget_finish+0x6c > #4 0xffffffff80c8051c at cache_lookup+0x57c > #5 0xffffffff80c84dad at vfs_cache_lookup+0x7d > #6 0xffffffff810df996 at VOP_LOOKUP_APV+0x56 > #7 0xffffffff80c8ee61 at lookup+0x601 > #8 0xffffffff80c8e374 at namei+0x524 > #9 0xffffffff80caa83f at kern_statat+0x7f > #10 0xffffffff80caafff at sys_fstatat+0x2f > #11 0xffffffff81065c40 at amd64_syscall+0x140 > #12 0xffffffff8103b2a0 at fast_syscall_common+0x101 > vnode 0xfffff808a08f0a00: type VDIR > usecount 6, writecount 0, refcount 2 mountedhere 0 > flags () > v_object 0xfffff801eb930000 ref 0 pages 0 cleanbuf 0 dirtybuf 0 > lock type zfs: EXCL by thread 0xfffffe24aadb6100 (pid 72267, gmake,= tid 104356) > with shared waiters pending > #0 0xffffffff80b94a0f at lockmgr_slock+0xdf > #1 0xffffffff810e2a40 at VOP_LOCK1_APV+0x40 > #2 0xffffffff80cb14f4 at _vn_lock+0x54 > #3 0xffffffff80c8e93d at lookup+0xdd > #4 0xffffffff80c8e374 at namei+0x524 > #5 0xffffffff80ca9e69 at kern_funlinkat+0xa9 > #6 0xffffffff80ca9db8 at sys_unlink+0x28 > #7 0xffffffff82780586 at filemon_wrapper_unlink+0x16 > #8 0xffffffff8106623d at amd64_syscall+0x73d > #9 0xffffffff8103b2a0 at fast_syscall_common+0x101 >=20 > vnode 0xfffff80571f29500: type VREG > usecount 6, writecount 1, refcount 2 > flags () > v_object 0xfffff806cb637c60 ref 2 pages 1 cleanbuf 0 dirtybuf 0 > lock type zfs: SHARED (count 2) > with exclusive waiters pending > #0 0xffffffff80b94a0f at lockmgr_slock+0xdf > #1 0xffffffff810e2a40 at VOP_LOCK1_APV+0x40 > #2 0xffffffff80cb14f4 at _vn_lock+0x54 > #3 0xffffffff8243af40 at zfs_lookup+0x610 > #4 0xffffffff8243b61e at zfs_freebsd_cachedlookup+0x8e > #5 0xffffffff810dfb46 at VOP_CACHEDLOOKUP_APV+0x56 > #6 0xffffffff80c84dd8 at vfs_cache_lookup+0xa8 > #7 0xffffffff810df996 at VOP_LOOKUP_APV+0x56 > #8 0xffffffff80c8ee61 at lookup+0x601 > #9 0xffffffff80c8e374 at namei+0x524 > #10 0xffffffff80caa83f at kern_statat+0x7f > #11 0xffffffff80caafff at sys_fstatat+0x2f > #12 0xffffffff8106623d at amd64_syscall+0x73d > #13 0xffffffff8103b2a0 at fast_syscall_common+0x101 It's nice how recent threads are at the top in gdb... > (kgdb) info threads > Id Target Id Frame > 1 Thread 107952 (PID=3D79390: zfs) sche= d_switch (td=3D0xfffffe26ebb36000, flags=3D) at /usr/src/s= ys/kern/sched_ule.c:2147 > 2 Thread 102764 (PID=3D73218: zfs) sche= d_switch (td=3D0xfffffe2490a12300, flags=3D) at /usr/src/s= ys/kern/sched_ule.c:2147 > 3 Thread 107378 (PID=3D72985: find) sche= d_switch (td=3D0xfffffe25eefa2e00, flags=3D) at /usr/src/s= ys/kern/sched_ule.c:2147 > 4 Thread 103940 (PID=3D72980: rm) sche= d_switch (td=3D0xfffffe2451932500, flags=3D) at /usr/src/s= ys/kern/sched_ule.c:2147 > 5 Thread 101698 (PID=3D72559: tail) sche= d_switch (td=3D0xfffffe255eac0000, flags=3D) at /usr/src/s= ys/kern/sched_ule.c:2147 > 6 Thread 103660 (PID=3D72280: timestamp) sche= d_switch (td=3D0xfffffe25f948aa00, flags=3D) at /usr/src/s= ys/kern/sched_ule.c:2147 > 7 Thread 101249 (PID=3D72280: timestamp/prefix_stdout) sche= d_switch (td=3D0xfffffe264412a100, flags=3D) at /usr/src/s= ys/kern/sched_ule.c:2147 > 8 Thread 101255 (PID=3D72280: timestamp/prefix_stderr) sche= d_switch (td=3D0xfffffe25c8e9bc00, flags=3D) at /usr/src/s= ys/kern/sched_ule.c:2147 > 9 Thread 104356 (PID=3D72267: gmake) sche= d_switch (td=3D0xfffffe24aadb6100, flags=3D) at /usr/src/s= ys/kern/sched_ule.c:2147 > 10 Thread 108476 (PID=3D66957: vim) sche= d_switch (td=3D0xfffffe26c8601500, flags=3D) at /usr/src/s= ys/kern/sched_ule.c:2147 The 2 threads holding shared lock on vnode 0xfffff80571f29500: The tail thread (101698) is waiting for a zfs rangelock getting pages for vnode 0xfffff80571f29500 > (kgdb) thread 5 > [Switching to thread 5 (Thread 101698)] > #0 sched_switch (td=3D0xfffffe255eac0000, flags=3D) at = /usr/src/sys/kern/sched_ule.c:2147 > 2147 cpuid =3D td->td_oncpu =3D PCPU_GET(cpuid); > (kgdb) backtrace > #0 sched_switch (td=3D0xfffffe255eac0000, flags=3D) at = /usr/src/sys/kern/sched_ule.c:2147 > #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern/= kern_synch.c:542 > #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffff810fb57dd48, pr= i=3D0) at /usr/src/sys/kern/subr_sleepqueue.c:625 > #3 0xffffffff80b57f0a in _cv_wait (cvp=3D0xfffff810fb57dd48, lock=3D0x= fffff80049a99040) at /usr/src/sys/kern/kern_condvar.c:146 > #4 0xffffffff82434ab6 in rangelock_enter_reader (rl=3D0xfffff80049a990= 18, new=3D0xfffff8022cadb100) at /usr/src/sys/cddl/contrib/opensolaris/ut= s/common/fs/zfs/zfs_rlock.c:429 > #5 rangelock_enter (rl=3D0xfffff80049a99018, off=3D, le= n=3D, type=3D) at /usr/src/sys/cddl/contrib= /opensolaris/uts/common/fs/zfs/zfs_rlock.c:477 > #6 0xffffffff82443d3f in zfs_getpages (vp=3D, ma=3D0xff= fffe259f204b18, count=3D, rbehind=3D0xfffffe259f204ac4, ra= head=3D0xfffffe259f204ad0) at /usr/src/sys/cddl/contrib/opensolaris/uts/c= ommon/fs/zfs/zfs_vnops.c:4500 > #7 zfs_freebsd_getpages (ap=3D) at /usr/src/sys/cddl/co= ntrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4567 > #8 0xffffffff810e3ab9 in VOP_GETPAGES_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe259f2049f0) at vnode_if.c:2644 > #9 0xffffffff80f349e7 in VOP_GETPAGES (vp=3D, m=3D, count=3D, rbehind=3D, rahead=3D) at ./vnode_if.h:1171 > #10 vnode_pager_getpages (object=3D, m=3D= , count=3D, rbehind=3D, rahead=3D)= at /usr/src/sys/vm/vnode_pager.c:743 > #11 0xffffffff80f2a93f in vm_pager_get_pages (object=3D0xfffff806cb637c= 60, m=3D0xfffffe259f204b18, count=3D1, rbehind=3D, rahead=3D= ) at /usr/src/sys/vm/vm_pager.c:305 > #12 0xffffffff80f054b0 in vm_fault_getpages (fs=3D, nera= =3D0, behindp=3D, aheadp=3D) at /usr/src/sy= s/vm/vm_fault.c:1163 > #13 vm_fault (map=3D, vaddr=3D, fault_typ= e=3D, fault_flags=3D, m_hold=3D) at /usr/src/sys/vm/vm_fault.c:1394 > #14 0xffffffff80f04bde in vm_fault_trap (map=3D0xfffffe25653949e8, vadd= r=3D, fault_type=3D, fault_flags=3D0, signo= =3D0xfffffe259f204d04, ucode=3D0xfffffe259f204d00) at /usr/src/sys/vm/vm_= fault.c:589 > #15 0xffffffff8106544e in trap_pfault (frame=3D0xfffffe259f204d40, user= mode=3D, signo=3D, ucode=3D) a= t /usr/src/sys/amd64/amd64/trap.c:821 > #16 0xffffffff81064a9c in trap (frame=3D0xfffffe259f204d40) at /usr/src= /sys/amd64/amd64/trap.c:340 > #17 > #18 0x00000000002034fc in ?? () > (kgdb) frame 11 > #11 0xffffffff80f2a93f in vm_pager_get_pages (object=3D0xfffff806cb637c= 60, m=3D0xfffffe259f204b18, count=3D1, rbehind=3D, rahead=3D= ) at /usr/src/sys/vm/vm_pager.c:305 > 305 r =3D (*pagertab[object->type]->pgo_getpages)(object, m= , count, rbehind, > (kgdb) p *object > $10 =3D {lock =3D {lock_object =3D {lo_name =3D 0xffffffff8114fa30 "vm = object", lo_flags =3D 627245056, lo_data =3D 0, lo_witness =3D 0x0}, rw_l= ock =3D 1}, object_list =3D {tqe_next =3D 0xfffff806cb637d68, tqe_prev =3D= 0xfffff806cb637b78}, shadow_head =3D {lh_first =3D 0x0}, shadow_list =3D= {le_next =3D 0xffffffffffffffff, > le_prev =3D 0xffffffffffffffff}, memq =3D {tqh_first =3D 0xfffffe00= 1cbca850, tqh_last =3D 0xfffffe001cbca860}, rtree =3D {rt_root =3D 184467= 41875168421969}, size =3D 1099, domain =3D {dr_policy =3D 0x0, dr_iter =3D= 0}, generation =3D 1, cleangeneration =3D 1, ref_count =3D 2, shadow_cou= nt =3D 0, memattr =3D 6 '\006', type =3D 2 '\002', > flags =3D 4096, pg_color =3D 0, paging_in_progress =3D {__count =3D 2= }, busy =3D {__count =3D 0}, resident_page_count =3D 1, backing_object =3D= 0x0, backing_object_offset =3D 0, pager_object_list =3D {tqe_next =3D 0x= 0, tqe_prev =3D 0x0}, rvq =3D {lh_first =3D 0x0}, handle =3D 0xfffff80571= f29500, un_pager =3D {vnp =3D {vnp_size =3D 4499568, > writemappings =3D 0}, devp =3D {devp_pglist =3D {tqh_first =3D 0x= 44a870, tqh_last =3D 0x0}, ops =3D 0x0, dev =3D 0x0}, sgp =3D {sgp_pglist= =3D {tqh_first =3D 0x44a870, tqh_last =3D 0x0}}, swp =3D {swp_tmpfs =3D = 0x44a870, swp_blks =3D {pt_root =3D 0}, writemappings =3D 0}}, cred =3D 0= x0, charge =3D 0, umtx_data =3D 0x0} > (kgdb) p object->handle > $11 =3D (void *) 0xfffff80571f29500 > (kgdb) p *(struct vnode *) 0xfffff80571f29500 > $18 =3D {v_type =3D VREG, v_irflag =3D 0, v_op =3D 0xffffffff8250a1e0 <= zfs_vnodeops>, v_data =3D 0xfffff80049a99000, v_mount =3D 0xfffffe247e5f3= 700, v_nmntvnodes =3D {tqe_next =3D 0xfffff8086eb38a00, tqe_prev =3D 0xff= fff80461c2d7a0}, {v_mountedhere =3D 0x0, v_unpcb =3D 0x0, v_rdev =3D 0x0,= v_fifoinfo =3D 0x0}, v_hashlist =3D {le_next =3D 0x0, > le_prev =3D 0x0}, v_cache_src =3D {lh_first =3D 0x0}, v_cache_dst =3D= {tqh_first =3D 0x0, tqh_last =3D 0xfffff80571f29550}, v_cache_dd =3D 0x0= , v_lock =3D {lock_object =3D {lo_name =3D 0xffffffff82486a37 "zfs", lo_f= lags =3D 117112832, lo_data =3D 0, lo_witness =3D 0x0}, lk_lock =3D 37, l= k_exslpfail =3D 0, lk_timo =3D 51, lk_pri =3D 96, > lk_stack =3D {depth =3D 14, pcs =3D {18446744071574211087, 18446744= 071579773504, 18446744071575377140, 18446744071600058176, 184467440716000= 59934, 18446744071579761478, 18446744071575195096, 18446744071579761046, = 18446744071575236193, 18446744071575233396, 18446744071575349311, 1844674= 4071575351295, > 18446744071579263549, 18446744071579087520, 0, 0, 0, 0}}}, v_in= terlock =3D {lock_object =3D {lo_name =3D 0xffffffff8123c142 "vnode inter= lock", lo_flags =3D 16973824, lo_data =3D 0, lo_witness =3D 0xfffff8123fd= 73600}, mtx_lock =3D 0}, v_vnlock =3D 0xfffff80571f29568, v_vnodelist =3D= {tqe_next =3D 0xfffff8064bd0dc80, > tqe_prev =3D 0xfffff80e250788d8}, v_lazylist =3D {tqe_next =3D 0x0,= tqe_prev =3D 0x0}, v_bufobj =3D {bo_lock =3D {lock_object =3D {lo_name =3D= 0xffffffff811fb7ab "bufobj interlock", lo_flags =3D 86179840, lo_data =3D= 0, lo_witness =3D 0xfffff8123fd7dd80}, rw_lock =3D 1}, bo_ops =3D 0xffff= ffff8191ead0 , > bo_object =3D 0xfffff806cb637c60, bo_synclist =3D {le_next =3D 0x0,= le_prev =3D 0x0}, bo_private =3D 0xfffff80571f29500, bo_clean =3D {bv_hd= =3D {tqh_first =3D 0x0, tqh_last =3D 0xfffff80571f296c0}, bv_root =3D {p= t_root =3D 0}, bv_cnt =3D 0}, bo_dirty =3D {bv_hd =3D {tqh_first =3D 0x0,= tqh_last =3D 0xfffff80571f296e0}, bv_root =3D {pt_root =3D 0}, > bv_cnt =3D 0}, bo_numoutput =3D 0, bo_flag =3D 0, bo_domain =3D 5= , bo_bsize =3D 131072}, v_pollinfo =3D 0x0, v_label =3D 0x0, v_lockf =3D = 0x0, v_rl =3D {rl_waiters =3D {tqh_first =3D 0xfffff80f2cc12708, tqh_last= =3D 0xfffff80f2cc12708}, rl_currdep =3D 0x0}, v_cstart =3D 0, v_lasta =3D= 0, v_lastw =3D 0, v_clen =3D 0, v_holdcnt =3D 2, > v_usecount =3D 6, v_iflag =3D 0, v_vflag =3D 0, v_mflag =3D 0, v_dbat= chcpu =3D -1, v_writecount =3D 1, v_hash =3D 45676874} Is that thread busying the vm object? thread 101255 (timestamp/prefix_stderr) which is also acting on vnode 0xfffff80571f29500 that the tail thread 101698 was. > (kgdb) thread > [Current thread is 8 (Thread 101255)] > (kgdb) backtrace > #0 sched_switch (td=3D0xfffffe25c8e9bc00, flags=3D) at = /usr/src/sys/kern/sched_ule.c:2147 > #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern/= kern_synch.c:542 > #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffffe001cbca850, pr= i=3D84) at /usr/src/sys/kern/subr_sleepqueue.c:625 > #3 0xffffffff80f1de50 in _vm_page_busy_sleep (obj=3D, m= =3D0xfffffe001cbca850, pindex=3D, wmesg=3D,= allocflags=3D21504, locked=3Dfalse) at /usr/src/sys/vm/vm_page.c:1094 > #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=3D0xfffff806cb637c= 60, m=3D, pindex=3D, wmesg=3D, allocflags=3D21504, locked=3D) at /usr/src/sys/vm/vm_page.c:4326 > #5 vm_page_acquire_unlocked (object=3D0xfffff806cb637c60, pindex=3D109= 8, prev=3D, mp=3D0xfffffe2717fc6730, allocflags=3D21504) a= t /usr/src/sys/vm/vm_page.c:4469 > #6 0xffffffff80f24c61 in vm_page_grab_valid_unlocked (mp=3D0xfffffe271= 7fc6730, object=3D0xfffff806cb637c60, pindex=3D1098, allocflags=3D21504) = at /usr/src/sys/vm/vm_page.c:4645 > #7 0xffffffff82440246 in page_busy (vp=3D0xfffff80571f29500, start=3D4= 497408, off=3D, nbytes=3D) at /usr/src/sys/= cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:414 > #8 update_pages (vp=3D0xfffff80571f29500, start=3D4497408, len=3D32, o= s=3D0xfffff8096a277400, oid=3D2209520, segflg=3D, tx=3D) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/= zfs_vnops.c:482 > #9 zfs_write (vp=3D, uio=3D, ioflag=3D0,= cr=3D, ct=3D) at /usr/src/sys/cddl/contrib= /opensolaris/uts/common/fs/zfs/zfs_vnops.c:1071 > #10 zfs_freebsd_write (ap=3D) at /usr/src/sys/cddl/contr= ib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4838 > #11 0xffffffff810e0eaf in VOP_WRITE_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe2717fc68c8) at vnode_if.c:925 > #12 0xffffffff80cb574c in VOP_WRITE (vp=3D0xfffff80571f29500, uio=3D0xf= ffffe2717fc6bb0, ioflag=3D8323073, cred=3D) at ./vnode_if.= h:413 > #13 vn_write (fp=3D0xfffff8048195e8c0, uio=3D, active_cr= ed=3D, flags=3D, td=3D) at /= usr/src/sys/kern/vfs_vnops.c:894 > #14 0xffffffff80cb50c3 in vn_io_fault_doio (args=3D0xfffffe2717fc6af0, = uio=3D0xfffffe2717fc6bb0, td=3D0xfffffe25c8e9bc00) at /usr/src/sys/kern/v= fs_vnops.c:959 > #15 0xffffffff80cb1c8c in vn_io_fault1 (vp=3D, uio=3D0xf= ffffe2717fc6bb0, args=3D0xfffffe2717fc6af0, td=3D0xfffffe25c8e9bc00) at /= usr/src/sys/kern/vfs_vnops.c:1077 > #16 0xffffffff80cafa32 in vn_io_fault (fp=3D0xfffff8048195e8c0, uio=3D0= xfffffe2717fc6bb0, active_cred=3D0xfffff80f2cc12708, flags=3D0, td=3D) at /usr/src/sys/kern/vfs_vnops.c:1181 > #17 0xffffffff80c34331 in fo_write (fp=3D0xfffff8048195e8c0, uio=3D0xff= fffe2717fc6bb0, active_cred=3D, flags=3D, td=3D= 0xfffffe25c8e9bc00) at /usr/src/sys/sys/file.h:326 > #18 dofilewrite (td=3D0xfffffe25c8e9bc00, fd=3D2, fp=3D0xfffff8048195e8= c0, auio=3D0xfffffe2717fc6bb0, offset=3D, flags=3D) at /usr/src/sys/kern/sys_generic.c:564 > #19 0xffffffff80c33eb0 in kern_writev (td=3D0xfffffe25c8e9bc00, fd=3D2,= auio=3D) at /usr/src/sys/kern/sys_generic.c:491 > #20 sys_write (td=3D0xfffffe25c8e9bc00, uap=3D) at /usr/= src/sys/kern/sys_generic.c:406 > #21 0xffffffff8106623d in syscallenter (td=3D) at /usr/s= rc/sys/amd64/amd64/../../kern/subr_syscall.c:150 > #22 amd64_syscall (td=3D0xfffffe25c8e9bc00, traced=3D0) at /usr/src/sys= /amd64/amd64/trap.c:1161 > #23 > #24 0x000000080043d53a in ?? () Maybe r358443 is related? > (kgdb) frame 4 > #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=3D0xfffff806cb637c= 60, m=3D, pindex=3D, wmesg=3D, allocflags=3D21504, locked=3D) at /usr/src/sys/vm/vm_page.c:4326 > 4326 if (_vm_page_busy_sleep(object, m, m->pindex, wmesg, al= locflags, > (kgdb) p *object > $8 =3D {lock =3D {lock_object =3D {lo_name =3D 0xffffffff8114fa30 "vm o= bject", lo_flags =3D 627245056, lo_data =3D 0, lo_witness =3D 0x0}, rw_lo= ck =3D 1}, object_list =3D {tqe_next =3D 0xfffff806cb637d68, tqe_prev =3D= 0xfffff806cb637b78}, shadow_head =3D {lh_first =3D 0x0}, shadow_list =3D= {le_next =3D 0xffffffffffffffff, > le_prev =3D 0xffffffffffffffff}, memq =3D {tqh_first =3D 0xfffffe00= 1cbca850, tqh_last =3D 0xfffffe001cbca860}, rtree =3D {rt_root =3D 184467= 41875168421969}, size =3D 1099, domain =3D {dr_policy =3D 0x0, dr_iter =3D= 0}, generation =3D 1, cleangeneration =3D 1, ref_count =3D 2, shadow_cou= nt =3D 0, memattr =3D 6 '\006', type =3D 2 '\002', > flags =3D 4096, pg_color =3D 0, paging_in_progress =3D {__count =3D 2= }, busy =3D {__count =3D 0}, resident_page_count =3D 1, backing_object =3D= 0x0, backing_object_offset =3D 0, pager_object_list =3D {tqe_next =3D 0x= 0, tqe_prev =3D 0x0}, rvq =3D {lh_first =3D 0x0}, handle =3D 0xfffff80571= f29500, un_pager =3D {vnp =3D {vnp_size =3D 4499568, > writemappings =3D 0}, devp =3D {devp_pglist =3D {tqh_first =3D 0x= 44a870, tqh_last =3D 0x0}, ops =3D 0x0, dev =3D 0x0}, sgp =3D {sgp_pglist= =3D {tqh_first =3D 0x44a870, tqh_last =3D 0x0}}, swp =3D {swp_tmpfs =3D = 0x44a870, swp_blks =3D {pt_root =3D 0}, writemappings =3D 0}}, cred =3D 0= x0, charge =3D 0, umtx_data =3D 0x0} > (kgdb) frame 5 > #5 vm_page_acquire_unlocked (object=3D0xfffff806cb637c60, pindex=3D109= 8, prev=3D, mp=3D0xfffffe2717fc6730, allocflags=3D21504) a= t /usr/src/sys/vm/vm_page.c:4469 > 4469 if (!vm_page_grab_sleep(object, m, pindex, "pgn= slp", > (kgdb) p *m > $9 =3D {plinks =3D {q =3D {tqe_next =3D 0xffffffffffffffff, tqe_prev =3D= 0xffffffffffffffff}, s =3D {ss =3D {sle_next =3D 0xffffffffffffffff}}, m= emguard =3D {p =3D 18446744073709551615, v =3D 18446744073709551615}, uma= =3D {slab =3D 0xffffffffffffffff, zone =3D 0xffffffffffffffff}}, listq =3D= {tqe_next =3D 0x0, tqe_prev =3D 0xfffff806cb637ca8}, > object =3D 0xfffff806cb637c60, pindex =3D 1098, phys_addr =3D 1898840= 8832, md =3D {pv_list =3D {tqh_first =3D 0x0, tqh_last =3D 0xfffffe001cbc= a888}, pv_gen =3D 44682, pat_mode =3D 6}, ref_count =3D 2147483648, busy_= lock =3D 1588330502, a =3D {{flags =3D 0, queue =3D 255 '\377', act_count= =3D 0 '\000'}, _bits =3D 16711680}, order =3D 13 '\r', > pool =3D 0 '\000', flags =3D 1 '\001', oflags =3D 0 '\000', psind =3D= 0 '\000', segind =3D 6 '\006', valid =3D 0 '\000', dirty =3D 0 '\000'} Pretty sure this thread is holding the rangelock from zfs_write() that tail is waiting on. So what is this thread (101255) waiting on exactly for? I'm not sure the way to track down what is using vm object 0xfffff806cb637c60. If the tail thread busied the page then they are waiting on each other I guess. If that's true then r358443 removing the write lock on the object in update_pages() could be a problem. Not sure the rest is interesting. I think they are just waiting on the locked vnode but I give it here in case I missed something. thread 101249 (timestamp/prefix_stdout) is also acting on vnode 0xfffff80571f29500 > (kgdb) thread 7 > [Switching to thread 7 (Thread 101249)] > #0 sched_switch (td=3D0xfffffe264412a100, flags=3D) at = /usr/src/sys/kern/sched_ule.c:2147 > 2147 cpuid =3D td->td_oncpu =3D PCPU_GET(cpuid); > (kgdb) backtrace > #0 sched_switch (td=3D0xfffffe264412a100, flags=3D) at = /usr/src/sys/kern/sched_ule.c:2147 > #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern/= kern_synch.c:542 > #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffff8048195e8e2, pr= i=3D119) at /usr/src/sys/kern/subr_sleepqueue.c:625 > #3 0xffffffff80bcdb6d in _sleep (ident=3D0xfffff8048195e8e2, lock=3D, priority=3D119, wmesg=3D0xffffffff8123c694 "vofflock", sbt= =3D, pr=3D0, flags=3D256) at /usr/src/sys/kern/kern_synch.= c:221 > #4 0xffffffff80cb203a in foffset_lock (fp=3D0xfffff8048195e8c0, flags=3D= ) at /usr/src/sys/kern/vfs_vnops.c:700 > #5 0xffffffff80caf909 in foffset_lock_uio (fp=3D, uio=3D= , flags=3D) at /usr/src/sys/kern/vfs_vnops.= c:748 > #6 vn_io_fault (fp=3D0xfffff8048195e8c0, uio=3D0xfffffe2719d9cbb0, act= ive_cred=3D0xfffff80786ecad00, flags=3D0, td=3D0xfffffe264412a100) at /us= r/src/sys/kern/vfs_vnops.c:1163 > #7 0xffffffff80c34331 in fo_write (fp=3D0xfffff8048195e8c0, uio=3D0xff= fffe2719d9cbb0, active_cred=3D, flags=3D, td=3D= 0xfffffe264412a100) at /usr/src/sys/sys/file.h:326 > #8 dofilewrite (td=3D0xfffffe264412a100, fd=3D1, fp=3D0xfffff8048195e8= c0, auio=3D0xfffffe2719d9cbb0, offset=3D, flags=3D) at /usr/src/sys/kern/sys_generic.c:564 > #9 0xffffffff80c33eb0 in kern_writev (td=3D0xfffffe264412a100, fd=3D1,= auio=3D) at /usr/src/sys/kern/sys_generic.c:491 > #10 sys_write (td=3D0xfffffe264412a100, uap=3D) at /usr/= src/sys/kern/sys_generic.c:406 > #11 0xffffffff8106623d in syscallenter (td=3D) at /usr/s= rc/sys/amd64/amd64/../../kern/subr_syscall.c:150 > #12 amd64_syscall (td=3D0xfffffe264412a100, traced=3D0) at /usr/src/sys= /amd64/amd64/trap.c:1161 > #13 > #14 0x000000080043d53a in ?? () > Backtrace stopped: Cannot access memory at address 0x7fffdfffddd8 > (kgdb) frame 6 > #6 vn_io_fault (fp=3D0xfffff8048195e8c0, uio=3D0xfffffe2719d9cbb0, act= ive_cred=3D0xfffff80786ecad00, flags=3D0, td=3D0xfffffe264412a100) at /us= r/src/sys/kern/vfs_vnops.c:1163 > 1163 foffset_lock_uio(fp, uio, flags); > (kgdb) p *fp > $22 =3D {f_data =3D 0xfffff80571f29500, f_ops =3D 0xffffffff81923a10 , f_cred =3D 0xfffff80786ecad00, f_vnode =3D 0xfffff80571f29500, f_t= ype =3D 1, f_vnread_flags =3D 3, f_flag =3D 2, f_count =3D 4, {f_seqcount= =3D 127, f_pipegen =3D 127}, f_nextoff =3D 4499536, f_vnun =3D {fvn_cdev= priv =3D 0x0, fvn_advice =3D 0x0}, f_offset =3D 4499536, > f_label =3D 0x0} thread 104356 (gmake) is just waiting on the lock for vnode 0xfffff80571f29500 It also is holding exclusive lock on directory vnode 0xfffff808a08f0a00 > (kgdb) thread 9 > [Switching to thread 9 (Thread 104356)] > #0 sched_switch (td=3D0xfffffe24aadb6100, flags=3D) at = /usr/src/sys/kern/sched_ule.c:2147 > 2147 cpuid =3D td->td_oncpu =3D PCPU_GET(cpuid); > (kgdb) backtrace > #0 sched_switch (td=3D0xfffffe24aadb6100, flags=3D) at = /usr/src/sys/kern/sched_ule.c:2147 > #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern/= kern_synch.c:542 > #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffff80571f29568, pr= i=3D96) at /usr/src/sys/kern/subr_sleepqueue.c:625 > #3 0xffffffff80b954f6 in sleeplk (lk=3D0xfffff80571f29568, flags=3D532= 480, ilk=3D, wmesg=3D, pri=3D, timo=3D51, queue=3D0) at /usr/src/sys/kern/kern_lock.c:295 > #4 0xffffffff80b93a1e in lockmgr_xlock_hard (lk=3D0xfffff80571f29568, = flags=3D, ilk=3D0x0, file=3D, line=3D1432, lw= a=3D0xfffff80571f29568) at /usr/src/sys/kern/kern_lock.c:841 > #5 0xffffffff810e2a40 in VOP_LOCK1_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe271833f4d8) at vnode_if.c:1989 > #6 0xffffffff80cb14f4 in VOP_LOCK1 (vp=3D0xfffff80571f29500, flags=3D5= 32480, file=3D0xffffffff82472ac9 "/usr/src/sys/cddl/contrib/opensolaris/u= ts/common/fs/zfs/zfs_vnops.c", line=3D1432) at ./vnode_if.h:879 > #7 _vn_lock (vp=3D0xfffff80571f29500, flags=3D532480, file=3D0xfffffff= f82472ac9 "/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vn= ops.c", line=3D1432) at /usr/src/sys/kern/vfs_vnops.c:1613 > #8 0xffffffff8243af40 in zfs_lookup_lock (dvp=3D0xfffff808a08f0a00, vp= =3D0xfffff80571f29500, name=3D0xfffffe271833f630 "copool-basic.sh.log", l= kflags=3D532480) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/z= fs/zfs_vnops.c:1432 > #9 zfs_lookup (dvp=3D0xfffff808a08f0a00, nm=3D0xfffffe271833f630 "copo= ol-basic.sh.log", vpp=3D, cnp=3D0xfffffe271833faf0, nameio= p=3D2, cr=3D, td=3D, flags=3D0, cached=3D1)= at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1= 606 > #10 0xffffffff8243b61e in zfs_freebsd_lookup (ap=3D0xfffffe271833f780, = cached=3D) a= t /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:490= 0 > #11 zfs_freebsd_cachedlookup (ap=3D0xfffffe271833f780) at /usr/src/sys/= cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4908 > #12 0xffffffff810dfb46 in VOP_CACHEDLOOKUP_APV (vop=3D0xffffffff8250a1e= 0 , a=3D0xfffffe271833f780) at vnode_if.c:180 > #13 0xffffffff80c84dd8 in VOP_CACHEDLOOKUP (dvp=3D0xfffff808a08f0a00, v= pp=3D0xfffffe271833fac0, cnp=3D0xfffffe271833faf0) at ./vnode_if.h:80 > #14 vfs_cache_lookup (ap=3D) at /usr/src/sys/kern/vfs_ca= che.c:2149 > #15 0xffffffff810df996 in VOP_LOOKUP_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe271833f820) at vnode_if.c:117 > #16 0xffffffff80c8ee61 in VOP_LOOKUP (dvp=3D0xfffff808a08f0a00, vpp=3D0= xfffffe271833fac0, cnp=3D0xfffffe271833faf0) at ./vnode_if.h:54 > #17 lookup (ndp=3D0xfffffe271833fa60) at /usr/src/sys/kern/vfs_lookup.c= :951 > #18 0xffffffff80c8e374 in namei (ndp=3D0xfffffe271833fa60) at /usr/src/= sys/kern/vfs_lookup.c:512 > #19 0xffffffff80ca9e69 in kern_funlinkat (td=3D0xfffffe24aadb6100, dfd=3D= -100, path=3D0x800a3982e , fd=3D, pathseg=3DUIO_USERSPACE, flag=3D, oldinum=3D0) at /usr/src/sys/kern/vfs_syscalls.c:1819 > #20 0xffffffff80ca9db8 in sys_unlink (td=3D, uap=3D) at /usr/src/sys/kern/vfs_syscalls.c:1747 > #21 0xffffffff82780586 in filemon_wrapper_unlink (td=3D, u= ap=3D0xfffffe24aadb64d8) at /usr/src/sys/dev/filemon/filemon_wrapper.c:35= 0 > #22 0xffffffff8106623d in syscallenter (td=3D) at /usr/s= rc/sys/amd64/amd64/../../kern/subr_syscall.c:150 > #23 amd64_syscall (td=3D0xfffffe24aadb6100, traced=3D0) at /usr/src/sys= /amd64/amd64/trap.c:1161 thread 108476 (vim) is waiting to lock the directory vnode 0xfffff808a08f0a00 > (kgdb) thread 10 > [Switching to thread 10 (Thread 108476)] > #0 sched_switch (td=3D0xfffffe26c8601500, flags=3D) at = /usr/src/sys/kern/sched_ule.c:2147 > 2147 cpuid =3D td->td_oncpu =3D PCPU_GET(cpuid); > (kgdb) backtrace > #0 sched_switch (td=3D0xfffffe26c8601500, flags=3D) at = /usr/src/sys/kern/sched_ule.c:2147 > #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern/= kern_synch.c:542 > #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffff808a08f0a68, pr= i=3D96) at /usr/src/sys/kern/subr_sleepqueue.c:625 > #3 0xffffffff80b954f6 in sleeplk (lk=3D0xfffff808a08f0a68, flags=3D210= 5344, ilk=3D, wmesg=3D, pri=3D, timo=3D51, queue=3D1) at /usr/src/sys/kern/kern_lock.c:295 > #4 0xffffffff80b93525 in lockmgr_slock_hard (lk=3D0xfffff808a08f0a68, = flags=3D2105344, ilk=3D, file=3D0xffffffff811fb967 "/usr/s= rc/sys/kern/vfs_subr.c", line=3D2930, lwa=3D) at /usr/src/= sys/kern/kern_lock.c:649 > #5 0xffffffff810e2a40 in VOP_LOCK1_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe271d46d6b8) at vnode_if.c:1989 > #6 0xffffffff80cb14f4 in VOP_LOCK1 (vp=3D0xfffff808a08f0a00, flags=3D2= 105344, file=3D0xffffffff811fb967 "/usr/src/sys/kern/vfs_subr.c", line=3D= 2930) at ./vnode_if.h:879 > #7 _vn_lock (vp=3D0xfffff808a08f0a00, flags=3D2105344, file=3D0xffffff= ff811fb967 "/usr/src/sys/kern/vfs_subr.c", line=3D2930) at /usr/src/sys/k= ern/vfs_vnops.c:1613 > #8 0xffffffff80c9b3ec in vget_finish (vp=3D0xfffff808a08f0a00, flags=3D= 2105344, vs=3DVGET_USECOUNT) at /usr/src/sys/kern/vfs_subr.c:2930 > #9 0xffffffff80c8051c in cache_lookup (dvp=3D, vpp=3D, cnp=3D, tsp=3D, ticksp=3D) at /usr/src/sys/kern/vfs_cache.c:1407 > #10 0xffffffff80c84dad in vfs_cache_lookup (ap=3D) at /u= sr/src/sys/kern/vfs_cache.c:2147 > #11 0xffffffff810df996 in VOP_LOOKUP_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe271d46d8a0) at vnode_if.c:117 > #12 0xffffffff80c8ee61 in VOP_LOOKUP (dvp=3D0xfffff804de66e500, vpp=3D0= xfffffe271d46da60, cnp=3D0xfffffe271d46da90) at ./vnode_if.h:54 > #13 lookup (ndp=3D0xfffffe271d46da00) at /usr/src/sys/kern/vfs_lookup.c= :951 > #14 0xffffffff80c8e374 in namei (ndp=3D0xfffffe271d46da00) at /usr/src/= sys/kern/vfs_lookup.c:512 > #15 0xffffffff80caa83f in kern_statat (td=3D0xfffffe26c8601500, flag=3D= , fd=3D, path=3D0x8049d12c0 , pathseg=3DUIO_USERSPACE, sbp=3D0xf= ffffe271d46db28, hook=3D0x0) at /usr/src/sys/kern/vfs_syscalls.c:2340 > #16 0xffffffff80caafff in sys_fstatat (td=3D, uap=3D0xffff= fe26c86018d8) at /usr/src/sys/kern/vfs_syscalls.c:2317 > #17 0xffffffff81065c40 in syscallenter (td=3D) at /usr/s= rc/sys/amd64/amd64/../../kern/subr_syscall.c:162 > #18 amd64_syscall (td=3D0xfffffe26c8601500, traced=3D0) at /usr/src/sys= /amd64/amd64/trap.c:1161 > #19 > #20 0x00000008020ba75a in ?? () Lastly the find thread (107378) is waiting to lock the same directory vnode 0xfffff808a08f0a00 > (kgdb) thread 3 > [Switching to thread 3 (Thread 107378)] > #0 sched_switch (td=3D0xfffffe25eefa2e00, flags=3D) at = /usr/src/sys/kern/sched_ule.c:2147 > 2147 cpuid =3D td->td_oncpu =3D PCPU_GET(cpuid); > (kgdb) backtrace > #0 sched_switch (td=3D0xfffffe25eefa2e00, flags=3D) at = /usr/src/sys/kern/sched_ule.c:2147 > #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern/= kern_synch.c:542 > #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffff808a08f0a68, pr= i=3D96) at /usr/src/sys/kern/subr_sleepqueue.c:625 > #3 0xffffffff80b954f6 in sleeplk (lk=3D0xfffff808a08f0a68, flags=3D210= 6368, ilk=3D, wmesg=3D, pri=3D, timo=3D51, queue=3D1) at /usr/src/sys/kern/kern_lock.c:295 > #4 0xffffffff80b93525 in lockmgr_slock_hard (lk=3D0xfffff808a08f0a68, = flags=3D2106368, ilk=3D, file=3D0xffffffff811f0ff4 "/usr/s= rc/sys/kern/vfs_lookup.c", line=3D737, lwa=3D) at /usr/src= /sys/kern/kern_lock.c:649 > #5 0xffffffff810e2a40 in VOP_LOCK1_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe271bee5748) at vnode_if.c:1989 > #6 0xffffffff80cb14f4 in VOP_LOCK1 (vp=3D0xfffff808a08f0a00, flags=3D2= 106368, file=3D0xffffffff811f0ff4 "/usr/src/sys/kern/vfs_lookup.c", line=3D= 737) at ./vnode_if.h:879 > #7 _vn_lock (vp=3D0xfffff808a08f0a00, flags=3D2106368, file=3D0xffffff= ff811f0ff4 "/usr/src/sys/kern/vfs_lookup.c", line=3D737) at /usr/src/sys/= kern/vfs_vnops.c:1613 > #8 0xffffffff80c8e93d in lookup (ndp=3D0xfffffe271bee5a88) at /usr/src= /sys/kern/vfs_lookup.c:735 > #9 0xffffffff80c8e374 in namei (ndp=3D0xfffffe271bee5a88) at /usr/src/= sys/kern/vfs_lookup.c:512 > #10 0xffffffff80cb0bdb in vn_open_cred (ndp=3D0xfffffe271bee5a88, flagp= =3D0xfffffe271bee5bb4, cmode=3D0, vn_open_flags=3D0, cred=3D0xfffff80786e= cad00, fp=3D0xfffff802a8627690) at /usr/src/sys/kern/vfs_vnops.c:288 > #11 0xffffffff80ca8a8a in kern_openat (td=3D0xfffffe25eefa2e00, fd=3D, path=3D, pathseg=3D, flags=3D= 1048577, mode=3D) at /usr/src/sys/kern/vfs_syscalls.c:1083= > #12 0xffffffff82780415 in filemon_wrapper_openat (td=3D0xfffffe25eefa2e= 00, uap=3D0xfffffe25eefa31d8) at /usr/src/sys/dev/filemon/filemon_wrapper= =2Ec:232 > #13 0xffffffff8106623d in syscallenter (td=3D) at /usr/s= rc/sys/amd64/amd64/../../kern/subr_syscall.c:150 > #14 amd64_syscall (td=3D0xfffffe25eefa2e00, traced=3D0) at /usr/src/sys= /amd64/amd64/trap.c:1161 --=20 Regards, Bryan Drewery --XElGlYQBYiLZJf7vq6TsZf2eItNVgQjfg-- --aIvlquH7auoI17NOayHMofyL3SITm4BOn Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQGTBAEBCgB9FiEE+Rc8ssOq6npcih8JNddxu25Gl88FAl67H6lfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEY5 MTczQ0IyQzNBQUVBN0E1QzhBMUYwOTM1RDc3MUJCNkU0Njk3Q0YACgkQNddxu25G l88jNAgAn03Ne8XDWCpYOCRxn2Lxycotkn2u5F6TEFlVUFHPzdpSZMRiLzk7XQe0 A68xi5kEb6/lgG5gxgQ39AhwaabProrwnRvm/Jh3+a/GxCIW63a6Ir6RD30AH25W NBPPtECP3hVf+9aQDYCNdmOZ+bYH7aQVd1aM58Uk5cCYeLGH7dTmlFLIC7qVYyn4 gqQeC5cHJ8A8tXAPj3Z+J8FrmYq1ApYwXrWpfNlmYMTBqv+iJG0Z5ve+IqS2wIz2 v/34teOUSPkZyV+aRmBwdS90WZgFhc4aw6Y5BjW2+mGxOsuxsf87BLv6XzbDJRMP 8W6UNXqBBmE9ef6RxVryhFV6JBWuKA== =n3ag -----END PGP SIGNATURE----- --aIvlquH7auoI17NOayHMofyL3SITm4BOn-- From owner-freebsd-current@freebsd.org Tue May 12 22:47:20 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 57FE52DFDF0 for ; Tue, 12 May 2020 22:47:20 +0000 (UTC) (envelope-from bdrewery@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2610:1c1:1:6074::16:84]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "freefall.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49MCc81RDTz4Bhr for ; Tue, 12 May 2020 22:47:20 +0000 (UTC) (envelope-from bdrewery@FreeBSD.org) Received: from mail.xzibition.com (unknown [127.0.1.132]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by freefall.freebsd.org (Postfix) with ESMTPS id 987A894B1 for ; Tue, 12 May 2020 22:47:19 +0000 (UTC) (envelope-from bdrewery@FreeBSD.org) Received: from mail.xzibition.com (localhost [172.31.3.2]) by mail.xzibition.com (Postfix) with ESMTP id D909B1B269 for ; Tue, 12 May 2020 22:47:18 +0000 (UTC) X-Virus-Scanned: amavisd-new at mail.xzibition.com Received: from mail.xzibition.com ([172.31.3.2]) by mail.xzibition.com (mail.xzibition.com [172.31.3.2]) (amavisd-new, port 10026) with LMTP id 5dZXVP3LbuEQ for ; Tue, 12 May 2020 22:47:12 +0000 (UTC) Subject: Re: zfs deadlock on r360452 relating to busy vm page DKIM-Filter: OpenDKIM Filter v2.10.3 mail.xzibition.com 8CA4A1B25F From: Bryan Drewery To: freebsd-current@FreeBSD.org References: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> Autocrypt: addr=bdrewery@FreeBSD.org; prefer-encrypt=mutual; keydata= mQENBFJphmsBCADiFgmS4bIzwZijrS31SjEMzg+n5zNellgM+HkShwehpqCiyhXdWrvH6dTZ a6u50pbUIX7doTR7W7PQHCjCTqtpwvcj0eulZva+iHFp+XrbgSFHn+VVXgkYP2MFySyZRFab D2qqzJBEJofhpv4HvY6uQI5K99pMqKr1Z/lHqsijYYu4RH2OfwB5PinId7xeldzWEonVoCr+ rfxzO/UrgA6v/3layGZcKNHFjmc3NqoN1DXtdaEHqtjIozzbndVkH6lkFvIpIrI6i5ox8pwp VxsxLCr/4Musd5CWgHiet5kSw2SzNeA8FbxdLYCpXNVu+uBACEbCUP+CSNy3NVfEUxsBABEB AAG0JEJyeWFuIERyZXdlcnkgPGJkcmV3ZXJ5QEZyZWVCU0Qub3JnPokBVwQTAQoAQQIbAwUL CQgHAwUVCgkICwUWAwIBAAIeAQIXgAIZARYhBPkXPLLDqup6XIofCTXXcbtuRpfPBQJb5hLu BQkNPvODAAoJEDXXcbtuRpfP9rMH/3f7cfX5rzyEV5QNfV/wS4jFukLoPZ4+nCM/TKxH3pEX 2bLbeQbkk6La8cueQ5Lpoht5XFZ18Y5TbMittngltrlNzoDD0h9are24OkDFGim3afJU7tkj IGQa1if+re+vI5BhzYwRhj0oKXzBi39M5oePd3L1dXfx83rg2FPyZBdIejsz6fR74T3JVkbd 6k2l5/3Zk2uiNMy+eBfDRgYE1E6CP28kV0nCeGKZgSVso0kGUUHud7voKqGVpMvbd0mE4pp4 PE5YJaFPjrll9miaDAvdU8LGIq5n6+aXPLKoQ/QNl6mg6ifgI6FfKILOkTizLW8E5PBSNnCm NapQ55yjm125AQ0EUmmGawEIAKJUU9+Q19oW1RK5jTf3m56j+szIc8Y9DaLC8REUKl4UZJBK BqCl6c0cukVApOD92XoU6hJPm2rLEyp/IcYcPPNTnVu8D8h9oag2L8EiFN7+2hk0xG+lwjc8 uOIZycme7AIJsBU4AZ1v63lxm2k104hwpiatgbe71GIGl7p1MX6ousP/wGzXCOF25Dx9w02C eRe7zEMfhnFjSUhzdCC9han2+KaVB7qIqNR3b8NfbwRNlwPmHqlhXffUow9OsQjSnTK8WKNR lx7xzVccXIvWP2wECFrmqmzMmXpSrmIuiWEpFwZ9x2a0Pva8dCNRiCVTK51IlRXKjaAxiN1u IUrMm6UAEQEAAYkBPAQYAQoAJgIbDBYhBPkXPLLDqup6XIofCTXXcbtuRpfPBQJb5hL4BQkN PvONAAoJEDXXcbtuRpfPCjcH/ivBsOpdpebpgLizSNU5/X4yWN5Aixsc9VBnQhGKAKnMINJQ VMpA55sD2JSPwloXYM/B3qyPJRS/9cwIuX5LDNKKOZU3Qp+TzleynM15/xea14orWYRGRict YHBM3Cnqp7OD8K6Q1uhs0fTxyJP7PZ/G0+7Corlf1DlHhDt6C2HldRPFvAvAgl6sR9Wzgcb7 rzub2HVtbJgl6YHbgyAG7x9NpXFqzx1JLAMdpt2DIYwoi+oMdRQlBIwNuKjQjCGzuXHandd3 kGvBAsyJpQ+coEep9UzwANaV28cXrFr2R4FSOcR50rBA2Nh/vqUYfpsvBvJlwuKAoV1djVHa ihNeL5E= Organization: FreeBSD Message-ID: Date: Tue, 12 May 2020 15:47:12 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0 MIME-Version: 1.0 In-Reply-To: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="Y3Kw281vOe14dImLo9fNoysaIpgYWAafe" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 May 2020 22:47:20 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --Y3Kw281vOe14dImLo9fNoysaIpgYWAafe Content-Type: multipart/mixed; boundary="WGx0QEJgJvn7xIqcMq1W4goqrKA31Rpc3"; protected-headers="v1" From: Bryan Drewery To: freebsd-current@FreeBSD.org Message-ID: Subject: Re: zfs deadlock on r360452 relating to busy vm page References: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> In-Reply-To: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> --WGx0QEJgJvn7xIqcMq1W4goqrKA31Rpc3 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable Trivial repro: dd if=3D/dev/zero of=3Dblah & tail -F blah ^C load: 0.21 cmd: tail 72381 [prev->lr_read_cv] 2.17r 0.00u 0.01s 0% 2600k= #0 0xffffffff80bce615 at mi_switch+0x155 #1 0xffffffff80c1cfea at sleepq_switch+0x11a #2 0xffffffff80b57f0a at _cv_wait+0x15a #3 0xffffffff829ddab6 at rangelock_enter+0x306 #4 0xffffffff829ecd3f at zfs_freebsd_getpages+0x14f #5 0xffffffff810e3ab9 at VOP_GETPAGES_APV+0x59 #6 0xffffffff80f349e7 at vnode_pager_getpages+0x37 #7 0xffffffff80f2a93f at vm_pager_get_pages+0x4f #8 0xffffffff80f054b0 at vm_fault+0x780 #9 0xffffffff80f04bde at vm_fault_trap+0x6e #10 0xffffffff8106544e at trap_pfault+0x1ee #11 0xffffffff81064a9c at trap+0x44c #12 0xffffffff8103a978 at calltrap+0x8 On 5/12/2020 3:13 PM, Bryan Drewery wrote: >> panic: deadlres_td_sleep_q: possible deadlock detected for 0xfffffe25e= efa2e00 (find), blocked for 1802392 ticks >=20 > 2 stuck processes from procstat -kk before panic >> 72559 101698 tail - mi_switch+0x155 s= leepq_switch+0x11a _cv_wait+0x15a rangelock_enter+0x306 zfs_freebsd_getpa= ges+0x14f VOP_GETPAGES_APV+0x59 vnode_pager_getpages+0x37 vm_pager_get_pa= ges+0x4f vm_fault+0x780 vm_fault_trap+0x6e trap_pfault+0x1ee >=20 >> 72985 107378 find - mi_switch+0x155 s= leepq_switch+0x11a sleeplk+0x106 lockmgr_slock_hard+0x1f5 VOP_LOCK1_APV+0= x40 _vn_lock+0x54 lookup+0xdd namei+0x524 vn_open_cred+0x32b kern_openat+= 0x1fa filemon_wrapper_openat+0x15 amd64_syscall+0x73d >=20 >=20 > The only find running was thread 107378 >=20 > I couldn't record much from ddb but got locked vnodes. >> >> db> show lockedvnods >> Locked vnodes >> vnode 0xfffff804de66e500: type VDIR >> usecount 3, writecount 0, refcount 2 mountedhere 0 >> flags () >> v_object 0xfffff809459cb420 ref 0 pages 0 cleanbuf 0 dirtybuf 0 >> lock type zfs: SHARED (count 1) >> #0 0xffffffff80b94a0f at lockmgr_slock+0xdf >> #1 0xffffffff810e2a40 at VOP_LOCK1_APV+0x40 >> #2 0xffffffff80cb14f4 at _vn_lock+0x54 >> #3 0xffffffff80c9b3ec at vget_finish+0x6c >> #4 0xffffffff80c8051c at cache_lookup+0x57c >> #5 0xffffffff80c84dad at vfs_cache_lookup+0x7d >> #6 0xffffffff810df996 at VOP_LOOKUP_APV+0x56 >> #7 0xffffffff80c8ee61 at lookup+0x601 >> #8 0xffffffff80c8e374 at namei+0x524 >> #9 0xffffffff80caa83f at kern_statat+0x7f >> #10 0xffffffff80caafff at sys_fstatat+0x2f >> #11 0xffffffff81065c40 at amd64_syscall+0x140 >> #12 0xffffffff8103b2a0 at fast_syscall_common+0x101 >> vnode 0xfffff808a08f0a00: type VDIR >> usecount 6, writecount 0, refcount 2 mountedhere 0 >> flags () >> v_object 0xfffff801eb930000 ref 0 pages 0 cleanbuf 0 dirtybuf 0 >> lock type zfs: EXCL by thread 0xfffffe24aadb6100 (pid 72267, gmake= , tid 104356) >> with shared waiters pending >> #0 0xffffffff80b94a0f at lockmgr_slock+0xdf >> #1 0xffffffff810e2a40 at VOP_LOCK1_APV+0x40 >> #2 0xffffffff80cb14f4 at _vn_lock+0x54 >> #3 0xffffffff80c8e93d at lookup+0xdd >> #4 0xffffffff80c8e374 at namei+0x524 >> #5 0xffffffff80ca9e69 at kern_funlinkat+0xa9 >> #6 0xffffffff80ca9db8 at sys_unlink+0x28 >> #7 0xffffffff82780586 at filemon_wrapper_unlink+0x16 >> #8 0xffffffff8106623d at amd64_syscall+0x73d >> #9 0xffffffff8103b2a0 at fast_syscall_common+0x101 >> >> vnode 0xfffff80571f29500: type VREG >> usecount 6, writecount 1, refcount 2 >> flags () >> v_object 0xfffff806cb637c60 ref 2 pages 1 cleanbuf 0 dirtybuf 0 >> lock type zfs: SHARED (count 2) >> with exclusive waiters pending >> #0 0xffffffff80b94a0f at lockmgr_slock+0xdf >> #1 0xffffffff810e2a40 at VOP_LOCK1_APV+0x40 >> #2 0xffffffff80cb14f4 at _vn_lock+0x54 >> #3 0xffffffff8243af40 at zfs_lookup+0x610 >> #4 0xffffffff8243b61e at zfs_freebsd_cachedlookup+0x8e >> #5 0xffffffff810dfb46 at VOP_CACHEDLOOKUP_APV+0x56 >> #6 0xffffffff80c84dd8 at vfs_cache_lookup+0xa8 >> #7 0xffffffff810df996 at VOP_LOOKUP_APV+0x56 >> #8 0xffffffff80c8ee61 at lookup+0x601 >> #9 0xffffffff80c8e374 at namei+0x524 >> #10 0xffffffff80caa83f at kern_statat+0x7f >> #11 0xffffffff80caafff at sys_fstatat+0x2f >> #12 0xffffffff8106623d at amd64_syscall+0x73d >> #13 0xffffffff8103b2a0 at fast_syscall_common+0x101 >=20 > It's nice how recent threads are at the top in gdb... >> (kgdb) info threads >> Id Target Id Frame= >> 1 Thread 107952 (PID=3D79390: zfs) sch= ed_switch (td=3D0xfffffe26ebb36000, flags=3D) at /usr/src/= sys/kern/sched_ule.c:2147 >> 2 Thread 102764 (PID=3D73218: zfs) sch= ed_switch (td=3D0xfffffe2490a12300, flags=3D) at /usr/src/= sys/kern/sched_ule.c:2147 >> 3 Thread 107378 (PID=3D72985: find) sch= ed_switch (td=3D0xfffffe25eefa2e00, flags=3D) at /usr/src/= sys/kern/sched_ule.c:2147 >> 4 Thread 103940 (PID=3D72980: rm) sch= ed_switch (td=3D0xfffffe2451932500, flags=3D) at /usr/src/= sys/kern/sched_ule.c:2147 >> 5 Thread 101698 (PID=3D72559: tail) sch= ed_switch (td=3D0xfffffe255eac0000, flags=3D) at /usr/src/= sys/kern/sched_ule.c:2147 >> 6 Thread 103660 (PID=3D72280: timestamp) sch= ed_switch (td=3D0xfffffe25f948aa00, flags=3D) at /usr/src/= sys/kern/sched_ule.c:2147 >> 7 Thread 101249 (PID=3D72280: timestamp/prefix_stdout) sch= ed_switch (td=3D0xfffffe264412a100, flags=3D) at /usr/src/= sys/kern/sched_ule.c:2147 >> 8 Thread 101255 (PID=3D72280: timestamp/prefix_stderr) sch= ed_switch (td=3D0xfffffe25c8e9bc00, flags=3D) at /usr/src/= sys/kern/sched_ule.c:2147 >> 9 Thread 104356 (PID=3D72267: gmake) sch= ed_switch (td=3D0xfffffe24aadb6100, flags=3D) at /usr/src/= sys/kern/sched_ule.c:2147 >> 10 Thread 108476 (PID=3D66957: vim) sch= ed_switch (td=3D0xfffffe26c8601500, flags=3D) at /usr/src/= sys/kern/sched_ule.c:2147 >=20 > The 2 threads holding shared lock on vnode 0xfffff80571f29500: >=20 > The tail thread (101698) is waiting for a zfs rangelock getting pages > for vnode 0xfffff80571f29500 >=20 >> (kgdb) thread 5 >> [Switching to thread 5 (Thread 101698)] >> #0 sched_switch (td=3D0xfffffe255eac0000, flags=3D) at= /usr/src/sys/kern/sched_ule.c:2147 >> 2147 cpuid =3D td->td_oncpu =3D PCPU_GET(cpuid); >> (kgdb) backtrace >> #0 sched_switch (td=3D0xfffffe255eac0000, flags=3D) at= /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern= /kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffff810fb57dd48, p= ri=3D0) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80b57f0a in _cv_wait (cvp=3D0xfffff810fb57dd48, lock=3D0= xfffff80049a99040) at /usr/src/sys/kern/kern_condvar.c:146 >> #4 0xffffffff82434ab6 in rangelock_enter_reader (rl=3D0xfffff80049a99= 018, new=3D0xfffff8022cadb100) at /usr/src/sys/cddl/contrib/opensolaris/u= ts/common/fs/zfs/zfs_rlock.c:429 >> #5 rangelock_enter (rl=3D0xfffff80049a99018, off=3D, l= en=3D, type=3D) at /usr/src/sys/cddl/contri= b/opensolaris/uts/common/fs/zfs/zfs_rlock.c:477 >> #6 0xffffffff82443d3f in zfs_getpages (vp=3D, ma=3D0xf= ffffe259f204b18, count=3D, rbehind=3D0xfffffe259f204ac4, r= ahead=3D0xfffffe259f204ad0) at /usr/src/sys/cddl/contrib/opensolaris/uts/= common/fs/zfs/zfs_vnops.c:4500 >> #7 zfs_freebsd_getpages (ap=3D) at /usr/src/sys/cddl/c= ontrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4567 >> #8 0xffffffff810e3ab9 in VOP_GETPAGES_APV (vop=3D0xffffffff8250a1e0 <= zfs_vnodeops>, a=3D0xfffffe259f2049f0) at vnode_if.c:2644 >> #9 0xffffffff80f349e7 in VOP_GETPAGES (vp=3D, m=3D, count=3D, rbehind=3D, rahead=3D) at ./vnode_if.h:1171 >> #10 vnode_pager_getpages (object=3D, m=3D, count=3D, rbehind=3D, rahead=3D= ) at /usr/src/sys/vm/vnode_pager.c:743 >> #11 0xffffffff80f2a93f in vm_pager_get_pages (object=3D0xfffff806cb637= c60, m=3D0xfffffe259f204b18, count=3D1, rbehind=3D, rahead=3D= ) at /usr/src/sys/vm/vm_pager.c:305 >> #12 0xffffffff80f054b0 in vm_fault_getpages (fs=3D, ner= a=3D0, behindp=3D, aheadp=3D) at /usr/src/s= ys/vm/vm_fault.c:1163 >> #13 vm_fault (map=3D, vaddr=3D, fault_ty= pe=3D, fault_flags=3D, m_hold=3D) at /usr/src/sys/vm/vm_fault.c:1394 >> #14 0xffffffff80f04bde in vm_fault_trap (map=3D0xfffffe25653949e8, vad= dr=3D, fault_type=3D, fault_flags=3D0, sign= o=3D0xfffffe259f204d04, ucode=3D0xfffffe259f204d00) at /usr/src/sys/vm/vm= _fault.c:589 >> #15 0xffffffff8106544e in trap_pfault (frame=3D0xfffffe259f204d40, use= rmode=3D, signo=3D, ucode=3D) = at /usr/src/sys/amd64/amd64/trap.c:821 >> #16 0xffffffff81064a9c in trap (frame=3D0xfffffe259f204d40) at /usr/sr= c/sys/amd64/amd64/trap.c:340 >> #17 >> #18 0x00000000002034fc in ?? () >> (kgdb) frame 11 >> #11 0xffffffff80f2a93f in vm_pager_get_pages (object=3D0xfffff806cb637= c60, m=3D0xfffffe259f204b18, count=3D1, rbehind=3D, rahead=3D= ) at /usr/src/sys/vm/vm_pager.c:305 >> 305 r =3D (*pagertab[object->type]->pgo_getpages)(object, = m, count, rbehind, >> (kgdb) p *object >> $10 =3D {lock =3D {lock_object =3D {lo_name =3D 0xffffffff8114fa30 "vm= object", lo_flags =3D 627245056, lo_data =3D 0, lo_witness =3D 0x0}, rw_= lock =3D 1}, object_list =3D {tqe_next =3D 0xfffff806cb637d68, tqe_prev =3D= 0xfffff806cb637b78}, shadow_head =3D {lh_first =3D 0x0}, shadow_list =3D= {le_next =3D 0xffffffffffffffff, >> le_prev =3D 0xffffffffffffffff}, memq =3D {tqh_first =3D 0xfffffe0= 01cbca850, tqh_last =3D 0xfffffe001cbca860}, rtree =3D {rt_root =3D 18446= 741875168421969}, size =3D 1099, domain =3D {dr_policy =3D 0x0, dr_iter =3D= 0}, generation =3D 1, cleangeneration =3D 1, ref_count =3D 2, shadow_cou= nt =3D 0, memattr =3D 6 '\006', type =3D 2 '\002', >> flags =3D 4096, pg_color =3D 0, paging_in_progress =3D {__count =3D = 2}, busy =3D {__count =3D 0}, resident_page_count =3D 1, backing_object =3D= 0x0, backing_object_offset =3D 0, pager_object_list =3D {tqe_next =3D 0x= 0, tqe_prev =3D 0x0}, rvq =3D {lh_first =3D 0x0}, handle =3D 0xfffff80571= f29500, un_pager =3D {vnp =3D {vnp_size =3D 4499568, >> writemappings =3D 0}, devp =3D {devp_pglist =3D {tqh_first =3D 0= x44a870, tqh_last =3D 0x0}, ops =3D 0x0, dev =3D 0x0}, sgp =3D {sgp_pglis= t =3D {tqh_first =3D 0x44a870, tqh_last =3D 0x0}}, swp =3D {swp_tmpfs =3D= 0x44a870, swp_blks =3D {pt_root =3D 0}, writemappings =3D 0}}, cred =3D = 0x0, charge =3D 0, umtx_data =3D 0x0} >> (kgdb) p object->handle >> $11 =3D (void *) 0xfffff80571f29500 >=20 >> (kgdb) p *(struct vnode *) 0xfffff80571f29500 >> $18 =3D {v_type =3D VREG, v_irflag =3D 0, v_op =3D 0xffffffff8250a1e0 = , v_data =3D 0xfffff80049a99000, v_mount =3D 0xfffffe247e5f= 3700, v_nmntvnodes =3D {tqe_next =3D 0xfffff8086eb38a00, tqe_prev =3D 0xf= ffff80461c2d7a0}, {v_mountedhere =3D 0x0, v_unpcb =3D 0x0, v_rdev =3D 0x0= , v_fifoinfo =3D 0x0}, v_hashlist =3D {le_next =3D 0x0, >> le_prev =3D 0x0}, v_cache_src =3D {lh_first =3D 0x0}, v_cache_dst = =3D {tqh_first =3D 0x0, tqh_last =3D 0xfffff80571f29550}, v_cache_dd =3D = 0x0, v_lock =3D {lock_object =3D {lo_name =3D 0xffffffff82486a37 "zfs", l= o_flags =3D 117112832, lo_data =3D 0, lo_witness =3D 0x0}, lk_lock =3D 37= , lk_exslpfail =3D 0, lk_timo =3D 51, lk_pri =3D 96, >> lk_stack =3D {depth =3D 14, pcs =3D {18446744071574211087, 1844674= 4071579773504, 18446744071575377140, 18446744071600058176, 18446744071600= 059934, 18446744071579761478, 18446744071575195096, 18446744071579761046,= 18446744071575236193, 18446744071575233396, 18446744071575349311, 184467= 44071575351295, >> 18446744071579263549, 18446744071579087520, 0, 0, 0, 0}}}, v_i= nterlock =3D {lock_object =3D {lo_name =3D 0xffffffff8123c142 "vnode inte= rlock", lo_flags =3D 16973824, lo_data =3D 0, lo_witness =3D 0xfffff8123f= d73600}, mtx_lock =3D 0}, v_vnlock =3D 0xfffff80571f29568, v_vnodelist =3D= {tqe_next =3D 0xfffff8064bd0dc80, >> tqe_prev =3D 0xfffff80e250788d8}, v_lazylist =3D {tqe_next =3D 0x0= , tqe_prev =3D 0x0}, v_bufobj =3D {bo_lock =3D {lock_object =3D {lo_name = =3D 0xffffffff811fb7ab "bufobj interlock", lo_flags =3D 86179840, lo_data= =3D 0, lo_witness =3D 0xfffff8123fd7dd80}, rw_lock =3D 1}, bo_ops =3D 0x= ffffffff8191ead0 , >> bo_object =3D 0xfffff806cb637c60, bo_synclist =3D {le_next =3D 0x0= , le_prev =3D 0x0}, bo_private =3D 0xfffff80571f29500, bo_clean =3D {bv_h= d =3D {tqh_first =3D 0x0, tqh_last =3D 0xfffff80571f296c0}, bv_root =3D {= pt_root =3D 0}, bv_cnt =3D 0}, bo_dirty =3D {bv_hd =3D {tqh_first =3D 0x0= , tqh_last =3D 0xfffff80571f296e0}, bv_root =3D {pt_root =3D 0}, >> bv_cnt =3D 0}, bo_numoutput =3D 0, bo_flag =3D 0, bo_domain =3D = 5, bo_bsize =3D 131072}, v_pollinfo =3D 0x0, v_label =3D 0x0, v_lockf =3D= 0x0, v_rl =3D {rl_waiters =3D {tqh_first =3D 0xfffff80f2cc12708, tqh_las= t =3D 0xfffff80f2cc12708}, rl_currdep =3D 0x0}, v_cstart =3D 0, v_lasta =3D= 0, v_lastw =3D 0, v_clen =3D 0, v_holdcnt =3D 2, >> v_usecount =3D 6, v_iflag =3D 0, v_vflag =3D 0, v_mflag =3D 0, v_dba= tchcpu =3D -1, v_writecount =3D 1, v_hash =3D 45676874} >=20 > Is that thread busying the vm object? >=20 >=20 > thread 101255 (timestamp/prefix_stderr) which is also acting on vnode > 0xfffff80571f29500 that the tail thread 101698 was. >=20 >> (kgdb) thread >> [Current thread is 8 (Thread 101255)] >> (kgdb) backtrace >> #0 sched_switch (td=3D0xfffffe25c8e9bc00, flags=3D) at= /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern= /kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffffe001cbca850, p= ri=3D84) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80f1de50 in _vm_page_busy_sleep (obj=3D, = m=3D0xfffffe001cbca850, pindex=3D, wmesg=3D= , allocflags=3D21504, locked=3Dfalse) at /usr/src/sys/vm/vm_page.c:1094 >> #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=3D0xfffff806cb637= c60, m=3D, pindex=3D, wmesg=3D, allocflags=3D21504, locked=3D) at /usr/src/sys/vm/vm_page.c:4326 >> #5 vm_page_acquire_unlocked (object=3D0xfffff806cb637c60, pindex=3D10= 98, prev=3D, mp=3D0xfffffe2717fc6730, allocflags=3D21504) = at /usr/src/sys/vm/vm_page.c:4469 >> #6 0xffffffff80f24c61 in vm_page_grab_valid_unlocked (mp=3D0xfffffe27= 17fc6730, object=3D0xfffff806cb637c60, pindex=3D1098, allocflags=3D21504)= at /usr/src/sys/vm/vm_page.c:4645 >> #7 0xffffffff82440246 in page_busy (vp=3D0xfffff80571f29500, start=3D= 4497408, off=3D, nbytes=3D) at /usr/src/sys= /cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:414 >> #8 update_pages (vp=3D0xfffff80571f29500, start=3D4497408, len=3D32, = os=3D0xfffff8096a277400, oid=3D2209520, segflg=3D, tx=3D) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs= /zfs_vnops.c:482 >> #9 zfs_write (vp=3D, uio=3D, ioflag=3D0= , cr=3D, ct=3D) at /usr/src/sys/cddl/contri= b/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1071 >> #10 zfs_freebsd_write (ap=3D) at /usr/src/sys/cddl/cont= rib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4838 >> #11 0xffffffff810e0eaf in VOP_WRITE_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe2717fc68c8) at vnode_if.c:925 >> #12 0xffffffff80cb574c in VOP_WRITE (vp=3D0xfffff80571f29500, uio=3D0x= fffffe2717fc6bb0, ioflag=3D8323073, cred=3D) at ./vnode_if= =2Eh:413 >> #13 vn_write (fp=3D0xfffff8048195e8c0, uio=3D, active_c= red=3D, flags=3D, td=3D) at = /usr/src/sys/kern/vfs_vnops.c:894 >> #14 0xffffffff80cb50c3 in vn_io_fault_doio (args=3D0xfffffe2717fc6af0,= uio=3D0xfffffe2717fc6bb0, td=3D0xfffffe25c8e9bc00) at /usr/src/sys/kern/= vfs_vnops.c:959 >> #15 0xffffffff80cb1c8c in vn_io_fault1 (vp=3D, uio=3D0x= fffffe2717fc6bb0, args=3D0xfffffe2717fc6af0, td=3D0xfffffe25c8e9bc00) at = /usr/src/sys/kern/vfs_vnops.c:1077 >> #16 0xffffffff80cafa32 in vn_io_fault (fp=3D0xfffff8048195e8c0, uio=3D= 0xfffffe2717fc6bb0, active_cred=3D0xfffff80f2cc12708, flags=3D0, td=3D) at /usr/src/sys/kern/vfs_vnops.c:1181 >> #17 0xffffffff80c34331 in fo_write (fp=3D0xfffff8048195e8c0, uio=3D0xf= ffffe2717fc6bb0, active_cred=3D, flags=3D, td=3D= 0xfffffe25c8e9bc00) at /usr/src/sys/sys/file.h:326 >> #18 dofilewrite (td=3D0xfffffe25c8e9bc00, fd=3D2, fp=3D0xfffff8048195e= 8c0, auio=3D0xfffffe2717fc6bb0, offset=3D, flags=3D) at /usr/src/sys/kern/sys_generic.c:564 >> #19 0xffffffff80c33eb0 in kern_writev (td=3D0xfffffe25c8e9bc00, fd=3D2= , auio=3D) at /usr/src/sys/kern/sys_generic.c:491 >> #20 sys_write (td=3D0xfffffe25c8e9bc00, uap=3D) at /usr= /src/sys/kern/sys_generic.c:406 >> #21 0xffffffff8106623d in syscallenter (td=3D) at /usr/= src/sys/amd64/amd64/../../kern/subr_syscall.c:150 >> #22 amd64_syscall (td=3D0xfffffe25c8e9bc00, traced=3D0) at /usr/src/sy= s/amd64/amd64/trap.c:1161 >> #23 >> #24 0x000000080043d53a in ?? () >=20 > Maybe r358443 is related? >=20 >=20 >> (kgdb) frame 4 >> #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=3D0xfffff806cb637= c60, m=3D, pindex=3D, wmesg=3D, allocflags=3D21504, locked=3D) at /usr/src/sys/vm/vm_page.c:4326 >> 4326 if (_vm_page_busy_sleep(object, m, m->pindex, wmesg, a= llocflags, >> (kgdb) p *object >> $8 =3D {lock =3D {lock_object =3D {lo_name =3D 0xffffffff8114fa30 "vm = object", lo_flags =3D 627245056, lo_data =3D 0, lo_witness =3D 0x0}, rw_l= ock =3D 1}, object_list =3D {tqe_next =3D 0xfffff806cb637d68, tqe_prev =3D= 0xfffff806cb637b78}, shadow_head =3D {lh_first =3D 0x0}, shadow_list =3D= {le_next =3D 0xffffffffffffffff, >> le_prev =3D 0xffffffffffffffff}, memq =3D {tqh_first =3D 0xfffffe0= 01cbca850, tqh_last =3D 0xfffffe001cbca860}, rtree =3D {rt_root =3D 18446= 741875168421969}, size =3D 1099, domain =3D {dr_policy =3D 0x0, dr_iter =3D= 0}, generation =3D 1, cleangeneration =3D 1, ref_count =3D 2, shadow_cou= nt =3D 0, memattr =3D 6 '\006', type =3D 2 '\002', >> flags =3D 4096, pg_color =3D 0, paging_in_progress =3D {__count =3D = 2}, busy =3D {__count =3D 0}, resident_page_count =3D 1, backing_object =3D= 0x0, backing_object_offset =3D 0, pager_object_list =3D {tqe_next =3D 0x= 0, tqe_prev =3D 0x0}, rvq =3D {lh_first =3D 0x0}, handle =3D 0xfffff80571= f29500, un_pager =3D {vnp =3D {vnp_size =3D 4499568, >> writemappings =3D 0}, devp =3D {devp_pglist =3D {tqh_first =3D 0= x44a870, tqh_last =3D 0x0}, ops =3D 0x0, dev =3D 0x0}, sgp =3D {sgp_pglis= t =3D {tqh_first =3D 0x44a870, tqh_last =3D 0x0}}, swp =3D {swp_tmpfs =3D= 0x44a870, swp_blks =3D {pt_root =3D 0}, writemappings =3D 0}}, cred =3D = 0x0, charge =3D 0, umtx_data =3D 0x0} >> (kgdb) frame 5 >> #5 vm_page_acquire_unlocked (object=3D0xfffff806cb637c60, pindex=3D10= 98, prev=3D, mp=3D0xfffffe2717fc6730, allocflags=3D21504) = at /usr/src/sys/vm/vm_page.c:4469 >> 4469 if (!vm_page_grab_sleep(object, m, pindex, "pg= nslp", >> (kgdb) p *m >> $9 =3D {plinks =3D {q =3D {tqe_next =3D 0xffffffffffffffff, tqe_prev =3D= 0xffffffffffffffff}, s =3D {ss =3D {sle_next =3D 0xffffffffffffffff}}, m= emguard =3D {p =3D 18446744073709551615, v =3D 18446744073709551615}, uma= =3D {slab =3D 0xffffffffffffffff, zone =3D 0xffffffffffffffff}}, listq =3D= {tqe_next =3D 0x0, tqe_prev =3D 0xfffff806cb637ca8}, >> object =3D 0xfffff806cb637c60, pindex =3D 1098, phys_addr =3D 189884= 08832, md =3D {pv_list =3D {tqh_first =3D 0x0, tqh_last =3D 0xfffffe001cb= ca888}, pv_gen =3D 44682, pat_mode =3D 6}, ref_count =3D 2147483648, busy= _lock =3D 1588330502, a =3D {{flags =3D 0, queue =3D 255 '\377', act_coun= t =3D 0 '\000'}, _bits =3D 16711680}, order =3D 13 '\r', >> pool =3D 0 '\000', flags =3D 1 '\001', oflags =3D 0 '\000', psind =3D= 0 '\000', segind =3D 6 '\006', valid =3D 0 '\000', dirty =3D 0 '\000'} >=20 > Pretty sure this thread is holding the rangelock from zfs_write() that > tail is waiting on. So what is this thread (101255) waiting on exactly > for? I'm not sure the way to track down what is using vm object > 0xfffff806cb637c60. If the tail thread busied the page then they are > waiting on each other I guess. If that's true then r358443 removing the= > write lock on the object in update_pages() could be a problem. >=20 >=20 > Not sure the rest is interesting. I think they are just waiting on the > locked vnode but I give it here in case I missed something. >=20 > thread 101249 (timestamp/prefix_stdout) is also acting on vnode > 0xfffff80571f29500 >=20 >> (kgdb) thread 7 >> [Switching to thread 7 (Thread 101249)] >> #0 sched_switch (td=3D0xfffffe264412a100, flags=3D) at= /usr/src/sys/kern/sched_ule.c:2147 >> 2147 cpuid =3D td->td_oncpu =3D PCPU_GET(cpuid); >> (kgdb) backtrace >> #0 sched_switch (td=3D0xfffffe264412a100, flags=3D) at= /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern= /kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffff8048195e8e2, p= ri=3D119) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80bcdb6d in _sleep (ident=3D0xfffff8048195e8e2, lock=3D<= optimized out>, priority=3D119, wmesg=3D0xffffffff8123c694 "vofflock", sb= t=3D, pr=3D0, flags=3D256) at /usr/src/sys/kern/kern_synch= =2Ec:221 >> #4 0xffffffff80cb203a in foffset_lock (fp=3D0xfffff8048195e8c0, flags= =3D) at /usr/src/sys/kern/vfs_vnops.c:700 >> #5 0xffffffff80caf909 in foffset_lock_uio (fp=3D, uio=3D= , flags=3D) at /usr/src/sys/kern/vfs_vnops.= c:748 >> #6 vn_io_fault (fp=3D0xfffff8048195e8c0, uio=3D0xfffffe2719d9cbb0, ac= tive_cred=3D0xfffff80786ecad00, flags=3D0, td=3D0xfffffe264412a100) at /u= sr/src/sys/kern/vfs_vnops.c:1163 >> #7 0xffffffff80c34331 in fo_write (fp=3D0xfffff8048195e8c0, uio=3D0xf= ffffe2719d9cbb0, active_cred=3D, flags=3D, td=3D= 0xfffffe264412a100) at /usr/src/sys/sys/file.h:326 >> #8 dofilewrite (td=3D0xfffffe264412a100, fd=3D1, fp=3D0xfffff8048195e= 8c0, auio=3D0xfffffe2719d9cbb0, offset=3D, flags=3D) at /usr/src/sys/kern/sys_generic.c:564 >> #9 0xffffffff80c33eb0 in kern_writev (td=3D0xfffffe264412a100, fd=3D1= , auio=3D) at /usr/src/sys/kern/sys_generic.c:491 >> #10 sys_write (td=3D0xfffffe264412a100, uap=3D) at /usr= /src/sys/kern/sys_generic.c:406 >> #11 0xffffffff8106623d in syscallenter (td=3D) at /usr/= src/sys/amd64/amd64/../../kern/subr_syscall.c:150 >> #12 amd64_syscall (td=3D0xfffffe264412a100, traced=3D0) at /usr/src/sy= s/amd64/amd64/trap.c:1161 >> #13 >> #14 0x000000080043d53a in ?? () >> Backtrace stopped: Cannot access memory at address 0x7fffdfffddd8 >> (kgdb) frame 6 >> #6 vn_io_fault (fp=3D0xfffff8048195e8c0, uio=3D0xfffffe2719d9cbb0, ac= tive_cred=3D0xfffff80786ecad00, flags=3D0, td=3D0xfffffe264412a100) at /u= sr/src/sys/kern/vfs_vnops.c:1163 >> 1163 foffset_lock_uio(fp, uio, flags); >> (kgdb) p *fp >> $22 =3D {f_data =3D 0xfffff80571f29500, f_ops =3D 0xffffffff81923a10 <= vnops>, f_cred =3D 0xfffff80786ecad00, f_vnode =3D 0xfffff80571f29500, f_= type =3D 1, f_vnread_flags =3D 3, f_flag =3D 2, f_count =3D 4, {f_seqcoun= t =3D 127, f_pipegen =3D 127}, f_nextoff =3D 4499536, f_vnun =3D {fvn_cde= vpriv =3D 0x0, fvn_advice =3D 0x0}, f_offset =3D 4499536, >> f_label =3D 0x0} >=20 > thread 104356 (gmake) is just waiting on the lock for vnode > 0xfffff80571f29500 > It also is holding exclusive lock on directory vnode 0xfffff808a08f0a00= >=20 >> (kgdb) thread 9 >> [Switching to thread 9 (Thread 104356)] >> #0 sched_switch (td=3D0xfffffe24aadb6100, flags=3D) at= /usr/src/sys/kern/sched_ule.c:2147 >> 2147 cpuid =3D td->td_oncpu =3D PCPU_GET(cpuid); >> (kgdb) backtrace >> #0 sched_switch (td=3D0xfffffe24aadb6100, flags=3D) at= /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern= /kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffff80571f29568, p= ri=3D96) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80b954f6 in sleeplk (lk=3D0xfffff80571f29568, flags=3D53= 2480, ilk=3D, wmesg=3D, pri=3D, timo=3D51, queue=3D0) at /usr/src/sys/kern/kern_lock.c:295 >> #4 0xffffffff80b93a1e in lockmgr_xlock_hard (lk=3D0xfffff80571f29568,= flags=3D, ilk=3D0x0, file=3D, line=3D1432, l= wa=3D0xfffff80571f29568) at /usr/src/sys/kern/kern_lock.c:841 >> #5 0xffffffff810e2a40 in VOP_LOCK1_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe271833f4d8) at vnode_if.c:1989 >> #6 0xffffffff80cb14f4 in VOP_LOCK1 (vp=3D0xfffff80571f29500, flags=3D= 532480, file=3D0xffffffff82472ac9 "/usr/src/sys/cddl/contrib/opensolaris/= uts/common/fs/zfs/zfs_vnops.c", line=3D1432) at ./vnode_if.h:879 >> #7 _vn_lock (vp=3D0xfffff80571f29500, flags=3D532480, file=3D0xffffff= ff82472ac9 "/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_v= nops.c", line=3D1432) at /usr/src/sys/kern/vfs_vnops.c:1613 >> #8 0xffffffff8243af40 in zfs_lookup_lock (dvp=3D0xfffff808a08f0a00, v= p=3D0xfffff80571f29500, name=3D0xfffffe271833f630 "copool-basic.sh.log", = lkflags=3D532480) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/= zfs/zfs_vnops.c:1432 >> #9 zfs_lookup (dvp=3D0xfffff808a08f0a00, nm=3D0xfffffe271833f630 "cop= ool-basic.sh.log", vpp=3D, cnp=3D0xfffffe271833faf0, namei= op=3D2, cr=3D, td=3D, flags=3D0, cached=3D1= ) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:= 1606 >> #10 0xffffffff8243b61e in zfs_freebsd_lookup (ap=3D0xfffffe271833f780,= cached=3D) = at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:49= 00 >> #11 zfs_freebsd_cachedlookup (ap=3D0xfffffe271833f780) at /usr/src/sys= /cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4908 >> #12 0xffffffff810dfb46 in VOP_CACHEDLOOKUP_APV (vop=3D0xffffffff8250a1= e0 , a=3D0xfffffe271833f780) at vnode_if.c:180 >> #13 0xffffffff80c84dd8 in VOP_CACHEDLOOKUP (dvp=3D0xfffff808a08f0a00, = vpp=3D0xfffffe271833fac0, cnp=3D0xfffffe271833faf0) at ./vnode_if.h:80 >> #14 vfs_cache_lookup (ap=3D) at /usr/src/sys/kern/vfs_c= ache.c:2149 >> #15 0xffffffff810df996 in VOP_LOOKUP_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe271833f820) at vnode_if.c:117 >> #16 0xffffffff80c8ee61 in VOP_LOOKUP (dvp=3D0xfffff808a08f0a00, vpp=3D= 0xfffffe271833fac0, cnp=3D0xfffffe271833faf0) at ./vnode_if.h:54 >> #17 lookup (ndp=3D0xfffffe271833fa60) at /usr/src/sys/kern/vfs_lookup.= c:951 >> #18 0xffffffff80c8e374 in namei (ndp=3D0xfffffe271833fa60) at /usr/src= /sys/kern/vfs_lookup.c:512 >> #19 0xffffffff80ca9e69 in kern_funlinkat (td=3D0xfffffe24aadb6100, dfd= =3D-100, path=3D0x800a3982e , fd=3D, pathseg=3DUIO_USERSPACE, flag=3D, oldinum=3D0) at /usr/src/sys/kern/vfs_syscalls.c:1819 >> #20 0xffffffff80ca9db8 in sys_unlink (td=3D, uap=3D) at /usr/src/sys/kern/vfs_syscalls.c:1747 >> #21 0xffffffff82780586 in filemon_wrapper_unlink (td=3D, = uap=3D0xfffffe24aadb64d8) at /usr/src/sys/dev/filemon/filemon_wrapper.c:3= 50 >> #22 0xffffffff8106623d in syscallenter (td=3D) at /usr/= src/sys/amd64/amd64/../../kern/subr_syscall.c:150 >> #23 amd64_syscall (td=3D0xfffffe24aadb6100, traced=3D0) at /usr/src/sy= s/amd64/amd64/trap.c:1161 >=20 > thread 108476 (vim) is waiting to lock the directory vnode > 0xfffff808a08f0a00 >=20 >> (kgdb) thread 10 >> [Switching to thread 10 (Thread 108476)] >> #0 sched_switch (td=3D0xfffffe26c8601500, flags=3D) at= /usr/src/sys/kern/sched_ule.c:2147 >> 2147 cpuid =3D td->td_oncpu =3D PCPU_GET(cpuid); >> (kgdb) backtrace >> #0 sched_switch (td=3D0xfffffe26c8601500, flags=3D) at= /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern= /kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffff808a08f0a68, p= ri=3D96) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80b954f6 in sleeplk (lk=3D0xfffff808a08f0a68, flags=3D21= 05344, ilk=3D, wmesg=3D, pri=3D, timo=3D51, queue=3D1) at /usr/src/sys/kern/kern_lock.c:295 >> #4 0xffffffff80b93525 in lockmgr_slock_hard (lk=3D0xfffff808a08f0a68,= flags=3D2105344, ilk=3D, file=3D0xffffffff811fb967 "/usr/= src/sys/kern/vfs_subr.c", line=3D2930, lwa=3D) at /usr/src= /sys/kern/kern_lock.c:649 >> #5 0xffffffff810e2a40 in VOP_LOCK1_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe271d46d6b8) at vnode_if.c:1989 >> #6 0xffffffff80cb14f4 in VOP_LOCK1 (vp=3D0xfffff808a08f0a00, flags=3D= 2105344, file=3D0xffffffff811fb967 "/usr/src/sys/kern/vfs_subr.c", line=3D= 2930) at ./vnode_if.h:879 >> #7 _vn_lock (vp=3D0xfffff808a08f0a00, flags=3D2105344, file=3D0xfffff= fff811fb967 "/usr/src/sys/kern/vfs_subr.c", line=3D2930) at /usr/src/sys/= kern/vfs_vnops.c:1613 >> #8 0xffffffff80c9b3ec in vget_finish (vp=3D0xfffff808a08f0a00, flags=3D= 2105344, vs=3DVGET_USECOUNT) at /usr/src/sys/kern/vfs_subr.c:2930 >> #9 0xffffffff80c8051c in cache_lookup (dvp=3D, vpp=3D<= optimized out>, cnp=3D, tsp=3D, ticksp=3D) at /usr/src/sys/kern/vfs_cache.c:1407 >> #10 0xffffffff80c84dad in vfs_cache_lookup (ap=3D) at /= usr/src/sys/kern/vfs_cache.c:2147 >> #11 0xffffffff810df996 in VOP_LOOKUP_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe271d46d8a0) at vnode_if.c:117 >> #12 0xffffffff80c8ee61 in VOP_LOOKUP (dvp=3D0xfffff804de66e500, vpp=3D= 0xfffffe271d46da60, cnp=3D0xfffffe271d46da90) at ./vnode_if.h:54 >> #13 lookup (ndp=3D0xfffffe271d46da00) at /usr/src/sys/kern/vfs_lookup.= c:951 >> #14 0xffffffff80c8e374 in namei (ndp=3D0xfffffe271d46da00) at /usr/src= /sys/kern/vfs_lookup.c:512 >> #15 0xffffffff80caa83f in kern_statat (td=3D0xfffffe26c8601500, flag=3D= , fd=3D, path=3D0x8049d12c0 , pathseg=3DUIO_USERSPACE, sbp=3D0xf= ffffe271d46db28, hook=3D0x0) at /usr/src/sys/kern/vfs_syscalls.c:2340 >> #16 0xffffffff80caafff in sys_fstatat (td=3D, uap=3D0xfff= ffe26c86018d8) at /usr/src/sys/kern/vfs_syscalls.c:2317 >> #17 0xffffffff81065c40 in syscallenter (td=3D) at /usr/= src/sys/amd64/amd64/../../kern/subr_syscall.c:162 >> #18 amd64_syscall (td=3D0xfffffe26c8601500, traced=3D0) at /usr/src/sy= s/amd64/amd64/trap.c:1161 >> #19 >> #20 0x00000008020ba75a in ?? () >=20 > Lastly the find thread (107378) is waiting to lock the same directory > vnode 0xfffff808a08f0a00 >=20 >> (kgdb) thread 3 >> [Switching to thread 3 (Thread 107378)] >> #0 sched_switch (td=3D0xfffffe25eefa2e00, flags=3D) at= /usr/src/sys/kern/sched_ule.c:2147 >> 2147 cpuid =3D td->td_oncpu =3D PCPU_GET(cpuid); >> (kgdb) backtrace >> #0 sched_switch (td=3D0xfffffe25eefa2e00, flags=3D) at= /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=3D260) at /usr/src/sys/kern= /kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=3D0xfffff808a08f0a68, p= ri=3D96) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80b954f6 in sleeplk (lk=3D0xfffff808a08f0a68, flags=3D21= 06368, ilk=3D, wmesg=3D, pri=3D, timo=3D51, queue=3D1) at /usr/src/sys/kern/kern_lock.c:295 >> #4 0xffffffff80b93525 in lockmgr_slock_hard (lk=3D0xfffff808a08f0a68,= flags=3D2106368, ilk=3D, file=3D0xffffffff811f0ff4 "/usr/= src/sys/kern/vfs_lookup.c", line=3D737, lwa=3D) at /usr/sr= c/sys/kern/kern_lock.c:649 >> #5 0xffffffff810e2a40 in VOP_LOCK1_APV (vop=3D0xffffffff8250a1e0 , a=3D0xfffffe271bee5748) at vnode_if.c:1989 >> #6 0xffffffff80cb14f4 in VOP_LOCK1 (vp=3D0xfffff808a08f0a00, flags=3D= 2106368, file=3D0xffffffff811f0ff4 "/usr/src/sys/kern/vfs_lookup.c", line= =3D737) at ./vnode_if.h:879 >> #7 _vn_lock (vp=3D0xfffff808a08f0a00, flags=3D2106368, file=3D0xfffff= fff811f0ff4 "/usr/src/sys/kern/vfs_lookup.c", line=3D737) at /usr/src/sys= /kern/vfs_vnops.c:1613 >> #8 0xffffffff80c8e93d in lookup (ndp=3D0xfffffe271bee5a88) at /usr/sr= c/sys/kern/vfs_lookup.c:735 >> #9 0xffffffff80c8e374 in namei (ndp=3D0xfffffe271bee5a88) at /usr/src= /sys/kern/vfs_lookup.c:512 >> #10 0xffffffff80cb0bdb in vn_open_cred (ndp=3D0xfffffe271bee5a88, flag= p=3D0xfffffe271bee5bb4, cmode=3D0, vn_open_flags=3D0, cred=3D0xfffff80786= ecad00, fp=3D0xfffff802a8627690) at /usr/src/sys/kern/vfs_vnops.c:288 >> #11 0xffffffff80ca8a8a in kern_openat (td=3D0xfffffe25eefa2e00, fd=3D<= optimized out>, path=3D, pathseg=3D, flags=3D= 1048577, mode=3D) at /usr/src/sys/kern/vfs_syscalls.c:1083= >> #12 0xffffffff82780415 in filemon_wrapper_openat (td=3D0xfffffe25eefa2= e00, uap=3D0xfffffe25eefa31d8) at /usr/src/sys/dev/filemon/filemon_wrappe= r.c:232 >> #13 0xffffffff8106623d in syscallenter (td=3D) at /usr/= src/sys/amd64/amd64/../../kern/subr_syscall.c:150 >> #14 amd64_syscall (td=3D0xfffffe25eefa2e00, traced=3D0) at /usr/src/sy= s/amd64/amd64/trap.c:1161 >=20 >=20 --=20 Regards, Bryan Drewery --WGx0QEJgJvn7xIqcMq1W4goqrKA31Rpc3-- --Y3Kw281vOe14dImLo9fNoysaIpgYWAafe Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQGTBAEBCgB9FiEE+Rc8ssOq6npcih8JNddxu25Gl88FAl67J3FfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEY5 MTczQ0IyQzNBQUVBN0E1QzhBMUYwOTM1RDc3MUJCNkU0Njk3Q0YACgkQNddxu25G l8/ELQf/fKEZ4D6Vt5ewznAe4HuX2JYe76OKkQ2W86GJ0Poc3ypv5fmZCf8r29sZ HRuEIHd2nbd9wV7EfRvtRXeREUmJ2peL2ISnZ/aJEVVynfZLkCPSEyRkCTjDnUw5 NVImJnTEetdqbDfofanWbVGKQo+w5QB2mvo0a13U5mRvC2fw9Mw/nOqHawGZg6oI XuhCvXCRqkdv8X52oRply8on+mbfn5LlFIQCb/OOro7KXg/PbkQC616RCbaPDh63 BqEsQKbhC3zRG4g0hnY5w336iua8cBJxeZpv6ou57jlhY7mIYM7Z6WJYeZpeWvo6 RJNtUhZ4owYUhZgrCNuGWEz3jMKo2g== =IAV4 -----END PGP SIGNATURE----- --Y3Kw281vOe14dImLo9fNoysaIpgYWAafe-- From owner-freebsd-current@freebsd.org Wed May 13 04:52:15 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0EED32E701B for ; Wed, 13 May 2020 04:52:15 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic306-21.consmr.mail.gq1.yahoo.com (sonic306-21.consmr.mail.gq1.yahoo.com [98.137.68.84]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49MMj94FRtz4V9y for ; Wed, 13 May 2020 04:52:13 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: hxTqXNoVM1krm1NdDyZZ5_3FVc.XfgWwVGVJChm5FDiLq78IW._fK0th4eou6bD C_sOD87eZ8lB9WxU5zMAC7Ib20OlNIX4OL0rUy0iDBYhLFyhSWQ_NCIwSUX9rWUloiLUw.otrXAY vur3B0W59dLxVCWw3JHHANWx9lr2RrpgnDWNWLzNwWJ3uOI1yU5r__vW7Hk1J4DER20UwUatIqFc 1OrlQgiHHn2RfY8v6EIgakn4w8lNc1ALP7MCYBikUyDeabWfXaAczmOjsKdAGFz8y0t8X3yFoFbt K1unfDk_oxmk3RR6it7hJv2lVi5vxv12sxgMYMtBLaMFAJeohreEifp.fPrnhOiLVncqMhe7uBfj O5WofNmJt3Ziw6YHx35ID7qA8sPlU3ybxFaOKetRvupxsm..eYw0fQCTZIw8TskhinvHnKoGYh65 .sCktndbEbgb7s9W74N9GxUoAq22HJfyUnRMZ1KwR1779vbgJdyi2ZC3fr1tjOIyUYFZ0Agu98kg Nqtjd9lVTW8r2YH5f3FDY9fk8eytxbot_PRGRNTIg8oEPgiOU8g8RXBBJGcKbvM06LDFYBzIp26K HmO.PukWZii2LWUMEcluMrUVIeIUzyiZsSAAAy7AE.2Jr81tEfFe9d9nByF9RkQxPjjrMh6P9Idp Ks2ypfMnemHfjvwlQCMXgd9AH6dEPt0OigpFHh.RwP.yr4dk4xKiPmrtWJEaqNGU_KzRSzfsMlqs DjpjnvcJOlJ1qmL_agoI2vEmmftvG0Nm7ba8fHO3zB25s4YptFRwIT.zxG6u.sYGLbQ6nt.f7NQG BcEIJFiODbB1i7OeYI0Ycjv4Sdai.RNTsCLpCH_xP3_KpozdQCCFyCRdIkAqvv7MC6y4uC8tTIOL FmX7mI5BTb2tYQWZmvxLDLPf2QiTaOVSxNT08ZtcfmE7vGqM16k9.yHHoEEs04Vvmexq4VyOe802 z9H5Kv3w2TJ9_O6jcqrtvaHUrVLeckceTFsAvKG1EnPxp7rqQBoi3YQ4sl4bQbzmK6oasZL8x39X o00IHDHmSIh9AytfQEEYFW_Y9f5Vj8QtHMgK.mxlsbr1rn7ROb6ODpTWlt1WfPPSdjyGHTwBi9vz ZbYMOBYmOsWYRDlvlPqldOPD9jyaFfWqVjD2CwDm819GrCPHNUoaIi5wBljd2iBgLhqkARRwZ.uN VdYQ4fzuAupTQL8vsabWhs.XDOZHVinXrkV2ZoF797d1cdTPV6KwbIYsvN0yWqU2KHTwnBozqLdF TCkggOM7q55tN12BAhS9KRI6JS.F8HQx_cYO10Lgfh2bJpLLEZLIm_.04HGUYraB.eGQwapP4jWp VBipvEN4jfkDYSBBZnVNKV2CQpS0bVNzunzTsw_35z5poEW9LEDrThIWHdAeXpHLZ05LM7DbHyhk i42GiepvWtj.IrrYNTKN5jDkTYBLQFQ-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic306.consmr.mail.gq1.yahoo.com with HTTP; Wed, 13 May 2020 04:52:10 +0000 Received: by smtp422.mail.bf1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 43ddc5f4b8c93e8a5b6a17dbda3e0423; Wed, 13 May 2020 04:52:09 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\)) Subject: Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311 From: Mark Millard In-Reply-To: Date: Tue, 12 May 2020 21:52:06 -0700 Cc: Brandon Bergren , Justin Hibbits Content-Transfer-Encoding: quoted-printable Message-Id: <9562EEE4-62EF-4164-91C0-948CC0432984@yahoo.com> References: <8479DD58-44F6-446A-9CA5-D01F0F7C1B38@yahoo.com> <17ACDA02-D7EF-4F26-874A-BB3E935CD072@yahoo.com> <695E6836-F860-4557-B7DE-CC1EDB347F18@yahoo.com> <121B9B09-141B-4DC3-918B-1E7CFB99E779@yahoo.com> <8AAB0462-3FA8-490C-8D8D-7C15B1C9E2DE@yahoo.com> <18E62746-80DB-4195-977D-4FF32D0129EE@yahoo.com> To: "vangyzen@freebsd.org" , svn-src-head@freebsd.org, FreeBSD Current , FreeBSD Hackers , FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3608.80.23.2.2) X-Rspamd-Queue-Id: 49MMj94FRtz4V9y X-Spamd-Bar: / X-Spamd-Result: default: False [-0.97 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-0.57)[-0.570,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.00)[ip: (4.89), ipnet: 98.137.64.0/21(0.83), asn: 36647(0.66), country: US(-0.05)]; NEURAL_SPAM_MEDIUM(0.10)[0.097,0]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[84.68.137.98.list.dnswl.org : 127.0.5.0]; RWL_MAILSPIKE_POSSIBLE(0.00)[84.68.137.98.rep.mailspike.net : 127.0.0.17]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 May 2020 04:52:15 -0000 [Yet another new kind of experiment. But this looks like I can cause problems in fairly sort order on demand now. Finally! And with that I've much better evidence for kernel vs. user-space process for making the zeroed memory appear in, for example, nfsd.] I've managed to get: : = /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:= Failed assertion: "slab =3D=3D extent_slab_get(extent)" : = /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:= Failed assertion: "slab =3D=3D extent_slab_get(extent)" and eventually: [1] Segmentation fault (core dumped) stress -m 2 --vm-bytes 1700M from a user program (stress) while another machine was attempted an nfs mount during the stress activity: # mount -onoatime,soft ...:/ /mnt && umount /mnt && rpcinfo -s ... [tcp] ...:/: RPCPROG_MNT: RPC: Timed out (To get failure I may have to run the commands multiple times. Timing details against stress's activity seem to matter.) That failure lead to: # ls -ldT /*.core* -rw------- 1 root wheel 3899392 May 12 19:52:26 2020 /mountd.core # ls -ldT *.core* -rw------- 1 root wheel 2682880 May 12 20:00:26 2020 stress.core (Note which of nfsd, mountd, or rpcbind need not be fully repeatable. stress.core seems to be written twice, probably because of the -m 2 in use.) The context that let me do this was to first (on the 2 socket G4 with a full 2048 MiBYte RAM configuration): stress -m 2 --vm-bytes 1700M & Note that the stress command keeps the memory busy and causes paging to the swap/page space. I've not tried to make it just fit without paging or just barely paging or such. The original context did not involve paging or low RAM, so I do not expect paging to be required but can be involved. The stress program backtrace is different: 4827 return (tls_get_addr_slow(dtvp, index, offset)); 4828 } (gdb) bt -full #0 0x41831b04 in tls_get_addr_common (dtvp=3D0x4186c010, index=3D2, = offset=3D4294937444) at /usr/src/libexec/rtld-elf/rtld.c:4824 dtv =3D 0x0 #1 0x4182bfcc in __tls_get_addr (ti=3D) at = /usr/src/libexec/rtld-elf/powerpc/reloc.c:848 tp =3D p =3D #2 0x41a83464 in __get_locale () at = /usr/src/lib/libc/locale/xlocale_private.h:199 No locals. #3 fprintf (fp=3D0x41b355f8, fmt=3D0x1804cbc "%s: FAIL: [%lli] (%d) ") = at /usr/src/lib/libc/stdio/fprintf.c:57 ap =3D {{gpr =3D 2 '\002', fpr =3D 0 '\000', reserved =3D 20731, = overflow_arg_area =3D 0xffffdb78, reg_save_area =3D 0xffffdae8}} ret =3D #4 0x01802348 in main (argc=3D, argv=3D) = at stress.c:415 status =3D ret =3D 6 do_dryrun =3D 0 retval =3D 0 children =3D 1 do_backoff =3D do_hdd_bytes =3D do_hdd =3D do_vm_keep =3D 0 do_vm_hang =3D -1 do_vm_stride =3D 4096 do_vm_bytes =3D 1782579200 do_vm =3D 108174317627375616 do_io =3D do_cpu =3D do_timeout =3D 108176117243859333 starttime =3D 1589338322 i =3D forks =3D pid =3D 6140 stoptime =3D runtime =3D Apparently the asserts did not stop the code and it ran until a failure occurred (via dtv=3D0x0). Stress uses a mutex stored on a page that gets the "turns into zeros" problem, preventing the mprotect(ADDR,1,1) type of test because stress will write on the page. (I've not tried to find a minimal form of stress run.) The the same sort of globals are again zeroed, such as: (gdb) print/x __je_sz_size2index_tab $1 =3D {0x0 } Another attempt lost rpcbind instead instead of mountd: # ls -ldT /*.core -rw------- 1 root wheel 3899392 May 12 19:52:26 2020 /mountd.core -rw------- 1 root wheel 3170304 May 12 20:03:00 2020 /rpcbind.core I again find that when I use gdb 3 times to: attach ??? x/x __je_sz_size2index_tab print (int)mprotext(ADDRESS,1,1) quit for each of rpcbind, mountd, and nfsd master that those processes no longer fail during the mount/umount/rpcinfo (or are far less likely to). But it turns out that later when I "service nfsd stop" nfsd does get the zeroed Memory based assert and core dumps. (I'd done a bunch of the mount/umount/ rpcinfo sequences before the stop.) That the failure is during SIGUSR1 based shutdown, leads me to wonder if killing off some child process(es) is involved in the problem. There was *no* evidence of a signal for an attempt to write the page from the user process. It appears that the kernel is doing something that changes what the process sees --instead of the user-space programs stomping on its own memory content. I've no clue how to track down the kernel activity that changes what the process sees on some page(s) of memory. (Prior testing with a debug kernel did not report problems, despite getting an example failure. So that seems insufficient.) At least a procedure is now known that does not involved waiting hours or days. The procedure (adjusted for how much RAM is present and number of cpus/cores?) could be appropriate to run in other contexts than the 32-bit powerpc G4. Part of the context likely should be not using MALLOC_PRODUCTION --so problems would be detected sooner via the asserts in jemalloc. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-current@freebsd.org Wed May 13 06:32:28 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9B7BD2EB0FD for ; Wed, 13 May 2020 06:32:28 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mailman.nyi.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id 49MPwr2jBsz4bHt for ; Wed, 13 May 2020 06:32:28 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mailman.nyi.freebsd.org (Postfix) id 5CA532EB0FB; Wed, 13 May 2020 06:32:28 +0000 (UTC) Delivered-To: current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 5C5772EB0F8; Wed, 13 May 2020 06:32:28 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com [209.85.167.48]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49MPwq3w76z4bHn; Wed, 13 May 2020 06:32:27 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lf1-f48.google.com with SMTP id c21so7992910lfb.3; Tue, 12 May 2020 23:32:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:subject:openpgp:autocrypt:message-id :date:user-agent:mime-version:content-language :content-transfer-encoding; bh=Mbo5G+POyWI/dgErMnyuXZ86nTTQ1Xn4vDqAOXQ23eg=; b=t9PxsPc3KmPiTW+PAWOtlMl8qOVBMUVbTkY2QTto/G8A/ZVmjMtZwytIg0a9NwWMoi /dx45tkPiURmVTv5Mfvfq4aVExTtA/LB1ovA3pcc+pmVm0CrOn9tYQMkc9UAHy94ul7P O1E7OplDya7c8voJXe5/6LKCDr204YUsmkzdvWT1+saZg9jN0fWKdbJd3fOGJq07S314 OlaEon7R2EawwkRBj2OvV7kAtAzpDTk/hiZkMYOjhs38hMChDJj6hvRH8Ae8I2yWzwx7 87hTUaBDtNzIhI8Gt/mV2kYC0pUKL0TYxuTytRxNA0s/R9m1dcjElc98EJcsPm7/bHM1 GLfQ== X-Gm-Message-State: AOAM530IO+/5qhYWnLGWKog8ea5X6QHdFm1hx6KYJ0eO3BOurg5rRrbQ psB8Dr2gNaDXB0+1duQWABVjpfl9siU= X-Google-Smtp-Source: ABdhPJzAg9kJ3Vy3gWiGpYsMycSXUQ/rcO+0k2y8ELDqv9G56BHXzCCl1AqPeClNsJVqbLwgvvQlfg== X-Received: by 2002:a05:6512:108f:: with SMTP id j15mr17082262lfg.19.1589351545479; Tue, 12 May 2020 23:32:25 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id r19sm13964358ljp.68.2020.05.12.23.32.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 May 2020 23:32:24 -0700 (PDT) To: FreeBSD Current , FreeBSD Hackers From: Andriy Gapon Subject: lkpi: print stack trace in WARN_ON ? Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= mQINBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABtB5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz6JAlQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryLkCDQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAYkCPAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <99bdcbbc-402d-8ab0-262d-0e9732ecf995@FreeBSD.org> Date: Wed, 13 May 2020 09:32:23 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Firefox/60.0 Thunderbird/60.9.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 49MPwq3w76z4bHn X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of agapon@gmail.com designates 209.85.167.48 as permitted sender) smtp.mailfrom=agapon@gmail.com X-Spamd-Result: default: False [-1.22 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; SUBJECT_ENDS_QUESTION(1.00)[]; DMARC_NA(0.00)[FreeBSD.org]; RCVD_TLS_ALL(0.00)[]; NEURAL_HAM_LONG(-0.99)[-0.994,0]; RCVD_COUNT_THREE(0.00)[3]; IP_SCORE(-0.24)[ip: (-0.32), ipnet: 209.85.128.0/17(-0.39), asn: 15169(-0.42), country: US(-0.05)]; TO_DN_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[48.167.85.209.list.dnswl.org : 127.0.5.0]; NEURAL_HAM_MEDIUM(-0.98)[-0.984,0]; FORGED_SENDER(0.30)[avg@FreeBSD.org,agapon@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[48.167.85.209.rep.mailspike.net : 127.0.0.17]; MIME_TRACE(0.00)[0:+]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[avg@FreeBSD.org,agapon@gmail.com]; RECEIVED_SPAMHAUS_PBL(0.00)[96.151.72.93.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 May 2020 06:32:28 -0000 Just to get a bigger exposure: https://reviews.freebsd.org/D24779 I think that this is a good idea and, if I am not mistaken, it should match the Linux behavior. -- Andriy Gapon From owner-freebsd-current@freebsd.org Wed May 13 07:29:15 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 7C9DD2EC835 for ; Wed, 13 May 2020 07:29:15 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic306-19.consmr.mail.gq1.yahoo.com (sonic306-19.consmr.mail.gq1.yahoo.com [98.137.68.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49MRBL1pnrz4fR9 for ; Wed, 13 May 2020 07:29:13 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: 8E92OWQVM1kjx.V0psqp37JR0A6OiadqLBqZxQTMpzvpvQd1SkZB.qC7qV8g7ol 8gaY51lNKSWHU4BNFKBk3YJb3lfPFUbLWRZizXuHFUqmTspoEWdZvTJU2eRbEWyfSgNYzCh2iBTF DTfgsd6xX8KaxR8yel7fDrqmtXOZGSSIHWKpxzkh2YkzQutzyxBoqAcnGL7IzB_hsVmhZxgb1169 bATqjm8K0dz636t3ZfeodD4qSzT34FaSQuPO1N_e3CMNLby9XF5.8spuXMqMkhFHtuwhzTqkgSjJ QKzBUqswkTIRElp1gRjwRZf4p.aa_ycWsbVKHCEI5rw7qOOce4pUuPBUd6UMF.N6WV6jForvheAY LRYzxp_88bf_2LKZr9I_NXZpewnBWkj4pcpdZOV0tuliy8RlS1vm4V5hZrU724sq9u4c1Gs59pKS G313r_3uCAnRVhKCH6TsGOx6MVovfSySqDj3AcBRn1f.oKotJggpaLJQ6I1RO2yRBkXoW0B4VVoN Y38XsFDWo3uB8qSXg5Ml10dTQuP9s_EGw5vXC_LoxloopBgnPo2F1IAT0VkyeZoJV6IISecLX1QF tWkiImhP4H0SQeQ0HzeaWbOhuplX8QwBy915kkdNKo_wDaT9KNmO9ac1vrm2qhZLGYqj5w3M2tsY KdolnuE_iVdIvqFkpl2mC6lezmTAkKR0TvVneUChgYdes9UHz0v3D9MDfJ2dYt5OVFdoXcDk0_M1 6QwavDlXXvIY5SmYBlCgn1NImFsG6poFajjyrIrK6N4ocBolHCcfirzT5hk.Sc2j6sd8DlyAfLOR qjTMvD0Bdvb02QeA7S6zW8UgGsF.jz8i5Z9xqzqDOC9FS0CCt73Ui1eiN.zmruUTo7rer_WDWtNI 73wtd339a66u4H9tjkagzG3laydaw72wyz9950mW9pOABNj3jq2ohaR2mliHtgemrUnCPsRPEE6. 9wTebdW224Taq7ObWJ55Se1AsgSqH2IAwqDSp3bCwJ2D3vS7nV.C9699ilm9jXQ3YeZTnr5e898. VZ4s_2KQ1XRFcsoZg9r6pRDjprtuxCLt1UbB_T.aKzIla7EPZwVGoZ7S8ZRfv9PxEulXh096qcD4 wgPeDCi_3rTxfPR8AxoTVe68nQmIlk3CcqagKTIt8L8eUwHn5fZ.8a0Y1Yxa4gyLSI51bCNxpicQ Uj3gF1LYfB.O3eqaG2bQOBNFDWAIPyNtI3WIjtiR4VmCoTm9H63kyCNc0LDhPU0uGjwF1Mq8aOQ. Uu0EgnIRqcyb9mb4n3R2Ggr7pPBa6KT9.cUwHwmkrffdDp5a.LTsPc6GieVm5B5QhN3ibcFgZM9h BE3mvXu0UUiRyzsXS6ytAADOIvvS8gtf9ExIw_jXWeSCbeCHrS.upSmanMQkoyioF5fy5IEKMnHw mLDtAYnez5Ll3Bn0QhdwMJMXiYkC2Xr2Ln7It8l7fXHW_Ll61rrrPHaP6jP514H_FPw-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic306.consmr.mail.gq1.yahoo.com with HTTP; Wed, 13 May 2020 07:29:12 +0000 Received: by smtp412.mail.ne1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 8bcb3d014077aa2d68dc334f30fdedbe; Wed, 13 May 2020 07:29:08 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\)) Subject: Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311 From: Mark Millard In-Reply-To: <9562EEE4-62EF-4164-91C0-948CC0432984@yahoo.com> Date: Wed, 13 May 2020 00:29:07 -0700 Cc: Brandon Bergren , Justin Hibbits Content-Transfer-Encoding: quoted-printable Message-Id: <9B68839B-AEC8-43EE-B3B6-B696A4A57DAE@yahoo.com> References: <8479DD58-44F6-446A-9CA5-D01F0F7C1B38@yahoo.com> <17ACDA02-D7EF-4F26-874A-BB3E935CD072@yahoo.com> <695E6836-F860-4557-B7DE-CC1EDB347F18@yahoo.com> <121B9B09-141B-4DC3-918B-1E7CFB99E779@yahoo.com> <8AAB0462-3FA8-490C-8D8D-7C15B1C9E2DE@yahoo.com> <18E62746-80DB-4195-977D-4FF32D0129EE@yahoo.com> <9562EEE4-62EF-4164-91C0-948CC0432984@yahoo.com> To: "vangyzen@freebsd.org" , svn-src-head@freebsd.org, FreeBSD Current , FreeBSD Hackers , FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3608.80.23.2.2) X-Rspamd-Queue-Id: 49MRBL1pnrz4fR9 X-Spamd-Bar: - X-Spamd-Result: default: False [-1.10 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.21)[-0.208,0]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-0.40)[-0.395,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.00)[ip: (5.37), ipnet: 98.137.64.0/21(0.83), asn: 36647(0.66), country: US(-0.05)]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[82.68.137.98.list.dnswl.org : 127.0.5.0]; RWL_MAILSPIKE_POSSIBLE(0.00)[82.68.137.98.rep.mailspike.net : 127.0.0.17]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 May 2020 07:29:15 -0000 [stress alone is sufficient to have jemalloc asserts fail in stress, no need for a multi-socket G4 either. No need to involve nfsd, mountd, rpcbind or the like. This is not a claim that I know all the problems to be the same, just that a jemalloc reported failure in this simpler context happens and zeroed pages are involved.] Reminder: head -r360311 based context. First I show a single CPU/core PowerMac G4 context failing in stress. (I actually did this later, but it is the simpler context.) I simply moved the media from the 2-socket G4 to this slower, single-cpu/core one. cpu0: Motorola PowerPC 7400 revision 2.9, 466.42 MHz cpu0: Features 9c000000 cpu0: HID0 8094c0a4 real memory =3D 1577857024 (1504 MB) avail memory =3D 1527508992 (1456 MB) # stress -m 1 --vm-bytes 1792M stress: info: [1024] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd : = /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:= Failed assertion: "slab =3D=3D extent_slab_get(extent)" stress: FAIL: [1024] (415) <-- worker 1025 got signal 6 stress: WARN: [1024] (417) now reaping child worker processes stress: FAIL: [1024] (451) failed run completed in 243s (Note: 1792 is the biggest it allowed with M.) The following still pages in and out and fails: # stress -m 1 --vm-bytes 1290M stress: info: [1163] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd : = /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:= Failed assertion: "slab =3D=3D extent_slab_get(extent)" . . . By contrast, the following had no problem for as long as I let it run --and did not page in or out: # stress -m 1 --vm-bytes 1280M stress: info: [1181] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd The 2 socket PowerMac G4 context with 2048 MiByte of RAM . . . stress -m 1 --vm-bytes 1792M did not (quickly?) fail or page. 1792 is as large as it would allow with M. The following also did not (quickly?) fail (and were not paging): stress -m 2 --vm-bytes 896M stress -m 4 --vm-bytes 448M stress -m 8 --vm-bytes 224M (Only 1 example was run at a time.) But the following all did quickly fail (and were paging): stress -m 8 --vm-bytes 225M stress -m 4 --vm-bytes 449M stress -m 2 --vm-bytes 897M (Only 1 example was run at a time.) I'll note that when I exited an su process I ended up with a: : = /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200: Failed = assertion: "ret =3D=3D sz_index2size_compute(index)" Abort trap (core dumped) and a matching su.core file. It appears that stress's activity leads to other processes also seeing examples of the zeroed-page(s) problem (probably su had paged some or had been fully swapped out): (gdb) bt #0 thr_kill () at thr_kill.S:4 #1 0x503821d0 in __raise (s=3D6) at /usr/src/lib/libc/gen/raise.c:52 #2 0x502e1d20 in abort () at /usr/src/lib/libc/stdlib/abort.c:67 #3 0x502d6144 in sz_index2size_lookup (index=3D) at = /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200 #4 sz_index2size (index=3D) at = /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:207 #5 ifree (tsd=3D0x5008b018, ptr=3D0x50041460, tcache=3D0x5008b138, = slow_path=3D) at jemalloc_jemalloc.c:2583 #6 0x502d5cec in __je_free_default (ptr=3D0x50041460) at = jemalloc_jemalloc.c:2784 #7 0x502d62d4 in __free (ptr=3D0x50041460) at jemalloc_jemalloc.c:2852 #8 0x501050cc in openpam_destroy_chain (chain=3D0x50041480) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:113 #9 0x50105094 in openpam_destroy_chain (chain=3D0x500413c0) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111 #10 0x50105094 in openpam_destroy_chain (chain=3D0x50041320) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111 #11 0x50105094 in openpam_destroy_chain (chain=3D0x50041220) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111 #12 0x50105094 in openpam_destroy_chain (chain=3D0x50041120) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111 #13 0x50105094 in openpam_destroy_chain (chain=3D0x50041100) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111 #14 0x50105014 in openpam_clear_chains (policy=3D0x50600004) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:130 #15 0x50101230 in pam_end (pamh=3D0x50600000, status=3D) = at /usr/src/contrib/openpam/lib/libpam/pam_end.c:83 #16 0x1001225c in main (argc=3D, argv=3D0x0) at = /usr/src/usr.bin/su/su.c:477 (gdb) print/x __je_sz_size2index_tab $1 =3D {0x0 } Notes: Given that the original problem did not involve paging to the swap partition, may be just making it to the Laundry list or some such is sufficient, something that is also involved when the swap space is partially in use (according to top). Or sitting in the inactive list for a long time, if that has some special status. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-current@freebsd.org Wed May 13 07:35:49 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 000EE2ED08D for ; Wed, 13 May 2020 07:35:48 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lj1-f196.google.com (mail-lj1-f196.google.com [209.85.208.196]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49MRKw11y4z3BrR; Wed, 13 May 2020 07:35:47 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lj1-f196.google.com with SMTP id h4so16583823ljg.12; Wed, 13 May 2020 00:35:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=101nyHZZWu8yzIL7Y4Gxu9Sc1WtaBXC7gOqe3dkOmog=; b=IlqjJqEvUg89TrdVnJk6fGJbnbWkC0/DaWAYf1c+gXO1R1jfG1LbTnwloDQiEVjVt/ yMfCPZj0EzTRIB0JAPcMP9b+/EFvN1H8cqfY91nICWD+XfLwjwCBQ44Tv3xVB3+1PmKH m2baXKK/ALKTyP+NsNqmQQCDi2pCKsbo8rkNqeugIORi1wcacjsUtBblwMDdzH0jmQCM EWyfAl+4iI1hE1FW+j7Iaao5103Y3LmcHVH1WxLZekrHDM5jAG67jXX+vDy4OWl8Gus/ WrUvkvutc2dD6oDedXXZMkNqZAX0zgF1pJKIBYlEh4EOfvuBzZVm22R1POa1pgQSsCTi y49Q== X-Gm-Message-State: AOAM533bq8hUfnHoAFaH/R93SkmyKBCCVfP7SZL9vrWvk6U4eAxR1WTD bmcwBJqrHgsNbYbSYEpnMt3zuMlZGsQ= X-Google-Smtp-Source: ABdhPJzwLsYn+AGAhM1urAifweVTDmHHMgUiK6zTw2xsTdRBqrW5P8T+5qlrMIpnVeD7GiIrUNotfQ== X-Received: by 2002:a2e:3519:: with SMTP id z25mr15274740ljz.253.1589355345699; Wed, 13 May 2020 00:35:45 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id z23sm13766930ljm.46.2020.05.13.00.35.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 13 May 2020 00:35:45 -0700 (PDT) Subject: Re: zfs deadlock on r360452 relating to busy vm page To: Bryan Drewery , freebsd-current@FreeBSD.org References: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> From: Andriy Gapon Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= mQINBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABtB5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz6JAlQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryLkCDQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAYkCPAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <0e9cceba-84d0-ec4f-8784-36703452201d@FreeBSD.org> Date: Wed, 13 May 2020 10:35:43 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Firefox/60.0 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 49MRKw11y4z3BrR X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of agapon@gmail.com designates 209.85.208.196 as permitted sender) smtp.mailfrom=agapon@gmail.com X-Spamd-Result: default: False [-2.02 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[FreeBSD.org]; NEURAL_HAM_LONG(-1.00)[-0.998,0]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; IP_SCORE(-0.03)[ip: (0.73), ipnet: 209.85.128.0/17(-0.39), asn: 15169(-0.42), country: US(-0.05)]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[196.208.85.209.list.dnswl.org : 127.0.5.0]; NEURAL_HAM_MEDIUM(-0.99)[-0.994,0]; FORGED_SENDER(0.30)[avg@FreeBSD.org,agapon@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[196.208.85.209.rep.mailspike.net : 127.0.0.17]; MIME_TRACE(0.00)[0:+]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[avg@FreeBSD.org,agapon@gmail.com]; RECEIVED_SPAMHAUS_PBL(0.00)[96.151.72.93.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 May 2020 07:35:49 -0000 On 13/05/2020 01:47, Bryan Drewery wrote: > Trivial repro: > > dd if=/dev/zero of=blah & tail -F blah > ^C > load: 0.21 cmd: tail 72381 [prev->lr_read_cv] 2.17r 0.00u 0.01s 0% 2600k > #0 0xffffffff80bce615 at mi_switch+0x155 > #1 0xffffffff80c1cfea at sleepq_switch+0x11a > #2 0xffffffff80b57f0a at _cv_wait+0x15a > #3 0xffffffff829ddab6 at rangelock_enter+0x306 > #4 0xffffffff829ecd3f at zfs_freebsd_getpages+0x14f > #5 0xffffffff810e3ab9 at VOP_GETPAGES_APV+0x59 > #6 0xffffffff80f349e7 at vnode_pager_getpages+0x37 > #7 0xffffffff80f2a93f at vm_pager_get_pages+0x4f > #8 0xffffffff80f054b0 at vm_fault+0x780 > #9 0xffffffff80f04bde at vm_fault_trap+0x6e > #10 0xffffffff8106544e at trap_pfault+0x1ee > #11 0xffffffff81064a9c at trap+0x44c > #12 0xffffffff8103a978 at calltrap+0x8 In r329363 I re-worked zfs_getpages and introduced range locking to it. At the time I believed that it was safe and maybe it was, please see the commit message. There, indeed, have been many performance / concurrency improvements to the VM system and r358443 is one of them. I am not sure how to resolve the problem best. Maybe someone who knows the latest VM code better than me can comment on my assumptions stated in the commit message. In illumos (and, I think, in OpenZFS/ZoL) they don't have the range locking in this corner of the code because of a similar deadlock a long time ago. > On 5/12/2020 3:13 PM, Bryan Drewery wrote: >>> panic: deadlres_td_sleep_q: possible deadlock detected for 0xfffffe25eefa2e00 (find), blocked for 1802392 ticks ... >>> (kgdb) backtrace >>> #0 sched_switch (td=0xfffffe255eac0000, flags=) at /usr/src/sys/kern/sched_ule.c:2147 >>> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >>> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffff810fb57dd48, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:625 >>> #3 0xffffffff80b57f0a in _cv_wait (cvp=0xfffff810fb57dd48, lock=0xfffff80049a99040) at /usr/src/sys/kern/kern_condvar.c:146 >>> #4 0xffffffff82434ab6 in rangelock_enter_reader (rl=0xfffff80049a99018, new=0xfffff8022cadb100) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:429 >>> #5 rangelock_enter (rl=0xfffff80049a99018, off=, len=, type=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:477 >>> #6 0xffffffff82443d3f in zfs_getpages (vp=, ma=0xfffffe259f204b18, count=, rbehind=0xfffffe259f204ac4, rahead=0xfffffe259f204ad0) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4500 >>> #7 zfs_freebsd_getpages (ap=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4567 >>> #8 0xffffffff810e3ab9 in VOP_GETPAGES_APV (vop=0xffffffff8250a1e0 , a=0xfffffe259f2049f0) at vnode_if.c:2644 >>> #9 0xffffffff80f349e7 in VOP_GETPAGES (vp=, m=, count=, rbehind=, rahead=) at ./vnode_if.h:1171 >>> #10 vnode_pager_getpages (object=, m=, count=, rbehind=, rahead=) at /usr/src/sys/vm/vnode_pager.c:743 >>> #11 0xffffffff80f2a93f in vm_pager_get_pages (object=0xfffff806cb637c60, m=0xfffffe259f204b18, count=1, rbehind=, rahead=) at /usr/src/sys/vm/vm_pager.c:305 >>> #12 0xffffffff80f054b0 in vm_fault_getpages (fs=, nera=0, behindp=, aheadp=) at /usr/src/sys/vm/vm_fault.c:1163 >>> #13 vm_fault (map=, vaddr=, fault_type=, fault_flags=, m_hold=) at /usr/src/sys/vm/vm_fault.c:1394 >>> #14 0xffffffff80f04bde in vm_fault_trap (map=0xfffffe25653949e8, vaddr=, fault_type=, fault_flags=0, signo=0xfffffe259f204d04, ucode=0xfffffe259f204d00) at /usr/src/sys/vm/vm_fault.c:589 >>> #15 0xffffffff8106544e in trap_pfault (frame=0xfffffe259f204d40, usermode=, signo=, ucode=) at /usr/src/sys/amd64/amd64/trap.c:821 >>> #16 0xffffffff81064a9c in trap (frame=0xfffffe259f204d40) at /usr/src/sys/amd64/amd64/trap.c:340 >>> #17 >>> #18 0x00000000002034fc in ?? () ... >>> (kgdb) thread >>> [Current thread is 8 (Thread 101255)] >>> (kgdb) backtrace >>> #0 sched_switch (td=0xfffffe25c8e9bc00, flags=) at /usr/src/sys/kern/sched_ule.c:2147 >>> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >>> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffffe001cbca850, pri=84) at /usr/src/sys/kern/subr_sleepqueue.c:625 >>> #3 0xffffffff80f1de50 in _vm_page_busy_sleep (obj=, m=0xfffffe001cbca850, pindex=, wmesg=, allocflags=21504, locked=false) at /usr/src/sys/vm/vm_page.c:1094 >>> #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=0xfffff806cb637c60, m=, pindex=, wmesg=, allocflags=21504, locked=) at /usr/src/sys/vm/vm_page.c:4326 >>> #5 vm_page_acquire_unlocked (object=0xfffff806cb637c60, pindex=1098, prev=, mp=0xfffffe2717fc6730, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4469 >>> #6 0xffffffff80f24c61 in vm_page_grab_valid_unlocked (mp=0xfffffe2717fc6730, object=0xfffff806cb637c60, pindex=1098, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4645 >>> #7 0xffffffff82440246 in page_busy (vp=0xfffff80571f29500, start=4497408, off=, nbytes=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:414 >>> #8 update_pages (vp=0xfffff80571f29500, start=4497408, len=32, os=0xfffff8096a277400, oid=2209520, segflg=, tx=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:482 >>> #9 zfs_write (vp=, uio=, ioflag=0, cr=, ct=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1071 >>> #10 zfs_freebsd_write (ap=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4838 >>> #11 0xffffffff810e0eaf in VOP_WRITE_APV (vop=0xffffffff8250a1e0 , a=0xfffffe2717fc68c8) at vnode_if.c:925 >>> #12 0xffffffff80cb574c in VOP_WRITE (vp=0xfffff80571f29500, uio=0xfffffe2717fc6bb0, ioflag=8323073, cred=) at ./vnode_if.h:413 >>> #13 vn_write (fp=0xfffff8048195e8c0, uio=, active_cred=, flags=, td=) at /usr/src/sys/kern/vfs_vnops.c:894 >>> #14 0xffffffff80cb50c3 in vn_io_fault_doio (args=0xfffffe2717fc6af0, uio=0xfffffe2717fc6bb0, td=0xfffffe25c8e9bc00) at /usr/src/sys/kern/vfs_vnops.c:959 >>> #15 0xffffffff80cb1c8c in vn_io_fault1 (vp=, uio=0xfffffe2717fc6bb0, args=0xfffffe2717fc6af0, td=0xfffffe25c8e9bc00) at /usr/src/sys/kern/vfs_vnops.c:1077 >>> #16 0xffffffff80cafa32 in vn_io_fault (fp=0xfffff8048195e8c0, uio=0xfffffe2717fc6bb0, active_cred=0xfffff80f2cc12708, flags=0, td=) at /usr/src/sys/kern/vfs_vnops.c:1181 >>> #17 0xffffffff80c34331 in fo_write (fp=0xfffff8048195e8c0, uio=0xfffffe2717fc6bb0, active_cred=, flags=, td=0xfffffe25c8e9bc00) at /usr/src/sys/sys/file.h:326 >>> #18 dofilewrite (td=0xfffffe25c8e9bc00, fd=2, fp=0xfffff8048195e8c0, auio=0xfffffe2717fc6bb0, offset=, flags=) at /usr/src/sys/kern/sys_generic.c:564 >>> #19 0xffffffff80c33eb0 in kern_writev (td=0xfffffe25c8e9bc00, fd=2, auio=) at /usr/src/sys/kern/sys_generic.c:491 >>> #20 sys_write (td=0xfffffe25c8e9bc00, uap=) at /usr/src/sys/kern/sys_generic.c:406 >>> #21 0xffffffff8106623d in syscallenter (td=) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:150 >>> #22 amd64_syscall (td=0xfffffe25c8e9bc00, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1161 >>> #23 >>> #24 0x000000080043d53a in ?? () >> >> Maybe r358443 is related? >> >> >>> (kgdb) frame 4 >>> #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=0xfffff806cb637c60, m=, pindex=, wmesg=, allocflags=21504, locked=) at /usr/src/sys/vm/vm_page.c:4326 >>> 4326 if (_vm_page_busy_sleep(object, m, m->pindex, wmesg, allocflags, >>> (kgdb) p *object >>> $8 = {lock = {lock_object = {lo_name = 0xffffffff8114fa30 "vm object", lo_flags = 627245056, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, object_list = {tqe_next = 0xfffff806cb637d68, tqe_prev = 0xfffff806cb637b78}, shadow_head = {lh_first = 0x0}, shadow_list = {le_next = 0xffffffffffffffff, >>> le_prev = 0xffffffffffffffff}, memq = {tqh_first = 0xfffffe001cbca850, tqh_last = 0xfffffe001cbca860}, rtree = {rt_root = 18446741875168421969}, size = 1099, domain = {dr_policy = 0x0, dr_iter = 0}, generation = 1, cleangeneration = 1, ref_count = 2, shadow_count = 0, memattr = 6 '\006', type = 2 '\002', >>> flags = 4096, pg_color = 0, paging_in_progress = {__count = 2}, busy = {__count = 0}, resident_page_count = 1, backing_object = 0x0, backing_object_offset = 0, pager_object_list = {tqe_next = 0x0, tqe_prev = 0x0}, rvq = {lh_first = 0x0}, handle = 0xfffff80571f29500, un_pager = {vnp = {vnp_size = 4499568, >>> writemappings = 0}, devp = {devp_pglist = {tqh_first = 0x44a870, tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {sgp_pglist = {tqh_first = 0x44a870, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x44a870, swp_blks = {pt_root = 0}, writemappings = 0}}, cred = 0x0, charge = 0, umtx_data = 0x0} >>> (kgdb) frame 5 >>> #5 vm_page_acquire_unlocked (object=0xfffff806cb637c60, pindex=1098, prev=, mp=0xfffffe2717fc6730, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4469 >>> 4469 if (!vm_page_grab_sleep(object, m, pindex, "pgnslp", >>> (kgdb) p *m >>> $9 = {plinks = {q = {tqe_next = 0xffffffffffffffff, tqe_prev = 0xffffffffffffffff}, s = {ss = {sle_next = 0xffffffffffffffff}}, memguard = {p = 18446744073709551615, v = 18446744073709551615}, uma = {slab = 0xffffffffffffffff, zone = 0xffffffffffffffff}}, listq = {tqe_next = 0x0, tqe_prev = 0xfffff806cb637ca8}, >>> object = 0xfffff806cb637c60, pindex = 1098, phys_addr = 18988408832, md = {pv_list = {tqh_first = 0x0, tqh_last = 0xfffffe001cbca888}, pv_gen = 44682, pat_mode = 6}, ref_count = 2147483648, busy_lock = 1588330502, a = {{flags = 0, queue = 255 '\377', act_count = 0 '\000'}, _bits = 16711680}, order = 13 '\r', >>> pool = 0 '\000', flags = 1 '\001', oflags = 0 '\000', psind = 0 '\000', segind = 6 '\006', valid = 0 '\000', dirty = 0 '\000'} >> >> Pretty sure this thread is holding the rangelock from zfs_write() that >> tail is waiting on. So what is this thread (101255) waiting on exactly >> for? I'm not sure the way to track down what is using vm object >> 0xfffff806cb637c60. If the tail thread busied the page then they are >> waiting on each other I guess. If that's true then r358443 removing the >> write lock on the object in update_pages() could be a problem. >> >> >> Not sure the rest is interesting. I think they are just waiting on the >> locked vnode but I give it here in case I missed something. -- Andriy Gapon From owner-freebsd-current@freebsd.org Wed May 13 07:45:29 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 70D932ED8C3 for ; Wed, 13 May 2020 07:45:29 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lj1-f196.google.com (mail-lj1-f196.google.com [209.85.208.196]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49MRY44Gf9z3Csg; Wed, 13 May 2020 07:45:28 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lj1-f196.google.com with SMTP id l19so16668687lje.10; Wed, 13 May 2020 00:45:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:references:cc:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=w3KoPuVJlXmFqJQou/1r4GD+FoDiyKZqJwgUALVp7BE=; b=a5ARPHevGd1zODtmOdcTfBFgEmmT31Apfzgk1kIUOfeG26ka7zh3/D4oeupFU222yi QyWk1/JAaMzmloGtQZKHSEBY3vRXGvIl9O2WOq6a6Rl0PwaCmFwUC2niHRMOuvR3My4r YYaZlG7yXLehZvisO0QsL6IMjp/VA/r+H7Paw3xxBoAisByq6+J7m90RXywNK326qokF CJkDqYZFxIbvhtOKiYWNUjX51fFdKq8arvwKbILPpyMhqO96L+s5ZD/imDhN4Sa+kIBW ttWK9Ux3SnI8zdYA70NSfyk/ckGWTXXiPvCWHBAGS0BwbmJlTKaYT31+meagU3GWHrLM m6Gw== X-Gm-Message-State: AOAM533pQmut+tk5WlU9wOMFa+hWdG/s8AHZN8kD1XqrZtgzDyiXLMZx fpYyYo7JCbeaDNoZwibQxIMTpwuT3ME= X-Google-Smtp-Source: ABdhPJywBThsl+vFDyE7YawrEQJCOL4fyU3AmjiM60iCVRORkDRn3FwuaUx6EIxY2AK220xP3R/RDA== X-Received: by 2002:a2e:3c06:: with SMTP id j6mr5364007lja.9.1589355926528; Wed, 13 May 2020 00:45:26 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id v19sm17146825lfa.54.2020.05.13.00.45.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 13 May 2020 00:45:25 -0700 (PDT) Subject: Re: zfs deadlock on r360452 relating to busy vm page From: Andriy Gapon To: Bryan Drewery , freebsd-current@FreeBSD.org References: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> <0e9cceba-84d0-ec4f-8784-36703452201d@FreeBSD.org> Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= mQINBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABtB5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz6JAlQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryLkCDQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAYkCPAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <889cb93b-85c7-3ec4-4ccf-5fb56ec38fa5@FreeBSD.org> Date: Wed, 13 May 2020 10:45:24 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Firefox/60.0 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <0e9cceba-84d0-ec4f-8784-36703452201d@FreeBSD.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 49MRY44Gf9z3Csg X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of agapon@gmail.com designates 209.85.208.196 as permitted sender) smtp.mailfrom=agapon@gmail.com X-Spamd-Result: default: False [-2.02 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; DMARC_NA(0.00)[FreeBSD.org]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-0.997,0]; RCVD_COUNT_THREE(0.00)[3]; IP_SCORE(-0.03)[ip: (0.71), ipnet: 209.85.128.0/17(-0.39), asn: 15169(-0.42), country: US(-0.05)]; RCVD_IN_DNSWL_NONE(0.00)[196.208.85.209.list.dnswl.org : 127.0.5.0]; NEURAL_HAM_MEDIUM(-0.99)[-0.992,0]; FORGED_SENDER(0.30)[avg@FreeBSD.org,agapon@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[196.208.85.209.rep.mailspike.net : 127.0.0.17]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[avg@FreeBSD.org,agapon@gmail.com]; MID_RHS_MATCH_FROM(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[96.151.72.93.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 May 2020 07:45:29 -0000 On 13/05/2020 10:35, Andriy Gapon wrote: > On 13/05/2020 01:47, Bryan Drewery wrote: >> Trivial repro: >> >> dd if=/dev/zero of=blah & tail -F blah >> ^C >> load: 0.21 cmd: tail 72381 [prev->lr_read_cv] 2.17r 0.00u 0.01s 0% 2600k >> #0 0xffffffff80bce615 at mi_switch+0x155 >> #1 0xffffffff80c1cfea at sleepq_switch+0x11a >> #2 0xffffffff80b57f0a at _cv_wait+0x15a >> #3 0xffffffff829ddab6 at rangelock_enter+0x306 >> #4 0xffffffff829ecd3f at zfs_freebsd_getpages+0x14f >> #5 0xffffffff810e3ab9 at VOP_GETPAGES_APV+0x59 >> #6 0xffffffff80f349e7 at vnode_pager_getpages+0x37 >> #7 0xffffffff80f2a93f at vm_pager_get_pages+0x4f >> #8 0xffffffff80f054b0 at vm_fault+0x780 >> #9 0xffffffff80f04bde at vm_fault_trap+0x6e >> #10 0xffffffff8106544e at trap_pfault+0x1ee >> #11 0xffffffff81064a9c at trap+0x44c >> #12 0xffffffff8103a978 at calltrap+0x8 > > In r329363 I re-worked zfs_getpages and introduced range locking to it. > At the time I believed that it was safe and maybe it was, please see the commit > message. > There, indeed, have been many performance / concurrency improvements to the VM > system and r358443 is one of them. Thinking more about it, it could be r352176. I think that vm_page_grab_valid (and later vm_page_grab_valid_unlocked) are not equivalent to the code that they replaced. The original code would check valid field before any locking and it would attempt any locking / busing if a page is invalid. The object was required to be locked though. The new code tries to busy the page in any case. > I am not sure how to resolve the problem best. Maybe someone who knows the > latest VM code better than me can comment on my assumptions stated in the commit > message. > > In illumos (and, I think, in OpenZFS/ZoL) they don't have the range locking in > this corner of the code because of a similar deadlock a long time ago. > >> On 5/12/2020 3:13 PM, Bryan Drewery wrote: >>>> panic: deadlres_td_sleep_q: possible deadlock detected for 0xfffffe25eefa2e00 (find), blocked for 1802392 ticks > ... >>>> (kgdb) backtrace >>>> #0 sched_switch (td=0xfffffe255eac0000, flags=) at /usr/src/sys/kern/sched_ule.c:2147 >>>> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >>>> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffff810fb57dd48, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:625 >>>> #3 0xffffffff80b57f0a in _cv_wait (cvp=0xfffff810fb57dd48, lock=0xfffff80049a99040) at /usr/src/sys/kern/kern_condvar.c:146 >>>> #4 0xffffffff82434ab6 in rangelock_enter_reader (rl=0xfffff80049a99018, new=0xfffff8022cadb100) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:429 >>>> #5 rangelock_enter (rl=0xfffff80049a99018, off=, len=, type=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:477 >>>> #6 0xffffffff82443d3f in zfs_getpages (vp=, ma=0xfffffe259f204b18, count=, rbehind=0xfffffe259f204ac4, rahead=0xfffffe259f204ad0) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4500 >>>> #7 zfs_freebsd_getpages (ap=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4567 >>>> #8 0xffffffff810e3ab9 in VOP_GETPAGES_APV (vop=0xffffffff8250a1e0 , a=0xfffffe259f2049f0) at vnode_if.c:2644 >>>> #9 0xffffffff80f349e7 in VOP_GETPAGES (vp=, m=, count=, rbehind=, rahead=) at ./vnode_if.h:1171 >>>> #10 vnode_pager_getpages (object=, m=, count=, rbehind=, rahead=) at /usr/src/sys/vm/vnode_pager.c:743 >>>> #11 0xffffffff80f2a93f in vm_pager_get_pages (object=0xfffff806cb637c60, m=0xfffffe259f204b18, count=1, rbehind=, rahead=) at /usr/src/sys/vm/vm_pager.c:305 >>>> #12 0xffffffff80f054b0 in vm_fault_getpages (fs=, nera=0, behindp=, aheadp=) at /usr/src/sys/vm/vm_fault.c:1163 >>>> #13 vm_fault (map=, vaddr=, fault_type=, fault_flags=, m_hold=) at /usr/src/sys/vm/vm_fault.c:1394 >>>> #14 0xffffffff80f04bde in vm_fault_trap (map=0xfffffe25653949e8, vaddr=, fault_type=, fault_flags=0, signo=0xfffffe259f204d04, ucode=0xfffffe259f204d00) at /usr/src/sys/vm/vm_fault.c:589 >>>> #15 0xffffffff8106544e in trap_pfault (frame=0xfffffe259f204d40, usermode=, signo=, ucode=) at /usr/src/sys/amd64/amd64/trap.c:821 >>>> #16 0xffffffff81064a9c in trap (frame=0xfffffe259f204d40) at /usr/src/sys/amd64/amd64/trap.c:340 >>>> #17 >>>> #18 0x00000000002034fc in ?? () > ... >>>> (kgdb) thread >>>> [Current thread is 8 (Thread 101255)] >>>> (kgdb) backtrace >>>> #0 sched_switch (td=0xfffffe25c8e9bc00, flags=) at /usr/src/sys/kern/sched_ule.c:2147 >>>> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >>>> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffffe001cbca850, pri=84) at /usr/src/sys/kern/subr_sleepqueue.c:625 >>>> #3 0xffffffff80f1de50 in _vm_page_busy_sleep (obj=, m=0xfffffe001cbca850, pindex=, wmesg=, allocflags=21504, locked=false) at /usr/src/sys/vm/vm_page.c:1094 >>>> #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=0xfffff806cb637c60, m=, pindex=, wmesg=, allocflags=21504, locked=) at /usr/src/sys/vm/vm_page.c:4326 >>>> #5 vm_page_acquire_unlocked (object=0xfffff806cb637c60, pindex=1098, prev=, mp=0xfffffe2717fc6730, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4469 >>>> #6 0xffffffff80f24c61 in vm_page_grab_valid_unlocked (mp=0xfffffe2717fc6730, object=0xfffff806cb637c60, pindex=1098, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4645 >>>> #7 0xffffffff82440246 in page_busy (vp=0xfffff80571f29500, start=4497408, off=, nbytes=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:414 >>>> #8 update_pages (vp=0xfffff80571f29500, start=4497408, len=32, os=0xfffff8096a277400, oid=2209520, segflg=, tx=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:482 >>>> #9 zfs_write (vp=, uio=, ioflag=0, cr=, ct=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1071 >>>> #10 zfs_freebsd_write (ap=) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4838 >>>> #11 0xffffffff810e0eaf in VOP_WRITE_APV (vop=0xffffffff8250a1e0 , a=0xfffffe2717fc68c8) at vnode_if.c:925 >>>> #12 0xffffffff80cb574c in VOP_WRITE (vp=0xfffff80571f29500, uio=0xfffffe2717fc6bb0, ioflag=8323073, cred=) at ./vnode_if.h:413 >>>> #13 vn_write (fp=0xfffff8048195e8c0, uio=, active_cred=, flags=, td=) at /usr/src/sys/kern/vfs_vnops.c:894 >>>> #14 0xffffffff80cb50c3 in vn_io_fault_doio (args=0xfffffe2717fc6af0, uio=0xfffffe2717fc6bb0, td=0xfffffe25c8e9bc00) at /usr/src/sys/kern/vfs_vnops.c:959 >>>> #15 0xffffffff80cb1c8c in vn_io_fault1 (vp=, uio=0xfffffe2717fc6bb0, args=0xfffffe2717fc6af0, td=0xfffffe25c8e9bc00) at /usr/src/sys/kern/vfs_vnops.c:1077 >>>> #16 0xffffffff80cafa32 in vn_io_fault (fp=0xfffff8048195e8c0, uio=0xfffffe2717fc6bb0, active_cred=0xfffff80f2cc12708, flags=0, td=) at /usr/src/sys/kern/vfs_vnops.c:1181 >>>> #17 0xffffffff80c34331 in fo_write (fp=0xfffff8048195e8c0, uio=0xfffffe2717fc6bb0, active_cred=, flags=, td=0xfffffe25c8e9bc00) at /usr/src/sys/sys/file.h:326 >>>> #18 dofilewrite (td=0xfffffe25c8e9bc00, fd=2, fp=0xfffff8048195e8c0, auio=0xfffffe2717fc6bb0, offset=, flags=) at /usr/src/sys/kern/sys_generic.c:564 >>>> #19 0xffffffff80c33eb0 in kern_writev (td=0xfffffe25c8e9bc00, fd=2, auio=) at /usr/src/sys/kern/sys_generic.c:491 >>>> #20 sys_write (td=0xfffffe25c8e9bc00, uap=) at /usr/src/sys/kern/sys_generic.c:406 >>>> #21 0xffffffff8106623d in syscallenter (td=) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:150 >>>> #22 amd64_syscall (td=0xfffffe25c8e9bc00, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1161 >>>> #23 >>>> #24 0x000000080043d53a in ?? () >>> >>> Maybe r358443 is related? >>> >>> >>>> (kgdb) frame 4 >>>> #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=0xfffff806cb637c60, m=, pindex=, wmesg=, allocflags=21504, locked=) at /usr/src/sys/vm/vm_page.c:4326 >>>> 4326 if (_vm_page_busy_sleep(object, m, m->pindex, wmesg, allocflags, >>>> (kgdb) p *object >>>> $8 = {lock = {lock_object = {lo_name = 0xffffffff8114fa30 "vm object", lo_flags = 627245056, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, object_list = {tqe_next = 0xfffff806cb637d68, tqe_prev = 0xfffff806cb637b78}, shadow_head = {lh_first = 0x0}, shadow_list = {le_next = 0xffffffffffffffff, >>>> le_prev = 0xffffffffffffffff}, memq = {tqh_first = 0xfffffe001cbca850, tqh_last = 0xfffffe001cbca860}, rtree = {rt_root = 18446741875168421969}, size = 1099, domain = {dr_policy = 0x0, dr_iter = 0}, generation = 1, cleangeneration = 1, ref_count = 2, shadow_count = 0, memattr = 6 '\006', type = 2 '\002', >>>> flags = 4096, pg_color = 0, paging_in_progress = {__count = 2}, busy = {__count = 0}, resident_page_count = 1, backing_object = 0x0, backing_object_offset = 0, pager_object_list = {tqe_next = 0x0, tqe_prev = 0x0}, rvq = {lh_first = 0x0}, handle = 0xfffff80571f29500, un_pager = {vnp = {vnp_size = 4499568, >>>> writemappings = 0}, devp = {devp_pglist = {tqh_first = 0x44a870, tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {sgp_pglist = {tqh_first = 0x44a870, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x44a870, swp_blks = {pt_root = 0}, writemappings = 0}}, cred = 0x0, charge = 0, umtx_data = 0x0} >>>> (kgdb) frame 5 >>>> #5 vm_page_acquire_unlocked (object=0xfffff806cb637c60, pindex=1098, prev=, mp=0xfffffe2717fc6730, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4469 >>>> 4469 if (!vm_page_grab_sleep(object, m, pindex, "pgnslp", >>>> (kgdb) p *m >>>> $9 = {plinks = {q = {tqe_next = 0xffffffffffffffff, tqe_prev = 0xffffffffffffffff}, s = {ss = {sle_next = 0xffffffffffffffff}}, memguard = {p = 18446744073709551615, v = 18446744073709551615}, uma = {slab = 0xffffffffffffffff, zone = 0xffffffffffffffff}}, listq = {tqe_next = 0x0, tqe_prev = 0xfffff806cb637ca8}, >>>> object = 0xfffff806cb637c60, pindex = 1098, phys_addr = 18988408832, md = {pv_list = {tqh_first = 0x0, tqh_last = 0xfffffe001cbca888}, pv_gen = 44682, pat_mode = 6}, ref_count = 2147483648, busy_lock = 1588330502, a = {{flags = 0, queue = 255 '\377', act_count = 0 '\000'}, _bits = 16711680}, order = 13 '\r', >>>> pool = 0 '\000', flags = 1 '\001', oflags = 0 '\000', psind = 0 '\000', segind = 6 '\006', valid = 0 '\000', dirty = 0 '\000'} >>> >>> Pretty sure this thread is holding the rangelock from zfs_write() that >>> tail is waiting on. So what is this thread (101255) waiting on exactly >>> for? I'm not sure the way to track down what is using vm object >>> 0xfffff806cb637c60. If the tail thread busied the page then they are >>> waiting on each other I guess. If that's true then r358443 removing the >>> write lock on the object in update_pages() could be a problem. >>> >>> >>> Not sure the rest is interesting. I think they are just waiting on the >>> locked vnode but I give it here in case I missed something. > > -- Andriy Gapon From owner-freebsd-current@freebsd.org Wed May 13 08:43:29 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E11502EEE79 for ; Wed, 13 May 2020 08:43:29 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic312-23.consmr.mail.gq1.yahoo.com (sonic312-23.consmr.mail.gq1.yahoo.com [98.137.69.204]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49MSr01lCWz3HXq for ; Wed, 13 May 2020 08:43:27 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: JcuqEyYVM1n16L7Heovzfmq3h815crg5hv6fPDYHSgs3lYKXCy8a1eN2_Mi220S gJc9X_k0O.CEoYEzkCzEDyKdhOrnJ90g85NRM_h9r.V7TrP7YUa7_LisQE.o4zZFwsNHaYO1gbUe CPkdim9V713ydBMaSqY7lDePBz_6PK4aN6Dakwf1RvuxlrJkIbzhYRkfEhJ40Thq.k7mxGDYMAuC TUD0C_CccVPYBmqWUBAAFINIUylfbjOtovyxsdNy4jsNfTJR7QbdI0be0K5NVKyQs2CJ3CSHLPtd Q2VkYoRYn9YnxBl_W26auilF9PAT7u3V1.BD.RvpedUqHbUngsTDqpp6Kocv6FV1P01ehsYwMu9h McNzQLqg68Ku9ensOfg2wL3m4Tm0LOsQzsnCWd4mfRqMxmUztNzaJ7OJMqBaMju0KJbwScRkYTdo ZYtFigJWxSQgs.HXwV6PxC3Owya9UuzySnTRdqQURm60ZA0y2fRCl8vH7D0ebMcTtot5kvOineU0 gjHZocjZDLRbgj6hfQYB3ESVfFUNPLW1v0LD4ltC_u6QydxZJ.6hvq6xZMYpQeMpqoVvB22LnhLY Q1GOYNCvpY.7ZpWHA529itWO7J3bV3lMTwe7jTV7OErQTasHp5bZFD2AYhRtoqlJ4gxEL0tyzE.3 MepGbWH0.IYZMaUlXMksZnuOBJdvWywdWprjstKQrnpTaR2fBb2.FQZ5Qt1CsKi6b5E8B5aITvv4 WnZnghA9tDeYeVaC5pkC.8OJ5MDCWMnhSGcKTOi8oI7W7_iAK2daAYfyE5TNk9zaI7dVRIsku5D3 9seL1Z_92ReB1PfYtMUfzfnScbt5h2BsbTSk7FkVDbCzcThtsZBh9j37lrlpP.3XjYjY_GPD9sCC vh2dmeWa4b.OLsXVBj1uU9CqkBgcQHiHM0ZluYF.2fNABtcCuxE4Wv7nAQD5Ghwxloucdbl4636b nTK2vlvE1j3Fdg5RPLVjqMW5Vr9wPawK46HV9G.QPf1C2joZ32_z9mOjdqJ5KiQdXkc3M7TvnDh6 SX_yPgjAjLydeDx8bVty6mX3c93i5wHLcVI8uVEezaIt8IYafcepOq8v9tc5umKY2Or0Pr0tKVhc ifaE6Degc4s9L8Sb.rLOlgG7ofUk2_a6rz1sRQAyRL7AKeqRtJqLbBknf_Z1Ts1JfzRwSanh56Pz ygElbWOpjUF6PvFATIdsO21haNucI25G1nNCLXrxcnv0z3JIC1UTLRviMdWNjETbjakv_OMMnObC Ibp6JFKTbqLAMi3T0gTKfytgjV1OAzAeJmRGYfrmv8pFyNvF.wNwExAsL4kl80ogHXitYrb42M0a Yq2SiEkHQx3UTSEPVCEAtKmoS5nBcpr8G.PNFOgK.5gCtMGsrGpiG20bgDDurGxnCbOO0.4uIQEK u_IqjEFJp5StoiUGIaKG1gOXI9mldbmQ- Received: from sonic.gate.mail.ne1.yahoo.com by sonic312.consmr.mail.gq1.yahoo.com with HTTP; Wed, 13 May 2020 08:43:26 +0000 Received: by smtp429.mail.gq1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 31a8d8875d7f20b3e1c3e41b2146484f; Wed, 13 May 2020 08:43:24 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\)) Subject: Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311 From: Mark Millard In-Reply-To: <9B68839B-AEC8-43EE-B3B6-B696A4A57DAE@yahoo.com> Date: Wed, 13 May 2020 01:43:23 -0700 Cc: Brandon Bergren , Justin Hibbits Content-Transfer-Encoding: quoted-printable Message-Id: <359C9C7D-4106-42B5-AAB5-08EF995B8100@yahoo.com> References: <8479DD58-44F6-446A-9CA5-D01F0F7C1B38@yahoo.com> <17ACDA02-D7EF-4F26-874A-BB3E935CD072@yahoo.com> <695E6836-F860-4557-B7DE-CC1EDB347F18@yahoo.com> <121B9B09-141B-4DC3-918B-1E7CFB99E779@yahoo.com> <8AAB0462-3FA8-490C-8D8D-7C15B1C9E2DE@yahoo.com> <18E62746-80DB-4195-977D-4FF32D0129EE@yahoo.com> <9562EEE4-62EF-4164-91C0-948CC0432984@yahoo.com> <9B68839B-AEC8-43EE-B3B6-B696A4A57DAE@yahoo.com> To: "vangyzen@freebsd.org" , svn-src-head@freebsd.org, FreeBSD Current , FreeBSD Hackers , FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3608.80.23.2.2) X-Rspamd-Queue-Id: 49MSr01lCWz3HXq X-Spamd-Bar: - X-Spamd-Result: default: False [-1.04 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.18)[-0.176,0]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-0.36)[-0.364,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.00)[ip: (5.09), ipnet: 98.137.64.0/21(0.83), asn: 36647(0.66), country: US(-0.05)]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[204.69.137.98.list.dnswl.org : 127.0.5.0]; RWL_MAILSPIKE_POSSIBLE(0.00)[204.69.137.98.rep.mailspike.net : 127.0.0.17]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 May 2020 08:43:29 -0000 [I'm adding a reference to an old arm64/aarch64 bug that had pages turning to zero, in case this 32-bit powerpc issue is somewhat analogous.] On 2020-May-13, at 00:29, Mark Millard wrote: > [stress alone is sufficient to have jemalloc asserts fail > in stress, no need for a multi-socket G4 either. No need > to involve nfsd, mountd, rpcbind or the like. This is not > a claim that I know all the problems to be the same, just > that a jemalloc reported failure in this simpler context > happens and zeroed pages are involved.] >=20 > Reminder: head -r360311 based context. >=20 >=20 > First I show a single CPU/core PowerMac G4 context failing > in stress. (I actually did this later, but it is the > simpler context.) I simply moved the media from the > 2-socket G4 to this slower, single-cpu/core one. >=20 > cpu0: Motorola PowerPC 7400 revision 2.9, 466.42 MHz > cpu0: Features 9c000000 > cpu0: HID0 8094c0a4 > real memory =3D 1577857024 (1504 MB) > avail memory =3D 1527508992 (1456 MB) >=20 > # stress -m 1 --vm-bytes 1792M > stress: info: [1024] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd > : = /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:= Failed assertion: "slab =3D=3D extent_slab_get(extent)" > stress: FAIL: [1024] (415) <-- worker 1025 got signal 6 > stress: WARN: [1024] (417) now reaping child worker processes > stress: FAIL: [1024] (451) failed run completed in 243s >=20 > (Note: 1792 is the biggest it allowed with M.) >=20 > The following still pages in and out and fails: >=20 > # stress -m 1 --vm-bytes 1290M > stress: info: [1163] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd > : = /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258:= Failed assertion: "slab =3D=3D extent_slab_get(extent)" > . . . >=20 > By contrast, the following had no problem for as > long as I let it run --and did not page in or out: >=20 > # stress -m 1 --vm-bytes 1280M > stress: info: [1181] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd >=20 >=20 >=20 >=20 > The 2 socket PowerMac G4 context with 2048 MiByte of RAM . . . >=20 > stress -m 1 --vm-bytes 1792M >=20 > did not (quickly?) fail or page. 1792 > is as large as it would allow with M. >=20 > The following also did not (quickly?) fail > (and were not paging): >=20 > stress -m 2 --vm-bytes 896M > stress -m 4 --vm-bytes 448M > stress -m 8 --vm-bytes 224M >=20 > (Only 1 example was run at a time.) >=20 > But the following all did quickly fail (and were > paging): >=20 > stress -m 8 --vm-bytes 225M > stress -m 4 --vm-bytes 449M > stress -m 2 --vm-bytes 897M >=20 > (Only 1 example was run at a time.) >=20 > I'll note that when I exited an su process > I ended up with a: >=20 > : = /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200: Failed = assertion: "ret =3D=3D sz_index2size_compute(index)" > Abort trap (core dumped) >=20 > and a matching su.core file. It appears > that stress's activity leads to other > processes also seeing examples of the > zeroed-page(s) problem (probably su had > paged some or had been fully swapped > out): >=20 > (gdb) bt > #0 thr_kill () at thr_kill.S:4 > #1 0x503821d0 in __raise (s=3D6) at /usr/src/lib/libc/gen/raise.c:52 > #2 0x502e1d20 in abort () at /usr/src/lib/libc/stdlib/abort.c:67 > #3 0x502d6144 in sz_index2size_lookup (index=3D) at = /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200 > #4 sz_index2size (index=3D) at = /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:207 > #5 ifree (tsd=3D0x5008b018, ptr=3D0x50041460, tcache=3D0x5008b138, = slow_path=3D) at jemalloc_jemalloc.c:2583 > #6 0x502d5cec in __je_free_default (ptr=3D0x50041460) at = jemalloc_jemalloc.c:2784 > #7 0x502d62d4 in __free (ptr=3D0x50041460) at = jemalloc_jemalloc.c:2852 > #8 0x501050cc in openpam_destroy_chain (chain=3D0x50041480) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:113 > #9 0x50105094 in openpam_destroy_chain (chain=3D0x500413c0) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111 > #10 0x50105094 in openpam_destroy_chain (chain=3D0x50041320) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111 > #11 0x50105094 in openpam_destroy_chain (chain=3D0x50041220) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111 > #12 0x50105094 in openpam_destroy_chain (chain=3D0x50041120) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111 > #13 0x50105094 in openpam_destroy_chain (chain=3D0x50041100) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:111 > #14 0x50105014 in openpam_clear_chains (policy=3D0x50600004) at = /usr/src/contrib/openpam/lib/libpam/openpam_load.c:130 > #15 0x50101230 in pam_end (pamh=3D0x50600000, status=3D) at /usr/src/contrib/openpam/lib/libpam/pam_end.c:83 > #16 0x1001225c in main (argc=3D, argv=3D0x0) at = /usr/src/usr.bin/su/su.c:477 >=20 > (gdb) print/x __je_sz_size2index_tab > $1 =3D {0x0 } >=20 >=20 > Notes: >=20 > Given that the original problem did not involve > paging to the swap partition, may be just making > it to the Laundry list or some such is sufficient, > something that is also involved when the swap > space is partially in use (according to top). Or > sitting in the inactive list for a long time, if > that has some special status. >=20 The following is was a fix for a "pages magically turn into zeros" problem on amd64/aarch64. The original 32-bit powerpc context did not seem a match to me --but the stress test behavior that I've just observed seems closer from an external-test point of view: swapping is involved. My be this will suggest something to someone that knows what they are doing. (Note: dsl-only.net closed down, so the E-mail address reference is no longer valid.) Author: kib Date: Mon Apr 10 15:32:26 2017 New Revision: 316679 URL:=20 https://svnweb.freebsd.org/changeset/base/316679 Log: Do not lose dirty bits for removing PROT_WRITE on arm64. Arm64 pmap interprets accessed writable ptes as modified, since ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable bit is removed, page must be marked as dirty for MI VM. This change is most important for COW, where fork caused losing content of the dirty pages which were not yet scanned by pagedaemon. Reviewed by: alc, andrew Reported and tested by: Mark Millard PR: 217138, 217239 Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Modified: head/sys/arm64/arm64/pmap.c Modified: head/sys/arm64/arm64/pmap.c = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D --- head/sys/arm64/arm64/pmap.c Mon Apr 10 12:35:58 2017 = (r316678) +++ head/sys/arm64/arm64/pmap.c Mon Apr 10 15:32:26 2017 = (r316679) @@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sv sva +=3D L3_SIZE) { l3 =3D pmap_load(l3p); if (pmap_l3_valid(l3)) { + if ((l3 & ATTR_SW_MANAGED) && + pmap_page_dirty(l3)) { + vm_page_dirty(PHYS_TO_VM_PAGE(l3 = & + ~ATTR_MASK)); + } pmap_set(l3p, ATTR_AP(ATTR_AP_RO)); PTE_SYNC(l3p); /* XXX: Use pmap_invalidate_range */ =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar) From owner-freebsd-current@freebsd.org Wed May 13 14:43:03 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E80AB2F81D8 for ; Wed, 13 May 2020 14:43:03 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qv1-xf2f.google.com (mail-qv1-xf2f.google.com [IPv6:2607:f8b0:4864:20::f2f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49Mcpv5ssnz480G; Wed, 13 May 2020 14:43:03 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qv1-xf2f.google.com with SMTP id r3so7704qvm.1; Wed, 13 May 2020 07:43:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=8jlDi2hFH758/jt7B8L2DAFs3gBPnU+r0mUjy3pOOjY=; b=H75wRirjZbEmWnL9tfzcudg5MWDTAJg7iXzqA1yBSXLJ3YJmYuRmLL5nzSGhHmDNp3 WU+KFgVdhzN1eJPoj+kV2+t06zKrXPvZAf+Uv28rEfmT486+6D34CYTUJDYyCukHOGYf N0vH4vGqKKFhKGj9dyJkaCbiQXPIGO7/Ni1bWL4CbNA3833YEb+p+4XCV+INjpcMQFsL 7Tzip/nbKoQ9UOUw47TwD04OD/SbaaYBMnnxVBCN9NqbeucyI0n56mbq2eonUJBJNVRV tfHYpxRGlv3Hs7IKffaneRN2AsDJpyH3n9G8nmjfv9eNPGZFqJ9A3Ozkm+WSCOgm78dk Ucdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=8jlDi2hFH758/jt7B8L2DAFs3gBPnU+r0mUjy3pOOjY=; b=V+aYT/zwmMxZo+UJYCI0tOdmlA5mCtia/0FqihxGRhr2SsYVxBPk5Iew41tX+85luH VvDFYPE8UTX31PiRFYdMM+4fTvtMVZSHj3LTMG5XUmsiQ0y6TKkpL/hfieCgqAAERtbl d26FYOMlzhgEFzKZPARf+zvQps0x0CKVP3yX0+4abna/CjlMi0igomSRHAf23BoyJKhy RUn6C6A675Rety33/VpeoQJQWKRwrD1M94jIVBUlcR7sP7wrfufwy8ahmtPCcd2YFwD1 pOvfSpPn5yiWZ0n63LDgM5thVdjnAan+iT3RFvjas5CDxiVopAspAxIh+oPUMOuceub7 xPCQ== X-Gm-Message-State: AOAM530Ke9hyR21xd1YvX1f5++CqVJe5cjpkz8TQFDy0J+GuFbFKUOg3 0FJfOaRtztONQFtqIADEOMjj5jNP X-Google-Smtp-Source: ABdhPJwIaOIg+yGtGca9v7wjgbHVubpNR13f6qLSwxJf2C6ohxMdA+RQIDSaVc4a1iMrPmz2A1q2Ig== X-Received: by 2002:a05:6214:1427:: with SMTP id o7mr7180qvx.104.1589380982530; Wed, 13 May 2020 07:43:02 -0700 (PDT) Received: from raichu (toroon0560w-lp130-15-184-144-87-103.dsl.bell.ca. [184.144.87.103]) by smtp.gmail.com with ESMTPSA id n20sm13913469qkk.53.2020.05.13.07.43.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 07:43:01 -0700 (PDT) Sender: Mark Johnston Date: Wed, 13 May 2020 10:42:57 -0400 From: Mark Johnston To: Andriy Gapon Cc: Bryan Drewery , freebsd-current@freebsd.org Subject: Re: zfs deadlock on r360452 relating to busy vm page Message-ID: <20200513144257.GA24239@raichu> References: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> <0e9cceba-84d0-ec4f-8784-36703452201d@FreeBSD.org> <889cb93b-85c7-3ec4-4ccf-5fb56ec38fa5@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <889cb93b-85c7-3ec4-4ccf-5fb56ec38fa5@FreeBSD.org> X-Rspamd-Queue-Id: 49Mcpv5ssnz480G X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-6.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 May 2020 14:43:04 -0000 On Wed, May 13, 2020 at 10:45:24AM +0300, Andriy Gapon wrote: > On 13/05/2020 10:35, Andriy Gapon wrote: > > On 13/05/2020 01:47, Bryan Drewery wrote: > >> Trivial repro: > >> > >> dd if=/dev/zero of=blah & tail -F blah > >> ^C > >> load: 0.21 cmd: tail 72381 [prev->lr_read_cv] 2.17r 0.00u 0.01s 0% 2600k > >> #0 0xffffffff80bce615 at mi_switch+0x155 > >> #1 0xffffffff80c1cfea at sleepq_switch+0x11a > >> #2 0xffffffff80b57f0a at _cv_wait+0x15a > >> #3 0xffffffff829ddab6 at rangelock_enter+0x306 > >> #4 0xffffffff829ecd3f at zfs_freebsd_getpages+0x14f > >> #5 0xffffffff810e3ab9 at VOP_GETPAGES_APV+0x59 > >> #6 0xffffffff80f349e7 at vnode_pager_getpages+0x37 > >> #7 0xffffffff80f2a93f at vm_pager_get_pages+0x4f > >> #8 0xffffffff80f054b0 at vm_fault+0x780 > >> #9 0xffffffff80f04bde at vm_fault_trap+0x6e > >> #10 0xffffffff8106544e at trap_pfault+0x1ee > >> #11 0xffffffff81064a9c at trap+0x44c > >> #12 0xffffffff8103a978 at calltrap+0x8 > > > > In r329363 I re-worked zfs_getpages and introduced range locking to it. > > At the time I believed that it was safe and maybe it was, please see the commit > > message. > > There, indeed, have been many performance / concurrency improvements to the VM > > system and r358443 is one of them. > > Thinking more about it, it could be r352176. > I think that vm_page_grab_valid (and later vm_page_grab_valid_unlocked) are not > equivalent to the code that they replaced. > The original code would check valid field before any locking and it would > attempt any locking / busing if a page is invalid. The object was required to > be locked though. > The new code tries to busy the page in any case. > > > I am not sure how to resolve the problem best. Maybe someone who knows the > > latest VM code better than me can comment on my assumptions stated in the commit > > message. The general trend has been to use the page busy lock as the single point of synchronization for per-page state. As you noted, updates to the valid bits were previously interlocked by the object lock, but this is coarse-grained and hurts concurrency. I think you are right that the range locking in getpages was ok before the recent change, but it seems preferable to try and address this in ZFS. > > In illumos (and, I think, in OpenZFS/ZoL) they don't have the range locking in > > this corner of the code because of a similar deadlock a long time ago. Do they just not implement readahead? Can you explain exactly what the range lock accomplishes here - is it entirely to ensure that znode block size remains stable? From owner-freebsd-current@freebsd.org Wed May 13 15:56:37 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E48032F9A18; Wed, 13 May 2020 15:56:37 +0000 (UTC) (envelope-from chmeeedalf@gmail.com) Received: from mail-qk1-x741.google.com (mail-qk1-x741.google.com [IPv6:2607:f8b0:4864:20::741]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49MfRm6J8hz4Cjp; Wed, 13 May 2020 15:56:36 +0000 (UTC) (envelope-from chmeeedalf@gmail.com) Received: by mail-qk1-x741.google.com with SMTP id y22so5219351qki.3; Wed, 13 May 2020 08:56:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=AorPKGANpreDu+Uw1/aaP88YkeF7NB6uV/b4eKDQVhM=; b=k/Jk0ssqncUGxCMPu4IbpADHW9LkIeL34iOrXY+5Biyx5dpZcFFQQkDlVmdZuQMiwK gm2hU67eOn95FWgA4URmJNUVXu2wtwxVKtn7NqBnRCUilvHTScxG+02l7bYRY0L3iXC0 Ndsl+We+OBBmD9hxfKVys770l/BPGIO5hctvOtgealtnLeAx6j2TQ+DHYw9mLSTR9lRV 2lHaGpvNpCtirWL+3/ksD8C4ZaPhGJUUmEAlUM56sASYB0x/0dQNJnd/9KXaATB4Na74 q3rAxUB9AsevB+DoWb2pVV/ceJg18ha99k+l+XmtjNAJUBP7U0IskDMbNteNFb1LeQkR XzSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AorPKGANpreDu+Uw1/aaP88YkeF7NB6uV/b4eKDQVhM=; b=gY2LTAaP49eLwPgVK7iDhYp4M2ijXpY2lg5L3pQkIRtdybExHSLGl3GuyCUEDucUhv gS1vzJdntkFHIpcdn5DzF9wV3umCshMoCzp+y9Bss4bawwLOs30ud0SwDVH0Xy5WuWT0 3ZpmjpbFfiOouzFFtC6BGh2ybrMNCVE/a9aB2AMmfpU/7p8Vooxev+jIRjoK92xh3hp9 IGSYFUrY8iXpHRbaqN8kUfHZq/f9hurJ+vA2gO37piNMW1SML07AJwuOKjstAlf3ZC4w NgMY9GMOg9Ght7mD98aJbr9Ls4GD8gVlBaSmJ0r0qZGopzkG4RHFe7XLAlaZIyLf6f+C F1WA== X-Gm-Message-State: AOAM531fgmCnqeK/NQYPeWTHoglY8cKGB3SjPucJgndPq6KdzcPQKgsm 09hSGpocpwl3D2MJncxrl5I= X-Google-Smtp-Source: ABdhPJy32QdNkjYh2lRrNzio3NWbm85r3Q1aP+/yT6mGNVhkcHOzzmYpla1TZNMld0yrVLw0/eRKNA== X-Received: by 2002:a37:a687:: with SMTP id p129mr343629qke.45.1589385395999; Wed, 13 May 2020 08:56:35 -0700 (PDT) Received: from titan.knownspace (173-19-125-130.client.mchsi.com. [173.19.125.130]) by smtp.gmail.com with ESMTPSA id v78sm133399qkb.62.2020.05.13.08.56.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 08:56:35 -0700 (PDT) Date: Wed, 13 May 2020 10:56:32 -0500 From: Justin Hibbits To: Mark Millard Cc: "vangyzen@freebsd.org" , svn-src-head@freebsd.org, FreeBSD Current , FreeBSD Hackers , FreeBSD PowerPC ML , Brandon Bergren Subject: Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311 Message-ID: <20200513105632.06db9e21@titan.knownspace> In-Reply-To: <359C9C7D-4106-42B5-AAB5-08EF995B8100@yahoo.com> References: <8479DD58-44F6-446A-9CA5-D01F0F7C1B38@yahoo.com> <17ACDA02-D7EF-4F26-874A-BB3E935CD072@yahoo.com> <695E6836-F860-4557-B7DE-CC1EDB347F18@yahoo.com> <121B9B09-141B-4DC3-918B-1E7CFB99E779@yahoo.com> <8AAB0462-3FA8-490C-8D8D-7C15B1C9E2DE@yahoo.com> <18E62746-80DB-4195-977D-4FF32D0129EE@yahoo.com> <9562EEE4-62EF-4164-91C0-948CC0432984@yahoo.com> <9B68839B-AEC8-43EE-B3B6-B696A4A57DAE@yahoo.com> <359C9C7D-4106-42B5-AAB5-08EF995B8100@yahoo.com> X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.32; powerpc64-portbld-freebsd13.0) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 49MfRm6J8hz4Cjp X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=k/Jk0ssq; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of chmeeedalf@gmail.com designates 2607:f8b0:4864:20::741 as permitted sender) smtp.mailfrom=chmeeedalf@gmail.com X-Spamd-Result: default: False [-3.00 / 15.00]; TO_DN_EQ_ADDR_SOME(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; RCPT_COUNT_SEVEN(0.00)[7]; FREEMAIL_TO(0.00)[yahoo.com]; RECEIVED_SPAMHAUS_PBL(0.00)[130.125.19.173.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.11]; IP_SCORE(0.00)[ip: (0.01), ipnet: 2607:f8b0::/32(-0.33), asn: 15169(-0.42), country: US(-0.05)]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_EQ_ENVFROM(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[1.4.7.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; RCVD_TLS_ALL(0.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 May 2020 15:56:38 -0000 Hi Mark, On Wed, 13 May 2020 01:43:23 -0700 Mark Millard wrote: > [I'm adding a reference to an old arm64/aarch64 bug that had > pages turning to zero, in case this 32-bit powerpc issue is > somewhat analogous.] > > On 2020-May-13, at 00:29, Mark Millard wrote: > > > [stress alone is sufficient to have jemalloc asserts fail > > in stress, no need for a multi-socket G4 either. No need > > to involve nfsd, mountd, rpcbind or the like. This is not > > a claim that I know all the problems to be the same, just > > that a jemalloc reported failure in this simpler context > > happens and zeroed pages are involved.] > > > > Reminder: head -r360311 based context. > > > > > > First I show a single CPU/core PowerMac G4 context failing > > in stress. (I actually did this later, but it is the > > simpler context.) I simply moved the media from the > > 2-socket G4 to this slower, single-cpu/core one. > > > > cpu0: Motorola PowerPC 7400 revision 2.9, 466.42 MHz > > cpu0: Features 9c000000 > > cpu0: HID0 8094c0a4 > > real memory = 1577857024 (1504 MB) > > avail memory = 1527508992 (1456 MB) > > > > # stress -m 1 --vm-bytes 1792M > > stress: info: [1024] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd > > : > > /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: > > Failed assertion: "slab == extent_slab_get(extent)" stress: FAIL: > > [1024] (415) <-- worker 1025 got signal 6 stress: WARN: [1024] > > (417) now reaping child worker processes stress: FAIL: [1024] (451) > > failed run completed in 243s > > > > (Note: 1792 is the biggest it allowed with M.) > > > > The following still pages in and out and fails: > > > > # stress -m 1 --vm-bytes 1290M > > stress: info: [1163] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd > > : > > /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: > > Failed assertion: "slab == extent_slab_get(extent)" . . . > > > > By contrast, the following had no problem for as > > long as I let it run --and did not page in or out: > > > > # stress -m 1 --vm-bytes 1280M > > stress: info: [1181] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd > > ... > The following is was a fix for a "pages magically > turn into zeros" problem on amd64/aarch64. The > original 32-bit powerpc context did not seem a > match to me --but the stress test behavior that > I've just observed seems closer from an > external-test point of view: swapping is involved. > > My be this will suggest something to someone that > knows what they are doing. > > (Note: dsl-only.net closed down, so the E-mail > address reference is no longer valid.) > > Author: kib > Date: Mon Apr 10 15:32:26 2017 > New Revision: 316679 > URL: > https://svnweb.freebsd.org/changeset/base/316679 > > > Log: > Do not lose dirty bits for removing PROT_WRITE on arm64. > > Arm64 pmap interprets accessed writable ptes as modified, since > ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable > bit is removed, page must be marked as dirty for MI VM. > > This change is most important for COW, where fork caused losing > content of the dirty pages which were not yet scanned by pagedaemon. > > Reviewed by: alc, andrew > Reported and tested by: Mark Millard > PR: 217138, 217239 > Sponsored by: The FreeBSD Foundation > MFC after: 2 weeks > > Modified: > head/sys/arm64/arm64/pmap.c > > Modified: head/sys/arm64/arm64/pmap.c > ============================================================================== > --- head/sys/arm64/arm64/pmap.c Mon Apr 10 12:35:58 > 2017 (r316678) +++ head/sys/arm64/arm64/pmap.c Mon Apr > 10 15:32:26 2017 (r316679) @@ -2481,6 +2481,11 @@ > pmap_protect(pmap_t pmap, vm_offset_t sv sva += L3_SIZE) { > l3 = pmap_load(l3p); > if (pmap_l3_valid(l3)) { > + if ((l3 & ATTR_SW_MANAGED) && > + pmap_page_dirty(l3)) { > + > vm_page_dirty(PHYS_TO_VM_PAGE(l3 & > + ~ATTR_MASK)); > + } > pmap_set(l3p, ATTR_AP(ATTR_AP_RO)); > PTE_SYNC(l3p); > /* XXX: Use pmap_invalidate_range */ > > > === > Mark Millard > marklmi at yahoo.com > ( dsl-only.net went > away in early 2018-Mar) > Thanks for this reference. I took a quick look at the 3 pmap implementations we have (haven't check the new radix pmap yet), and it looks like only mmu_oea.c (32-bit AIM pmap, for G3 and G4) is missing vm_page_dirty() calls in its pmap_protect() implementation, analogous to the change you posted right above. Given this, I think it's safe to say that this missing piece is necessary. We'll work on a fix for this; looking at moea64_protect(), there may be additional work needed to support this as well, so it may take a few days. - Justin From owner-freebsd-current@freebsd.org Thu May 14 11:16:21 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id DC2832ECA79 for ; Thu, 14 May 2020 11:16:21 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lj1-f195.google.com (mail-lj1-f195.google.com [209.85.208.195]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49N89w50n8z4BwL; Thu, 14 May 2020 11:16:20 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lj1-f195.google.com with SMTP id a21so3012216ljj.11; Thu, 14 May 2020 04:16:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=G3if1PWIeU7usf/8IaYxXbb+kedeZ22vOXTzOeVr4nM=; b=oKs3w/xGo9dVtxUJ6vj2J0in8bs5Ynw/dWdw3yMpvo95roSz3xIs0Chj2IwVrw8S5A 57iS9SynQrn7SGIFdg8aF+p1pOfqswReoMTF2FDVC9CU3/XEyVd0lEoL1e7nDk/rr3h5 hegoexQFXJFbbnwHaARQ6i2GiDMN/rOsLNgqq3WWsAzY/2IbaZLMt13VzCWptiXTLvzk dOObL1ljI6wJP4xeQdwuAIR6riXkpheWFctU44UthO2vlB/pq4pLD4Wwj0gP287scRo8 TRfSuMEZukXGQdVmCdXatWH8PxQ/6aGF0nqOPrz0CQkGxosNuydv7x7i0SW3vptBHAqZ SbZQ== X-Gm-Message-State: AOAM533e++P7Hqm9fHo4vAzopSHJa5kVemcf8q5cNMUxse2MEv0FgiAD hOzKYRJI5ClrUSchSFqff0Zfa23BQgo= X-Google-Smtp-Source: ABdhPJz8AMJtgKsAx+4NmxiRbJWJI4vR5AZ0rv4JoV8EgPTXAprrPGMg90bLTAr3+J7u2mB8Ov86QA== X-Received: by 2002:a2e:7e04:: with SMTP id z4mr2518189ljc.50.1589454978224; Thu, 14 May 2020 04:16:18 -0700 (PDT) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id o14sm1636656lfn.56.2020.05.14.04.16.16 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 May 2020 04:16:17 -0700 (PDT) Subject: Re: zfs deadlock on r360452 relating to busy vm page To: Mark Johnston Cc: Bryan Drewery , freebsd-current@freebsd.org References: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> <0e9cceba-84d0-ec4f-8784-36703452201d@FreeBSD.org> <889cb93b-85c7-3ec4-4ccf-5fb56ec38fa5@FreeBSD.org> <20200513144257.GA24239@raichu> From: Andriy Gapon Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= mQINBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABtB5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz6JAlQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryLkCDQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAYkCPAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: Date: Thu, 14 May 2020 14:16:15 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Firefox/60.0 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20200513144257.GA24239@raichu> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 49N89w50n8z4BwL X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of agapon@gmail.com designates 209.85.208.195 as permitted sender) smtp.mailfrom=agapon@gmail.com X-Spamd-Result: default: False [-2.06 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; DMARC_NA(0.00)[FreeBSD.org]; TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-0.998,0]; RCVD_COUNT_THREE(0.00)[3]; IP_SCORE(-0.09)[ip: (0.40), ipnet: 209.85.128.0/17(-0.39), asn: 15169(-0.42), country: US(-0.05)]; RCVD_IN_DNSWL_NONE(0.00)[195.208.85.209.list.dnswl.org : 127.0.5.0]; NEURAL_HAM_MEDIUM(-0.97)[-0.971,0]; FORGED_SENDER(0.30)[avg@FreeBSD.org,agapon@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[195.208.85.209.rep.mailspike.net : 127.0.0.17]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[avg@FreeBSD.org,agapon@gmail.com]; MID_RHS_MATCH_FROM(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[96.151.72.93.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 May 2020 11:16:21 -0000 On 13/05/2020 17:42, Mark Johnston wrote: > On Wed, May 13, 2020 at 10:45:24AM +0300, Andriy Gapon wrote: >> On 13/05/2020 10:35, Andriy Gapon wrote: >>> In r329363 I re-worked zfs_getpages and introduced range locking to it. >>> At the time I believed that it was safe and maybe it was, please see the commit >>> message. >>> There, indeed, have been many performance / concurrency improvements to the VM >>> system and r358443 is one of them. >> >> Thinking more about it, it could be r352176. >> I think that vm_page_grab_valid (and later vm_page_grab_valid_unlocked) are not >> equivalent to the code that they replaced. >> The original code would check valid field before any locking and it would >> attempt any locking / busing if a page is invalid. The object was required to >> be locked though. >> The new code tries to busy the page in any case. >> >>> I am not sure how to resolve the problem best. Maybe someone who knows the >>> latest VM code better than me can comment on my assumptions stated in the commit >>> message. > > The general trend has been to use the page busy lock as the single point > of synchronization for per-page state. As you noted, updates to the > valid bits were previously interlocked by the object lock, but this is > coarse-grained and hurts concurrency. I think you are right that the > range locking in getpages was ok before the recent change, but it seems > preferable to try and address this in ZFS. > >>> In illumos (and, I think, in OpenZFS/ZoL) they don't have the range locking in >>> this corner of the code because of a similar deadlock a long time ago. > > Do they just not implement readahead? I think so, but not 100% sure. I recall seeing a comment in illumos code that they do not care about read-ahead because there is ZFS prefetch and the data will be cached in ARC. That makes sense from the I/O point of view, but it does not help with page faults. > Can you explain exactly what the > range lock accomplishes here - is it entirely to ensure that znode block > size remains stable? As far as I can recall, this is the reason indeed. -- Andriy Gapon From owner-freebsd-current@freebsd.org Thu May 14 14:29:32 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C80EB2F3616 for ; Thu, 14 May 2020 14:29:32 +0000 (UTC) (envelope-from ctuffli@gmail.com) Received: from mail-pg1-f182.google.com (mail-pg1-f182.google.com [209.85.215.182]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49NDSr0Gcfz4PR3; Thu, 14 May 2020 14:29:31 +0000 (UTC) (envelope-from ctuffli@gmail.com) Received: by mail-pg1-f182.google.com with SMTP id b8so1303429pgi.11; Thu, 14 May 2020 07:29:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wChxNqs9TtufNlGp5TbWSOfPHSxfolVSZer5edZZb8g=; b=NmdxCTm6FBQtALgleyqIiwM6lbKP42WxnNuUZWnXw7GNSwhp9bMsK7tvHjJTOGzee9 ARayc9F+MIZbGkTwHKfohPcoUPzmkvwuGple3kmU03N+EQ9znSlwZjbM1veitpWZnjyH UBNNleQlu1fVJtiAryG5WV59CdWM/Lu8hJmDShNnEmNIHyikz160mTLScpF2/XJc7TtH LsosxxhUd5YMCvGcXQllxsNMhPW8FTdfM0pFqdh3WJHd8C2QWkO0eEcqGmLx8nE05Ug7 qQrsWaZy2lxhZE8d2AxnywRjYxfz6b9QFNqTGcAcK5ClMsuqxfEXaBpYIYj3LPDSQzJ1 N5+w== X-Gm-Message-State: AOAM533TnL16fHA0+78571xBZs5OJ4W47dqMjE7jd5MXHxSdlVk61sBK KsTMgPPDD+2Lys0FYPThJ6nVInLh X-Google-Smtp-Source: ABdhPJwhq4GvGUmGWdgypJoGRMa3upP8QEO9BNzv3HBdS/XDhYxpRmBlkArKh/c/8iEJy+l3Itn2vQ== X-Received: by 2002:a62:8888:: with SMTP id l130mr4610176pfd.140.1589466570174; Thu, 14 May 2020 07:29:30 -0700 (PDT) Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com. [209.85.214.173]) by smtp.gmail.com with ESMTPSA id ce21sm18216948pjb.51.2020.05.14.07.29.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 May 2020 07:29:29 -0700 (PDT) Received: by mail-pl1-f173.google.com with SMTP id q16so8854plr.2; Thu, 14 May 2020 07:29:29 -0700 (PDT) X-Received: by 2002:a17:90a:21c9:: with SMTP id q67mr9794374pjc.166.1589466569363; Thu, 14 May 2020 07:29:29 -0700 (PDT) MIME-Version: 1.0 References: <0F8BCB8C-DE60-4A34-A4D8-F1BB4B9F906A@samsco.org> <9EF043C1-FF8F-4997-B59A-EC3BF7D1CEEE@samsco.org> <31E8B2BE-BED2-4084-868D-32C48CB3CD6E@samsco.org> <573f5fab-1ef6-151f-18ca-58d3a4a89c72@quip.cz> <07B6763F-C23B-4B7C-B76A-26267AC35453@samsco.org> <20200417194431.GD39563@home.opsec.eu> <148dcdf7-f185-f14f-52ee-d4df3a2a1dc7@quip.cz> <8D8E1F62-AB66-47E1-8444-3D66F8EADA74@samsco.org> <015c7aa8-9385-4219-1bf1-0137f65ed80d@quip.cz> <90C35FEF-690C-4D04-A0D8-D3E5A448C744@samsco.org> <4736a46d-716d-7860-ff56-6c1d7391dbeb@quip.cz> <6e6461d1-fbc4-c306-f71f-54767f2849cb@quip.cz> In-Reply-To: <6e6461d1-fbc4-c306-f71f-54767f2849cb@quip.cz> From: Chuck Tuffli Date: Thu, 14 May 2020 07:29:18 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: PCIe NVME drives not detected on Dell R6515 To: Miroslav Lachman <000.fbsd@quip.cz> Cc: Scott Long , Kurt Jaeger , Warner Losh , FreeBSD-Current Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 49NDSr0Gcfz4PR3 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of ctuffli@gmail.com designates 209.85.215.182 as permitted sender) smtp.mailfrom=ctuffli@gmail.com X-Spamd-Result: default: False [-4.11 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RWL_MAILSPIKE_GOOD(0.00)[182.215.85.209.rep.mailspike.net : 127.0.0.18]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; DMARC_NA(0.00)[freebsd.org]; RCPT_COUNT_FIVE(0.00)[5]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[182.215.85.209.list.dnswl.org : 127.0.5.0]; IP_SCORE(-2.11)[ip: (-9.67), ipnet: 209.85.128.0/17(-0.39), asn: 15169(-0.42), country: US(-0.05)]; FORGED_SENDER(0.30)[chuck@freebsd.org,ctuffli@gmail.com]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[chuck@freebsd.org,ctuffli@gmail.com]; RCVD_TLS_ALL(0.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 May 2020 14:29:32 -0000 On Mon, May 4, 2020 at 11:12 AM Miroslav Lachman <000.fbsd@quip.cz> wrote: > > On 2020-04-27 08:02, Miroslav Lachman wrote: > > I don't know what is with Scott. I hope he is well. > > Is there somebody else who can help me with this issue? > > Scott wrote there are hotplug PCIe buses not probed during boot process. > > I am not a developer so I cannot move forward alone. > > The problem is with PCIe Hot Plug. > Hot Plug bus was not enumerated thus no NVME detected. I may have just been bitten by this as well when running FreeBSD under qemu. The q35 machine type with PCIe emulation enables PCIe hot plug on all the root ports, but I am not seeing any downstream devices (either emulated like e1000 or passed through by the host) because of a check in pcib_hotplug_present(): /* * Require the Electromechanical Interlock to be engaged if * present. */ if (sc->pcie_slot_cap & PCIEM_SLOT_CAP_EIP && (sc->pcie_slot_sta & PCIEM_SLOT_STA_EIS) == 0) return (0); Under qemu, the slot indicates an Electromechanical Interlock is Present in the capabilities register, but it does not set the Electromechanical Interlock Status bit. This causes the PCI driver to not probe any children. Commenting out the above code made both emulated PCIe devices as well as host devices passed through appear in FreeBSD. As a data point, I'm not seeing similar checks in the Linux kernel. Miroslav, would it be possible to comment out/delete the above code in your kernel and retest to see if that helps your case as well? --chuck From owner-freebsd-current@freebsd.org Thu May 14 15:26:47 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D3E1A2F49A3 for ; Thu, 14 May 2020 15:26:47 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qv1-xf42.google.com (mail-qv1-xf42.google.com [IPv6:2607:f8b0:4864:20::f42]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49NFkv5Df2z4SKv; Thu, 14 May 2020 15:26:47 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qv1-xf42.google.com with SMTP id z9so1836213qvi.12; Thu, 14 May 2020 08:26:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=YtioxKq85PWeY7Ea5uAEcaCAuvzeHlH1tmrypRinKFQ=; b=WNMGeM7/KQxqo8gmRiPx+j//s1fWV0yetSCAnSp0y0RP2uwY0LKorX7s9PdNPWgjCt Ww+s7zpKbWJk6SvLOn7+59yQ7TRcxHePoL2ViMa4EIUlO4lNB1X2HPGZP3tRET5MODJp DVxeZ32cElHYekmGrgiIer2LyrK4wjGX3NfZkHJo56TXnINVQ6fPEuyWTh6UEAhzC1qn eUoBkI2uOsMul0UA23tlGia+rWBBc6AJ0wcfSxEsAqwLwNkwAkVFzH75IExydD4jStm7 HLU+griiW75I0TJCfSW1rzm3XPOpgZjL2u81bqtkvNjviJzxTJ1zcZzIY1uNHPyezocG hixw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=YtioxKq85PWeY7Ea5uAEcaCAuvzeHlH1tmrypRinKFQ=; b=o82jn5fEP/YO3N8nUEH8HmTvnoGnWOIgX9XbfTR0DWNUucFcmDb7/oUyCW8C8vZ9x6 VI8IdmI0ECDFyxOe9Xh9fauA2cdOCNWVm1P6B+z0ykYa3Yug92I/MCJ8ZIyKdl177dAo BSeAx9RoB9pNtTwm3LSctjLwiXRu/Z3TZVxsrjWC7EMFQgoIYnSWEbgSv2PkWFhgFKKW y2bXqKKK0wpInycEoeewJren0rfeZu5LmIvcJgRlfBjyvHT8veQ7XTy18fFqnuarU3yi O/RTrP1Q+DHHAUi/Cuo0kuIsfSpWHh20wSfCo9swX4jT1hD4/T3QPoL7MSpfSz1o0B7r kyZQ== X-Gm-Message-State: AOAM530hEAE6ArxnJvzuCjDaTd0424zUlWHYyNuJJjeNBTHHJyThJONS URG7ZDa4YVm4srWwmx8qDEbzNCag X-Google-Smtp-Source: ABdhPJwTVDETUCl4J1q3fpXL1GQLH5i7uBfw1azdn5OPTKsR9rAVrDMpfENutYwYxcaHTkV1Fi/MrA== X-Received: by 2002:ad4:4c92:: with SMTP id bs18mr5427873qvb.67.1589470006268; Thu, 14 May 2020 08:26:46 -0700 (PDT) Received: from raichu (toroon0560w-lp130-15-184-144-87-103.dsl.bell.ca. [184.144.87.103]) by smtp.gmail.com with ESMTPSA id x19sm2667229qkb.136.2020.05.14.08.26.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 May 2020 08:26:45 -0700 (PDT) Sender: Mark Johnston Date: Thu, 14 May 2020 11:26:43 -0400 From: Mark Johnston To: Andriy Gapon Cc: Bryan Drewery , freebsd-current@freebsd.org Subject: Re: zfs deadlock on r360452 relating to busy vm page Message-ID: <20200514152643.GC4268@raichu> References: <2bdc8563-283b-32cc-8a1a-85ff52aca99e@FreeBSD.org> <0e9cceba-84d0-ec4f-8784-36703452201d@FreeBSD.org> <889cb93b-85c7-3ec4-4ccf-5fb56ec38fa5@FreeBSD.org> <20200513144257.GA24239@raichu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 49NFkv5Df2z4SKv X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-5.98 / 15.00]; NEURAL_HAM_MEDIUM(-0.98)[-0.983,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; REPLY(-4.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 May 2020 15:26:47 -0000 On Thu, May 14, 2020 at 02:16:15PM +0300, Andriy Gapon wrote: > On 13/05/2020 17:42, Mark Johnston wrote: > > On Wed, May 13, 2020 at 10:45:24AM +0300, Andriy Gapon wrote: > >> On 13/05/2020 10:35, Andriy Gapon wrote: > >>> In r329363 I re-worked zfs_getpages and introduced range locking to it. > >>> At the time I believed that it was safe and maybe it was, please see the commit > >>> message. > >>> There, indeed, have been many performance / concurrency improvements to the VM > >>> system and r358443 is one of them. > >> > >> Thinking more about it, it could be r352176. > >> I think that vm_page_grab_valid (and later vm_page_grab_valid_unlocked) are not > >> equivalent to the code that they replaced. > >> The original code would check valid field before any locking and it would > >> attempt any locking / busing if a page is invalid. The object was required to > >> be locked though. > >> The new code tries to busy the page in any case. > >> > >>> I am not sure how to resolve the problem best. Maybe someone who knows the > >>> latest VM code better than me can comment on my assumptions stated in the commit > >>> message. > > > > The general trend has been to use the page busy lock as the single point > > of synchronization for per-page state. As you noted, updates to the > > valid bits were previously interlocked by the object lock, but this is > > coarse-grained and hurts concurrency. I think you are right that the > > range locking in getpages was ok before the recent change, but it seems > > preferable to try and address this in ZFS. > > > >>> In illumos (and, I think, in OpenZFS/ZoL) they don't have the range locking in > >>> this corner of the code because of a similar deadlock a long time ago. > > > > Do they just not implement readahead? > > I think so, but not 100% sure. > I recall seeing a comment in illumos code that they do not care about read-ahead > because there is ZFS prefetch and the data will be cached in ARC. That makes > sense from the I/O point of view, but it does not help with page faults. > > > Can you explain exactly what the > > range lock accomplishes here - is it entirely to ensure that znode block > > size remains stable? > > As far as I can recall, this is the reason indeed. It seems to me that zfs_getpages() could use a non-blocking rangelock_enter() operation to avoid the deadlock. The ZFS rangelock implementation doesn't have one, but it is easy to add. I'm not able to trigger the deadlock with this patch: https://reviews.freebsd.org/D24839 From owner-freebsd-current@freebsd.org Fri May 15 09:36:06 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id DD19C2EEE3D for ; Fri, 15 May 2020 09:36:06 +0000 (UTC) (envelope-from timp87@gmail.com) Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [IPv6:2a00:1450:4864:20::62f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49Njvn5P1qz4ZtD for ; Fri, 15 May 2020 09:36:05 +0000 (UTC) (envelope-from timp87@gmail.com) Received: by mail-ej1-x62f.google.com with SMTP id x20so1661240ejb.11 for ; Fri, 15 May 2020 02:36:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=/+VMv4TAT8UcTu5a2t2A7W9RzxBQwgu4aXBvuGnMy00=; b=Dx8qcG1rjHoLX5KT0rXRJU+grdHbgB6u0wiFysd9w+UhM/+nmxHfi3LW4ashdYxP8C +HUYifE0UY1OaP17Cfuxfjk+1VuKMursxKrZdw5lq7LBBhoy5bkgwxsg163yRagwQ045 aTmm8ekkrdivmwAUFEfAFbFddtGCJQ2u3tUJrlKrSwV8ZPqgKUGviX4Dj6THUzG3Ppy7 G6mBJMFAekuxpsDS/PAYAsmGVI3fFvQ2nXsSNTSH3YqtLd743aaNthoAXhGqsjHnlQWQ z6SHVatfTOmwo5IyhyOdjEOjcshIGXHwWgcYcpFESu1iGswctI5vA0+/3cVxIrLnE5zy qXWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=/+VMv4TAT8UcTu5a2t2A7W9RzxBQwgu4aXBvuGnMy00=; b=p4lBYIs3/6d7yiS/eJwzY/HtfV2Fj6KHCOWIP4lK8bpwIM/ZLTnd9As7n2Dfj3qaqx tokf5dCiSTqFs3ZLgt8zupCaadD14uoCpgFlGsPvSZSdIC62m9ZltgqX2aRkCzc+x0i8 GPOWR3X7PZFp+ap6kUVUOALSkq83PrMUwRQKd63rZeF2ZBGZhngybnZcfsH4gAoo8bMc X6c0B8hYOKKodPuz6aJAZUJ8WlVJFqjgxJA/4jB9wv29vNU4XmZbUSzWWl9jbxrA23lF mb3yyVBjDwByCjlG5onnG7aG60+Qhi0FtuwgB25rF0h3UcRtLk8q447ckk8B+CsZESZM y4dg== X-Gm-Message-State: AOAM53363ncEln47fK92xrC73D8Hrc+r6HZbrnj8U8zUayKXYd0hlFx+ F03fxxo2JTOJVL7aedSdMAX+ttu9bwzm8RQmfd4Elpw1 X-Google-Smtp-Source: ABdhPJxseCDTjMzOSqbNqUGv00kd/3pZsVvPXES1kd/3EJ+ukQlkkpk3X3JVp2dRKO+miD4KzZ7AGWotTfWztFNsE7w= X-Received: by 2002:a17:906:3607:: with SMTP id q7mr1757422ejb.81.1589535363685; Fri, 15 May 2020 02:36:03 -0700 (PDT) MIME-Version: 1.0 From: Pavel Timofeev Date: Fri, 15 May 2020 12:35:52 +0300 Message-ID: Subject: bhyve(8) missing new feature description about netgraph integration To: freebsd-current X-Rspamd-Queue-Id: 49Njvn5P1qz4ZtD X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=Dx8qcG1r; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of timp87@gmail.com designates 2a00:1450:4864:20::62f as permitted sender) smtp.mailfrom=timp87@gmail.com X-Spamd-Result: default: False [-2.00 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.996,0]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2a00:1450:4000::/36]; FREEMAIL_FROM(0.00)[gmail.com]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; IP_SCORE_FREEMAIL(0.00)[]; URI_COUNT_ODD(1.00)[3]; RCPT_COUNT_ONE(0.00)[1]; IP_SCORE(0.00)[ipnet: 2a00:1450::/32(-2.27), asn: 15169(-0.42), country: US(-0.05)]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; RCVD_IN_DNSWL_NONE(0.00)[f.2.6.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.5.4.1.0.0.a.2.list.dnswl.org : 127.0.5.0]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com.dwl.dnswl.org : 127.0.5.0] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.33 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 May 2020 09:36:06 -0000 Hi, I saw this nice change in svn - "Add a new bhyve network backend that allow to connect the VM to the netgraph(4) network" https://svnweb.freebsd.org/base?view=revision&revision=360958 Does anybody know if there is any prepared but not-committed yet review or something to reflect this change in bhyve(8) man page? From owner-freebsd-current@freebsd.org Fri May 15 10:26:38 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id A17C32F13DB for ; Fri, 15 May 2020 10:26:38 +0000 (UTC) (envelope-from aleksandr.fedorov@itglobal.com) Received: from relay01.itglobal.com (relay01.itglobal.com [185.255.76.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 49Nl235jsPz4dW8 for ; Fri, 15 May 2020 10:26:35 +0000 (UTC) (envelope-from aleksandr.fedorov@itglobal.com) X-Virus-Scanned: by SpamTitan at itglobal.com DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=itglobal.com; s=relay; t=1589538377; bh=ixLwwpSmcnxzB4R3DObR+8HQingCvn+Tv5j303gTp+4=; h=From:To:CC:Subject:Date; b=qn/Fm1N0whpS2UQUvSb2436DQcYtV25AROOzlHoqjoarsppeVIunA+JdExnXoiUTS +Jnz4tECkjLWELEyCxSUeF+wDrsdgzCp1Fs/KXAAxaV41uzNezWWKYIBC8Io2ZbgOS PWqkiMtX0fJPoK3mDkQ2WQS4pWsyds3PFvpMaQhvLTeQStAzwe9tcY8p9UST++dF5x 1NgShoYaszqdlnwS73d5DFkXpq9M4jf9vkUfr0GwawVQHLiDzg8UHHKHTQxS8j5gpQ qnzcz5ywop0/BTBkpNVvupERfMatPTbv85E9Z0npQLx3vr2GZapGwgm/5g9yg9UE6p SmiqV/XsAhhLw== From: "Fedorov, Aleksandr" To: "timp87@gmail.com" CC: "freebsd-current@freebsd.org" Subject: Re: bhyve(8) missing new feature description about netgraph integration Thread-Topic: bhyve(8) missing new feature description about netgraph integration Thread-Index: AQHWKqG45wacpPT2IkqDdYRpSb5oaQ== Date: Fri, 15 May 2020 10:26:10 +0000 Message-ID: <98e554642f0e4ba7bbd5059f236a3cb2@itglobal.com> Accept-Language: ru-RU, en-US Content-Language: ru-RU X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.32.254.12] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Rspamd-Queue-Id: 49Nl235jsPz4dW8 X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; dkim=pass header.d=itglobal.com header.s=relay header.b=qn/Fm1N0; dmarc=pass (policy=none) header.from=itglobal.com; spf=pass (mx1.freebsd.org: domain of aleksandr.fedorov@itglobal.com designates 185.255.76.12 as permitted sender) smtp.mailfrom=aleksandr.fedorov@itglobal.com X-Spamd-Result: default: False [-1.91 / 15.00]; ARC_NA(0.00)[]; FAKE_REPLY(1.00)[]; R_DKIM_ALLOW(-0.20)[itglobal.com:s=relay]; HAS_XOIP(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; DMARC_POLICY_ALLOW(-0.50)[itglobal.com,none]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-0.96)[-0.959,0]; NEURAL_HAM_MEDIUM(-0.97)[-0.974,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[itglobal.com:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[12.76.255.185.list.dnswl.org : 127.0.10.0]; TO_DN_EQ_ADDR_ALL(0.00)[]; FREEMAIL_TO(0.00)[gmail.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; IP_SCORE(0.02)[country: BY(0.10)]; ASN(0.00)[asn:209283, ipnet:185.255.76.0/22, country:BY]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 May 2020 10:26:38 -0000 I sent the changes to the review only today: https://reviews.freebsd.org/D2= 4846 You can see some examples in original review: https://reviews.freebsd.org/D= 24620=