From owner-freebsd-current@freebsd.org  Mon Jun 25 02:04:35 2018
Return-Path: <owner-freebsd-current@freebsd.org>
Delivered-To: freebsd-current@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2609B101F1EC
 for <freebsd-current@mailman.ysv.freebsd.org>;
 Mon, 25 Jun 2018 02:04:35 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from CAN01-QB1-obe.outbound.protection.outlook.com
 (mail-eopbgr660074.outbound.protection.outlook.com [40.107.66.74])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (Client CN "mail.protection.outlook.com",
 Issuer "Microsoft IT TLS CA 4" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 71C6C85CD1;
 Mon, 25 Jun 2018 02:04:34 +0000 (UTC)
 (envelope-from rmacklem@uoguelph.ca)
Received: from YTOPR0101MB0953.CANPRD01.PROD.OUTLOOK.COM (52.132.44.24) by
 YTOPR0101MB1564.CANPRD01.PROD.OUTLOOK.COM (52.132.50.157) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.884.24; Mon, 25 Jun 2018 02:04:32 +0000
Received: from YTOPR0101MB0953.CANPRD01.PROD.OUTLOOK.COM
 ([fe80::d0eb:3783:7c99:2802]) by YTOPR0101MB0953.CANPRD01.PROD.OUTLOOK.COM
 ([fe80::d0eb:3783:7c99:2802%4]) with mapi id 15.20.0884.023; Mon, 25 Jun 2018
 02:04:32 +0000
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Konstantin Belousov <kostikbel@gmail.com>
CC: "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>, Alexander
 Motin <mav@FreeBSD.org>, Doug Rabson <dfr@rabson.org>
Subject: Re: nfsd kernel threads won't die via SIGKILL
Thread-Topic: nfsd kernel threads won't die via SIGKILL
Thread-Index: AQHUCzMbOoi8yQKY7UygDXm1hs0B0KRvJmwAgAESAI8=
Date: Mon, 25 Jun 2018 02:04:32 +0000
Message-ID: <YTOPR0101MB095352561BFA6428ACBB19FDDD4A0@YTOPR0101MB0953.CANPRD01.PROD.OUTLOOK.COM>
References: <YTXPR0101MB0959B4E960B85ACAC07B819DDD740@YTXPR0101MB0959.CANPRD01.PROD.OUTLOOK.COM>,
 <20180624093330.GX2430@kib.kiev.ua>
In-Reply-To: <20180624093330.GX2430@kib.kiev.ua>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
authentication-results: spf=none (sender IP is )
 smtp.mailfrom=rmacklem@uoguelph.ca; 
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; YTOPR0101MB1564;
 7:brXuicSweFWNNqIQmOjLI8GAzAzshh5Unb8pijvapsU+gSSgTZ9k+r6BHdX6qgh6QG3xoNLx0xNhmEsUrd9QEMelzrkeQQkaJ8MhlzPT5UArrjNmQaxl9JDhQZNgn7NIpLXC7wi/QEBBHEqOwAqwx4Hhs+29tb7dcYgHOeF1fMhoZFF8qArnQ6WSQgIgJERRS9lX/YUA6FqqvsU41x1x26SEXGj4AlftxsK/jGpOmSoerRJQSYoqSv1LrWNVWEab
x-ms-exchange-antispam-srfa-diagnostics: SOS;
x-ms-office365-filtering-correlation-id: 446d1d13-a381-4c90-7bd1-08d5da3ffe02
x-microsoft-antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(7020095)(4652020)(8989117)(4534165)(4627221)(201703031133081)(201702281549075)(8990107)(5600026)(711020)(2017052603328)(7153060)(7193020);
 SRVR:YTOPR0101MB1564; 
x-ms-traffictypediagnostic: YTOPR0101MB1564:
x-microsoft-antispam-prvs: <YTOPR0101MB156418CD1DF678BB4A75A9EEDD4A0@YTOPR0101MB1564.CANPRD01.PROD.OUTLOOK.COM>
x-exchange-antispam-report-test: UriScan:(158342451672863);
x-ms-exchange-senderadcheck: 1
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0;
 RULEID:(6040522)(2401047)(5005006)(8121501046)(93006095)(93001095)(10201501046)(3002001)(3231254)(944501410)(52105095)(149027)(150027)(6041310)(201703131423095)(201702281529075)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123560045)(20161123564045)(20161123558120)(6072148)(201708071742011)(7699016);
 SRVR:YTOPR0101MB1564; BCL:0; PCL:0; RULEID:; SRVR:YTOPR0101MB1564; 
x-forefront-prvs: 0714841678
x-forefront-antispam-report: SFV:NSPM;
 SFS:(10009020)(39860400002)(346002)(396003)(366004)(376002)(39380400002)(199004)(189003)(74316002)(54906003)(6506007)(3280700002)(8676002)(81156014)(74482002)(59450400001)(186003)(6916009)(26005)(102836004)(8936002)(81166006)(5660300001)(7696005)(2906002)(68736007)(2900100001)(76176011)(786003)(99286004)(105586002)(25786009)(3660700001)(106356001)(305945005)(5250100002)(316002)(55016002)(86362001)(229853002)(6436002)(4326008)(39060400002)(1411001)(14454004)(53936002)(6246003)(11346002)(478600001)(446003)(33656002)(97736004)(9686003)(486006)(476003);
 DIR:OUT; SFP:1101; SCL:1; SRVR:YTOPR0101MB1564;
 H:YTOPR0101MB0953.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en;
 PTR:InfoNoRecords; A:1; MX:1; 
received-spf: None (protection.outlook.com: uoguelph.ca does not designate
 permitted sender hosts)
x-microsoft-antispam-message-info: nGydpW4wvgInn01n0waepQMOdzhA50U/TQpfJSCIZ1tKZHdKQy7B1iCPQnun/cIZC1po2gg7FJtxk2gTfL9aPTBypFaV4zULxQaY0d5jqHVy+pqOpnUe8/UJEn5wqGVkWriCQTE4mU+nJoFEsBBjvkGp0u/EHuL7nkcWU0xafIHEiXLGaZ+D4fgD0kkRKCf6h8VvKkAvN92mmpPYhYAOBvS5OPXN/tPr6n6V3GW3+6ush1DjSqANx18t2p4nSBIWKqIBwv4QiuLJrR3EqPg8jjXcuwYz8GZ5nfYaZ1zdf0mu7mNGR96aS93WyYlOLMN2fSkr9wSsfq8H9kvOf6cFiNGKb3XnwfMNS/EGJtY3Av8=
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: uoguelph.ca
X-MS-Exchange-CrossTenant-Network-Message-Id: 446d1d13-a381-4c90-7bd1-08d5da3ffe02
X-MS-Exchange-CrossTenant-originalarrivaltime: 25 Jun 2018 02:04:32.1152 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d
X-MS-Exchange-Transport-CrossTenantHeadersStamped: YTOPR0101MB1564
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
 <freebsd-current.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current/>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
 <mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Jun 2018 02:04:35 -0000

Konstantin Belousov wrote:
>On Sat, Jun 23, 2018 at 09:03:02PM +0000, Rick Macklem wrote:
>> During testing of the pNFS server I have been frequently killing/restart=
ing the nfsd.
>> Once in a while, the "slave" nfsd process doesn't terminate and a "ps ax=
Hl" shows:
>>   0 48889     1   0   20  0  5884  812 svcexit  D     -   0:00.01 nfsd: =
server
>>   0 48889     1   0   40  0  5884  812 rpcsvc   I     -   0:00.00 nfsd: =
server
>> ... more of the same
>>   0 48889     1   0   40  0  5884  812 rpcsvc   I     -   0:00.00 nfsd: =
server
>>   0 48889     1   0   -8  0  5884  812 rpcsvc   I     -   1:51.78 nfsd: =
server
>>   0 48889     1   0   -8  0  5884  812 rpcsvc   I     -   2:27.75 nfsd: =
server
>>
>> You can see that the top thread (the one that was created with the proce=
ss) is
>> stuck in "D"  on "svcexit".
>> The rest of the threads are still servicing NFS RPCs. If you still have =
an NFS mount >>on
>> the server, the mount continues to work and the CPU time for the last tw=
o threads
>> slowly climbs, due to NFS RPC activity. A SIGKILL was posted for the pro=
cess and
>> these threads (created by kthread_add) are here, but the
>> cv_wait_sig/cv_timedwait_sig never seems to return EINTR for these other=
 >>threads.
>>
>>                        if (ismaster || (!ismaster &&
>> 1207                              grp->sg_threadcount > grp->sg_minthrea=
ds))
>> 1208                                  error =3D cv_timedwait_sig(&st->st=
_cond,
>> 1209                                      &grp->sg_lock, 5 * hz);
>> 1210                          else
>> 1211                                  error =3D cv_wait_sig(&st->st_cond=
,
>> 1212                                      &grp->sg_lock);
>>
>> The top thread (referred to in svc.c as "ismaster" did return from here =
with EINTR
>> and has now done an msleep() here, waiting for the other threads to term=
inate.
>>
>>        /* Waiting for threads to stop. */
>> 1387          for (g =3D 0; g < pool->sp_groupcount; g++) {
>> 1388                  grp =3D &pool->sp_groups[g];
>> 1389                  mtx_lock(&grp->sg_lock);
>> 1390                  while (grp->sg_threadcount > 0)
>> 1391                          msleep(grp, &grp->sg_lock, 0, "svcexit", 0=
);
>> 1392                  mtx_unlock(&grp->sg_lock);
>> 1393          }
>>
>> Although I can't be sure if this patch has fixed the problem because it =
happens
>> intermittently, I have not seen the problem since applying this patch:
>> --- rpc/svc.c.sav     2018-06-21 22:52:11.623955000 -0400
>> +++ rpc/svc.c 2018-06-22 09:01:40.271803000 -0400
>> @@ -1388,7 +1388,7 @@ svc_run(SVCPOOL *pool)
>>               grp =3D &pool->sp_groups[g];
>>               mtx_lock(&grp->sg_lock);
>>               while (grp->sg_threadcount > 0)
>> -                     msleep(grp, &grp->sg_lock, 0, "svcexit", 0);
>> +                     msleep(grp, &grp->sg_lock, 0, "svcexit", 1);
>>               mtx_unlock(&grp->sg_lock);
>>       }
>>  }
>>
>> As you can see, all it does is add a timeout to the msleep().
>> I am not familiar with the signal delivery code in sleepqeue, so it prob=
ably
>> isn't correct, but my theory is alonge the lines of...
>>
>> Since the msleep() doesn't have PCATCH, it does not set TDF_SINTR
>> and if that happens before the other threads return EINTR from cv_wait_s=
ig(),
>> they no longer do so?
>> And I thought that waking up from the msleep() via timeouts would maybe =
allow
>> the other threads to return EINTR from cv_wait_sig()?
>>
>> Does this make sense? rick
>> ps: I'll post if I see the problem again with the patch applied.
>> pss: This is a single core i386 system, just in case that might affect t=
his.
>
>No, the patch does not make sense. I think it was just coincidental that
>with the patch you did not get the hang.
>
>Signals are delivered to a thread, which should take the appropriate
>actions. For the kernel process like rpc pool, the signals are never
>delivered, they are queued in the randomly selected thread' signal queue
>and sit there. The interruptible sleeps are aborted in the context
>of that thread, but nothing else happens. So if you need to make svc
>pools properly killable, all threads must check at least for EINTR and
>instruct other threads to exit as well.
I'm not sure I understand what the "randomly selected thread signal queue" =
means,
but it seems strange that this usually works. (The code is at least 10years=
 old.
Originally committed by dfr@. I've added him to the cc list in case he unde=
rstands
this?
Is it that, usually, the threads will all return EINTR before the master on=
e gets
to where the msleep() happens if the count is > 0?

>Your description at the start of the message of the behaviour after
>SIGKILL, where other threads continued to serve RPCs, exactly matches
>above explanation. You need to add some global 'stop' flag, if it is not
>yet present, and recheck it after each RPC handled. Any thread which
>notes EINTR or does a direct check for the pending signal, should set
>the flag and wake up every other thread in the pool.
Ok, I'll code up a patch with a global "stop" flag and test it for a while.
If it seems ok, I'll put it up in phabricator and ask you to review it.

Thanks, rick