Date: Thu, 11 Apr 2019 06:56:22 +0200 From: Mateusz Guzik <mjguzik@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: "kib@freebsd.org" <kib@freebsd.org>, "freebsd-current@FreeBSD.org" <freebsd-current@freebsd.org> Subject: Re: Do the pidhashtbl_locks added by r340742 need to be sx locks? Message-ID: <CAGudoHHANiZQN_2BtYDB9vr2W0rez8pFeZjCbW_tPj3028avcg@mail.gmail.com> In-Reply-To: <QB1PR01MB35374204FA4B6764F34BFA23DD2F0@QB1PR01MB3537.CANPRD01.PROD.OUTLOOK.COM> References: <QB1PR01MB35371458E0E88BA357633D50DD2F0@QB1PR01MB3537.CANPRD01.PROD.OUTLOOK.COM> <CAGudoHGOyQVjrxt=555wsC_VQruZBnrA_jMjnXpLN8N9V%2B7-Xw@mail.gmail.com> <QB1PR01MB35374204FA4B6764F34BFA23DD2F0@QB1PR01MB3537.CANPRD01.PROD.OUTLOOK.COM>
next in thread | previous in thread | raw e-mail | index | archive | help
On 4/11/19, Rick Macklem <rmacklem@uoguelph.ca> wrote: > Mateusz Guzik wrote: >>On 4/11/19, Rick Macklem <rmacklem@uoguelph.ca> wrote: >>> Hi, >>> >>> I finally got around to looking at what effect replacing pfind_locked() >>> with >>> pfind() has for the NFSv4 client and it is broken. >>> >>> The problem is that the NFS code needs to call some variant of "pfind()" >>> while >>> holding a mutex lock. The current _pfind() code uses the >>> pidhashtbl_locks, >>> which are "sx" locks. >>> >>> There are a few ways to fix this: >>> 1 - Create a custom version of _pfind() for the NFS client with the >>> sx_X() >>> calls >>> removed, plus replace the locking of allproc_lock with locking of >>> all >>> the >>> pidhashtbl_locks, so that the "sx" locks are acquired before the >>> mutex. >>> --> Not very efficient, but since it is only done once/sec, I can >>> live >>> with it. >>> 2 - Similar to the above, but still lock the allproc_lock and use a loop >>> of >>> FOREACH_PROC_IN_SYSTEM(p) instead of a hash list for the pid in the >>> custom pfind(). (I don't know if this would be preferable to >>> locking >>> all >>> the pidhashtbl_locks for other users of pfind()?) >>> 3 - Convert the pidhashtbl_locks to mutexes. Then the NFS client doesn't >>> need >>> to acquire any proc related locks and it just works. >>> I can't see anywhere that "sleeps" while holding the >>> pidhashtbl_locks, >>> so I >>> think they can be converted, although I haven't tried it yet? >>> >>> From my perspective, #3 seems the better solution. >>> What do others think? >>> >> >>Changing the lock type to rwlock may be doable and worthwhile on its own, >>but I don't think it would constitute the right fix. >> >>Preferably there would be an easy to use mechanism which allows >>registering per-process callbacks. Currently it can be somewhat emulated >>with EVENTHANDLERs, but it would give calls for all exiting processes, not >>only the ones of interest. Then there would be no need to periodically >>scan as you would always get notified on process exit. > Long ago when I first did the NFSv4 code for OpenBSD2.6, I had a callback > function > pointer in "struct proc" which the NFS code set non-null to get a callback. > { The code still has remnants of that because it still has > nfscl_cleanup_common(), > which was code shared by that callback and this approach which was used > for > the Mac OS X port, where I couldn't change "struct proc". } > I have never added anything like that for FreeBSD, but I suppose we could > look > at doing it that way. > To be honest, since the current code works fine and can be difficult to test > well, > I hesitate to change to using a callback. > >>Note the current code does not ref processes it is interested in any >>manner and just performs a timestamp check to see if it got the one it >>expected (with pid reuse in mind). >> >>So I think a temporary hack which will do the trick will take the current >>approach further: rely on struct proc being type-stable (i.e. never being >>freed) and also store the pointer. You can always safely PROC_LOCK it, do >>checks to see the proc is alive and has the right timestamp... > Hmm, so you are saying that every element of the proc_zone always has a > valid > p_mtx field in it that can be safely PROC_LOCK()'d no matter if the element > refers to a process at that time? > I would also need help with the code to determine if the structure refers > to > a process that currently exists with the same pid and creation time. You can just: PROC_LOCK(p); if (p->p_state != PRS_NORMAL) { /* it's not the one you want, bail */ } /* the current timestamp check goes here */ PROC_UNLOCK(p); > > I suppose saving "p" with the lock/open owner string and then doing what > you > suggest is possible, but it would take some work. > > For now, I can just grab all the pidhashtbl_locks once/sec and fix head so > it works. > -- Mateusz Guzik <mjguzik gmail.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGudoHHANiZQN_2BtYDB9vr2W0rez8pFeZjCbW_tPj3028avcg>