Date: Thu, 15 Jan 2015 13:14:39 +0100 From: Hans Petter Selasky <hps@selasky.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r277199 - in head/sys: fs/devfs kern Message-ID: <54B7AF2F.3080802@selasky.org> In-Reply-To: <20150115115123.GA42409@kib.kiev.ua> References: <201501142207.t0EM7Dfn041543@svn.freebsd.org> <20150115033109.GM42409@kib.kiev.ua> <54B76F2B.8040106@selasky.org> <20150115093841.GX42409@kib.kiev.ua> <54B79B25.3070707@selasky.org> <20150115115123.GA42409@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 01/15/15 12:51, Konstantin Belousov wrote: > On Thu, Jan 15, 2015 at 11:49:09AM +0100, Hans Petter Selasky wrote: >> >> I see no leakage in that case either! > Because these cases come through destroy_dev(). > >> >> Are there more cases which I don't see? > You are breaking existig devfs KPI by your hack. You introduce yet another > reference on the device, which is not supposed to be there. Hi Konstantin, I need a non-sleeping way to say a character device is no longer supposed to be used and be able to re-use the device name right away creating a new device. I guess you got that. > > If some code calls delist_dev(), it could be said that it is a contract > of the new function that destroy_dev() must be called eventually on > the cdev. Then, the reference could be said to be shared-owned by > delist_dev() and destroy_dev(). But, for arbitrary devfs user this new > reference is unacceptable and breaks interface. delist_dev() changes no references. It can be called multiple times even, also inside destroy_devl(). Also I think that the "destroy_dev_sched_cbl()" function should call delist_dev() first so that we don't have a time from when the "destroy_dev_sched_cbl()" function is called where the device entry still exists in devfs mounts until the final destroy_devl() is done by a taskqueue. >>>> In the case of direct free through #1, the reference count is ignored >>>> and it doesn't matter if it is one or zero. Only in the case of >>>> destruction through destroy_dev() it matters. >>>> >>>> Like the comment says in destroy_devl(): >>>> >>>> /* Avoid race with dev_rel() */ >>>> >>>> The problem is that the "cdev->si_refcount" is zero when the initial >>>> devfs_create() is called. Then one ref is made. When we clear the >>>> CDP_ACTIVE flag in devfs_destroy() it instructs a !parallel! running >>>> process to destroy all the FS related structures and the reference count >>>> goes back to zero when the "cdp" is removed from the "cdevp_list". Then >>>> the cdev is freed too early. This happens because destroy_devl() is >>>> dropping the dev_lock() to sleep waiting for pending references. >>> Basically, this is very good explanation why your delist hack is wrong, >>> for one of the reason. Another reason is explained below. >>> You are trying to cover it with additional reference, but this is wrong >>> as well. >>> >>>> >>>> Do you see something else? >>> >>> I think that what you are trying to do with the CDP_ACTIVE hack is doomed >>> anyway, because you are allowing for devfs directory to have two entries >>> with the same name, until the populate loop cleans up the inactive one. >>> In the meantime, any access to the directory operates on random entry. >> >> The entry will not be random, because upon an open() call to a character >> device, I believe the devfs_lookup() function will be called, which >> always populate the devfs tree at first by calls to >> devfs_populate_xxx(). Any delisted devices which don't have the >> "CDP_ACTIVE" bit set, will never be seen by any open. > Entry can be random, since after the populate loop is ran, your code in > other thread could start and create duplicate entry. There is a window > in the lookup where both directory vnode lock and mount point sx locks > are dropped. So running the populate does not guarantee anything. If there is such a race, it is already there! My patch changes nothing in that area: Thread1: Calls destroy_dev() and clears CDP_ACTIVE, after dev_unlock() it goes waiting for some refs for xxx milliseconds. Thread2: Tries to create create a new character device having the same name like the one in thread1. Device name duplication check is missed because CDP_ACTIVE is cleared. Still thread1 is waiting. Thread3: Tries to open character device created by thread2 while thread1 is still waiting for some ref held by a userspace app to go away. This can happen already before my patches! What do you think? > >> >> Regarding leftover filedescriptors which still access the old "cdev" >> this is not a problem, and these will be closed when the si_refcount >> goes to zero after the destroy_devl() call. >> >>> >>> The checks for existent names in make_dev() are performed for the reason, >>> and you makes the rounds to effectively ignore it. >>> >> >> These checks are still correct and don't conflict with my patch from >> what I can see. Else the existing destroy_devl() would also be broken >> even before my patch with regard to the "random" selection of character >> devices at open() from userspace. > > The checks are done to avoid duplicate names. Your patch makes these > checks ineffective (i.e. broken). At what level do you mean duplicate names, I don't get this fully? At the directory level (DE nodes)? Or inside the list of character devices (LIST_XXX)? > Let me summarize: > - the extra reference on arbitrary cdev should be eliminated. The > delist_dev_locked() may add the ref and set some CDP_ flag to > indicate to destroy_dev() that it should do dev_rel(). It is possible to do this. I thought about this before doing my patch, but decided to try to avoid adding yet another cdev flag. > - the existence of the duplicated entries should be either eliminated > (I am not sure it is possible with your code), or we must ensure > that only one name with CDP_ACTIVE flag set and given name exists. > Then, all lookup code must be audited to take CDP_ACTIVE into account > when accessing names. I see at least devfs_find() and devfs_mknod() > which must be changed. I did not performed full audit. I will check this path out aswell. --HPS
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54B7AF2F.3080802>