Date: Thu, 17 Dec 2009 09:03:54 +0100 From: Alexander Leidinger <Alexander@Leidinger.net> To: fs@freebsd.org, stable@freebsd.org Cc: pjd@freebsd.org Subject: 57 ZFS patches not merged to RELENG_7 Message-ID: <20091217090354.98634ouizsftffk0@webmail.leidinger.net>
next in thread | raw e-mail | index | archive | help
Hi, I identified at least 57 patches which are in 8-stable, but not in 7-stable. Does someone know the status of those patches (listed below)? Maybe not all are applicable, but some of them should really get merged. I also have seen some things which I think are mismerges to 7-stable, but I do not have a list/diff of them at hand (I diffed 7-stable and 8-stable and those things where "in the noise" of the stuff which is not merged yet, one of the things I find directly now is two times the same line with kmem_cache_create of the zio_cache in cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c). If those people (ok, mostly pjd) which handle ZFS stuff normally do not have time to take care about this, are there people interested in helping me merge those things? Basically this means trying to apply the patches listed below on 7-stable, and testing them (or patches from other people, in case you are not able to merge a patch yourself) on a scratch box (I do not have a scratch-box with 7-stable around). It also means reviewing the patches regarding possible issues (I already identified some which need investigation, see below). Suggestions where those things should be coordinated in this case (on fs@ or stable@ or in the FreeBSD Wiki)? Below is the list of patches which I identified. I didn't had a look which one depends on which one, but there are for sure dependencies. The format is URL of the patch in head committer which committed it - my comment about the patch commit log Bye, Alexander. http://svn.freebsd.org/viewvc/base?view=revision&revision=185310 ganbold - very easy merge ---snip--- Remove unused variable. Found with: Coverity Prevent(tm) CID: 3669,3671 ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=185319 pjd - applicable to RELENG_7? ---snip--- Fix locking (file descriptor table and Giant around VFS). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=192689 trasz - very easy merge ---snip--- Fix comment. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=193110 kmacy - easy merge ---snip--- work around snapshot shutdown race reported by Henri Hennebert ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=193128 kmacy - first probably, second and 3rd to check ---snip--- fix xdrmem_control to be safe in an if statement fix zfs to depend on krpc remove xdr from zfs makefile ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=193440 ps - shared vnode locks available in RELENG_7? vn_lock same syntax? ---snip--- Support shared vnode locks for write operations when the offset is provided on filesystems that support it. This really improves mysql + innodb performance on ZFS. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=194043 kmacy - sysctl API change? ---snip--- pjd has requested that I keep the tunable as zfs_prefetch_disable to minimize gratuitous differences with Opensolaris' ZFS ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=195627 marcel - easy merge ---snip--- In nvpair_native_embedded_array(), meaningless pointers are zeroed. The programmer was aware that alignment was not guaranteed in the packed structure and used bzero() to NULL out the pointers. However, on ia64, the compiler is quite agressive in finding ILP and calls to bzero() are often replaced by simple assignments (i.e. stores). Especially when the width or size in question corresponds with a store instruction (i.e. st1, st2, st4 or st8). The problem here is not a compiler bug. The address of the memory to zero-out was given by '&packed->nvl_priv' and given the type of the 'packed' pointer the compiler could assume proper alignment for the replacement of bzero() with an 8-byte wide store to be valid. The problem is with the programmer. The programmer knew that the address did not have the alignment guarantees needed for a regular assignment, but failed to inform the compiler of that fact. In fact, the programmer told the compiler the opposite: alignment is guaranteed. The fix is to avoid using a pointer of type "nvlist_t *" and instead use a "char *" pointer as the basis for calculating the address. This tells the compiler that only 1-byte alignment can be assumed and the compiler will either keep the bzero() call or instead replace it with a sequence of byte-wise stores. Both are valid. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=195785 trasz - extattr_check_cred same sytax in RELENG_7? Necessary? ---snip--- Fix permission handling for extended attributes in ZFS. Without this change, ZFS uses SunOS Alternate Data Streams semantics - each EA has its own permissions, which are set at EA creation time and - unlike SunOS - invisible to the user and impossible to change. From the user point of view, it's just broken: sometimes access is granted when it shouldn't be, sometimes it's denied when it shouldn't be. This patch makes it behave just like UFS, i.e. depend on current file permissions. Also, it fixes returned error codes (ENOATTR instead of ENOENT) and makes listextattr(2) return 0 instead of EPERM where there is no EA directory (i.e. the file never had any EA). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=195822 trasz - easy merge ---snip--- Fix extattr_list_file(2) on ZFS in case the attribute directory doesn't exist and user doesn't have write access to the file. Without this fix, it returns bogus value instead of 0. For some reason this didn't manifest on my kernel compiled with -O0. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=195909 pjd - very easy merge ---snip--- We don't support ephemeral IDs in FreeBSD and without this fix ZFS can panic when in zfs_fuid_create_cred() when userid is negative. It is converted to unsigned value which makes IS_EPHEMERAL() macro to incorrectly report that this is ephemeral ID. The most reasonable solution for now is to always report that the given ID is not ephemeral. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196291 pjd - probably easy merge ---snip--- - Fix a race where /dev/zfs control device is created before ZFS is fully initialized. Also destroy /dev/zfs before doing other deinitializations. - Initialization through taskq is no longer needed and there is a race where one of the zpool/zfs command loads zfs.ko and tries to do some work immediately, but /dev/zfs is not there yet. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196269 marcel - easy merge ---snip--- Fix misalignment in nvpair_native_embedded() caused by the compiler replacing the bzero(). See also revision 195627, which fixed the misalignment in nvpair_native_embedded_array(). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196295 pjd - the added stuff needs to be reviewed, taskqueue available & same syntax? ---snip--- Remove OpenSolaris taskq port (it performs very poorly in our kernel) and replace it with wrappers around our taskqueue(9). To make it possible implement taskqueue_member() function which returns 1 if the given thread was created by the given taskqueue. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196297 pjd - easy merge ---snip--- Fix panic in zfs recv code. The last vnode (mountpoint's vnode) can have 0 usecount. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196299 pjd - VI_UNLOCK same syntax in RELENG_7? ---snip--- - We need to recycle vnode instead of freeing znode. Submitted by: avg - Add missing vnode interlock unlock. - Remove redundant znode locking. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196301 pjd - probably easy merge ---snip--- If z_buf is NULL, we should free znode immediately. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196307 pjd - to be reviewed ---snip--- Manage asynchronous vnode release just like Solaris. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196309 pjd - vhold/VN_RELE/zfsctl_root_lookup/vop_vptocnp same syntax in RELENG_7? ---snip--- getcwd() (when __getcwd() fails) works by stating current directory, going up (..), calling readdir and looking for previous directory inode. In case of .zfs/ directory this doesn't work, because .zfs/ is hidden by default, so it won't be visible in readdir output. Fix this by implementing VPTOCNP for snapshot directories, so __getcwd() doesn't fail and getcwd() doesn't have to use readdir method. This fixes /bin/pwd from within .zfs/snapshot/<name>/. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196395 pjd - applicable to RELENG_7 (same XDR code?)? ---snip--- Our libc doesn't implement control method for XDR (only kernel does) and it will always return failure. Fix this by bringing userland implementation of xdrmem_control() back. This allow 'zpool import' to work again. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196456 pjd - kproc_create the same syntax on RELENG_7? ---snip--- - Give minclsyspri and maxclsyspri real values (consulted with kmacy). - Honour 'pri' argument for thread_create(). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196457 pjd - easy merge, probably depends upon 196457 ---snip--- Set priority of vdev_geom threads and zvol threads to PRIBIO. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196458 pjd - kproc_kthread_add available in RELENG_7? same syntax? ---snip--- - Hide ZFS kernel threads under zfskern process. - Use better (shorter) threads names: 'zvol:worker zvol/tank/vol00' -> 'zvol tank/vol00' 'vdev:worker da0' -> 'vdev da0' ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196662 pjd - vn_lock/VOP_UNLOCK syntax the same in RELENG_7 (add curthread?)? ---snip--- Add missing mountpoint vnode locking. This fixes panic on assertion with DEBUG_VFS_LOCKS and vfs.usermount=1 when regular user tries to mount dataset owned by him. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196703 pjd - probably easy merge, KBI change (dnode)? ---snip--- Backport the 'dirtying dbuf' panic fix from newer ZFS version. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196919 pjd - very easy merge ---snip--- bzero() on-stack argument, so mutex_init() won't misinterpret that the lock is already initialized if we have some garbage on the stack. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196927 pjd - easy merge ---snip--- Changing provider size is not really supported by GEOM, but doing so when provider is closed should be ok. When administrator requests to change ZVOL size do it immediately if ZVOL is closed or do it on last ZVOL close. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196943 pjd - MNT_* same syntax in RELENG_7? ---snip--- - Avoid holding mutex around M_WAITOK allocations. - Add locking for mnt_opt field. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196944 pjd - easy merge ---snip--- Don't recheck ownership on update mount. This will eliminate LOR between vfs_busy() and mount mutex. We check ownership in vfs_domount() anyway. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196954 pjd - easy merge ---snip--- If we have to use avl_find(), optimize a bit and use avl_insert() instead of avl_add() (the latter is actually a wrapper around avl_find() + avl_insert()). Fix similar case in the code that is currently commented out. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196965 pjd - easy merge ---snip--- Fix reference count leak for a case where snapshot's mount point is updated. Such situation is not supported. This problem was triggered by something like this: # zpool create tank da0 # zfs snapshot tank@snap # cd /tank/.zfs/snapshot/snap (this will mount the snapshot) # cd # mount -u nosuid /tank/.zfs/snapshot/snap (refcount leak) # zpool export tank cannot export 'tank': pool is busy ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196980 pjd - VFS_ROOT/VN_RELE same syntax in RELENG_7? ---snip--- When we automatically mount snapshot we want to return vnode of the mount point from the lookup and not covered vnode. This is one of the fixes for using .zfs/ over NFS. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196982 pjd - vfs_checkexp/vfs_stdcheckexp same syntax in RELENG_7? ---snip--- We don't export individual snapshots, so mnt_export field in snapshot's mount point is NULL. That's why when we try to access snapshots over NFS use mnt_export field from the parent file system. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196985 pjd - easy merge ---snip--- Only log successful commands! Without this fix we log even unsuccessful commands executed by unprivileged users. Action is not really taken, but it is logged to pool history, which might be confusing. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197133 pjd - do we have rw-locks in RELENG_7? ---snip--- - Protect reclaim with z_teardown_inactive_lock. - Be prepared for dbuf to disappear in zfs_reclaim_complete() and check if z_dbuf field is NULL - this might happen in case of rollback or forced unmount between zfs_freebsd_reclaim() and zfs_reclaim_complete(). - On forced unmount wait for all znodes to be destroyed - destruction can be done asynchronously via zfs_reclaim_complete(). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197151 pjd - easy merge ---snip--- Be sure not to overflow struct fid. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197152 pjd - easy merge ---snip--- Extend scope of the z_teardown_lock lock for consistency and "just in case". ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197153 pjd - easy merge ---snip--- When zfs.ko is compiled with debug, make sure that znode and vnode point at each other. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197167 pjd - easy merge ---snip--- Work-around READDIRPLUS problem with .zfs/ and .zfs/snapshot/ directories by just returning EOPNOTSUPP. This will allow NFS server to fall back to regular READDIR. Note that converting inode number to snapshot's vnode is expensive operation. Snapshots are stored in AVL tree, but based on their names, not inode numbers, so to convert inode to snapshot vnode we have to interate over all snalshots. This is not a problem in OpenSolaris, because in their READDIRPLUS implementation they use VOP_LOOKUP() on d_name, instead of VFS_VGET() on d_fileno as we do. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197177 pjd - easy merge ---snip--- Support both case: when snapshot is already mounted and when it is not yet mounted. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197201 pjd - VOP_UNLOCK/VI_LOCK/VI_UNLOCK same syntax in RELENG_7 (add curthread?)? ---snip--- - Mount ZFS snapshots with MNT_IGNORE flag, so they are not visible in regular df(1) and mount(8) output. This is a bit smilar to OpenSolaris and follows ZFS route of not listing snapshots by default with 'zfs list' command. - Add UPDATING entry to note that ZFS snapshots are no longer visible in mount(8) and df(1) output by default. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197351 pjd - easy merge ---snip--- Purge namecache in the same place OpenSolaris does. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197458 pjd - probably easy merge ---snip--- Close race in zfs_zget(). We have to increase usecount first and then check for VI_DOOMED flag. Before this change vnode could be reclaimed between checking for the flag and increasing usecount. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197459 pjd - probably easy merge ---snip--- Before calling vflush(FORCECLOSE) mark file system as unmounted so the following vnops will fail. This is very important, because without this change vnode could be reclaimed at any point, even if we increased usecount. The only way to ensure that vnode won't be reclaimed was to lock it, which would be very hard to do in ZFS without changing a lot of code. With this change simply increasing usecount is enough to be sure vnode won't be reclaimed from under us. To be precise it can still be reclaimed but we won't be able to see it, because every try to enter ZFS through VFS will result in EIO. The only function that cannot return EIO, because it is needed for vflush() is zfs_root(). Introduce ZFS_ENTER_NOERROR() macro that only locks z_teardown_lock and never returns EIO. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197497 pjd - easy merge, implications for existing pools/data? ---snip--- Switch to fletcher4 as the default checksum algorithm. Fletcher2 was proven to be a bit weak and OpenSolaris also switched to fletcher4. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197512 pjd - VI_UNLOCK same syntax in RELENG_8? ---snip--- - Don't depend on value returned by gfs_*_inactive(), it doesn't work well with forced unmounts when GFS vnodes are referenced. - Make other preparations to GFS for forced unmounts. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197513 pjd - traverse/VN_RELE same syntax in RELENG_7? ---snip--- Use traverse() function to find and return mount point's vnode instead of covered vnode when snapshot is already mounted. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197515 pjd - probably easy merge ---snip--- Handle cases where virtual (GFS) vnodes are referenced when doing forced unmount. In that case we cannot depend on the proper order of invalidating vnodes, so we have to free resources when we have a chance. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197683 delphij - easy merge ---snip--- Return EOPNOTSUPP instead of EINVAL when doing chflags(2) over an old format ZFS, as defined in the manual page. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197831 pjd - does the added variable cause a KBI change? ---snip--- Fix situation where Mac OS X NFS client creates a file and when it tries to set ownership and mode in the same setattr operation, the mode was overwritten by secpolicy_vnode_setattr(). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197861 pjd - priv_check available / VOP_ACCESS same syntax in RELENG_7? ---snip--- Allow file system owner to modify system flags if securelevel permits. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=198703 pjd - vaccess same syntax on RELENG_7? ---snip--- - zfs_zaccess() can handle VAPPEND too, so map V_APPEND to VAPPEND and call zfs_access() instead of vaccess() in this case as well. - If VADMIN is specified with another V* flag (unlikely) call both zfs_access() and vaccess() after spliting V* flags. This fixes "dirtying snapshot!" panic. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=199156 pjd - probably easy merge (struct change, KBI implications?) ---snip--- Avoid passing invalid mountpoint to getnewvnode(). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=199157 pjd - important, maybe security, probably easy merge ---snip--- Be careful which vattr fields are set during setattr replay. Without this fix strange things can appear after unclean shutdown like files with mode set to 07777. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=200124 pjd - very easy merge ---snip--- Avoid using additional variable for storing an error if we are not going to do anything with it. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=200126 pjd - easy merge, prevents ZFS pools on ZVOLs ---snip--- Fix deadlock when ZVOLs are present and we are replacing dead component or calling scrub when pool is in a degraded state. It will try to taste ZVOLs, which will lead to deadlock, as ZVOL will try to acquire the same locks as replace/scrub is holding already. We can't simply skip provider based on their GEOM class, because ZVOL can have providers build on top of it and we need to skip those as well. We do it by asking for ZFS::iszvol attribute. Any ZVOL-based provider will give us positive answer and we have to skip those providers. This way we remove possibility to create ZFS pools on top of ZVOLs, but it is not very useful anyway. I believe deadlock is still possible in some very complex situations like when we have MD provider on top of UFS file on top of ZVOL. When we try to replace dead component in the pool mentioned ZVOL is based on, there might be a deadlock when ZFS will try to taste MD provider. There is no easy way to detect that, but it isn't very common. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=200158 pjd - easy merge ---snip--- We have to eventually look for provider without checking guid as this is need for attaching when there is no metadata yet. Before r200125 the order of looking for providers was wrong. It was: 1. Find provider by name. 2. Find provider by guid. 3. Find provider by name and guid. Where it should have been: 1. Find provider by name and guid. 2. Find provider by guid. 3. Find provider by name. ---snip--- -- http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20091217090354.98634ouizsftffk0>