From owner-freebsd-fs@FreeBSD.ORG Fri Aug 29 00:02:55 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A66871065677 for ; Fri, 29 Aug 2008 00:02:55 +0000 (UTC) (envelope-from swell.k@gmail.com) Received: from fk-out-0910.google.com (fk-out-0910.google.com [209.85.128.184]) by mx1.freebsd.org (Postfix) with ESMTP id CF8158FC12 for ; Fri, 29 Aug 2008 00:02:54 +0000 (UTC) (envelope-from swell.k@gmail.com) Received: by fk-out-0910.google.com with SMTP id k31so490307fkk.11 for ; Thu, 28 Aug 2008 17:02:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject:references :date:message-id:user-agent:mime-version:content-type; bh=CDw2ESnzFk9exMEr4nOda3jvZgpWA7CRn9y7uUaP87A=; b=bQthqXQzKWui8S+Uv5o8w4mE7xl0+Fo1aMWcZ0dWasNZ3SWHtkz7cQrlUVltz1sO4S g8dFlOIFiekTdbkvu0nEQfHDhL32w60J1qc+gTC3QqRNoq1GlBKaj8E7vnMCDqu+eD8R fR6FTUaUkcZz+wDspywQQFMt8VU9xXR1j2hQg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:references:date:message-id:user-agent :mime-version:content-type; b=WEf3a/u2VyRl7cteYZu8OSuIuT3B2jasszVNn2yABKH4ZNqQ/12pQy5KpTqGe0RSXC Tm+bX7ylUrazrsnwqlNal3V2ol6jfHPO1d9QIa7sB//8FHnUXLKeoKx3t8V4HGH2/MJV OnJBZ+7Vp4mqSDj7WrHyIvmqbK2R/SiryXli4= Received: by 10.180.234.2 with SMTP id g2mr2923284bkh.54.1219966204072; Thu, 28 Aug 2008 16:30:04 -0700 (PDT) Received: from localhost ( [93.80.243.28]) by mx.google.com with ESMTPS id 21sm1146555fkx.13.2008.08.28.16.30.00 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 28 Aug 2008 16:30:02 -0700 (PDT) From: swell.k@gmail.com To: freebsd-fs@FreeBSD.org References: <20080727125413.GG1345@garage.freebsd.pl> Date: Fri, 29 Aug 2008 03:29:58 +0400 Message-ID: <86tzd490qx.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (berkeley-unix) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Cc: Attilio Rao , Pawel Jakub Dawidek Subject: Re: ZFS patches. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Aug 2008 00:02:55 -0000 --=-=-= (CC'ing Attilio, who made the commits) Pawel Jakub Dawidek writes: > Hi. > > http://people.freebsd.org/~pjd/patches/zfs_20080727.patch.bz2 > > The patch above contains the most recent ZFS version that could be found > in OpenSolaris as of today. Apart for large amount of new functionality, > I belive there are many stability (and also performance) improvements > compared to the version from the base system. [...] After r182371 and r182383 there are another three rejections. Namely cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h.rej sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c.rej sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_replay.c.rej I'm attaching them in case someone has a quick fix or idea how to solve them, especially regarding `+' lines. In the meantime I'm reverting them locally hoping it will not do any harm to me. If this fails then I will stay with r182370 since I already upgraded my pools to 11th version and can't go back easily. --=-=-= Content-Disposition: attachment; filename=zfs_context.h.rej Content-Description: zfs_context.h.rej *************** *** 331,374 **** char *v_path; } vnode_t; typedef struct vattr { uint_t va_mask; /* bit-mask of attributes */ u_offset_t va_size; /* file size in bytes */ } vattr_t; - #define AT_TYPE 0x0001 - #define AT_MODE 0x0002 - #define AT_UID 0x0004 - #define AT_GID 0x0008 - #define AT_FSID 0x0010 - #define AT_NODEID 0x0020 - #define AT_NLINK 0x0040 - #define AT_SIZE 0x0080 - #define AT_ATIME 0x0100 - #define AT_MTIME 0x0200 - #define AT_CTIME 0x0400 - #define AT_RDEV 0x0800 - #define AT_BLKSIZE 0x1000 - #define AT_NBLOCKS 0x2000 - #define AT_SEQ 0x8000 #define CRCREAT 0 - #define VOP_CLOSE(vp, f, c, o, cr) 0 - #define VOP_PUTPAGE(vp, of, sz, fl, cr) 0 - #define VOP_GETATTR(vp, vap, fl, cr) ((vap)->va_size = (vp)->v_size, 0) - #define VOP_FSYNC(vp, f, cr) fsync((vp)->v_fd) - #define VN_RELE(vp) vn_close(vp) extern int vn_open(char *path, int x1, int oflags, int mode, vnode_t **vpp, int x2, int x3); extern int vn_openat(char *path, int x1, int oflags, int mode, vnode_t **vpp, - int x2, int x3, vnode_t *vp); extern int vn_rdwr(int uio, vnode_t *vp, void *addr, ssize_t len, offset_t offset, int x1, int x2, rlim64_t x3, void *x4, ssize_t *residp); - extern void vn_close(vnode_t *vp); #define vn_remove(path, x1, x2) remove(path) #define vn_rename(from, to, seg) rename((from), (to)) --- 347,439 ---- char *v_path; } vnode_t; + + typedef struct xoptattr { + timestruc_t xoa_createtime; /* Create time of file */ + uint8_t xoa_archive; + uint8_t xoa_system; + uint8_t xoa_readonly; + uint8_t xoa_hidden; + uint8_t xoa_nounlink; + uint8_t xoa_immutable; + uint8_t xoa_appendonly; + uint8_t xoa_nodump; + uint8_t xoa_settable; + uint8_t xoa_opaque; + uint8_t xoa_av_quarantined; + uint8_t xoa_av_modified; + } xoptattr_t; + typedef struct vattr { uint_t va_mask; /* bit-mask of attributes */ u_offset_t va_size; /* file size in bytes */ } vattr_t; + + typedef struct xvattr { + vattr_t xva_vattr; /* Embedded vattr structure */ + uint32_t xva_magic; /* Magic Number */ + uint32_t xva_mapsize; /* Size of attr bitmap (32-bit words) */ + uint32_t *xva_rtnattrmapp; /* Ptr to xva_rtnattrmap[] */ + uint32_t xva_reqattrmap[XVA_MAPSIZE]; /* Requested attrs */ + uint32_t xva_rtnattrmap[XVA_MAPSIZE]; /* Returned attrs */ + xoptattr_t xva_xoptattrs; /* Optional attributes */ + } xvattr_t; + + typedef struct vsecattr { + uint_t vsa_mask; /* See below */ + int vsa_aclcnt; /* ACL entry count */ + void *vsa_aclentp; /* pointer to ACL entries */ + int vsa_dfaclcnt; /* default ACL entry count */ + void *vsa_dfaclentp; /* pointer to default ACL entries */ + size_t vsa_aclentsz; /* ACE size in bytes of vsa_aclentp */ + } vsecattr_t; + + #define AT_TYPE 0x00001 + #define AT_MODE 0x00002 + #define AT_UID 0x00004 + #define AT_GID 0x00008 + #define AT_FSID 0x00010 + #define AT_NODEID 0x00020 + #define AT_NLINK 0x00040 + #define AT_SIZE 0x00080 + #define AT_ATIME 0x00100 + #define AT_MTIME 0x00200 + #define AT_CTIME 0x00400 + #define AT_RDEV 0x00800 + #define AT_BLKSIZE 0x01000 + #define AT_NBLOCKS 0x02000 + #define AT_SEQ 0x08000 + #define AT_XVATTR 0x10000 #define CRCREAT 0 + #define VOP_CLOSE(vp, f, c, o, cr, ct) 0 + #define VOP_PUTPAGE(vp, of, sz, fl, cr, ct) 0 + #define VOP_GETATTR(vp, vap, cr, td) ((vap)->va_size = (vp)->v_size, 0) + + #define VOP_FSYNC(vp, f, cr, ct) fsync((vp)->v_fd) + #define VN_RELE(vp) vn_close(vp, 0, NULL, NULL) + #define vn_lock(vp, type) + #define VOP_UNLOCK(vp, type) + #ifdef VFS_LOCK_GIANT + #undef VFS_LOCK_GIANT + #endif + #define VFS_LOCK_GIANT(mp) 0 + #ifdef VFS_UNLOCK_GIANT + #undef VFS_UNLOCK_GIANT + #endif + #define VFS_UNLOCK_GIANT(vfslocked) extern int vn_open(char *path, int x1, int oflags, int mode, vnode_t **vpp, int x2, int x3); extern int vn_openat(char *path, int x1, int oflags, int mode, vnode_t **vpp, + int x2, int x3, vnode_t *vp, int fd); extern int vn_rdwr(int uio, vnode_t *vp, void *addr, ssize_t len, offset_t offset, int x1, int x2, rlim64_t x3, void *x4, ssize_t *residp); + extern void vn_close(vnode_t *vp, int openflag, cred_t *cr, kthread_t *td); #define vn_remove(path, x1, x2) remove(path) #define vn_rename(from, to, seg) rename((from), (to)) --=-=-= Content-Disposition: attachment; filename=vdev_file.c.rej Content-Description: vdev_file.c.rej *************** *** 81,91 **** } #endif /* * Determine the physical size of the file. */ vattr.va_mask = AT_SIZE; - error = VOP_GETATTR(vp, &vattr, 0, kcred); if (error) { vd->vdev_stat.vs_aux = VDEV_AUX_OPEN_FAILED; return (error); --- 81,110 ---- } #endif + return (0); + } + + static int + vdev_file_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift) + { + vdev_file_t *vf; + vattr_t vattr; + vnode_t *vp; + int error; + + if ((error = vdev_file_open_common(vd)) != 0) + return (error); + + vf = vd->vdev_tsd; + vp = vf->vf_vnode; + /* * Determine the physical size of the file. */ vattr.va_mask = AT_SIZE; + vn_lock(vp, LK_SHARED | LK_RETRY); + error = VOP_GETATTR(vp, &vattr, kcred, curthread); + VOP_UNLOCK(vp, 0); if (error) { vd->vdev_stat.vs_aux = VDEV_AUX_OPEN_FAILED; return (error); --=-=-= Content-Disposition: attachment; filename=zfs_replay.c.rej Content-Description: zfs_replay.c.rej *************** *** 352,386 **** return (error); } - zfs_init_vattr(&va, lr->lr_mask, lr->lr_mode, lr->lr_uid, lr->lr_gid, 0, lr->lr_foid); - va.va_size = lr->lr_size; - ZFS_TIME_DECODE(&va.va_atime, lr->lr_atime); - ZFS_TIME_DECODE(&va.va_mtime, lr->lr_mtime); vp = ZTOV(zp); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); - error = VOP_SETATTR(vp, &va, kcred, curthread); VOP_UNLOCK(vp, 0); VN_RELE(vp); return (error); } static int - zfs_replay_acl(zfsvfs_t *zfsvfs, lr_acl_t *lr, boolean_t byteswap) { ace_t *ace = (ace_t *)(lr + 1); /* ace array follows lr_acl_t */ #ifdef TODO vsecattr_t vsa; - #endif znode_t *zp; int error; if (byteswap) { byteswap_uint64_array(lr, sizeof (*lr)); - zfs_ace_byteswap(ace, lr->lr_aclcnt); } if ((error = zfs_zget(zfsvfs, lr->lr_foid, &zp)) != 0) { --- 766,877 ---- return (error); } + zfs_init_vattr(vap, lr->lr_mask, lr->lr_mode, lr->lr_uid, lr->lr_gid, 0, lr->lr_foid); + vap->va_size = lr->lr_size; + ZFS_TIME_DECODE(&vap->va_atime, lr->lr_atime); + ZFS_TIME_DECODE(&vap->va_mtime, lr->lr_mtime); + + /* + * Fill in xvattr_t portions if necessary. + */ + + start = (lr_setattr_t *)(lr + 1); + if (vap->va_mask & AT_XVATTR) { + zfs_replay_xvattr((lr_attr_t *)start, &xva); + start = (caddr_t)start + + ZIL_XVAT_SIZE(((lr_attr_t *)start)->lr_attr_masksize); + } else + xva.xva_vattr.va_mask &= ~AT_XVATTR; + + zfsvfs->z_fuid_replay = zfs_replay_fuid_domain(start, &start, + lr->lr_uid, lr->lr_gid); vp = ZTOV(zp); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); + error = VOP_SETATTR(vp, vap, kcred, curthread); VOP_UNLOCK(vp, 0); + + zfs_fuid_info_free(zfsvfs->z_fuid_replay); + zfsvfs->z_fuid_replay = NULL; VN_RELE(vp); return (error); } static int + zfs_replay_acl_v0(zfsvfs_t *zfsvfs, lr_acl_v0_t *lr, boolean_t byteswap) { ace_t *ace = (ace_t *)(lr + 1); /* ace array follows lr_acl_t */ + vsecattr_t vsa; + znode_t *zp; + int error; + + if (byteswap) { + byteswap_uint64_array(lr, sizeof (*lr)); + zfs_oldace_byteswap(ace, lr->lr_aclcnt); + } + + if ((error = zfs_zget(zfsvfs, lr->lr_foid, &zp)) != 0) { + /* + * As we can log acls out of order, it's possible the + * file has been removed. In this case just drop the acl + * and return success. + */ + if (error == ENOENT) + error = 0; + return (error); + } + + bzero(&vsa, sizeof (vsa)); + vsa.vsa_mask = VSA_ACE | VSA_ACECNT; + vsa.vsa_aclcnt = lr->lr_aclcnt; + vsa.vsa_aclentsz = sizeof (ace_t) * vsa.vsa_aclcnt; + vsa.vsa_aclflags = 0; + vsa.vsa_aclentp = ace; + #ifdef TODO + error = VOP_SETSECATTR(ZTOV(zp), &vsa, 0, kcred, NULL); + #else + panic("%s:%u: unsupported condition", __func__, __LINE__); + #endif + + VN_RELE(ZTOV(zp)); + + return (error); + } + + /* + * Replaying ACLs is complicated by FUID support. + * The log record may contain some optional data + * to be used for replaying FUID's. These pieces + * are the actual FUIDs that were created initially. + * The FUID table index may no longer be valid and + * during zfs_create() a new index may be assigned. + * Because of this the log will contain the original + * doman+rid in order to create a new FUID. + * + * The individual ACEs may contain an ephemeral uid/gid which is no + * longer valid and will need to be replaced with an actual FUID. + * + */ + static int + zfs_replay_acl(zfsvfs_t *zfsvfs, lr_acl_t *lr, boolean_t byteswap) + { + ace_t *ace = (ace_t *)(lr + 1); vsecattr_t vsa; znode_t *zp; int error; if (byteswap) { byteswap_uint64_array(lr, sizeof (*lr)); + zfs_ace_byteswap(ace, lr->lr_acl_bytes, B_FALSE); + if (lr->lr_fuidcnt) { + byteswap_uint64_array((caddr_t)ace + + ZIL_ACE_LENGTH(lr->lr_acl_bytes), + lr->lr_fuidcnt * sizeof (uint64_t)); + } } if ((error = zfs_zget(zfsvfs, lr->lr_foid, &zp)) != 0) { --=-=-=--