From owner-freebsd-arch@FreeBSD.ORG Tue Nov 21 10:58:48 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 28AFD16A403 for ; Tue, 21 Nov 2006 10:58:48 +0000 (UTC) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (comp.chem.msu.su [158.250.32.97]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9998243D4C for ; Tue, 21 Nov 2006 10:58:25 +0000 (GMT) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (localhost [127.0.0.1]) by comp.chem.msu.su (8.13.4/8.13.3) with ESMTP id kALAwhx1033800 for ; Tue, 21 Nov 2006 13:58:43 +0300 (MSK) (envelope-from yar@comp.chem.msu.su) Received: (from yar@localhost) by comp.chem.msu.su (8.13.4/8.13.3/Submit) id kALAwhRJ033799 for freebsd-arch@freebsd.org; Tue, 21 Nov 2006 13:58:43 +0300 (MSK) (envelope-from yar) Date: Tue, 21 Nov 2006 13:58:42 +0300 From: Yar Tikhiy To: freebsd-arch@freebsd.org Message-ID: <20061121105842.GD13289@comp.chem.msu.su> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="SUOF0GtieIMvvwua" Content-Disposition: inline User-Agent: Mutt/1.5.9i Subject: Limitations in fts(3) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Nov 2006 10:58:48 -0000 --SUOF0GtieIMvvwua Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi folks, Our FTSENT appears to have unjustified limits as it uses u_short for fts_pathlen and short for fts_level. Consequently, our fts(3) fails miserably on deep trees although there are plenty of resources (memory) available to it--see PR bin/104458. (Traversing unlimited trees using limited resources is a different topic.) Note that the fts_pathlen range is not limited by PATH_MAX because fts(3) constructs long paths by itself. To begin with, paths longer than PATH_MAX are quite useless because they can't be passed to syscalls. Bruce Evans proposed that we could have a new flag FTS_NOPATH to disable generating full pathnames at all. It can be good if used without FTS_NOCHDIR, as fts(3) will take the process to the proper directory so that it can access its entries by their relative pathnames. This way can relieve us of the fts_pathlen limitation without touching the structures from fts.h. However, there still is the fts_level limit. This field begs to be extended to int64_t, as we face all those yottabyte file systems. (But at least it could be int to match FTW.level from ftw(3).) NetBSD folks have already extended fts_pathlen and fts_namelen to u_int, but they forgot about fts_level. They also extended fts_number to int64_t. It looks like time for us to change our fts.h, too. Symbol versioning now even allows us to have both new and old fts(3) in the same libc if we want to. Here's a prototype patch extending fts fields. I tested it on i386, and it just worked. Of course, I'll test it on other arches if we decide this is the way to go. The string/array length fields become size_t because it's the only proper type for them. Note that the patch also extends flag fields to u_int in order to get rid of short types in fts.h entirely, since on modern arches they buy us neither memory economy nor computing speed. It's also a chance to extend fts_number to int64_t. According to Google Code, fts_bignum isn't used outside our src tree, so we can eventually drop it in favour of the extended fts_number. What do you think about all this? Thanks! -- Yar --SUOF0GtieIMvvwua Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=fts-patch --- include/fts.h.orig Fri Jan 7 03:06:20 2005 +++ include/fts.h Mon Nov 20 17:04:48 2006 @@ -44,8 +44,8 @@ dev_t fts_dev; /* starting device # */ char *fts_path; /* path for this descent */ int fts_rfd; /* fd for root */ - int fts_pathlen; /* sizeof(path) */ - int fts_nitems; /* elements in the sort array */ + size_t fts_pathlen; /* sizeof(path) */ + size_t fts_nitems; /* elements in the sort array */ int (*fts_compar) /* compare function */ (const struct _ftsent * const *, const struct _ftsent * const *); @@ -69,22 +69,15 @@ struct _ftsent *fts_cycle; /* cycle node */ struct _ftsent *fts_parent; /* parent directory */ struct _ftsent *fts_link; /* next file in directory */ - union { - struct { - long __fts_number; /* local numeric value */ - void *__fts_pointer; /* local address value */ - } __struct_ftsent; - int64_t __fts_bignum; - } __union_ftsent; -#define fts_number __union_ftsent.__struct_ftsent.__fts_number -#define fts_pointer __union_ftsent.__struct_ftsent.__fts_pointer -#define fts_bignum __union_ftsent.__fts_bignum + int64_t fts_number; /* local numeric value */ +#define fts_bignum fts_number /* XXX non-std, should go away */ + void *fts_pointer; /* local address value */ char *fts_accpath; /* access path */ char *fts_path; /* root path */ int fts_errno; /* errno for this node */ int fts_symfd; /* fd for symlink */ - u_short fts_pathlen; /* strlen(fts_path) */ - u_short fts_namelen; /* strlen(fts_name) */ + size_t fts_pathlen; /* strlen(fts_path) */ + size_t fts_namelen; /* strlen(fts_name) */ ino_t fts_ino; /* inode */ dev_t fts_dev; /* device */ @@ -92,7 +85,7 @@ #define FTS_ROOTPARENTLEVEL -1 #define FTS_ROOTLEVEL 0 - short fts_level; /* depth (-1 to N) */ + int64_t fts_level; /* depth (-1 to N) */ #define FTS_D 1 /* preorder directory */ #define FTS_DC 2 /* directory that causes cycles */ @@ -108,18 +101,18 @@ #define FTS_SL 12 /* symbolic link */ #define FTS_SLNONE 13 /* symbolic link without target */ #define FTS_W 14 /* whiteout object */ - u_short fts_info; /* user flags for FTSENT structure */ + u_int fts_info; /* user flags for FTSENT structure */ #define FTS_DONTCHDIR 0x01 /* don't chdir .. to the parent */ #define FTS_SYMFOLLOW 0x02 /* followed a symlink to get here */ #define FTS_ISW 0x04 /* this is a whiteout object */ - u_short fts_flags; /* private flags for FTSENT structure */ + u_int fts_flags; /* private flags for FTSENT structure */ #define FTS_AGAIN 1 /* read node again */ #define FTS_FOLLOW 2 /* follow symbolic link */ #define FTS_NOINSTR 3 /* no instructions */ #define FTS_SKIP 4 /* discard node */ - u_short fts_instr; /* fts_set() instructions */ + u_int fts_instr; /* fts_set() instructions */ struct stat *fts_statp; /* stat(2) information */ char *fts_name; /* file name */ --- lib/libc/gen/fts.c.orig Tue Jun 8 10:23:23 2004 +++ lib/libc/gen/fts.c Mon Nov 20 17:17:39 2006 @@ -56,15 +56,15 @@ #include #include "un-namespace.h" -static FTSENT *fts_alloc(FTS *, char *, int); +static FTSENT *fts_alloc(FTS *, char *, size_t); static FTSENT *fts_build(FTS *, int); static void fts_lfree(FTSENT *); static void fts_load(FTS *, FTSENT *); static size_t fts_maxarglen(char * const *); static void fts_padjust(FTS *, FTSENT *); static int fts_palloc(FTS *, size_t); -static FTSENT *fts_sort(FTS *, FTSENT *, int); -static u_short fts_stat(FTS *, FTSENT *, int); +static FTSENT *fts_sort(FTS *, FTSENT *, size_t); +static u_int fts_stat(FTS *, FTSENT *, int); static int fts_safe_changedir(FTS *, FTSENT *, int, char *); static int fts_ufslinks(FTS *, const FTSENT *); @@ -119,9 +119,8 @@ struct _fts_private *priv; FTS *sp; FTSENT *p, *root; - int nitems; FTSENT *parent, *tmp; - int len; + size_t len, nitems; /* Options check. */ if (options & ~FTS_OPTIONMASK) { @@ -228,7 +227,7 @@ FTS *sp; FTSENT *p; { - int len; + size_t len; char *cp; /* @@ -630,12 +629,12 @@ { struct dirent *dp; FTSENT *p, *head; - int nitems; FTSENT *cur, *tail; DIR *dirp; void *oldaddr; - size_t dnamlen; - int cderrno, descend, len, level, maxlen, nlinks, oflag, saved_errno, + size_t dnamlen, len, maxlen, nitems; + int64_t level; + int cderrno, descend, nlinks, oflag, saved_errno, nostat, doadjust; char *cp; @@ -745,7 +744,7 @@ if (!ISSET(FTS_SEEDOT) && ISDOT(dp->d_name)) continue; - if ((p = fts_alloc(sp, dp->d_name, (int)dnamlen)) == NULL) + if ((p = fts_alloc(sp, dp->d_name, dnamlen)) == NULL) goto mem1; if (dnamlen >= maxlen) { /* include space for NUL */ oldaddr = sp->fts_path; @@ -774,21 +773,6 @@ maxlen = sp->fts_pathlen - len; } - if (len + dnamlen >= USHRT_MAX) { - /* - * In an FTSENT, fts_pathlen is a u_short so it is - * possible to wraparound here. If we do, free up - * the current structure and the structures already - * allocated, then error out with ENAMETOOLONG. - */ - free(p); - fts_lfree(head); - (void)closedir(dirp); - cur->fts_info = FTS_ERR; - SET(FTS_STOP); - errno = ENAMETOOLONG; - return (NULL); - } p->fts_level = level; p->fts_parent = sp->fts_cur; p->fts_pathlen = len + dnamlen; @@ -889,7 +873,7 @@ return (head); } -static u_short +static u_int fts_stat(sp, p, follow) FTS *sp; FTSENT *p; @@ -991,7 +975,7 @@ fts_sort(sp, head, nitems) FTS *sp; FTSENT *head; - int nitems; + size_t nitems; { FTSENT **ap, *p; @@ -1023,7 +1007,7 @@ fts_alloc(sp, name, namelen) FTS *sp; char *name; - int namelen; + size_t namelen; { FTSENT *p; size_t len; @@ -1095,18 +1079,6 @@ { sp->fts_pathlen += more + 256; - /* - * Check for possible wraparound. In an FTS, fts_pathlen is - * a signed int but in an FTSENT it is an unsigned short. - * We limit fts_pathlen to USHRT_MAX to be safe in both cases. - */ - if (sp->fts_pathlen < 0 || sp->fts_pathlen >= USHRT_MAX) { - if (sp->fts_path) - free(sp->fts_path); - sp->fts_path = NULL; - errno = ENAMETOOLONG; - return (1); - } sp->fts_path = reallocf(sp->fts_path, sp->fts_pathlen); return (sp->fts_path == NULL); } --SUOF0GtieIMvvwua--