From owner-freebsd-hackers@FreeBSD.ORG Fri Nov 22 17:04:20 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A223A6F8; Fri, 22 Nov 2013 17:04:20 +0000 (UTC) Received: from mail.ambrisko.com (mail.ambrisko.com [70.91.206.90]) by mx1.freebsd.org (Postfix) with ESMTP id 7A5442124; Fri, 22 Nov 2013 17:04:20 +0000 (UTC) X-Ambrisko-Me: Yes Received: from server2.ambrisko.com (HELO internal.ambrisko.com) ([192.168.1.2]) by ironport.ambrisko.com with ESMTP; 22 Nov 2013 09:08:12 -0800 Received: from ambrisko.com (localhost [127.0.0.1]) by internal.ambrisko.com (8.14.4/8.14.4) with ESMTP id rAMH4Ji4070672; Fri, 22 Nov 2013 09:04:19 -0800 (PST) (envelope-from ambrisko@ambrisko.com) Received: (from ambrisko@localhost) by ambrisko.com (8.14.4/8.14.4/Submit) id rAMH4JRa070670; Fri, 22 Nov 2013 09:04:19 -0800 (PST) (envelope-from ambrisko) Date: Fri, 22 Nov 2013 09:04:19 -0800 From: Doug Ambrisko To: Konstantin Belousov Subject: Re: Re: Fix MNAMELEN or reimplement struct statfs Message-ID: <20131122170419.GA60910@ambrisko.com> References: <201306101152.17966.jhb@freebsd.org> <52854161.6080104@FreeBSD.org> <20131115010854.GA76106@ambrisko.com> <20131116183129.GD59496@kib.kiev.ua> <20131118190142.GA28210@ambrisko.com> <20131119074922.GY59496@kib.kiev.ua> <20131119174216.GA80753@ambrisko.com> <20131120075531.GE59496@kib.kiev.ua> <20131121174028.GA80520@ambrisko.com> <20131122074228.GT59496@kib.kiev.ua> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20131122074228.GT59496@kib.kiev.ua> User-Agent: Mutt/1.4.2.3i Cc: freebsd-hackers@freebsd.org, Dirk Engling , Jase Thew , mdf@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Nov 2013 17:04:20 -0000 On Fri, Nov 22, 2013 at 09:42:28AM +0200, Konstantin Belousov wrote: | On Thu, Nov 21, 2013 at 09:40:28AM -0800, Doug Ambrisko wrote: | > On Wed, Nov 20, 2013 at 09:55:31AM +0200, Konstantin Belousov wrote: | > | On Tue, Nov 19, 2013 at 09:42:16AM -0800, Doug Ambrisko wrote: | > | > I was talking about the more general case since the system tries to keep | > | > the path in the stat structure. My prior approach which had more issues | > | > was to modify the stat structure of which I was pointed to NetBSD and their | > | > change to statvfs which doesn't really solve the problem. They don't | > | > have the check to see if the mount is longer then VFS_MNAMELEN (in their case) | > | > and just truncate things. | > | > | > | > If we are just talking about adding it to the mount structure that | > | > would be okay since it isn't exposed to user land. I can add that. | > | | > | Yes, this is exactly what I mean. Add a struct mount field, and use | > | it for kernel only. In fact, it only matters for sys_unmount() and | > | kern_jail.c, other locations in kernel use the path for warnings, and | > | this could be postponed if you prefer to minimize the patch. | > | > Okay, I went through all of the occurances and compile tested (except | > for #DEBUG). I united a few things but should do more once I get | > consensus on the approach. I found a few spots that should be updated as | > well and made the length check more consistant. Some were doing >= and others | > >. So this should be better, however, a lot larger. On the plus side | > when we figure out how to return the longer path length to user land | > that can be more flexible since the kernel is tracking the longer length. | > Probably things to note are changes in: | > ZFS to mount snapshot | > cd9660 for symlinks | > fuse to return full path | > jail to check statfs and mount | > mount/umount to save and check full path | > mountroot to save new field for full path | > | > Just in case it doesn't make it in email the full patch is at: | > http://people.freebsd.org/~ambrisko/mount_bigger.patch | > | Yes, this is closer to the patch I can agree with :). | | Two notes, one was already made, about off by one. The off by one, I want to revisit so that it is consistant. We have places in which there was checks if (strlen(fstype) >= MFSNAMELEN || strlen(fspath) >= MNAMELEN) and if (strlen(fstype) >= MFSNAMELEN - 1 || strlen(fspath) >= MNAMELEN - 1) both with the same comment of "Be ultra-paranoid". Unless something is special they should have been the same and whatever is right should be carried forward. If there is a special case then it should be clearly commented. Since this check has moved into other code we need to get it hashed out once and for all IMHO. I mainly did this current change to make sure attention is drawn to this for now until it is resolved. | Second is, I suggest to make the mnt_path member a char *. Usually, the | mount point path length is quite short, so 1024 bytes for the buffer is | excessive. You can allocate exact needed buffer, which would save in | around 10KB of kernel memory even for relatively modest amount of 10 | mounts. Okay, I thought you wanted it a const char to potential guard against some mis use of the field in that this should be a read only value. I had actually planned to do the malloc since I was concerned about if this structure got allocated on the stack then it could explode the kernel's stack. It seems most of the consumers access the mount structure as a pointer so then I wasn't as concerned. | For additional cleverness, mnt_path could point to f_mntonname when | the length is less than MNAMELEN. Since mp deallocation is centralized, | code for the trick should be not too hard. I'll look to see if I can change the other places that update mnt_path to use the vfs_mount_alloc type function. Since then we could get more sophisticated about the mnt_path allocater/reference as you mention. In nfs_mounroot.c it probably doesn't matter much since it should be a short path but it could be more of an issue with zfs snapshots. It looks like we are converging. I'll make some more changes to make sure we are getting on a good path port another patch. Once that looks okay in concept then I'll start looking into testing the various file systems since unfortuanately it touches a lot of code even though it is mostly mechanical. I don't have a lot of time to work on this so I want to optimize various things as once. If someone can help unit test corner cases that would be great with the various file systems. Atleast I have VirtualBox netbooting so I can test things quicker. However, that required some debugging and changes to pxeboot to send the Client ID so isc-dhcpd didn't get upset with it. I need to check that doesn't break the non-ipxe boot stuff that doesn't require the Client ID field to be set. I've only run into this issue with ipxe in VirtualBox and qemu. I also have some pxe boot robustness and caching fixes that I should get in as well. Thanks, Doug A.