From owner-freebsd-fs Thu Sep 11 23:00:13 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id XAA06792 for fs-outgoing; Thu, 11 Sep 1997 23:00:13 -0700 (PDT) Received: from nemesis.idirect.com (root@nemesis.idirect.com [207.136.80.40]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id XAA06787; Thu, 11 Sep 1997 23:00:10 -0700 (PDT) Received: from thor.idirect.com (jlixfeld@thor.idirect.com [207.136.80.105]) by nemesis.idirect.com (8.8.5/8.8.4) with SMTP id CAA02337; Fri, 12 Sep 1997 02:00:06 -0400 (EDT) Received: from localhost (jlixfeld@localhost) by thor.idirect.com (8.6.12/8.6.12) with SMTP id CAA02908; Fri, 12 Sep 1997 02:00:04 -0400 X-Authentication-Warning: thor.idirect.com: jlixfeld owned process doing -bs Date: Fri, 12 Sep 1997 02:00:03 -0400 (EDT) From: Jason Lixfeld Reply-To: Jason Lixfeld To: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org Subject: Installing a new disk.. Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hey! I'm a devoted Linux user (due to an already existing Linux network), but I am discovering very quickly some of Linux' limitations which FreeBSD seems to handle better, or period! I'm playing with a new 4 machine proxy concept which will hopefully (depending on the support I get) consist of one Master Proxy machine, and 3 diskless* proxy slaves (* = diskless except for either 2 x 2.5GB or 1 x 5GB local cache drive(s)). Currently, I am still in the process of setting up the master machine. I have the system installed (It was not the first one I have installed, so that went off without a hitch), however in attempting to add a second drive, I am getting somewhat confused as to the order in which to run certain commands to create & newfs two partitions on the cache drives. After reading much documentation, I decided to take the easy way out, and re-run /stand/sysinstall to try to partition the disk. Here is what I did: 0. /stand/sysinstall 1. Selected option 7 2. Selected option 2 3. Selected wd2 4. (C)reated slice using all space on disk (used 165 partition type) 5. (W)rote changes 6. (N)o boot manager 7. "Wrote FDISK partition information out successfully." 8. (Q)uit back to "Choose Custom Installation Options" 9. Selected option 3 10. (C)reated wd2s1e partition. 11. Selected (F)ile system 12. Selected /mnt2 as mount point 13. Repeated steps 10 - 12 for wd2s1f partition (mounted on /mnt3). 14. (W)rote the changes Received the following error for each partition: Error mounting /dev/wd2s1e on /mnt2 : Invalid argument Error mounting /dev/wd2s1e on /mnt3 : Invalid argument What am I doing wrong?! Thanks in advance, Jason Lixfeld From owner-freebsd-fs Fri Sep 12 00:19:45 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id AAA12220 for fs-outgoing; Fri, 12 Sep 1997 00:19:45 -0700 (PDT) Received: from usr08.primenet.com (tlambert@usr08.primenet.com [206.165.6.208]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id AAA12191; Fri, 12 Sep 1997 00:19:38 -0700 (PDT) Received: (from tlambert@localhost) by usr08.primenet.com (8.8.5/8.8.5) id AAA12728; Fri, 12 Sep 1997 00:19:36 -0700 (MST) From: Terry Lambert Message-Id: <199709120719.AAA12728@usr08.primenet.com> Subject: Re: Installing a new disk.. To: jlixfeld@idirect.com Date: Fri, 12 Sep 1997 07:19:35 +0000 (GMT) Cc: freebsd-fs@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG In-Reply-To: from "Jason Lixfeld" at Sep 12, 97 02:00:03 am X-Mailer: ELM [version 2.4 PL23] Content-Type: text Sender: owner-freebsd-fs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > Received the following error for each partition: > > Error mounting /dev/wd2s1e on /mnt2 : Invalid argument > Error mounting /dev/wd2s1e on /mnt3 : Invalid argument > > What am I doing wrong?! mkdir /mnt2 /mnt3 8-). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. From owner-freebsd-fs Fri Sep 12 00:40:50 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id AAA13447 for fs-outgoing; Fri, 12 Sep 1997 00:40:50 -0700 (PDT) Received: from time.cdrom.com (root@time.cdrom.com [204.216.27.226]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id AAA13426; Fri, 12 Sep 1997 00:40:42 -0700 (PDT) Received: from time.cdrom.com (jkh@localhost.cdrom.com [127.0.0.1]) by time.cdrom.com (8.8.7/8.6.9) with ESMTP id AAA12792; Fri, 12 Sep 1997 00:40:51 -0700 (PDT) To: Jason Lixfeld cc: freebsd-fs@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: Installing a new disk.. In-reply-to: Your message of "Fri, 12 Sep 1997 02:00:03 EDT." Date: Fri, 12 Sep 1997 00:40:51 -0700 Message-ID: <12789.874050051@time.cdrom.com> From: "Jordan K. Hubbard" Sender: owner-freebsd-fs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > 0. /stand/sysinstall > 1. Selected option 7 > 2. Selected option 2 > 3. Selected wd2 4. (C)reated slice using all space on disk (used 165 partition type) > 5. (W)rote changes << -- ELIMINATE THIS STEP > 6. (N)o boot manager > 7. "Wrote FDISK partition information out successfully." > 8. (Q)uit back to "Choose Custom Installation Options" > 9. Selected option 3 > 10. (C)reated wd2s1e partition. > 11. Selected (F)ile system > 12. Selected /mnt2 as mount point > 13. Repeated steps 10 - 12 for wd2s1f partition (mounted on /mnt3). > 14. (W)rote the changes << -- YOU WANT TO DO IT HERE. From owner-freebsd-fs Fri Sep 12 08:18:49 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id IAA10041 for fs-outgoing; Fri, 12 Sep 1997 08:18:49 -0700 (PDT) Received: from roguetrader.com (brandon@cold.org [206.81.134.103]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id IAA10019; Fri, 12 Sep 1997 08:18:44 -0700 (PDT) Received: from localhost (brandon@localhost) by roguetrader.com (8.8.5/8.8.5) with SMTP id JAA17650; Fri, 12 Sep 1997 09:18:32 -0600 (MDT) Date: Fri, 12 Sep 1997 09:18:32 -0600 (MDT) From: Brandon Gillespie To: Terry Lambert cc: jlixfeld@idirect.com, freebsd-fs@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: Installing a new disk.. In-Reply-To: <199709120719.AAA12728@usr08.primenet.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Fri, 12 Sep 1997, Terry Lambert wrote: > > Received the following error for each partition: > > > > Error mounting /dev/wd2s1e on /mnt2 : Invalid argument > > Error mounting /dev/wd2s1e on /mnt3 : Invalid argument > > > > What am I doing wrong?! > > mkdir /mnt2 /mnt3 Uhrm, I thought /stand/sysinstall does this already. Besides, I've also seen this same problem with sysinstall (in that for some reason when trying to add a disk later it gives the above errors)--and I ended up manually adding the disk with diskabel/newfs. -Brandon From owner-freebsd-fs Fri Sep 12 08:28:11 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id IAA10487 for fs-outgoing; Fri, 12 Sep 1997 08:28:11 -0700 (PDT) Received: from nemesis.idirect.com (root@nemesis.idirect.com [207.136.80.40]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id IAA10482; Fri, 12 Sep 1997 08:28:07 -0700 (PDT) Received: from thor.idirect.com (jlixfeld@thor.idirect.com [207.136.80.105]) by nemesis.idirect.com (8.8.5/8.8.4) with SMTP id LAA11544; Fri, 12 Sep 1997 11:28:05 -0400 (EDT) Received: from localhost (jlixfeld@localhost) by thor.idirect.com (8.6.12/8.6.12) with SMTP id LAA07912; Fri, 12 Sep 1997 11:28:04 -0400 X-Authentication-Warning: thor.idirect.com: jlixfeld owned process doing -bs Date: Fri, 12 Sep 1997 11:28:04 -0400 (EDT) From: Jason Lixfeld To: Terry Lambert cc: freebsd-fs@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: Installing a new disk.. In-Reply-To: <199709120719.AAA12728@usr08.primenet.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Those directories already exist! =( Any other options?! On Fri, 12 Sep 1997, Terry Lambert wrote: > > Received the following error for each partition: > > > > Error mounting /dev/wd2s1e on /mnt2 : Invalid argument > > Error mounting /dev/wd2s1e on /mnt3 : Invalid argument > > > > What am I doing wrong?! > > mkdir /mnt2 /mnt3 > > 8-). > > > Terry Lambert > terry@lambert.org > --- > Any opinions in this posting are my own and not those of my present > or previous employers. > From owner-freebsd-fs Fri Sep 12 08:29:21 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id IAA10599 for fs-outgoing; Fri, 12 Sep 1997 08:29:21 -0700 (PDT) Received: from nemesis.idirect.com (root@nemesis.idirect.com [207.136.80.40]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id IAA10590; Fri, 12 Sep 1997 08:29:17 -0700 (PDT) Received: from thor.idirect.com (jlixfeld@thor.idirect.com [207.136.80.105]) by nemesis.idirect.com (8.8.5/8.8.4) with SMTP id LAA11725; Fri, 12 Sep 1997 11:29:12 -0400 (EDT) Received: from localhost (jlixfeld@localhost) by thor.idirect.com (8.6.12/8.6.12) with SMTP id LAA07930; Fri, 12 Sep 1997 11:29:11 -0400 X-Authentication-Warning: thor.idirect.com: jlixfeld owned process doing -bs Date: Fri, 12 Sep 1997 11:29:11 -0400 (EDT) From: Jason Lixfeld To: Dan Cross cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Installing a new disk.. In-Reply-To: <19970912073925.1451.qmail@spitfire.ecsel.psu.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I tried that aswell. I got the error: su-2.00# newfs /dev/rwd2s1e newfs: /dev/rwd2s1e: `e' partition is unavailable su-2.00# WTF?! =) On Fri, 12 Sep 1997, Dan Cross wrote: > I think that you forgot to newfs the partitions. That part is > easy to do outside of sysinstall: > > > 10. (C)reated wd2s1e partition. > > # newfs /dev/rwd2s1e > # mount /dev/wd2s1e /mnt2 > > Should work if the fdisk information is correct. Hope this helps! > > - Dan C. > From owner-freebsd-fs Fri Sep 12 08:32:17 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id IAA10951 for fs-outgoing; Fri, 12 Sep 1997 08:32:17 -0700 (PDT) Received: from nemesis.idirect.com (root@nemesis.idirect.com [207.136.80.40]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id IAA10933; Fri, 12 Sep 1997 08:32:10 -0700 (PDT) Received: from thor.idirect.com (jlixfeld@thor.idirect.com [207.136.80.105]) by nemesis.idirect.com (8.8.5/8.8.4) with SMTP id LAA12319; Fri, 12 Sep 1997 11:32:05 -0400 (EDT) Received: from localhost (jlixfeld@localhost) by thor.idirect.com (8.6.12/8.6.12) with SMTP id LAA07981; Fri, 12 Sep 1997 11:32:04 -0400 X-Authentication-Warning: thor.idirect.com: jlixfeld owned process doing -bs Date: Fri, 12 Sep 1997 11:32:04 -0400 (EDT) From: Jason Lixfeld To: "Jordan K. Hubbard" cc: freebsd-fs@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: Installing a new disk.. In-Reply-To: <12789.874050051@time.cdrom.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk This appeared to work! Thank you very much! =) Weird though! I would have though it required a write from the Partition area.. wow! Thanks again! :) On Fri, 12 Sep 1997, Jordan K. Hubbard wrote: > > 0. /stand/sysinstall > > 1. Selected option 7 > > 2. Selected option 2 > > 3. Selected wd2 > 4. (C)reated slice using all space on disk (used 165 partition type) > > 5. (W)rote changes << -- ELIMINATE THIS STEP > > 6. (N)o boot manager > > 7. "Wrote FDISK partition information out successfully." > > 8. (Q)uit back to "Choose Custom Installation Options" > > 9. Selected option 3 > > 10. (C)reated wd2s1e partition. > > 11. Selected (F)ile system > > 12. Selected /mnt2 as mount point > > 13. Repeated steps 10 - 12 for wd2s1f partition (mounted on /mnt3). > > 14. (W)rote the changes << -- YOU WANT TO DO IT HERE. > From owner-freebsd-fs Fri Sep 12 18:21:01 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id SAA23754 for fs-outgoing; Fri, 12 Sep 1997 18:21:01 -0700 (PDT) Received: from pluto.plutotech.com (root@mail.plutotech.com [206.168.67.137]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id SAA23749; Fri, 12 Sep 1997 18:20:58 -0700 (PDT) Received: from shane.plutotech.com (shane.plutotech.com [206.168.67.149]) by pluto.plutotech.com (8.8.5/8.8.5) with ESMTP id TAA03040; Fri, 12 Sep 1997 19:19:31 -0600 (MDT) Message-Id: <199709130119.TAA03040@pluto.plutotech.com> From: "Mike Durian" To: Terry Lambert cc: hackers@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: VFS/NFS client wedging problem In-reply-to: Your message of "Sat, 13 Sep 1997 00:03:16 -0000." Date: Fri, 12 Sep 1997 19:19:31 -0600 Sender: owner-freebsd-fs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Sat, 13 Sep 1997 00:03:16 -0000, Terry Lambert wrote: > >How do you serialize upcalls through the socket, such that you >don't have overlapped requests? Or do you have seperate request >contexts so that this doesn't cause problems? > >If you don't have seperate contexts, eventually you'll make a request >before the previous one completes. I serialize. I used to keep a list of N available sockets and use one socket per request, but since I handle commands atomically in the user process figured it was silly and dropped down to one socket. The user process is one big select loop, and doesn't call select again until it has completed all commands on the readable sockets (which is now just one socket). >The NFS export stuff is a bit problematic. I don't know what to >say about it, except that it should be in the common mount code >instead of being duplicated per FS. > >If you can give more architectural data about your FS, and you can >give the FS you used as a model of how a VFS should be written, I >might be able to give you more detailed help. > >This is probably something that should be taken off the general >-hackers list, and onto fs@freebsd.org It's really a mish-mash of other file systems. I grabbed some from cd9660 and msdosfs for NFS, socket stuff from portal and then nullfs and other miscfs filesystem for general stuff. I'll take all the detailed stuff off this list and move it to freebsd-fs. I didn't know the fs list existed. >That's not strange. It's a request context that's wedged. When a >request context would be slept, the nfsd on the server isn't slept, >the context is. The nfsd provides an execution context for a different >request context at that point. Try nfsstat instead, and/or iostat, >on the server. I didn't realize that. I did use nfsstat, but didn't know what to look for. The only thing that seemed interesting to me was the 190 server faults. But I didn't know if that was normal or not. >This proves to us that it isn't async requests over the wire that are >hosing you. That the server is an NFSv3 capable server argues that >the v2 protocol is implemented by a v3 engine, which would explain >the blockages. > >Have you tried bot TCP and UDP based mounts? Yes. UDP died locked up faster than TCP (though that is a subjective measurement, I didn't actually time things). TCP had the "server not responding"/"responding again" messages. mike From owner-freebsd-fs Fri Sep 12 18:59:42 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id SAA25738 for fs-outgoing; Fri, 12 Sep 1997 18:59:42 -0700 (PDT) Received: from pluto.plutotech.com (root@mail.plutotech.com [206.168.67.137]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id SAA25733 for ; Fri, 12 Sep 1997 18:59:40 -0700 (PDT) Received: from shane.plutotech.com (shane.plutotech.com [206.168.67.149]) by pluto.plutotech.com (8.8.5/8.8.5) with ESMTP id TAA03613; Fri, 12 Sep 1997 19:59:37 -0600 (MDT) Message-Id: <199709130159.TAA03613@pluto.plutotech.com> From: "Mike Durian" To: Terry Lambert cc: fs@FreeBSD.ORG Subject: Re: VFS/NFS client wedging problem In-reply-to: Your message of "Sat, 13 Sep 1997 00:03:16 -0000." Date: Fri, 12 Sep 1997 19:59:37 -0600 Sender: owner-freebsd-fs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Sat, 13 Sep 1997 00:03:16 -0000, Terry Lambert wrote: > >If you can give more architectural data about your FS, and you can >give the FS you used as a model of how a VFS should be written, I >might be able to give you more detailed help. > >This is probably something that should be taken off the general >-hackers list, and onto fs@freebsd.org I'm not sure what you want when you ask for the architectural data. The thing to keep in mind is that it truly is a "virtual" filesystem and not one that would be used for everyday use. You cannot create or delete files. All the files access a common set of data on our RAID system that we record and play in real time using standard VTR function. The purpose of the file system is to allow graphics type people, those who render or modify video data, a painless way of getting the frames (and eventually audio). Using NFS, rcp, samba, appletalk or ftp they can get the frame they want in the format they want and write the same. All without installing additional software on their workstations. The filesystem layout is controlled from a configuration file with lines like: # # This is a sample PFS config file # The first component is a possible path in the PFS virtual file system. # The rest of the line is a program to run to convert the video data # found via the corrisponding path. Comments are started with a # # and continue to the end of the line. # # Path components can consist of string literals or wildcards. # Wildcards begin with a ${ and end with a }. Literals are all # other characters. Literals must be matched exactly, but wildcards # will expand to different values depending on the wildcard type and # clip meta data. # # Wildcard Expands to # ${name} Names of all clips on system # ${hour} number of hours in a clip # ${minute} number of minutes in a clip # ${second} number of seconds in a clip # ${frame} number of frames in a clip not zero padded # ${pad_frame} number of frames in a clip zero padded # # ${hour}, ${minute}, ${second}, ${frame} will # be further restricted if other wildcards exist. In a path # ${hour}/${minute}/${second}/${frame}, ${frame} will only # expand to the number of frames in the specified hour:minute:second. # # Timecode access HMSF/tiff/${name}/hour${hour}/minute${minute}/second${second}/\ ${hour}.${minute}.${second}.${pad_frame}.tiff \ data_type=vframe \ converter_type=tiff HMSF/tga/${name}/hour${hour}/minute${minute}/second${second}/\ ${hour}.${minute}.${second}.${pad_frame}.tga \ data_type=vframe \ converter_type=tga HMSF/pluto/${name}/hour${hour}/minute${minute}/second${second}/\ ${hour}.${minute}.${second}.${pad_frame}.plt \ data_type=vframe \ converter_type=pluto HMSF/abekas/${name}/hour${hour}/minute${minute}/second${second}/\ ${hour}.${minute}.${second}.${pad_frame}.yuv \ data_type=vframe \ converter_type=abekas HMSF/sgi/${name}/hour${hour}/minute${minute}/second${second}/\ ${hour}.${minute}.${second}.${pad_frame}.rgb \ data_type=vframe \ converter_type=sgi # flat frame access frames/tiff/${name}/${pad_frame}.tiff \ data_type=vframe \ converter_type=tiff frames/tga/${name}/${pad_frame}.tga \ data_type=vframe \ converter_type=tga frames/pluto/${name}/${pad_frame}.plt \ data_type=vframe \ converter_type=pluto frames/abekas/${name}/${pad_frame}.yuv \ data_type=vframe \ converter_type=abekas frames/sgi/${name}/${pad_frame}.rgb \ data_type=vframe \ converter_type=sgi I implemented the filesystem paritally in user space for a number of reasons. Our box has realtime constraints when playing and recording CCIR-601 video, so I didn't want our custom filesystem (PFS, Pluto File System) to get in the way. I though using the scheduler in the normal way would be an easy way to do this (the playback and record stuff runs at a realtime priority). The ability to use normal debugging techniques was also a major factor. So when I first mount the filesystem I create a bunch (where bunch is currently defined as 1) of unix domain sockets for communicating commands from the kernel to the user process. If all sockets are currently in use by commands the next command will sleep until one is available. The user process is a big loop that selects on the sockets and then processes the commands on the sockets one at a time in series and then selects again. So on the user side, commands do not overlap. There is basically a one to one corrispondence between VFS entry points and my commands. Where there are vnodes and pfsnodes in the kernel, there are pfs_states in the user process. For me the file path is the important part, not the data. In a way all my files are like hard links. /pfs/frames/tiff/0.tiff and /pfs/frames/tga/0.tga both access the same data on the raid disk. However, they need to be translated differently. So in lookup, I not only create a new pfsnode, but also issue a command to make sure the user process also knows the full cannonical path. I did notice a deadlock situation when using only one socket. Both synch and reclaim could lock the system if they were forced to sleep waiting for the socket to become available. So I delay those operations. In the synch case I synch immediately if the socket is not in use, but will delay the operation if it is. For reclaim I free all the resources in the kernel, but always delay the reclaim command for the user process. The delayed commands get executed immediately before a socket is returned to the available list or right before one is pulled off the available list. This keeps the operations in order. I should also mention that I can't really behave like an NFS server since my filesystem needs to save state. Do to some video issues having to do with converting from a 4:4:4 RGB color space to a 4:2:2 YUV color space, I need to process lines of video atomically. Since I can't just return short counts on writes (imagine the person writing a program that loops around a write without adding new data until the current data gets out), I need to process what I can and then hang onto the rest until the rest of the line of video shows up. I also sometimes need to hold onto various bits of the file header information so I can parse the data correctly when it arrives. This is an annoying issue for me, but not really what I wanted to ask about. As a final note, here are the vnops entry points I support (with locking type stuff basically no-op'd for my pfsnodes and just the standard boiler plate from the VOP_LOCK man page. I'm still running -current from a month ago until I can get a working version of pmap.c): static int pfs_abortop __P((struct vop_abortop_args *)); static int pfs_access __P((struct vop_access_args *)); static int pfs_badop __P((void)); static int pfs_close __P((struct vop_close_args *)); static int pfs_getattr __P((struct vop_getattr_args *)); static int pfs_inactive __P((struct vop_inactive_args *)); static int pfs_ioctl __P((struct vop_ioctl_args *)); static int pfs_lookup __P((struct vop_lookup_args *)); static int pfs_open __P((struct vop_open_args *)); static int pfs_pathconf __P((struct vop_pathconf_args *ap)); static int pfs_print __P((struct vop_print_args *)); static int pfs_read __P((struct vop_read_args *)); static int pfs_write __P((struct vop_write_args *)); static int pfs_readdir __P((struct vop_readdir_args *)); static int pfs_reclaim __P((struct vop_reclaim_args *)); static int pfs_setattr __P((struct vop_setattr_args *)); static int pfs_fsync __P((struct vop_fsync_args *)); static int pfs_advlock __P((struct vop_advlock_args *)); static int pfs_lock __P((struct vop_lock_args *)); static int pfs_unlock __P((struct vop_unlock_args *)); static int pfs_islocked __P((struct vop_islocked_args *)); static int pfs_truncate __P((struct vop_truncate_args *)); pfs_truncate is a no-op but keeps atalkd happy. Same with advlock. So, is this the sort of detail you wanted? Or did I complete miss all the relevent information? mike From owner-freebsd-fs Fri Sep 12 20:26:46 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id UAA29528 for fs-outgoing; Fri, 12 Sep 1997 20:26:46 -0700 (PDT) Received: from usr02.primenet.com (tlambert@usr02.primenet.com [206.165.6.202]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id UAA29523; Fri, 12 Sep 1997 20:26:44 -0700 (PDT) Received: (from tlambert@localhost) by usr02.primenet.com (8.8.5/8.8.5) id UAA07078; Fri, 12 Sep 1997 20:26:25 -0700 (MST) From: Terry Lambert Message-Id: <199709130326.UAA07078@usr02.primenet.com> Subject: Re: VFS/NFS client wedging problem To: durian@plutotech.com (Mike Durian) Date: Sat, 13 Sep 1997 03:26:24 +0000 (GMT) Cc: tlambert@primenet.com, hackers@FreeBSD.ORG, fs@FreeBSD.ORG In-Reply-To: <199709130119.TAA03040@pluto.plutotech.com> from "Mike Durian" at Sep 12, 97 07:19:31 pm X-Mailer: ELM [version 2.4 PL23] Content-Type: text Sender: owner-freebsd-fs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > >If you don't have seperate contexts, eventually you'll make a request > >before the previous one completes. > > I serialize. This is what I figured you had to do, or you'd really be in trouble. > I used to keep a list of N available sockets and > use one socket per request, but since I handle commands atomically > in the user process figured it was silly and dropped down to one > socket. If this is a UNIX domain socket, then it's like a pipe. A pipe does not guarantee to keep data together over the pipe block size, so if you are doing larger writes, this gould be your problem. You could write: AAAAABBBBBCCCCC And get the data out of order: AAAABABBBCBCCCC Which would account for the failures. Typically, when I do this, I write data as: aAaAaAaAaAbBbBbBbBbBcCcCcCcCcC where a, b, and c are channel identification tokens. Then you can decode: aAaAaAaAbBaAbBbBbBcCbBcCcCcCcC Back into atomic units. The channel identifiers are per byte. This is only one possibility, and depends on the write buffer size. > The user process is one big select loop, and doesn't > call select again until it has completed all commands on the > readable sockets (which is now just one socket). Did this failure occur when you had seperate sockets? How hard would it be to go back to a socket per channel as a test case? > >The NFS export stuff is a bit problematic. I don't know what to > >say about it, except that it should be in the common mount code > >instead of being duplicated per FS. > > > >If you can give more architectural data about your FS, and you can > >give the FS you used as a model of how a VFS should be written, I > >might be able to give you more detailed help. > > > >This is probably something that should be taken off the general > >-hackers list, and onto fs@freebsd.org > > It's really a mish-mash of other file systems. I grabbed some > from cd9660 and msdosfs for NFS, socket stuff from portal and > then nullfs and other miscfs filesystem for general stuff. This is not going to be a pleasent revelation, I'm afraid. These are the worst places to get NFS and VOP_LOCK examples, unfortunately. The best place is the ffs/ufs two layer stack, but it's very complicated and hard to understand. The directory stuff in the msdosfs, particularly, is bad. There is a race window after unlocking the parent to locking the child of the child. This is pretty much unavoidable (at present) because of the VOP_LOOKUP code structure pushing some things better left up top down into the per FS code (the msdosfs would be able to deal with it if it didn't have the VOP_ABORTOP issues on create and rename to contend with). > I'll take all the detailed stuff off this list and move it to > freebsd-fs. I didn't know the fs list existed. Heh. Most people don't. It doesn't see much action because it requires huge code shifts to modify interfaces. Anything that needs to do that touches every FS at the same time. > >That's not strange. It's a request context that's wedged. When a > >request context would be slept, the nfsd on the server isn't slept, > >the context is. The nfsd provides an execution context for a different > >request context at that point. Try nfsstat instead, and/or iostat, > >on the server. > > I didn't realize that. I did use nfsstat, but didn't know what > to look for. The only thing that seemed interesting to me was > the 190 server faults. But I didn't know if that was normal or not. I have 0 here, but then my stuff is pretty hacked up compared to the standard distribution, so I have no way of kowing if faults are the normal state of affairs or not. Doug Rabson would know. > >This proves to us that it isn't async requests over the wire that are > >hosing you. That the server is an NFSv3 capable server argues that > >the v2 protocol is implemented by a v3 engine, which would explain > >the blockages. > > > >Have you tried bot TCP and UDP based mounts? > > Yes. UDP died locked up faster than TCP (though that is a subjective > measurement, I didn't actually time things). TCP had the "server not > responding"/"responding again" messages. This lets out "source host not equal to mount host" errors. It's a good data point for eliminating an obvious case... even negative data is still data. 8-(. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. From owner-freebsd-fs Sat Sep 13 14:27:35 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id OAA00721 for fs-outgoing; Sat, 13 Sep 1997 14:27:35 -0700 (PDT) Received: from critter.freebsd.dk (critter.freebsd.dk [195.8.129.26]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id OAA00712 for ; Sat, 13 Sep 1997 14:27:31 -0700 (PDT) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.8.7/8.8.7) with ESMTP id WAA05466 for ; Sat, 13 Sep 1997 22:47:01 +0200 (CEST) To: fs@freebsd.org Subject: getcwd() as syscall... From: Poul-Henning Kamp Date: Sat, 13 Sep 1997 22:47:00 +0200 Message-ID: <5464.874183620@critter.freebsd.dk> Sender: owner-freebsd-fs@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Now that we have v_dd and v_ddid in the vnodes, and since the namecache is quite reluctant to throw out directories, it would be possible to make a "getcwdifyoucan()" syscall, which would run a lot faster than the getcwd() library routine. The "ifyoucan" bit reflects that it may actually not be able to if the required information isn't in the namecache, in which case I would prefer to have userland fall back to the usual method, rather than have to implement it in the kernel. I have no idea how much this would or wouldn't save in time. I know that during a "make world", 13 % of the name cache lookups are for "..", and quite a number of these probably come from getcwd(), so there is something to go aim for. Comments ? -- Poul-Henning Kamp FreeBSD coreteam member phk@FreeBSD.ORG "Real hackers run -current on their laptop." From owner-freebsd-fs Sat Sep 13 15:01:54 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id PAA02397 for fs-outgoing; Sat, 13 Sep 1997 15:01:54 -0700 (PDT) Received: from usr03.primenet.com (tlambert@usr03.primenet.com [206.165.6.203]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id PAA02391; Sat, 13 Sep 1997 15:01:50 -0700 (PDT) Received: (from tlambert@localhost) by usr03.primenet.com (8.8.5/8.8.5) id PAA29443; Sat, 13 Sep 1997 15:01:50 -0700 (MST) From: Terry Lambert Message-Id: <199709132201.PAA29443@usr03.primenet.com> Subject: Re: getcwd() as syscall... To: phk@FreeBSD.ORG (Poul-Henning Kamp) Date: Sat, 13 Sep 1997 22:01:50 +0000 (GMT) Cc: fs@FreeBSD.ORG In-Reply-To: <5464.874183620@critter.freebsd.dk> from "Poul-Henning Kamp" at Sep 13, 97 10:47:00 pm X-Mailer: ELM [version 2.4 PL23] Content-Type: text Sender: owner-freebsd-fs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > Now that we have v_dd and v_ddid in the vnodes, and since the namecache > is quite reluctant to throw out directories, it would be possible to > make a "getcwdifyoucan()" syscall, which would run a lot faster than > the getcwd() library routine. > > The "ifyoucan" bit reflects that it may actually not be able to if the > required information isn't in the namecache, in which case I would > prefer to have userland fall back to the usual method, rather than > have to implement it in the kernel. > > I have no idea how much this would or wouldn't save in time. I know > that during a "make world", 13 % of the name cache lookups are for > "..", and quite a number of these probably come from getcwd(), so > there is something to go aim for. > > Comments ? I'd say try the cache first, and then do the work in the kernel to make it work. The work is going to be done on the users dime (quantum) anyway. Knowing the parent will save looking up "..". Otherwise, it won't be used much because it won't work on anything but UFS. There may be a problem with this, however, given that the parent may not know about the child. If the child has been deleted while it was someone's current directory, then the parent value needs to be set to 0 (or whatever other marker value) so that people with it open can distinguish an orphaned open directory from an unorphaned one. You can't use the link count because multiple people might have it open, and you won't be able to distinguish that from "I have it open and the other links are '.' and the one in the parent". Note: you need different markers for "deleted" and "root of FS"; I think "root of FS" should have itself (2) as the parent. Other than that, it's a *great* idea, and the purpose behind the parent pointers in the first place. 8-) 8-). Oh... is there an option to fsck to set these on a legacy FS? And has newfs been fixed to fill the values in for the default director{y/ies}? I know the code has a "create lost+found" option, even if it is never exercised... Finally, I'd be *real* tempted to make it an fcntl() instead of a system call, and make "-1" mean "cwd", so it can be used on fd's pointing to directories (saves on the system call name space, too). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.