From owner-freebsd-arch@FreeBSD.ORG Tue Dec 18 09:37:42 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EA9A216A41A for ; Tue, 18 Dec 2007 09:37:42 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (vlk.vlakno.cz [62.168.28.247]) by mx1.freebsd.org (Postfix) with ESMTP id 9CE0213C458 for ; Tue, 18 Dec 2007 09:37:42 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 1A54D66AB8C for ; Tue, 18 Dec 2007 10:22:24 +0100 (CET) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (vlk.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SHFHyApYeE2z for ; Tue, 18 Dec 2007 10:22:23 +0100 (CET) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 0144B66AB8A for ; Tue, 18 Dec 2007 10:22:22 +0100 (CET) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.13.8/8.13.8/Submit) id lBI9MM4u009714 for arch@freebsd.org; Tue, 18 Dec 2007 10:22:22 +0100 (CET) (envelope-from rdivacky) Date: Tue, 18 Dec 2007 10:22:22 +0100 From: Roman Divacky To: arch@freebsd.org Message-ID: <20071218092222.GA9695@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: Subject: final decision about *at syscalls X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Dec 2007 09:37:43 -0000 Dear arch@ Over this summer I was working (among other things) on *at family of syscalls kindly sponsored by Google (in their Summer of Code). The resulting patch is almost finished but I need to decide one design question. If you are not interested in *at/namei feel free to skip this mail. The *at syscalls are a threads-oriented extension to basic file syscalls (think of open(), fstat(), etc.) adding the possibility to specify from where the search for relative path should start. image that we have /tmp/foo/bar and CWD is set to "/tmp/", and the process has opened "foo" as dirfd. with ordinary open() syscall you have to either chdir("/tmp/foo");open("./bar"); or open("/tmp/foo/bar"); The first approach is problematic because it changes CWD for all threads in the process, the second is prone to race-conditions as some of the components of the path can change in parallel with the "open". So POSIX introduced a new API, called "Extended API set part 2, ISBN: 1-931624-67-4" (at least this was the latest when I looked last time), which solves that by introducing "*at" syscalls that supply an fd of previously opened directory which is used instead of CWD for searching relative path, ie. the previous example becomes dirfd = open("/tmp/foo"); openat("foo", dirfd); I implemented the whole API as native FreeBSD syscalls + in linuxulator emulation layer. Here's the problem: There are two approaches to the name translation from "filedescriptor" to the "vnode". 1) we can do it in the kern_fooat() syscall and pass namei() the resulting vnode 2) we can pass namei() the filedescriptor and do the translation there PROs of #1: o namei() does not need to know about the curthread, you can use this *at ability for different purposes, it's cleaner (imho) PROs of #2 o raceless implementation o no code duplication CONs of #1 o some very small code duplication (the translation is done in every kern_fooat() function) o there is a race between the name translation and the actual use of the result of the translation that needs to be handled, the "path_to_file" string is copied to the kernel space twice hence a race CONs of #2 o namei is made thread dependant Please tell me what approach you like more. I personally favour #1 because I don't like namei() being thread dependant, Kostik Belousov prefers #2. I'd like to change the current patch to whatever you decide is the best (currently I implement #1) and finally ship it for commiting. thank you Roman Divacky