From owner-freebsd-hackers@freebsd.org Sun Jan 31 03:24:01 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ED1A0A73E91 for ; Sun, 31 Jan 2016 03:24:00 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-ob0-x22c.google.com (mail-ob0-x22c.google.com [IPv6:2607:f8b0:4003:c01::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B5235E03; Sun, 31 Jan 2016 03:24:00 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-ob0-x22c.google.com with SMTP id is5so93670102obc.0; Sat, 30 Jan 2016 19:24:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=mBZ9an5UwqLqhsouPCwDrfAK3MS5jxRMFyxC7DyyDj0=; b=lC0K+lT5K8Qr0Z3QCuIuFjX3mc/yDCQ9h8zUri1mRw9I1ROfGR3DBvI4UbEoIKD6xK tRYBkkz9XII3gtSdWvlvM53Kfn/hlnaA+f8G4hv7xU0WrQQWi8dxrtqyNSjRXai4F4Ph 4Mn7eHNNYMgghGsndzXiUjBRPHF2YJ0mANH1vh6a+n2fB3Xc2VWXb4LFIwOpsaZ4l3Nt r7jYZRW1R3f002jHJrz2m+7WMKrN9kHYz6Gi7d0dZVw3VT040X0JXDjaEbYwT1y1bERS 5lEYTzmlH+lz4LTMRskwjHvuOBVCc73j+xOs2MjZrfiEZ7EXrirBKnakYgY19oncjo+H dU4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=mBZ9an5UwqLqhsouPCwDrfAK3MS5jxRMFyxC7DyyDj0=; b=XLghTHkBEN7CeQ6kl4VpSM9PP034oP+ApQRcdaEHwzgCDxm5E4pfGt+dqBNyBz+v2r x/E2L6HGCPKIOZ5w4nPOTUOj0cc0QpxYh/QcRXr8yZ2dt4d/ok0kbbGfWeqrI3nY0s8z 0ORmibEsV84EGbSRNyv06IogW4eLP9uwIEDZqzLqsQh4T8f9skBIcarivfctxw6bNESU 9jSKgJYubdyVDQkg5UT5DQOcG02mrSi40Mhk4yFR5mBlC7skBKePZ7KDGzacBl8vszao EEXvzTOee1yhKH3scnwebbjAt9twY4r3H4WfgvJ5H6EGGGetIAhIddAQ8N+ROAktVRXd Legg== X-Gm-Message-State: AG10YORpv5kfl2s9G1/DO7XaEjPL+WvcNWa2IZXLO0qLENDMHbOc5wRhXs1fzrCQ2nS7yPd7nhWHG/AIS+cfJg== MIME-Version: 1.0 X-Received: by 10.182.79.103 with SMTP id i7mr12568782obx.41.1454210639982; Sat, 30 Jan 2016 19:23:59 -0800 (PST) Sender: asomers@gmail.com Received: by 10.202.210.79 with HTTP; Sat, 30 Jan 2016 19:23:59 -0800 (PST) In-Reply-To: References: Date: Sat, 30 Jan 2016 20:23:59 -0700 X-Google-Sender-Auth: Zfv-ux1HY_X4z8tTT7B7Obo2s8o Message-ID: Subject: Re: aesni doesn't play nice with krb5 From: Alan Somers To: cem@freebsd.org Cc: "freebsd-hackers@freebsd.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jan 2016 03:24:01 -0000 Fast work! I should be able to test it by Tuesday. By inspection, the code looks good. -Alan On Sat, Jan 30, 2016 at 4:01 PM, Conrad Meyer wrote: > I have an untested patch to fix this issue: > https://reviews.freebsd.org/D5146 . If you have time, please review > or test the patch. > > Thanks, > Conrad > > On Wed, Jan 27, 2016 at 3:55 PM, Alan Somers wrote: >> I'm experimenting with Kerberized NFS, but my performance sucks when I >> use krb5p. I tracked the problem down to an interaction between aesni >> and krb5: aes_set_key in kcrypto_aes.c registers for a crypto session >> and requests support for two algorithms: CRYPTO_SHA1_HMAC and >> CRYPTO_AES_CBC. aesni(4) supports the latter, but not the former. So >> crypto_select_driver returns cryptosoft and krb5 uses software for >> both algorithms. >> >> It's too bad that aesni doesn't support SHA1, but other software like >> OpenSSL deals with it by using hardware for AES and software for SHA1. >> It seems to me like krb5 could be made to do the same by registering >> for two sessions, one for each algorithm. In fact, it seems like it >> would be pretty easy to do. The changes would probably be confined >> strictly to crypto_aes.c. Is there any reason why this wouldn't work? >> >> -Alan From owner-freebsd-hackers@freebsd.org Sun Jan 31 12:55:21 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 95CE2A7373B for ; Sun, 31 Jan 2016 12:55:21 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from know-smtprelay-omc-3.server.virginmedia.net (know-smtprelay-omc-3.server.virginmedia.net [80.0.253.67]) by mx1.freebsd.org (Postfix) with ESMTP id 07DC69D5 for ; Sun, 31 Jan 2016 12:55:20 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from [192.168.1.100] ([86.10.211.13]) by know-smtprelay-3-imp with bizsmtp id Ccu91s00Z0HtmFq01cu9WS; Sun, 31 Jan 2016 12:54:09 +0000 X-Originating-IP: [86.10.211.13] X-Spam: 0 X-Authority: v=2.1 cv=MtevkDue c=1 sm=1 tr=0 a=SB7hr1IvJSWWr45F2gQiKw==:117 a=SB7hr1IvJSWWr45F2gQiKw==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=IkcTkHD0fZMA:10 a=vNiBO2GyW91VwNN-wXkA:9 a=QEXdDO2ut3YA:10 To: FreeBSD Hackers References: <56A86D91.3040709@freebsd.org> From: Jonathan de Boyne Pollard Subject: Re: syslogd(8) with OOM Killer protection Message-ID: <56AE03E9.80908@NTLWorld.com> Date: Sun, 31 Jan 2016 12:54:01 +0000 User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <56A86D91.3040709@freebsd.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jan 2016 12:55:21 -0000 Allan Jude: > someapp_protect=YES (and maybe syslogd has this enabled by default in /etc/defaults/rc.conf) and it prefixes the start command with protect -i. Should all children inherit it? One of the things that the Linux OOM Killer does is motivated by the idea that children processes are "more killable" than the main service processes that spawned them; on the presumption that the arrangement is going to be like an SSH daemon spawning per-connection children, in the commonest case. From owner-freebsd-hackers@freebsd.org Sun Jan 31 13:35:40 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2AA91A747E5 for ; Sun, 31 Jan 2016 13:35:40 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from know-smtprelay-omc-3.server.virginmedia.net (know-smtprelay-omc-3.server.virginmedia.net [80.0.253.67]) by mx1.freebsd.org (Postfix) with ESMTP id 8FC778A for ; Sun, 31 Jan 2016 13:35:38 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from [192.168.1.100] ([86.10.211.13]) by know-smtprelay-3-imp with bizsmtp id Cdbd1s00v0HtmFq01dbeeu; Sun, 31 Jan 2016 13:35:38 +0000 X-Originating-IP: [86.10.211.13] X-Spam: 0 X-Authority: v=2.1 cv=MtevkDue c=1 sm=1 tr=0 a=SB7hr1IvJSWWr45F2gQiKw==:117 a=SB7hr1IvJSWWr45F2gQiKw==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=IkcTkHD0fZMA:10 a=Is0Wz_cqAAAA:8 a=5hPpmlmPgHLzqOddbhoA:9 a=QEXdDO2ut3YA:10 To: FreeBSD Hackers References: <56A8B300.5080503@toco-domains.de> From: Jonathan de Boyne Pollard Subject: Re: syslogd(8) with OOM Killer protection Message-ID: <56AE0DA2.8030908@NTLWorld.com> Date: Sun, 31 Jan 2016 13:35:30 +0000 User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <56A8B300.5080503@toco-domains.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jan 2016 13:35:40 -0000 Torsten Zuehlsdorff: > I would like a generalized way too. The first i thought to protect is my database. I never want to get it killed. You might enjoy this view on OOM Killers, then: * http://thoughts.davisjeff.com/2009/11/29/linux-oom-killer/ A widely circulated postgresql.service unit for systemd gained this a couple of years later: > # Due to PostgreSQL's use of shared memory, OOM killer is often overzealous in > # killing Postgres, so adjust it downward > OOMScoreAdjust=-200 Discounting 200 permil of a process' memory use, when the known problem is that that process shares a lot of its memory with its children, seems slightly conservative. From owner-freebsd-hackers@freebsd.org Sun Jan 31 14:06:20 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 81AAFA73B35 for ; Sun, 31 Jan 2016 14:06:20 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from know-smtprelay-omd-3.server.virginmedia.net (know-smtprelay-omd-3.server.virginmedia.net [81.104.62.35]) by mx1.freebsd.org (Postfix) with ESMTP id E5E9E1754 for ; Sun, 31 Jan 2016 14:06:19 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from [192.168.1.100] ([86.10.211.13]) by know-smtprelay-3-imp with bizsmtp id Ce6B1s00V0HtmFq01e6Bmd; Sun, 31 Jan 2016 14:06:11 +0000 X-Originating-IP: [86.10.211.13] X-Spam: 1 X-Authority: v=2.1 cv=MtevkDue c=1 sm=1 tr=0 p=Vrr8EYJMX8BPIQ9r4wYKCl4JfKw=:19 a=SB7hr1IvJSWWr45F2gQiKw==:117 a=SB7hr1IvJSWWr45F2gQiKw==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=IkcTkHD0fZMA:10 a=NLZqzBF-AAAA:8 a=CMQ_oNYfAAAA:8 a=jJvRZLfOAAAA:8 a=XxBdQHoZAAAA:8 a=XgoGgaRg1vXBoTbGU_0A:9 a=czzO_RvbwOveqGn9:21 a=yI8NpkVye7ehP01K:21 a=QEXdDO2ut3YA:10 a=z1OFvQ39S9YA:10 a=uoapEWrCiCEA:10 a=V_YoWeXle8YA:10 To: FreeBSD Hackers References: <9A913658-E90C-4B85-B73B-F3F7D3004344@panasas.com> From: Jonathan de Boyne Pollard Subject: Re: syslogd(8) with OOM Killer protection Message-ID: <56AE14CB.2030905@NTLWorld.com> Date: Sun, 31 Jan 2016 14:06:03 +0000 User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <9A913658-E90C-4B85-B73B-F3F7D3004344@panasas.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jan 2016 14:06:20 -0000 Jan Bramkamp: > I would prefer to implement [a] flag keeping cron (and all other > base system daemons) from double-forking and run it under a > process supervisor like daemontools. Ravi Pokala: > We've recently added that to `syslogd' (r279567, 2015-03-03). > I think we also have internal changes (not committed to -HEAD > yet) which adds a "run in foreground" option to a few other > daemons. As I've noted: * http://homepage.ntlworld.com./jonathan.deboynepollard/FGA/unix-daemon-design-mistakes-to-avoid.html#ForegroundDoesNotImplyDebugging This has been gradually going on for a decade, now. It's not something new that we're only just getting. Vixie cron (still determinedly "daemonizing" itself) is largely *not* representative of current practice, now. It's not even representative of crons in particular. GNU cron has a -f flag. Matt Dillon's dcron has a -f flag. Thibault Godouet's fcron has a -f flag. Bruce Guenter's bcron was designed from the start to be run under a service manager. * http://untroubled.org./bcron/bcron.html * http://www.jimpryor.net./linux/dcron.html * http://fcron.free.fr./ My thanks (and I suspect that of a lot of other people) to those who are and have been, program by program, enabling us to be rid of all of the unnecessary forking. From owner-freebsd-hackers@freebsd.org Sun Jan 31 14:45:23 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 112BCA74F05 for ; Sun, 31 Jan 2016 14:45:23 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from know-smtprelay-omc-3.server.virginmedia.net (know-smtprelay-omc-3.server.virginmedia.net [80.0.253.67]) by mx1.freebsd.org (Postfix) with ESMTP id 772BB1144 for ; Sun, 31 Jan 2016 14:45:21 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from [192.168.1.100] ([86.10.211.13]) by know-smtprelay-3-imp with bizsmtp id CelL1s00M0HtmFq01elLP6; Sun, 31 Jan 2016 14:45:20 +0000 X-Originating-IP: [86.10.211.13] X-Spam: 0 X-Authority: v=2.1 cv=MtevkDue c=1 sm=1 tr=0 a=SB7hr1IvJSWWr45F2gQiKw==:117 a=SB7hr1IvJSWWr45F2gQiKw==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=IkcTkHD0fZMA:10 a=C5OaT-PGtJu3Kg4S-YwA:9 a=QEXdDO2ut3YA:10 To: FreeBSD Hackers References: From: Jonathan de Boyne Pollard Subject: Re: syslogd(8) with OOM Killer protection Message-ID: <56AE1DF7.9020402@NTLWorld.com> Date: Sun, 31 Jan 2016 14:45:11 +0000 User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jan 2016 14:45:23 -0000 Warren Block: > Possibly simpler to provide a list in one setting than an individual setting for each daemon. With ideas from other posters: > > oomprotect_daemons="crond syslogd" Let me add my voice to Ian Lepore, Willem Jan Withagen, Allan Jude, and Alan Somers. M. Withagen makes a good point that this is difficult to machine-update and manage with system and package management tools. I add to that, from the point of view of one whose programs will have to parse this, it's difficult to machine-parse. One has to process two levels of quoting (if one is doing it safely). This really is not a simpler mechanism for the computers. The simpler rc.conf mechanism definitely is the straightforward per-service yes/no ${service}_oomprotect flag. And that's what I have just implemented in my toolset. Yes, I've used the name "oomprotect", M. Lapore. (-: root # rcctl get syslogd flags=-c -ss root # rcctl set syslogd oomprotect YES root # rcctl get syslogd flags=-c -ss oomprotect=YES root # cat "`rcctl find syslogd`"/service/run #!/bin/nosh #Run file generated from services/syslogd.service #Vanilla BSD syslog daemon oom-kill-protect -- fromenv envdir env sh -c "exec syslogd ${flags}" root # From owner-freebsd-hackers@freebsd.org Sun Jan 31 15:08:12 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B61EBA737B2 for ; Sun, 31 Jan 2016 15:08:12 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from know-smtprelay-omc-3.server.virginmedia.net (know-smtprelay-omc-3.server.virginmedia.net [80.0.253.67]) by mx1.freebsd.org (Postfix) with ESMTP id 00FC61A69 for ; Sun, 31 Jan 2016 15:08:11 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from [192.168.1.100] ([86.10.211.13]) by know-smtprelay-3-imp with bizsmtp id Cf8A1s00F0HtmFq01f8Aug; Sun, 31 Jan 2016 15:08:10 +0000 X-Originating-IP: [86.10.211.13] X-Spam: 0 X-Authority: v=2.1 cv=MtevkDue c=1 sm=1 tr=0 a=SB7hr1IvJSWWr45F2gQiKw==:117 a=SB7hr1IvJSWWr45F2gQiKw==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=IkcTkHD0fZMA:10 a=NLZqzBF-AAAA:8 a=AijD2GBdd0Qu9pMrJWcA:9 a=QEXdDO2ut3YA:10 a=A_Ij85UXA3UA:10 To: FreeBSD Hackers References: From: Jonathan de Boyne Pollard Subject: Re: syslogd(8) with OOM Killer protection Message-ID: <56AE2351.6030000@NTLWorld.com> Date: Sun, 31 Jan 2016 15:08:01 +0000 User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jan 2016 15:08:12 -0000 Willem Jan Withagen: > I'm trying to keep settings per deamon together in a file in /etc/rc.conf.d/, and load configs from there. > This makes daemon managment from external tools (puppet etc) a LOT easier. > It can just copy a default file into /etc/rc.conf.d if it wants a daemon available on a server. Perhaps I can interest you in a system where the settings, the daemon start/run/restart/stop programs, and whatever ancillaries the daemon may care to have in its working directory, are all contained in a single directory hierarchy, such as /var/sv/syslogd for a (non-socket-inheriting) syslogd service. JdeBP /var/sv/syslogd $ ls after conflicts required-by stopped-by wants before log service wanted-by JdeBP /var/sv/syslogd $ ls service down env restart run start stop JdeBP /var/sv/syslogd $ ls service/env flags oomprotect JdeBP /var/sv/syslogd $ It is a service bundle, and part of the idea is that it is just a directory tree that can be archived up and copied around. The nosh-bundles package contains 739 such service bundles. * http://homepage.ntlworld.com./jonathan.deboynepollard/Softwares/nosh/freebsd-binary-packages.html#Bundles From owner-freebsd-hackers@freebsd.org Sun Jan 31 15:27:58 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 227E5A73F97 for ; Sun, 31 Jan 2016 15:27:58 +0000 (UTC) (envelope-from zhao6014@gmail.com) Received: from mail-yk0-x22c.google.com (mail-yk0-x22c.google.com [IPv6:2607:f8b0:4002:c07::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D6DF99E4 for ; Sun, 31 Jan 2016 15:27:57 +0000 (UTC) (envelope-from zhao6014@gmail.com) Received: by mail-yk0-x22c.google.com with SMTP id r207so76764270ykd.2 for ; Sun, 31 Jan 2016 07:27:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=aq1K9AQo77b6imKmChXzdQAFcC1UBQGTCX4afusxFy8=; b=sCs+nHFX7I+x7Z6MSyf/85kQDgHVhYKkKujVlhcKvAkopynh7yzz4XCVWiISIlzifS OfSRZNv4QGA3RExnkS4Ag0g2DP2j5jWfDUdGLYZflMUu3XBB6DddPh1Q5aa+87hcD1I1 UyklQZxGPcFywLmQHgqW4Z67Krvn/tvXrKbveDcoi64yAHVdmKMyidEYddsWMW4skae1 fmVy46l6dY4AeptnztGIsose/a1EradFb8N0DxzUo7fpIIBEBQN3dwUBZNecrN3153MG 1g1NJ3OtUxV7rAYJrOP0q+RTOKYqM9Mnsn8NtfMAaoK5s22Wf/rqxEUHPfsY+2HFY+QE d+jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=aq1K9AQo77b6imKmChXzdQAFcC1UBQGTCX4afusxFy8=; b=l4d9vPNR2zg+eZ3nEw1FG0A8oV9I+H6J1jZvm0wxkBMHTPAXfnJFGtiwwz9r6xH/xC 6aunA5iV1Rb8XlmqUvWwMG6xgl6TAlQDm0MOMP90UCr/6nf+BINXaoN5MDwYSDJM8uyw W5jK6QWLZRcXUzJudP+uKg8eZa4aSfIIZlTsh/QyIWvk0PvZZPToIweDJVx3pqQeq0d9 phzOCNAbxPm6lqHpHHIbW3YcLCdRF1S7LpzalqFhEk5Wux3yGy+9s8rX9ZcAukTeNQhn bH8yOxyPlCP8F2CyOM8k9c8/OQubniuT0bmjidZRdxEJL99z5XbuarQGFn33NCG05XNE kSBQ== X-Gm-Message-State: AG10YORk4K7fudCDefuC0YzN75RfhwUZX+OKWDe8IrNx0IdjRi1M2jdidLlEFWB3rqeu3/AvadbBaZBDZx0lmw== MIME-Version: 1.0 X-Received: by 10.129.72.70 with SMTP id v67mr7672703ywa.156.1454254077053; Sun, 31 Jan 2016 07:27:57 -0800 (PST) Received: by 10.37.79.6 with HTTP; Sun, 31 Jan 2016 07:27:56 -0800 (PST) Received: by 10.37.79.6 with HTTP; Sun, 31 Jan 2016 07:27:56 -0800 (PST) In-Reply-To: <20160130071346.31022.37189@wrigleys.postgresql.org> References: <20160130071346.31022.37189@wrigleys.postgresql.org> Date: Sun, 31 Jan 2016 23:27:56 +0800 Message-ID: Subject: Fwd: [BUGS] BUG #13900: stop standby failed with writer process hang(happen 3 times in 2 days) From: Jov To: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jan 2016 15:27:58 -0000 how can I help to debug this problem?the process still there in Ds state. ---------- =E8=BD=AC=E5=8F=91=E7=9A=84=E9=82=AE=E4=BB=B6 ---------- =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=9A =E6=97=A5=E6=9C=9F=EF=BC=9A2016=E5=B9=B41=E6=9C=8830=E6=97=A5 3:14 PM =E4=B8=BB=E9=A2=98=EF=BC=9A[BUGS] BUG #13900: stop standby failed with writ= er process hang(happen 3 times in 2 days) =E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=9A =E6=8A=84=E9=80=81=EF=BC=9A The following bug has been logged on the website: Bug reference: 13900 Logged by: Jov Email address: amutu@amutu.com PostgreSQL version: 9.3.7 Operating system: FreeBSD 10.2 amd64 Description: I am updating my 3 database from pg9.3 to pg9.5,but may find a bug for the bgwriter of pg9.3.I can't stop all the stand by process,even for immediate stop mode and kill -9,the writer process still there,with ps state "Ds" (D Marks a process in disk (or other short term, uninterruptible) wait) .googl= e say the only method to clean the "Ds" process is rebooting the system. truss say no info for the process,and procstat say the process is calling the poll system call in the kernel. These is the detail info: pg_ctl -D ./slave stop -m fast waiting for server to shut down............................................................... failed pg_ctl: server does not shut down psql postgres psql: FATAL: the database system is shutting down pg_ctl -D ./slave stop -m immediate waiting for server to shut down.... done server stopped ps auxwww | grep postgres jovz 976 0.0 0.3 28840 5232 - Is 17 116 0:00.04 postgres: logger process (postgres) jovz 979 0.0 0.7 196940 13552 - Ds 17 116 0:06.03 postgres: writer process (postgres) log: 2016-01-30 14:23:22.350 CST,,,947,,569b1bc2.3b3,3,,2016-01-17 12:42:42 CST,,0,LOG,00000,"received fast shutdown request",,,,,,,,,"" 2016-01-30 14:23:22.350 CST,,,947,,569b1bc2.3b3,4,,2016-01-17 12:42:42 CST,,0,LOG,00000,"aborting any active transactions",,,,,,,,,"" 2016-01-30 14:25:35.271 CST,,,64815,"",56ac575f.fd2f,1,"",2016-01-30 14:25:35 CST,,0,LOG,00000,"connection received: host=3D[local]",,,,,,,,,"" 2016-01-30 14:25:35.274 CST,"jovz","f",64815,"[local]",56ac575f.fd2f,2,"",2016-01-30 14:25:35 CST,,0,FATAL,57P03,"the database system is shutting down",,,,,,,,,"" 2016-01-30 14:25:38.324 CST,,,64817,"",56ac5762.fd31,1,"",2016-01-30 14:25:38 CST,,0,LOG,00000,"connection received: host=3D[local]",,,,,,,,,"" 2016-01-30 14:25:38.324 CST,"jovz","f",64817,"[local]",56ac5762.fd31,2,"",2016-01-30 14:25:38 CST,,0,FATAL,57P03,"the database system is shutting down",,,,,,,,,"" 2016-01-30 14:47:36.727 CST,,,65457,"",56ac5c88.ffb1,1,"",2016-01-30 14:47:36 CST,,0,LOG,00000,"connection received: host=3D[local]",,,,,,,,,"" 2016-01-30 14:47:36.727 CST,"jovz","postgres",65457,"[local]",56ac5c88.ffb1,2,"",2016-01-30 14:47:3= 6 CST,,0,FATAL,57P03,"the database system is shutting down",,,,,,,,,"" 2016-01-30 14:50:04.564 CST,,,947,,569b1bc2.3b3,5,,2016-01-17 12:42:42 CST,,0,LOG,00000,"received immediate shutdown request",,,,,,,,,"" truss -p 979 ^Ctruss: Unexpect stop in waitpid: Interrupted system call root@fblax:~ # procstat -kk 979 PID TID COMM TDNAME KSTACK 979 100688 postgres - mi_switch+0xe1 sleepq_timedwait_sig+0x8b _cv_timedwait_sig_sbt+0x18b seltdwait+0xa4 kern_poll+0x464 sys_poll+0x61 amd64_syscall+0x357 Xfast_syscall+0xfb root@fb:~ # kill -9 979 root@fb:~ # procstat -kk 979 PID TID COMM TDNAME KSTACK 979 100688 postgres - mi_switch+0xe1 sleepq_timedwait_sig+0x8b _cv_timedwait_sig_sbt+0x18b seltdwait+0xa4 kern_poll+0x464 sys_poll+0x61 amd64_syscall+0x357 Xfast_syscall+0xfb -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs From owner-freebsd-hackers@freebsd.org Sun Jan 31 15:33:08 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AAAC0A741EF for ; Sun, 31 Jan 2016 15:33:08 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from know-smtprelay-omc-3.server.virginmedia.net (know-smtprelay-omc-3.server.virginmedia.net [80.0.253.67]) by mx1.freebsd.org (Postfix) with ESMTP id 1CCAECDA for ; Sun, 31 Jan 2016 15:33:07 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from [192.168.1.100] ([86.10.211.13]) by know-smtprelay-3-imp with bizsmtp id CfZ61s00V0HtmFq01fZ6h6; Sun, 31 Jan 2016 15:33:06 +0000 X-Originating-IP: [86.10.211.13] X-Spam: 0 X-Authority: v=2.1 cv=MtevkDue c=1 sm=1 tr=0 a=SB7hr1IvJSWWr45F2gQiKw==:117 a=SB7hr1IvJSWWr45F2gQiKw==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=IkcTkHD0fZMA:10 a=uNc6XVXN_B3ltN-fNu4A:9 a=QEXdDO2ut3YA:10 To: FreeBSD Hackers References: <56AA047D.8070807@digiware.nl> From: Jonathan de Boyne Pollard Subject: Re: syslogd(8) with OOM Killer protection Message-ID: <56AE2929.304@NTLWorld.com> Date: Sun, 31 Jan 2016 15:32:57 +0000 User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <56AA047D.8070807@digiware.nl> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jan 2016 15:33:08 -0000 Eugene Grosbein: > protection of single process is meaningless because it forks to become daemon and that ceases protection; This premise is erroneous, and the conclusion that you've based upon it is erroneous too. Daemons that run under service managers do not need to fork "to become [a] daemon". Indeed, they *already are* daemons right from the start. As Jan Brankamp said elsewhere: > I would prefer to implement the a flag keeping cron (and all other base system daemons) from double-forking and run it under a process supervisor like daemontools. And as I have pointed out, this is already the case over a wide range of daemon softwares nowadays. Thus the use of "protect" is feasible, since proper service-manager-managed daemons end up as the same process as the process that ran "protect". Indeed, chain-loading utilities like "protect" are the basics of the daemontools way of doing things. There is a broad range of tools whose purpose is to affect process state in one particular aspect and then chain to another program using what's left in the argument vector. Eugene Grosbein: > Perhaps, we could have kernel facility [...] There's no need for new kernel facilities here if one uses a service manager and throws away the wrongheaded idea that daemons need to *become* daemons under their own steam. (-: From owner-freebsd-hackers@freebsd.org Sun Jan 31 16:29:38 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 36DE9A73413 for ; Sun, 31 Jan 2016 16:29:38 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C715180D for ; Sun, 31 Jan 2016 16:29:37 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u0VGTTmC006776 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sun, 31 Jan 2016 18:29:29 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u0VGTTmC006776 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u0VGTTPm006775; Sun, 31 Jan 2016 18:29:29 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 31 Jan 2016 18:29:29 +0200 From: Konstantin Belousov To: Jov Cc: freebsd-hackers@freebsd.org Subject: Re: Fwd: [BUGS] BUG #13900: stop standby failed with writer process hang(happen 3 times in 2 days) Message-ID: <20160131162929.GI91220@kib.kiev.ua> References: <20160130071346.31022.37189@wrigleys.postgresql.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jan 2016 16:29:38 -0000 On Sun, Jan 31, 2016 at 11:27:56PM +0800, Jov wrote: > root@fb:~ # procstat -kk 979 > PID TID COMM TDNAME KSTACK > > 979 100688 postgres - mi_switch+0xe1 > sleepq_timedwait_sig+0x8b _cv_timedwait_sig_sbt+0x18b seltdwait+0xa4 > kern_poll+0x464 sys_poll+0x61 amd64_syscall+0x357 Xfast_syscall+0xfb > Show me 'ps axlHww | grep 979' output. From owner-freebsd-hackers@freebsd.org Mon Feb 1 01:03:20 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C53F8A73859 for ; Mon, 1 Feb 2016 01:03:20 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from know-smtprelay-omc-3.server.virginmedia.net (know-smtprelay-omc-3.server.virginmedia.net [80.0.253.67]) by mx1.freebsd.org (Postfix) with ESMTP id 310631A8C for ; Mon, 1 Feb 2016 01:03:19 +0000 (UTC) (envelope-from j.deboynepollard-newsgroups@ntlworld.com) Received: from [192.168.1.100] ([86.10.211.13]) by know-smtprelay-3-imp with bizsmtp id Cp3J1s0060HtmFq01p3JPB; Mon, 01 Feb 2016 01:03:18 +0000 X-Originating-IP: [86.10.211.13] X-Spam: 0 X-Authority: v=2.1 cv=MtevkDue c=1 sm=1 tr=0 a=SB7hr1IvJSWWr45F2gQiKw==:117 a=SB7hr1IvJSWWr45F2gQiKw==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=IkcTkHD0fZMA:10 a=NLZqzBF-AAAA:8 a=6I5d2MoRAAAA:8 a=jLyePonnBzbiv6716mYA:9 a=QEXdDO2ut3YA:10 a=XdyKOaxJwVsA:10 a=ZUGwP7LCt9cA:10 a=YXNy2nbAtK8A:10 a=FSu5OgGmP5kA:10 Subject: nosh version 1.25 To: "supervision@list.skarnet.org" , FreeBSD Hackers , debian-user@lists.debian.org References: <54430B41.3010301@NTLWorld.com> <54B86FD5.3090203@NTLWorld.com> <554E53EF.4080600@NTLWorld.com> <554E93AF.3070709@NTLWorld.com> <556BA130.50708@NTLWorld.com> <55902328.8080602@NTLWorld.com> <55D5CFA2.5010402@NTLWorld.com> <55D8B9AC.6010209@NTLWorld.com> <56089268.6080007@NTLWorld.com> <56120D11.4080506@NTLWorld.com> <5636C75B.70000@NTLWorld.com> <5672BD8C.50303@NTLWorld.com> <569617F3.8000101@NTLWorld.com> From: Jonathan de Boyne Pollard Message-ID: <56AEAED5.4010606@NTLWorld.com> Date: Mon, 1 Feb 2016 01:03:17 +0000 User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <569617F3.8000101@NTLWorld.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 01:03:20 -0000 The nosh package is now up to version 1.25 . * http://homepage.ntlworld.com./jonathan.deboynepollard/Softwares/nosh.html * https://www.freebsd.org/news/status/report-2015-07-2015-09.html#The-nosh-Project As you may have noticed from discussions elsewhere, a new oom-kill-protect utility has snuck in at the last moment. This takes Linux-style OOM Killer score adjustments (an integer between -1000 and 1000), BSD-style binary YES/NO settings, or a special setting for querying the "oomprotect" environment variable; and tries to do the closest matching thing for each platform. Details are in the manual, of course. With this, the OOMScoreAdjust setting is now converted by the convert-systemd-units utility. The local-syslogd, udp-syslogd, and syslogd service bundles make use of oom-kill-protect with the special environment variable setting in their run programs. So FreeBSD bug #204741 is addressed in a more general fashion that can be easily used in other service bundles. "rcctl set syslogd oomprotect YES" and "rcctl set syslogd oomprotect NO" can be used to turn OOM Killer protection on and off. Other things in this version include: * More configuration import utilities, covering ip6addrctl, webcamd, and NFS settings. * A fix for a problem with configuration import on Linux in version 1.24. * Two minor utilities for querying the fstab database, get-mount-what and get-mount-where, needed by the configuration import for mdconfig (but generally usable). * New binary "run-" packages for OpenSSH server, syslog on a local socket, and klog. * The new syslog and klog packages provide the Debian package manager's virtual package names "linux-kernel-log-daemon" and "system-log-daemon" (per Debian Bug #67604). As can be seen from the roadmap, we are nearing the end of the rc.d conversion for FreeBSD. Additions in this release include nfsserver, gptboot, rtadvd, virecover, and pcdm. Almost all of mdconfig is actually done, bar some after/before orderings. * http://homepage.ntlworld.com./jonathan.deboynepollard/Softwares/nosh/roadmap.html#FreeBSDrc.d From owner-freebsd-hackers@freebsd.org Mon Feb 1 02:07:39 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5DBEBA74E2F for ; Mon, 1 Feb 2016 02:07:39 +0000 (UTC) (envelope-from zhao6014@gmail.com) Received: from mail-yk0-x236.google.com (mail-yk0-x236.google.com [IPv6:2607:f8b0:4002:c07::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1F3F911A9 for ; Mon, 1 Feb 2016 02:07:39 +0000 (UTC) (envelope-from zhao6014@gmail.com) Received: by mail-yk0-x236.google.com with SMTP id r207so87103250ykd.2 for ; Sun, 31 Jan 2016 18:07:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=aTeCZ6nwT7OpwIh4rtdLVNs6SZTJ8ph/TgmUb+ZmS3Y=; b=hKqybCRThyMf0l2w3WoGOfV43nntfBx9iNC827kVfyGH+nvIhYumqfIyiSnMEd+zeT XmAn7EYPYeirbEDpRa7iUre52Co+N7qDJiwSMOqfml1LQXfy+2JpSmdP98VpKJJPqB1o z212K9in5+cCbjwj52r43zwnYdkhqcX6ZEl/nPDAsHa3RhsOVyoMdGTJE+0MzwlD8djd OU6yy0JzS3Gqxbh69Soo3GISzH1EMiciRz6O5EeSSePBbynXQqKrH1UlMGyPFsAxyiMp Pyxw27ytGgyu7gIu9EX6Z3lwqUXNmwALzO/QdfLZpFrLOdoUnU24nO4zo6QJ6EUVI9fj R4pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=aTeCZ6nwT7OpwIh4rtdLVNs6SZTJ8ph/TgmUb+ZmS3Y=; b=caIlS64uE5uYQY26r90XNeb6hc8u7RyKh422ptInQbK5z0K6EEzZD8qtW4ew1lQ93Y 9ngPgFIMdhCAXMfJUXifM1yGRQ1/MbVq50v8aQNNiEDB/ehi9lHRlvY5DjaApiPAqKxI zl24os+9iO/I5qrzje+g/LlBxU4tCvXbWZFsrCGfbM2FAL4x4A+b8YQq/RhxNhWkurxP rWk/c2PYRRh8FfKMZM++fDv8p291Rl/kFDnrYWUEDKuhchBh4y65pTMvLRvbjT/PV+fM NvFVGMXYzm6n1wjd1pZ2ae8NXyp6eoZ/YmY75yrdATv96uuXx5mORNlJtLOvaL1FM0Gj bjOA== X-Gm-Message-State: AG10YOSeM5+RuTKK6GF4MLi0IbY4s0bO2bk6gYnAml/5+xJdt25C0FOh64F1nB8vJ9YA3lZmw5tTXqNdCIFHmg== X-Received: by 10.129.91.132 with SMTP id p126mr10354678ywb.188.1454292458185; Sun, 31 Jan 2016 18:07:38 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.79.6 with HTTP; Sun, 31 Jan 2016 18:07:18 -0800 (PST) In-Reply-To: <20160131162929.GI91220@kib.kiev.ua> References: <20160130071346.31022.37189@wrigleys.postgresql.org> <20160131162929.GI91220@kib.kiev.ua> From: Jov Date: Mon, 1 Feb 2016 10:07:18 +0800 Message-ID: Subject: Re: Fwd: [BUGS] BUG #13900: stop standby failed with writer process hang(happen 3 times in 2 days) To: Konstantin Belousov Cc: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 02:07:39 -0000 Thanks Belousov! this is the output: root@fb:~ # ps axlHww | grep 979 1001 979 1 0 4 0 196940 8236 - Ds - 0:06.03 postgres: writer process (postgres) 0 48614 48587 0 20 0 18824 2252 piperd S+ 0 0:00.00 grep 979 I use PostgreSQL on FreeBSD more than 3 years and this is the first time I have problems. I am suspicious of this problem related with the harden of the security of the FreeBSD, I add these entry to all of my server aouble one month ago: security.bsd.see_other_uids=0 security.bsd.see_other_gids=0 security.bsd.unprivileged_read_msgbuf=0 security.bsd.unprivileged_proc_debug=0 security.bsd.hardlink_check_uid=1 security.bsd.hardlink_check_gid=1 security.bsd.stack_guard_page=1 security.bsd.unprivileged_mlock=0 Jov blog: http:amutu.com/blog 2016-02-01 0:29 GMT+08:00 Konstantin Belousov : > On Sun, Jan 31, 2016 at 11:27:56PM +0800, Jov wrote: > > root@fb:~ # procstat -kk 979 > > PID TID COMM TDNAME KSTACK > > > > 979 100688 postgres - mi_switch+0xe1 > > sleepq_timedwait_sig+0x8b _cv_timedwait_sig_sbt+0x18b seltdwait+0xa4 > > kern_poll+0x464 sys_poll+0x61 amd64_syscall+0x357 Xfast_syscall+0xfb > > > Show me 'ps axlHww | grep 979' output. > From owner-freebsd-hackers@freebsd.org Mon Feb 1 02:28:09 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C8708A754D0; Mon, 1 Feb 2016 02:28:09 +0000 (UTC) (envelope-from jceel@FreeBSD.org) Received: from mail1.uj.edu.pl (mail1.uj.edu.pl [149.156.89.193]) by mx1.freebsd.org (Postfix) with ESMTP id 8AC8119DC; Mon, 1 Feb 2016 02:28:06 +0000 (UTC) (envelope-from jceel@FreeBSD.org) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; CHARSET=US-ASCII Received: from [192.168.0.2] ([89.79.116.100]) by mta.uoks.uj.edu.pl (Oracle Communications Messaging Server 7u4-27.01 (7.0.4.27.0) 64bit (built Aug 30 2012)) with ESMTPSA id <0O1U00BCXJYAWR10@mta.uoks.uj.edu.pl>; Mon, 01 Feb 2016 03:22:59 +0100 (CET) From: jceel@FreeBSD.org Subject: VirtFS support in bhyve Message-id: <0E724C32-17FB-489A-B6E0-119CE17470E6@FreeBSD.org> Date: Mon, 01 Feb 2016 03:22:58 +0100 To: freebsd-hackers@FreeBSD.org, freebsd-virtualization@FreeBSD.org X-Mailer: Apple Mail (2.3094) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.98.6 at clamav1 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 02:28:09 -0000 Hello, I'm working on virtio-9p (so-called VirtFS) support in bhyve. Project consists of two parts: BSD-licensed lib9p library and actual virtio-9p driver. Right now it's able to do filesystem passthrough using 9P2000.u protocol to Linux guests. You can check it out here: https://github.com/jceel/freebsd/tree/virtfs Syntax: bhyve side: append `-s ,virtio-9p,sharename=/host/path` linux side: `mount -t 9p -o trans=virtio -o version=9p2000.u sharename /mnt/guest/path` Using 9p as root filesystem for Linux guests should work too. Plans: - Definitely in-kernel 9pfs filesystem support for FreeBSD guests using same lib9p library - 9P2000.L support (adds ACLs, extattrs, file locks, atomic reads/writes and so on) - Filesystem backend using AIO - Ability to export multiple trees for different "aname" values using one virtio-9p device (that's actually a low-hanging fruit) I'm looking forward to your feedback - keep in mind that's totally experimental/incomplete/nonworking code. Jakub. From owner-freebsd-hackers@freebsd.org Mon Feb 1 05:13:08 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 30A44A901F4 for ; Mon, 1 Feb 2016 05:13:08 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com [IPv6:2a00:1450:400c:c09::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C03CE1649; Mon, 1 Feb 2016 05:13:07 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x242.google.com with SMTP id 128so7385718wmz.3; Sun, 31 Jan 2016 21:13:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=DfztBr18ciYGbYcV9W/qr/f2a5CAhkFVKolSovxnLTc=; b=TzS+6YE3MXOXAqAfqr6kXWn2QmlktfhvqG4yByXKqQRY+feCQWioXD0KbvF6/b2jlk Amop4DUdjw76S8hCsSofC1cCOukDrfrMHxA1HCZbF11xY6i2hDJRKBKn+mxLfJHjjChv fwrpITwgYg52lac8VM5/PGXTR4gn/NAwiIFndpqjD8/ve7zphjtDb44zv69PgK02GNow q0ayf1Yy9OzbfMpOzBjC6xFsAusppM56jirY2oMmobZ74IBkvWeBipaEk3q483SGN82j ufH5aDK1Sriq6jugtupozVbzh0SSZe1Cp+cP8e/cumisx2rXekOSexo8uJQnOLbn4jMN dIQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=DfztBr18ciYGbYcV9W/qr/f2a5CAhkFVKolSovxnLTc=; b=TfRmQ+s5sD1FHJaGdK8MkB53Q393wlw5ichXtBkKuYm5OlTy49c4IZyZ7W2xyQaXVe uWFtrErJtcNvBX7cYPJhvczxiOteJIelSMPyjuBxv10Z4swbG5/uplWQlxlZSE0UYV1g cfWnOg53tkiGfy2zlNTn2xO/+mQfCdFXSEDlOnzy9dh+YUJWsxt5TQIAbvn3W4srkmxJ 3ITZWrsb1AmtmjFQp8Wi708tR0PeZNwTk2HDh/gUHUDpkSjQgDlMvE4jN58HzOEWUn93 oLjzmhMcjXFuBK+fmtJABltqIIq40o9sDpYDwcnjxDX8VvKxmyVLis/fRnXWVtvVC6gP bb4A== X-Gm-Message-State: AG10YORWlaiy8Y/ov+tyzvmDcYAn7Jys11m8fjSfNCQN+8V54HUbEsjm+16UJcQAze7G/Q== X-Received: by 10.194.201.134 with SMTP id ka6mr20467056wjc.116.1454303586215; Sun, 31 Jan 2016 21:13:06 -0800 (PST) Received: from mguzik.localdomain (ip-62-245-66-110.net.upcbroadband.cz. [62.245.66.110]) by smtp.gmail.com with ESMTPSA id x6sm27239437wje.38.2016.01.31.21.13.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 31 Jan 2016 21:13:05 -0800 (PST) From: Mateusz Guzik To: freebsd-hackers@freebsd.org Cc: kib@freebsd.org, Mateusz Guzik Subject: [PATCH 0/2] plug fork use-after-free Date: Mon, 1 Feb 2016 06:13:02 +0100 Message-Id: <1454303584-20941-1-git-send-email-mjguzik@gmail.com> X-Mailer: git-send-email 2.4.3 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 05:13:08 -0000 From: Mateusz Guzik Quit some time ago I reported a problem with fork and provided a half-assed patch, see: https://lists.freebsd.org/pipermail/freebsd-hackers/2014-October/046212.html Now I got around to fixing the problem in a less hackish manner. Note that despite the new process possibly immediatley exiting and being waited on, returning its (possibly now reused PID) is fine - that's the pid it possibly saw by other means and in worst case the process is racing with itself. To reiterate, as it is, the code has use-after-free in procdesc and racct handling. The first patch is a small cleanup to reduce the number of arguments to fork1, which was getting out of hand. I don't feel strongly about the name of the structure used in there. Mateusz Guzik (2): fork: move procdesc-related parameters into a dedicated struct fork: plug a use after free of the returned process pointer sys/compat/cloudabi/cloudabi_proc.c | 11 ++-- sys/compat/linux/linux_fork.c | 6 +- sys/kern/init_main.c | 2 +- sys/kern/kern_fork.c | 125 ++++++++++++++++++++---------------- sys/kern/kern_kthread.c | 2 +- sys/sys/proc.h | 5 +- sys/sys/procdesc.h | 6 ++ 7 files changed, 91 insertions(+), 66 deletions(-) -- 2.7.0 From owner-freebsd-hackers@freebsd.org Mon Feb 1 05:13:09 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 105F1A901FD for ; Mon, 1 Feb 2016 05:13:09 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 931591721; Mon, 1 Feb 2016 05:13:08 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x243.google.com with SMTP id r129so7399556wmr.0; Sun, 31 Jan 2016 21:13:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=iO718Z7SXmzwsSX+wqOtiHa/mszOn8Soc7LzKqlX2DQ=; b=cPtLqytbn8Qu1SyiFigd9IX6YWzeAlECJJO4NGMMz23tuXOfjsDzz15IWsBGenWA9G yM+VmSoFtnLlKZM/b6isJ1drZAQNwmbJIrvqCv9R8yn/WKueRX0UIRVdtVipiUj277/S AUdtYi1HWCPTeJR2vS1MCLdnSOF1oYmh2t2mr86t75l3TPbTnp3jqptUzXrLUyMaa2De Mosxn71UDoGgEpaQ4kN+ROjrEh3BUjztZvgzx5oAN7jGalv8IJtnzpvQi22klhRkXUr+ NP91K9rX2KnOjZ11O4TkJk0lodt+t7fmJt1V4JKPAFDTKRrYDgxt3F1bD9wBj1JsxlSp KbPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=iO718Z7SXmzwsSX+wqOtiHa/mszOn8Soc7LzKqlX2DQ=; b=V4Q2BvkizbyTZt7Zw9wrU2vcR6gYj9qi0hAV07GUCSB9GN9g6H+ZcUiFFfplBb+wGZ MjHX7xC0VzVOI4a20WZ1AZfHTpR8L0k17DYkwBRcVare9w0GhbD2MK0LtT4NrAECLlc5 5g5LEtY+gnfdePRmP5HPNBVjVxb//NSIpIow+uz9GTpyuUTsqimYjCbvk8B1bJ78eOE8 dDIFL/2dxoo3jef9oh+I/KPZM2UIY1lhWg2KMZhctNO1VK9wXbzHzdixqxhWkc11SP3a 648qm188LgfsGYLE0MBSxUeBOxSawHgLA0Nl242YFJfUKgXw167vy0+dKh4AyxaI+zmP YiKQ== X-Gm-Message-State: AG10YOQPocyMoNUzzRO9cTtgpBumGGHlH7n+8IO7ZcAtHpGZzrwekVww79Tk9tI/8IpuEg== X-Received: by 10.28.97.135 with SMTP id v129mr9659144wmb.78.1454303587113; Sun, 31 Jan 2016 21:13:07 -0800 (PST) Received: from mguzik.localdomain (ip-62-245-66-110.net.upcbroadband.cz. [62.245.66.110]) by smtp.gmail.com with ESMTPSA id x6sm27239437wje.38.2016.01.31.21.13.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 31 Jan 2016 21:13:06 -0800 (PST) From: Mateusz Guzik To: freebsd-hackers@freebsd.org Cc: kib@freebsd.org, Mateusz Guzik Subject: [PATCH 1/2] fork: move procdesc-related parameters into a dedicated struct Date: Mon, 1 Feb 2016 06:13:03 +0100 Message-Id: <1454303584-20941-2-git-send-email-mjguzik@gmail.com> X-Mailer: git-send-email 2.4.3 In-Reply-To: <1454303584-20941-1-git-send-email-mjguzik@gmail.com> References: <1454303584-20941-1-git-send-email-mjguzik@gmail.com> X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 05:13:09 -0000 From: Mateusz Guzik This reduces the number of arguments to fork1. --- sys/compat/cloudabi/cloudabi_proc.c | 10 +++++++--- sys/compat/linux/linux_fork.c | 7 +++---- sys/kern/init_main.c | 2 +- sys/kern/kern_fork.c | 29 +++++++++++++++++------------ sys/kern/kern_kthread.c | 2 +- sys/sys/proc.h | 5 +++-- sys/sys/procdesc.h | 6 ++++++ 7 files changed, 38 insertions(+), 23 deletions(-) diff --git a/sys/compat/cloudabi/cloudabi_proc.c b/sys/compat/cloudabi/cloudabi_proc.c index d917337..e98471b 100644 --- a/sys/compat/cloudabi/cloudabi_proc.c +++ b/sys/compat/cloudabi/cloudabi_proc.c @@ -28,6 +28,7 @@ __FBSDID("$FreeBSD$"); #include #include +#include #include #include #include @@ -75,16 +76,19 @@ int cloudabi_sys_proc_fork(struct thread *td, struct cloudabi_sys_proc_fork_args *uap) { + struct procdesc_req pdr; struct filecaps fcaps = {}; struct proc *p2; - int error, fd; + int error; + pdr.pdr_flags = 0; + pdr.pdr_fcaps = &fcaps; cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_EVENT); - error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, &fd, 0, &fcaps); + error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, &pdr); if (error != 0) return (error); /* Return the file descriptor to the parent process. */ - td->td_retval[0] = fd; + td->td_retval[0] = pdr.pdr_fd; return (0); } diff --git a/sys/compat/linux/linux_fork.c b/sys/compat/linux/linux_fork.c index d0f73ad..7cbe216 100644 --- a/sys/compat/linux/linux_fork.c +++ b/sys/compat/linux/linux_fork.c @@ -73,8 +73,7 @@ linux_fork(struct thread *td, struct linux_fork_args *args) printf(ARGS(fork, "")); #endif - if ((error = fork1(td, RFFDG | RFPROC | RFSTOPPED, 0, &p2, NULL, 0, - NULL)) != 0) + if ((error = fork1(td, RFFDG | RFPROC | RFSTOPPED, 0, &p2, NULL)) != 0) return (error); td2 = FIRST_THREAD_IN_PROC(p2); @@ -107,7 +106,7 @@ linux_vfork(struct thread *td, struct linux_vfork_args *args) #endif if ((error = fork1(td, RFFDG | RFPROC | RFMEM | RFPPWAIT | RFSTOPPED, - 0, &p2, NULL, 0, NULL)) != 0) + 0, &p2, NULL)) != 0) return (error); td2 = FIRST_THREAD_IN_PROC(p2); @@ -170,7 +169,7 @@ linux_clone_proc(struct thread *td, struct linux_clone_args *args) if (args->flags & LINUX_CLONE_VFORK) ff |= RFPPWAIT; - error = fork1(td, ff, 0, &p2, NULL, 0, NULL); + error = fork1(td, ff, 0, &p2, NULL); if (error) return (error); diff --git a/sys/kern/init_main.c b/sys/kern/init_main.c index 8d5580b..7d0443a 100644 --- a/sys/kern/init_main.c +++ b/sys/kern/init_main.c @@ -833,7 +833,7 @@ create_init(const void *udata __unused) int error; error = fork1(&thread0, RFFDG | RFPROC | RFSTOPPED, 0, &initproc, - NULL, 0, NULL); + NULL); if (error) panic("cannot fork init: %d\n", error); KASSERT(initproc->p_pid == 1, ("create_init: initproc->p_pid != 1")); diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index e7d7276..8cc56b7 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -104,7 +104,7 @@ sys_fork(struct thread *td, struct fork_args *uap) int error; struct proc *p2; - error = fork1(td, RFFDG | RFPROC, 0, &p2, NULL, 0, NULL); + error = fork1(td, RFFDG | RFPROC, 0, &p2, NULL); if (error == 0) { td->td_retval[0] = p2->p_pid; td->td_retval[1] = 0; @@ -118,20 +118,22 @@ sys_pdfork(td, uap) struct thread *td; struct pdfork_args *uap; { - int error, fd; + struct procdesc_req pdr; struct proc *p2; + int error; /* * It is necessary to return fd by reference because 0 is a valid file * descriptor number, and the child needs to be able to distinguish * itself from the parent using the return value. */ - error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, - &fd, uap->flags, NULL); + pdr.pdr_flags = uap->flags; + pdr.pdr_fcaps = NULL; + error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, &pdr); if (error == 0) { td->td_retval[0] = p2->p_pid; td->td_retval[1] = 0; - error = copyout(&fd, uap->fdp, sizeof(fd)); + error = copyout(&pdr.pdr_fd, uap->fdp, sizeof(pdr.pdr_fd)); } return (error); } @@ -144,7 +146,7 @@ sys_vfork(struct thread *td, struct vfork_args *uap) struct proc *p2; flags = RFFDG | RFPROC | RFPPWAIT | RFMEM; - error = fork1(td, flags, 0, &p2, NULL, 0, NULL); + error = fork1(td, flags, 0, &p2, NULL); if (error == 0) { td->td_retval[0] = p2->p_pid; td->td_retval[1] = 0; @@ -163,7 +165,7 @@ sys_rfork(struct thread *td, struct rfork_args *uap) return (EINVAL); AUDIT_ARG_FFLAGS(uap->flags); - error = fork1(td, uap->flags, 0, &p2, NULL, 0, NULL); + error = fork1(td, uap->flags, 0, &p2, NULL); if (error == 0) { td->td_retval[0] = p2 ? p2->p_pid : 0; td->td_retval[1] = 0; @@ -762,14 +764,14 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, int fork1(struct thread *td, int flags, int pages, struct proc **procp, - int *procdescp, int pdflags, struct filecaps *fcaps) + struct procdesc_req *pdr) { struct proc *p1, *newproc; struct thread *td2; struct vmspace *vm2; struct file *fp_procdesc; vm_ooffset_t mem_charged; - int error, nprocs_new, ok; + int error, nprocs_new, ok, pdflags; static int curfail; static struct timeval lastfail; @@ -789,14 +791,16 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, if ((flags & RFTSIGZMB) != 0 && (u_int)RFTSIGNUM(flags) > _SIG_MAXSIG) return (EINVAL); + pdflags = 0; if ((flags & RFPROCDESC) != 0) { /* Can't not create a process yet get a process descriptor. */ if ((flags & RFPROC) == 0) return (EINVAL); /* Must provide a place to put a procdesc if creating one. */ - if (procdescp == NULL) + if (pdr == NULL) return (EINVAL); + pdflags = pdr->pdr_flags; } p1 = td->td_proc; @@ -845,7 +849,8 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, * later. */ if (flags & RFPROCDESC) { - error = falloc_caps(td, &fp_procdesc, procdescp, 0, fcaps); + error = falloc_caps(td, &fp_procdesc, &pdr->pdr_fd, 0, + pdr->pdr_fcaps); if (error != 0) goto fail2; } @@ -962,7 +967,7 @@ fail2: vmspace_free(vm2); uma_zfree(proc_zone, newproc); if ((flags & RFPROCDESC) != 0 && fp_procdesc != NULL) { - fdclose(td, fp_procdesc, *procdescp); + fdclose(td, fp_procdesc, pdr->pdr_fd); fdrop(fp_procdesc, td); } atomic_add_int(&nprocs, -1); diff --git a/sys/kern/kern_kthread.c b/sys/kern/kern_kthread.c index 2072dc7..0673f68 100644 --- a/sys/kern/kern_kthread.c +++ b/sys/kern/kern_kthread.c @@ -89,7 +89,7 @@ kproc_create(void (*func)(void *), void *arg, panic("kproc_create called too soon"); error = fork1(&thread0, RFMEM | RFFDG | RFPROC | RFSTOPPED | flags, - pages, &p2, NULL, 0, NULL); + pages, &p2, NULL); if (error) return error; diff --git a/sys/sys/proc.h b/sys/sys/proc.h index f2f4a9d..60efdcd 100644 --- a/sys/sys/proc.h +++ b/sys/sys/proc.h @@ -171,6 +171,7 @@ struct nlminfo; struct p_sched; struct proc; struct procdesc; +struct procdesc_req; struct racct; struct sbuf; struct sleepqueue; @@ -930,8 +931,8 @@ int enterpgrp(struct proc *p, pid_t pgid, struct pgrp *pgrp, int enterthispgrp(struct proc *p, struct pgrp *pgrp); void faultin(struct proc *p); void fixjobc(struct proc *p, struct pgrp *pgrp, int entering); -int fork1(struct thread *, int, int, struct proc **, int *, int, - struct filecaps *); +int fork1(struct thread *, int, int, struct proc **, + struct procdesc_req *); void fork_exit(void (*)(void *, struct trapframe *), void *, struct trapframe *); void fork_return(struct thread *, struct trapframe *); diff --git a/sys/sys/procdesc.h b/sys/sys/procdesc.h index 1a3bc98..ee7abf8 100644 --- a/sys/sys/procdesc.h +++ b/sys/sys/procdesc.h @@ -73,6 +73,12 @@ struct procdesc { struct mtx pd_lock; /* Protect data + events. */ }; +struct procdesc_req { + int pdr_fd; + int pdr_flags; + struct filecaps *pdr_fcaps; +}; + /* * Locking macros for the procdesc itself. */ -- 2.7.0 From owner-freebsd-hackers@freebsd.org Mon Feb 1 05:13:10 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DBCC9A90207 for ; Mon, 1 Feb 2016 05:13:09 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x241.google.com (mail-wm0-x241.google.com [IPv6:2a00:1450:400c:c09::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6A7B7172A; Mon, 1 Feb 2016 05:13:09 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x241.google.com with SMTP id r129so7399585wmr.0; Sun, 31 Jan 2016 21:13:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ymSBW8BFAjMPDajg056Yhg3alZEtpSgsEBbc8Etwm8c=; b=F7bUbp65horYjIDmTGPIl7k3++0AFhgAdYo+UseFdK8959buGgWG0p/UASwEry8hf+ Ecl9nH7M5ObrqT1YqjDATsO7CWw6PMh/iAGWAe+2laIzPleTAk9C7p6oz5r7JPVrdN5Y SW0RkoZEiZsURdi6R7n4pZJ1XTuyLThJkpThsl7iDjhyUG8bJK7ble5l6TKxF9LtGnWx PuRf6YV+KY0oAGoK72q456XVJ2seBZt5nCoFXgQipN3H01QTU/mCuyloesl5eJksSPDj jFutMFVEgq6dcS8HHcj7B/y36vTgjSBa1or+t6CZ5I2kv7lZeBzr+CIXCvA3PIABBNvy G7qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ymSBW8BFAjMPDajg056Yhg3alZEtpSgsEBbc8Etwm8c=; b=fj9AqswVBGLEZHm4QVPKst7mbjMCZQLNx2XeRmyBLoNIBar18zLIRW4+BOE58M6442 yPf+iXHEKkf4HoR0+PaulVFSdBOO07aNRlqT+y+0X22LOzvVGDJdyCBMGyRbtaWueViF RlObaGXgM4FXPWIzskqankkuh33mFjMKUxwC5ECpRLx22zFgpdGKG2+yCI6UcN18MPO9 5cAzE0/CMAShx3yGK2UWrWXZBSP9++qu8k9Km9BzKkYpwtyIid3+eBIwTw8bycdtbxBG 7xm+C9Au3Oeqc4DufhHX7i72AlykwNuBkhkxgSHrn26JdyNrUBpRoilZWfexCzVg1VX8 txhg== X-Gm-Message-State: AG10YOTPspq6Up+IAn4S5eQUCR4/d+G/pmQI29wIwTdt4QHaKxp+bccG4JSX5Tn9Yl0A8A== X-Received: by 10.28.64.131 with SMTP id n125mr8799306wma.65.1454303587891; Sun, 31 Jan 2016 21:13:07 -0800 (PST) Received: from mguzik.localdomain (ip-62-245-66-110.net.upcbroadband.cz. [62.245.66.110]) by smtp.gmail.com with ESMTPSA id x6sm27239437wje.38.2016.01.31.21.13.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 31 Jan 2016 21:13:07 -0800 (PST) From: Mateusz Guzik To: freebsd-hackers@freebsd.org Cc: kib@freebsd.org, Mateusz Guzik Subject: [PATCH 2/2] fork: plug a use after free of the returned process pointer Date: Mon, 1 Feb 2016 06:13:04 +0100 Message-Id: <1454303584-20941-3-git-send-email-mjguzik@gmail.com> X-Mailer: git-send-email 2.4.3 In-Reply-To: <1454303584-20941-1-git-send-email-mjguzik@gmail.com> References: <1454303584-20941-1-git-send-email-mjguzik@gmail.com> X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 05:13:10 -0000 From: Mateusz Guzik fork1 required its callers to pass a point to struct proc * which would be set to the new process (if any). procdesc and racct manipulation also used said pointer. However, the process could have exited prior to do_fork return and be automatically reaped. Fix the problem by changing the API to let callers indicate whether they want the pid or the struct proc, return the process in stopped state for the latter case. The process is held after its thread is marked as runnable to prevent it from exiting untill all work is done. --- sys/compat/cloudabi/cloudabi_proc.c | 3 +- sys/compat/linux/linux_fork.c | 7 ++- sys/kern/init_main.c | 2 +- sys/kern/kern_fork.c | 108 ++++++++++++++++++++---------------- sys/kern/kern_kthread.c | 2 +- sys/sys/proc.h | 2 +- 6 files changed, 67 insertions(+), 57 deletions(-) diff --git a/sys/compat/cloudabi/cloudabi_proc.c b/sys/compat/cloudabi/cloudabi_proc.c index e98471b..d8c3bef 100644 --- a/sys/compat/cloudabi/cloudabi_proc.c +++ b/sys/compat/cloudabi/cloudabi_proc.c @@ -78,13 +78,12 @@ cloudabi_sys_proc_fork(struct thread *td, { struct procdesc_req pdr; struct filecaps fcaps = {}; - struct proc *p2; int error; pdr.pdr_flags = 0; pdr.pdr_fcaps = &fcaps; cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_EVENT); - error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, &pdr); + error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, NULL, NULL, &pdr); if (error != 0) return (error); /* Return the file descriptor to the parent process. */ diff --git a/sys/compat/linux/linux_fork.c b/sys/compat/linux/linux_fork.c index 7cbe216..0c0f800 100644 --- a/sys/compat/linux/linux_fork.c +++ b/sys/compat/linux/linux_fork.c @@ -73,7 +73,8 @@ linux_fork(struct thread *td, struct linux_fork_args *args) printf(ARGS(fork, "")); #endif - if ((error = fork1(td, RFFDG | RFPROC | RFSTOPPED, 0, &p2, NULL)) != 0) + if ((error = fork1(td, RFFDG | RFPROC | RFSTOPPED, 0, &p2, NULL, + NULL)) != 0) return (error); td2 = FIRST_THREAD_IN_PROC(p2); @@ -106,7 +107,7 @@ linux_vfork(struct thread *td, struct linux_vfork_args *args) #endif if ((error = fork1(td, RFFDG | RFPROC | RFMEM | RFPPWAIT | RFSTOPPED, - 0, &p2, NULL)) != 0) + 0, &p2, NULL, NULL)) != 0) return (error); td2 = FIRST_THREAD_IN_PROC(p2); @@ -169,7 +170,7 @@ linux_clone_proc(struct thread *td, struct linux_clone_args *args) if (args->flags & LINUX_CLONE_VFORK) ff |= RFPPWAIT; - error = fork1(td, ff, 0, &p2, NULL); + error = fork1(td, ff, 0, &p2, NULL, NULL); if (error) return (error); diff --git a/sys/kern/init_main.c b/sys/kern/init_main.c index 7d0443a..d4fdcc0 100644 --- a/sys/kern/init_main.c +++ b/sys/kern/init_main.c @@ -833,7 +833,7 @@ create_init(const void *udata __unused) int error; error = fork1(&thread0, RFFDG | RFPROC | RFSTOPPED, 0, &initproc, - NULL); + NULL, NULL); if (error) panic("cannot fork init: %d\n", error); KASSERT(initproc->p_pid == 1, ("create_init: initproc->p_pid != 1")); diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index 8cc56b7..d15f517 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -101,12 +101,11 @@ struct fork_args { int sys_fork(struct thread *td, struct fork_args *uap) { - int error; - struct proc *p2; + int error, pid; - error = fork1(td, RFFDG | RFPROC, 0, &p2, NULL); + error = fork1(td, RFFDG | RFPROC, 0, NULL, &pid, NULL); if (error == 0) { - td->td_retval[0] = p2->p_pid; + td->td_retval[0] = pid; td->td_retval[1] = 0; } return (error); @@ -119,8 +118,7 @@ sys_pdfork(td, uap) struct pdfork_args *uap; { struct procdesc_req pdr; - struct proc *p2; - int error; + int error, pid; /* * It is necessary to return fd by reference because 0 is a valid file @@ -129,9 +127,9 @@ sys_pdfork(td, uap) */ pdr.pdr_flags = uap->flags; pdr.pdr_fcaps = NULL; - error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, &pdr); + error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, NULL, &pid, &pdr); if (error == 0) { - td->td_retval[0] = p2->p_pid; + td->td_retval[0] = pid; td->td_retval[1] = 0; error = copyout(&pdr.pdr_fd, uap->fdp, sizeof(pdr.pdr_fd)); } @@ -142,13 +140,12 @@ sys_pdfork(td, uap) int sys_vfork(struct thread *td, struct vfork_args *uap) { - int error, flags; - struct proc *p2; + int error, pid; - flags = RFFDG | RFPROC | RFPPWAIT | RFMEM; - error = fork1(td, flags, 0, &p2, NULL); + error = fork1(td, RFFDG | RFPROC | RFPPWAIT | RFMEM, 0, NULL, &pid, + NULL); if (error == 0) { - td->td_retval[0] = p2->p_pid; + td->td_retval[0] = pid; td->td_retval[1] = 0; } return (error); @@ -157,17 +154,16 @@ sys_vfork(struct thread *td, struct vfork_args *uap) int sys_rfork(struct thread *td, struct rfork_args *uap) { - struct proc *p2; - int error; + int error, pid; /* Don't allow kernel-only flags. */ if ((uap->flags & RFKERNELONLY) != 0) return (EINVAL); AUDIT_ARG_FFLAGS(uap->flags); - error = fork1(td, uap->flags, 0, &p2, NULL); + error = fork1(td, uap->flags, 0, NULL, &pid, NULL); if (error == 0) { - td->td_retval[0] = p2 ? p2->p_pid : 0; + td->td_retval[0] = pid; td->td_retval[1] = 0; } return (error); @@ -369,10 +365,11 @@ fail: static void do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, - struct vmspace *vm2, int pdflags) + struct vmspace *vm2, struct proc **procp, int *procpid, int pdflags, + struct file *fp_procdesc) { struct proc *p1, *pptr; - int p2_held, trypid; + int trypid; struct filedesc *fd; struct filedesc_to_leader *fdtol; struct sigacts *newsigacts; @@ -380,7 +377,6 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, sx_assert(&proctree_lock, SX_SLOCKED); sx_assert(&allproc_lock, SX_XLOCKED); - p2_held = 0; p1 = td->td_proc; trypid = fork_findpid(flags); @@ -708,6 +704,11 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) dtrace_fasttrap_fork(p1, p2); #endif + /* + * Hold the process so that it cannot exit after we make it runnable, + * but before we wait for the debugger. + */ + _PHOLD(p2); if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | P_FOLLOWFORK)) { /* @@ -720,25 +721,12 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, td->td_dbgflags |= TDB_FORK; td->td_dbg_forked = p2->p_pid; td2->td_dbgflags |= TDB_STOPATFORK; - _PHOLD(p2); - p2_held = 1; } if (flags & RFPPWAIT) { td->td_pflags |= TDP_RFPPWAIT; td->td_rfppwait_p = p2; } PROC_UNLOCK(p2); - if ((flags & RFSTOPPED) == 0) { - /* - * If RFSTOPPED not requested, make child runnable and - * add to run queue. - */ - thread_lock(td2); - TD_SET_CAN_RUN(td2); - sched_add(td2, SRQ_BORING); - thread_unlock(td2); - } - /* * Now can be swapped. */ @@ -751,20 +739,43 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, knote_fork(&p1->p_klist, p2->p_pid); SDT_PROBE3(proc, , , create, p2, p1, flags); + if (flags & RFPROCDESC) { + procdesc_finit(p2->p_procdesc, fp_procdesc); + fdrop(fp_procdesc, td); + } + + if (procpid != NULL) + *procpid = p2->p_pid; + if ((flags & RFSTOPPED) == 0) { + /* + * If RFSTOPPED not requested, make child runnable and + * add to run queue. + */ + thread_lock(td2); + TD_SET_CAN_RUN(td2); + sched_add(td2, SRQ_BORING); + thread_unlock(td2); + } else { + /* + * Return child proc pointer to parent. + */ + *procp = p2; + } + + PROC_LOCK(p2); /* * Wait until debugger is attached to child. */ - PROC_LOCK(p2); while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) cv_wait(&p2->p_dbgwait, &p2->p_mtx); - if (p2_held) - _PRELE(p2); + _PRELE(p2); + racct_proc_fork_done(p2); PROC_UNLOCK(p2); } int fork1(struct thread *td, int flags, int pages, struct proc **procp, - struct procdesc_req *pdr) + int *procpid, struct procdesc_req *pdr) { struct proc *p1, *newproc; struct thread *td2; @@ -775,6 +786,11 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, static int curfail; static struct timeval lastfail; + if ((flags & RFSTOPPED) != 0) + MPASS(procp != NULL && procpid == NULL); + else + MPASS(procp == NULL); + /* Check for the undefined or unimplemented flags. */ if ((flags & ~(RFFLAGS | RFTSIGFLAGS(RFTSIGMASK))) != 0) return (EINVAL); @@ -810,7 +826,10 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, * certain parts of a process from itself. */ if ((flags & RFPROC) == 0) { - *procp = NULL; + if (procp != NULL) + *procp = NULL; + if (procpid != NULL) + *procpid = 0; return (fork_norfproc(td, flags)); } @@ -938,17 +957,8 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, lim_cur(td, RLIMIT_NPROC)); } if (ok) { - do_fork(td, flags, newproc, td2, vm2, pdflags); - - /* - * Return child proc pointer to parent. - */ - *procp = newproc; - if (flags & RFPROCDESC) { - procdesc_finit(newproc->p_procdesc, fp_procdesc); - fdrop(fp_procdesc, td); - } - racct_proc_fork_done(newproc); + do_fork(td, flags, newproc, td2, vm2, procp, procpid, pdflags, + fp_procdesc); return (0); } diff --git a/sys/kern/kern_kthread.c b/sys/kern/kern_kthread.c index 0673f68..67f6cc2 100644 --- a/sys/kern/kern_kthread.c +++ b/sys/kern/kern_kthread.c @@ -89,7 +89,7 @@ kproc_create(void (*func)(void *), void *arg, panic("kproc_create called too soon"); error = fork1(&thread0, RFMEM | RFFDG | RFPROC | RFSTOPPED | flags, - pages, &p2, NULL); + pages, &p2, NULL, NULL); if (error) return error; diff --git a/sys/sys/proc.h b/sys/sys/proc.h index 60efdcd..b072e10 100644 --- a/sys/sys/proc.h +++ b/sys/sys/proc.h @@ -931,7 +931,7 @@ int enterpgrp(struct proc *p, pid_t pgid, struct pgrp *pgrp, int enterthispgrp(struct proc *p, struct pgrp *pgrp); void faultin(struct proc *p); void fixjobc(struct proc *p, struct pgrp *pgrp, int entering); -int fork1(struct thread *, int, int, struct proc **, +int fork1(struct thread *, int, int, struct proc **, int *, struct procdesc_req *); void fork_exit(void (*)(void *, struct trapframe *), void *, struct trapframe *); -- 2.7.0 From owner-freebsd-hackers@freebsd.org Mon Feb 1 05:28:13 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9201BA8E7BA for ; Mon, 1 Feb 2016 05:28:13 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1C8C11CD3; Mon, 1 Feb 2016 05:28:13 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x243.google.com with SMTP id p63so7427798wmp.1; Sun, 31 Jan 2016 21:28:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=7C0JqcVNqhWEi07eZMmhs1RNTndQgEyXwcJrW6CU3Fk=; b=Frg+kIXIYUv8cRwyNFBo7jeOq6q1Gb+mYwJXuDJG86rrxefczNmyh+tbGHeGAd7HAs 907kMCHo55j0wC4Sx5UxEkE/+tWmYtf3hGZIarOGdXK7Ycq6I/yVd6IYuy66coHBO/mp q/B23N+P/JcPMEvNwZ0p3nr8TKUG7i5wnmCJRr/ALOk5khBPiqc4Xnub1AtKK39IoHW/ Aknkc3bNK6sAqkwAcPhC90mPWlGWBV5a0anAIHhvbn+ISpcagYF752ibaIpXkdAgPYNG EdK/hogsAR1KR83dfrmrcv0FvRVlpu8nvKShFRip6oAZtF1hZSd1mTAl1CUYx3zE+T2i /ujA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=7C0JqcVNqhWEi07eZMmhs1RNTndQgEyXwcJrW6CU3Fk=; b=i50sWYxgn6++5z0Hm3jTbzSFh0ImQH/H9XlPgrpybZcreti48ixsCEcMcwC3d8TeTx KYIxTwIcdhTRHkjkrW28L+vfQwHfrj+JwAW3ex83qq58Q3PO66zVsrPrp2IaXR8B9Zly NsjxJHUQ36qve1PGpjUTHRqUzuZsM/QtLg3GELYmprE4kYY4uBbvnWwzNISCmTkUTPTh JEAN99YOooZspnTaXpwoDj0E6+Qdfo66RRPv3VVlXyXBcwyJpxooatEDjLFDiuZIs+iS KZKteXJ6vi8ts1Duuvm8r7fzdw7YnFL7DmPBCXEI+hB0jTaI28Z3xCSwABSWaFIogJXp 4iQg== X-Gm-Message-State: AG10YOQ//xsk9mRSxNP+fE79fdnpWhR8fHgdBzsqGi0Tlld3d+ceYZ4pV6KCgB0LEgl8qQ== X-Received: by 10.28.132.146 with SMTP id g140mr9851274wmd.49.1454304491684; Sun, 31 Jan 2016 21:28:11 -0800 (PST) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by smtp.gmail.com with ESMTPSA id m206sm9569272wmf.16.2016.01.31.21.28.10 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Sun, 31 Jan 2016 21:28:11 -0800 (PST) Date: Mon, 1 Feb 2016 06:28:09 +0100 From: Mateusz Guzik To: freebsd-hackers@freebsd.org Cc: kib@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160201052809.GA7127@dft-labs.eu> Mail-Followup-To: Mateusz Guzik , freebsd-hackers@freebsd.org, kib@freebsd.org References: <1454303584-20941-1-git-send-email-mjguzik@gmail.com> <1454303584-20941-3-git-send-email-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1454303584-20941-3-git-send-email-mjguzik@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 05:28:13 -0000 On Mon, Feb 01, 2016 at 06:13:04AM +0100, Mateusz Guzik wrote: Ops, the diff excluded racct change (see the end). > From: Mateusz Guzik > > fork1 required its callers to pass a point to struct proc * which would > be set to the new process (if any). procdesc and racct manipulation also > used said pointer. > > However, the process could have exited prior to do_fork return and be > automatically reaped. > > Fix the problem by changing the API to let callers indicate whether they > want the pid or the struct proc, return the process in stopped state for > the latter case. > > The process is held after its thread is marked as runnable to prevent it > from exiting untill all work is done. > --- > sys/compat/cloudabi/cloudabi_proc.c | 3 +- > sys/compat/linux/linux_fork.c | 7 ++- > sys/kern/init_main.c | 2 +- > sys/kern/kern_fork.c | 108 ++++++++++++++++++++---------------- > sys/kern/kern_kthread.c | 2 +- > sys/sys/proc.h | 2 +- > 6 files changed, 67 insertions(+), 57 deletions(-) > > diff --git a/sys/compat/cloudabi/cloudabi_proc.c b/sys/compat/cloudabi/cloudabi_proc.c > index e98471b..d8c3bef 100644 > --- a/sys/compat/cloudabi/cloudabi_proc.c > +++ b/sys/compat/cloudabi/cloudabi_proc.c > @@ -78,13 +78,12 @@ cloudabi_sys_proc_fork(struct thread *td, > { > struct procdesc_req pdr; > struct filecaps fcaps = {}; > - struct proc *p2; > int error; > > pdr.pdr_flags = 0; > pdr.pdr_fcaps = &fcaps; > cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_EVENT); > - error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, &pdr); > + error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, NULL, NULL, &pdr); > if (error != 0) > return (error); > /* Return the file descriptor to the parent process. */ > diff --git a/sys/compat/linux/linux_fork.c b/sys/compat/linux/linux_fork.c > index 7cbe216..0c0f800 100644 > --- a/sys/compat/linux/linux_fork.c > +++ b/sys/compat/linux/linux_fork.c > @@ -73,7 +73,8 @@ linux_fork(struct thread *td, struct linux_fork_args *args) > printf(ARGS(fork, "")); > #endif > > - if ((error = fork1(td, RFFDG | RFPROC | RFSTOPPED, 0, &p2, NULL)) != 0) > + if ((error = fork1(td, RFFDG | RFPROC | RFSTOPPED, 0, &p2, NULL, > + NULL)) != 0) > return (error); > > td2 = FIRST_THREAD_IN_PROC(p2); > @@ -106,7 +107,7 @@ linux_vfork(struct thread *td, struct linux_vfork_args *args) > #endif > > if ((error = fork1(td, RFFDG | RFPROC | RFMEM | RFPPWAIT | RFSTOPPED, > - 0, &p2, NULL)) != 0) > + 0, &p2, NULL, NULL)) != 0) > return (error); > > td2 = FIRST_THREAD_IN_PROC(p2); > @@ -169,7 +170,7 @@ linux_clone_proc(struct thread *td, struct linux_clone_args *args) > if (args->flags & LINUX_CLONE_VFORK) > ff |= RFPPWAIT; > > - error = fork1(td, ff, 0, &p2, NULL); > + error = fork1(td, ff, 0, &p2, NULL, NULL); > if (error) > return (error); > > diff --git a/sys/kern/init_main.c b/sys/kern/init_main.c > index 7d0443a..d4fdcc0 100644 > --- a/sys/kern/init_main.c > +++ b/sys/kern/init_main.c > @@ -833,7 +833,7 @@ create_init(const void *udata __unused) > int error; > > error = fork1(&thread0, RFFDG | RFPROC | RFSTOPPED, 0, &initproc, > - NULL); > + NULL, NULL); > if (error) > panic("cannot fork init: %d\n", error); > KASSERT(initproc->p_pid == 1, ("create_init: initproc->p_pid != 1")); > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index 8cc56b7..d15f517 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -101,12 +101,11 @@ struct fork_args { > int > sys_fork(struct thread *td, struct fork_args *uap) > { > - int error; > - struct proc *p2; > + int error, pid; > > - error = fork1(td, RFFDG | RFPROC, 0, &p2, NULL); > + error = fork1(td, RFFDG | RFPROC, 0, NULL, &pid, NULL); > if (error == 0) { > - td->td_retval[0] = p2->p_pid; > + td->td_retval[0] = pid; > td->td_retval[1] = 0; > } > return (error); > @@ -119,8 +118,7 @@ sys_pdfork(td, uap) > struct pdfork_args *uap; > { > struct procdesc_req pdr; > - struct proc *p2; > - int error; > + int error, pid; > > /* > * It is necessary to return fd by reference because 0 is a valid file > @@ -129,9 +127,9 @@ sys_pdfork(td, uap) > */ > pdr.pdr_flags = uap->flags; > pdr.pdr_fcaps = NULL; > - error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, &pdr); > + error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, NULL, &pid, &pdr); > if (error == 0) { > - td->td_retval[0] = p2->p_pid; > + td->td_retval[0] = pid; > td->td_retval[1] = 0; > error = copyout(&pdr.pdr_fd, uap->fdp, sizeof(pdr.pdr_fd)); > } > @@ -142,13 +140,12 @@ sys_pdfork(td, uap) > int > sys_vfork(struct thread *td, struct vfork_args *uap) > { > - int error, flags; > - struct proc *p2; > + int error, pid; > > - flags = RFFDG | RFPROC | RFPPWAIT | RFMEM; > - error = fork1(td, flags, 0, &p2, NULL); > + error = fork1(td, RFFDG | RFPROC | RFPPWAIT | RFMEM, 0, NULL, &pid, > + NULL); > if (error == 0) { > - td->td_retval[0] = p2->p_pid; > + td->td_retval[0] = pid; > td->td_retval[1] = 0; > } > return (error); > @@ -157,17 +154,16 @@ sys_vfork(struct thread *td, struct vfork_args *uap) > int > sys_rfork(struct thread *td, struct rfork_args *uap) > { > - struct proc *p2; > - int error; > + int error, pid; > > /* Don't allow kernel-only flags. */ > if ((uap->flags & RFKERNELONLY) != 0) > return (EINVAL); > > AUDIT_ARG_FFLAGS(uap->flags); > - error = fork1(td, uap->flags, 0, &p2, NULL); > + error = fork1(td, uap->flags, 0, NULL, &pid, NULL); > if (error == 0) { > - td->td_retval[0] = p2 ? p2->p_pid : 0; > + td->td_retval[0] = pid; > td->td_retval[1] = 0; > } > return (error); > @@ -369,10 +365,11 @@ fail: > > static void > do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, > - struct vmspace *vm2, int pdflags) > + struct vmspace *vm2, struct proc **procp, int *procpid, int pdflags, > + struct file *fp_procdesc) > { > struct proc *p1, *pptr; > - int p2_held, trypid; > + int trypid; > struct filedesc *fd; > struct filedesc_to_leader *fdtol; > struct sigacts *newsigacts; > @@ -380,7 +377,6 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, > sx_assert(&proctree_lock, SX_SLOCKED); > sx_assert(&allproc_lock, SX_XLOCKED); > > - p2_held = 0; > p1 = td->td_proc; > > trypid = fork_findpid(flags); > @@ -708,6 +704,11 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, > if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) > dtrace_fasttrap_fork(p1, p2); > #endif > + /* > + * Hold the process so that it cannot exit after we make it runnable, > + * but before we wait for the debugger. > + */ > + _PHOLD(p2); > if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | > P_FOLLOWFORK)) { > /* > @@ -720,25 +721,12 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, > td->td_dbgflags |= TDB_FORK; > td->td_dbg_forked = p2->p_pid; > td2->td_dbgflags |= TDB_STOPATFORK; > - _PHOLD(p2); > - p2_held = 1; > } > if (flags & RFPPWAIT) { > td->td_pflags |= TDP_RFPPWAIT; > td->td_rfppwait_p = p2; > } > PROC_UNLOCK(p2); > - if ((flags & RFSTOPPED) == 0) { > - /* > - * If RFSTOPPED not requested, make child runnable and > - * add to run queue. > - */ > - thread_lock(td2); > - TD_SET_CAN_RUN(td2); > - sched_add(td2, SRQ_BORING); > - thread_unlock(td2); > - } > - > /* > * Now can be swapped. > */ > @@ -751,20 +739,43 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, > knote_fork(&p1->p_klist, p2->p_pid); > SDT_PROBE3(proc, , , create, p2, p1, flags); > > + if (flags & RFPROCDESC) { > + procdesc_finit(p2->p_procdesc, fp_procdesc); > + fdrop(fp_procdesc, td); > + } > + > + if (procpid != NULL) > + *procpid = p2->p_pid; > + if ((flags & RFSTOPPED) == 0) { > + /* > + * If RFSTOPPED not requested, make child runnable and > + * add to run queue. > + */ > + thread_lock(td2); > + TD_SET_CAN_RUN(td2); > + sched_add(td2, SRQ_BORING); > + thread_unlock(td2); > + } else { > + /* > + * Return child proc pointer to parent. > + */ > + *procp = p2; > + } > + > + PROC_LOCK(p2); > /* > * Wait until debugger is attached to child. > */ > - PROC_LOCK(p2); > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > - if (p2_held) > - _PRELE(p2); > + _PRELE(p2); > + racct_proc_fork_done(p2); > PROC_UNLOCK(p2); > } > > int > fork1(struct thread *td, int flags, int pages, struct proc **procp, > - struct procdesc_req *pdr) > + int *procpid, struct procdesc_req *pdr) > { > struct proc *p1, *newproc; > struct thread *td2; > @@ -775,6 +786,11 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, > static int curfail; > static struct timeval lastfail; > > + if ((flags & RFSTOPPED) != 0) > + MPASS(procp != NULL && procpid == NULL); > + else > + MPASS(procp == NULL); > + > /* Check for the undefined or unimplemented flags. */ > if ((flags & ~(RFFLAGS | RFTSIGFLAGS(RFTSIGMASK))) != 0) > return (EINVAL); > @@ -810,7 +826,10 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, > * certain parts of a process from itself. > */ > if ((flags & RFPROC) == 0) { > - *procp = NULL; > + if (procp != NULL) > + *procp = NULL; > + if (procpid != NULL) > + *procpid = 0; > return (fork_norfproc(td, flags)); > } > > @@ -938,17 +957,8 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, > lim_cur(td, RLIMIT_NPROC)); > } > if (ok) { > - do_fork(td, flags, newproc, td2, vm2, pdflags); > - > - /* > - * Return child proc pointer to parent. > - */ > - *procp = newproc; > - if (flags & RFPROCDESC) { > - procdesc_finit(newproc->p_procdesc, fp_procdesc); > - fdrop(fp_procdesc, td); > - } > - racct_proc_fork_done(newproc); > + do_fork(td, flags, newproc, td2, vm2, procp, procpid, pdflags, > + fp_procdesc); > return (0); > } > > diff --git a/sys/kern/kern_kthread.c b/sys/kern/kern_kthread.c > index 0673f68..67f6cc2 100644 > --- a/sys/kern/kern_kthread.c > +++ b/sys/kern/kern_kthread.c > @@ -89,7 +89,7 @@ kproc_create(void (*func)(void *), void *arg, > panic("kproc_create called too soon"); > > error = fork1(&thread0, RFMEM | RFFDG | RFPROC | RFSTOPPED | flags, > - pages, &p2, NULL); > + pages, &p2, NULL, NULL); > if (error) > return error; > > diff --git a/sys/sys/proc.h b/sys/sys/proc.h > index 60efdcd..b072e10 100644 > --- a/sys/sys/proc.h > +++ b/sys/sys/proc.h > @@ -931,7 +931,7 @@ int enterpgrp(struct proc *p, pid_t pgid, struct pgrp *pgrp, > int enterthispgrp(struct proc *p, struct pgrp *pgrp); > void faultin(struct proc *p); > void fixjobc(struct proc *p, struct pgrp *pgrp, int entering); > -int fork1(struct thread *, int, int, struct proc **, > +int fork1(struct thread *, int, int, struct proc **, int *, > struct procdesc_req *); > void fork_exit(void (*)(void *, struct trapframe *), void *, > struct trapframe *); > -- > 2.7.0 > diff --git a/sys/kern/kern_racct.c b/sys/kern/kern_racct.c index 0c7c0c4..ce7e2a4 100644 --- a/sys/kern/kern_racct.c +++ b/sys/kern/kern_racct.c @@ -957,16 +957,15 @@ void racct_proc_fork_done(struct proc *child) { + PROC_LOCK_ASSERT(child, MA_OWNED); #ifdef RCTL if (!racct_enable) return; - PROC_LOCK(child); mtx_lock(&racct_lock); rctl_enforce(child, RACCT_NPROC, 0); rctl_enforce(child, RACCT_NTHR, 0); mtx_unlock(&racct_lock); - PROC_UNLOCK(child); #endif } -- Mateusz Guzik From owner-freebsd-hackers@freebsd.org Mon Feb 1 10:36:40 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 090AFA9719B for ; Mon, 1 Feb 2016 10:36:40 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A8EAE26C; Mon, 1 Feb 2016 10:36:39 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u11AaXoh078184 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Mon, 1 Feb 2016 12:36:33 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u11AaXoh078184 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u11AaXw4078183; Mon, 1 Feb 2016 12:36:33 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Mon, 1 Feb 2016 12:36:32 +0200 From: Konstantin Belousov To: Mateusz Guzik Cc: freebsd-hackers@freebsd.org, kib@freebsd.org, Mateusz Guzik Subject: Re: [PATCH 0/2] plug fork use-after-free Message-ID: <20160201103632.GL91220@kib.kiev.ua> References: <1454303584-20941-1-git-send-email-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="BwCQnh7xodEAoBMC" Content-Disposition: inline In-Reply-To: <1454303584-20941-1-git-send-email-mjguzik@gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 10:36:40 -0000 --BwCQnh7xodEAoBMC Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Feb 01, 2016 at 06:13:02AM +0100, Mateusz Guzik wrote: > From: Mateusz Guzik >=20 > Quit some time ago I reported a problem with fork and provided a half-ass= ed > patch, see: > https://lists.freebsd.org/pipermail/freebsd-hackers/2014-October/046212.h= tml >=20 > Now I got around to fixing the problem in a less hackish manner. >=20 > Note that despite the new process possibly immediatley exiting and being > waited on, returning its (possibly now reused PID) is fine - that's the > pid it possibly saw by other means and in worst case the process is racing > with itself. >=20 > To reiterate, as it is, the code has use-after-free in procdesc and racct > handling. >=20 > The first patch is a small cleanup to reduce the number of arguments to > fork1, which was getting out of hand. I don't feel strongly about the > name of the structure used in there. >=20 > Mateusz Guzik (2): > fork: move procdesc-related parameters into a dedicated struct > fork: plug a use after free of the returned process pointer >=20 > sys/compat/cloudabi/cloudabi_proc.c | 11 ++-- > sys/compat/linux/linux_fork.c | 6 +- > sys/kern/init_main.c | 2 +- > sys/kern/kern_fork.c | 125 ++++++++++++++++++++----------= ------ > sys/kern/kern_kthread.c | 2 +- > sys/sys/proc.h | 5 +- > sys/sys/procdesc.h | 6 ++ > 7 files changed, 91 insertions(+), 66 deletions(-) I agree with the fix, but I want the approach to be pushed further. First, please pack all arguments to fork1() into the struct. I think everything except the curthread pointer should be packed into the argument structure. You have to touch all fork1() callers anyway, and with the structure approach you could avoid doing the second pass over the all callers (in the second patch), esp. if the structure is bzeroed before being filled. Second, it puzzles me that do_fork() takes both the p2 and procp arguments. Wouldn't it be cleaner to assign to *procp (or fork_req->procp) in fork1 ? I understand why this cannot be done with *procpid. --BwCQnh7xodEAoBMC Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJWrzUwAAoJEJDCuSvBvK1BSYgQAIra0v0O2Bn2lfFg1dtFB4zH d3SrF9tuhQY2F/Cg/imPFC5Fhgce7eum0UfLznjiYSxks2tC3hdhEKe4T3mGTIT8 3r19zuwv6bGAprLHK7uzBJ0VU+Vy03FFWVCgeIm4XUtECTbOGqLho6iFklh4pA3W uKJ8YhTa6hoVLx90/8V0Gjp7Nmo5THZsQpLAD3NfLs3SKAeW4hy6Imiue804T3r/ b/u0dDBH1b/rW+v8VggvmVL8TGcyqAU8+11C41Lbrdy0pEeiA3DechBa5+2KClrB PKbgt0jqLXr8MwtdwjbqYdtFv66HCnzYWnt5IBEvkpodFyWVYH0CtAGO87s7rJ+2 6uS22OqgPhebd4Fwq2kyP0W6y3l1yt3nE6GJNnR/Srgx4JzXIkn8is1t7keqVhuH 1PeWlFC3UYet5xuKHNU2ejJOnKtSKk4xMHbDQYa7u7IEAWD5NHDW3eSQ3kPIwjNe Q1wq09FkqBnji4GYw3nWiEOeZL83PuwThC9pttbvkHe6LXbrBnPm4e4Ap7HTtY+A +8+sDThZ56nCeL88jbZ9cO5Y3Cdj/tXPuvmRHQzWGl3BDNNjZgnUHgDLcGRerlgw f/+p0uQ6F7amd0541PfgddVctAEAcXvldzXfUveAOfhnAKXT8OtDv43Qe90Pr1E2 7FZNDq1TXI6YtxCr1AHO =sKWa -----END PGP SIGNATURE----- --BwCQnh7xodEAoBMC-- From owner-freebsd-hackers@freebsd.org Mon Feb 1 16:10:56 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EA3CAA97D68 for ; Mon, 1 Feb 2016 16:10:56 +0000 (UTC) (envelope-from Don.whY@gmx.com) Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 60F01C62 for ; Mon, 1 Feb 2016 16:10:56 +0000 (UTC) (envelope-from Don.whY@gmx.com) Received: from [192.168.1.115] ([67.212.197.98]) by mail.gmx.com (mrgmx101) with ESMTPSA (Nemesis) id 0MXDo1-1acDqy0Jbp-00WE5H for ; Mon, 01 Feb 2016 17:10:53 +0100 Message-ID: <56AF8381.9040307@gmx.com> Date: Mon, 01 Feb 2016 09:10:41 -0700 From: Don whY User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.1.1 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org Subject: Re: IoT OS References: <56A10892.2090308@rlwinm.de> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:UI8ZZXt/AAn7+pH1/qo8Cb04g0ERi93Oo8ngqoCdr/z7Agb/i4A 8BmstHY7yGO2nbnK3Qy3byNsmIUtjc7y3dXejPJIfgKOdWi7KHTBPR83L2lQNSX4EkXTeDr IModczf7lgH3J53J102ewK+Tx3BhbxQYImv8sxT9X2wFNTgybxL0BszWC5kCZ4bCphKzVwl LNLByyvcs/HaVRkYy6V7w== X-UI-Out-Filterresults: notjunk:1;V01:K0:7Utm9397maM=:2nzBusCSlyLpNd8PHhEYwD +5Whr1gTxvH41CX55lZEeExePWO7XOla4rf+EDa9c5m8dW43GOX3KhVWpfQTaRJJ2TGMlKxZD jumOvDZLB2MW+A5qRpMEFXvcX8UKTnW3JqAHkfwJ5Rjo6uQC4envsUuwaMTMCzMlsmcjLLrkr 3qOdm/nwGYXSmOevabrUnXpEVySlpQ6g2F6PWNUMQuf5pWJEoOWxtfq5FwP96Tp5iLk+S2avW yrIIk/LEtOAAJl648+9JT4nIitbooiosTRUXGKx6+xqJDQPRMdFFk0IMTJG1EcFQOdI5dI5tq /a4BzDOcyG3ZQS5emPPob6mV/s9RjShu/KiYUQqfBexLTjoqd3ezObrjV/X4ti+DhnsEYcrxz e20QPugPTzz7jSgUj3QfkoqAP8qwnuqx7faWOnB7Id8uhqLTyDuK7wpfoAg60lhTwdP4c0xyP ImGtwUXCQkuQyL6lpIUQiKS58bi9GXwehIbUWPPdGTnOg1E6RAEo5g9QXRCRlWd8+Fq5cCQv6 7LmiWRZgDa6GYKnGyKJ7vh3jIFmL32m59lIbN+cHyzTM30i9yXlIj7B2Oy4ClovfqQTjCGIqS xlEVcXMWzY2JSKSjqkuTg+wvWhxXeIg2yg5MInZk6Dxe0HGRfGi/Bdj83FpKgKDDaxEaMeP1b QWWFhl8hmn52rr2XZ9VTJDmJWQWI7Go+Xl1KdbpXueut2hNm4Hf+ELSlrHfvDUVAKT9E/1aZT 9Sfq8+eRlTH5MpXa1pucLuanOynHm4XDpxod1CthvsX0dmcDwNATvo1Ni1Gt9q76/3KJNGvlE 1h30B4Q X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 16:10:57 -0000 [stuff elided, recipients trimmed] On 1/21/2016 11:04 AM, Mathieu Prevot wrote: >>>> On 21/01/16 17:19, Mathieu Prevot wrote: >>>> I would like to connect several connected object (with homogeneous or >>>> heterogenous hardare: intel edison, samsung artik, apple AX, intel core, >>>> etc) so the calculation needs, the storage/memory, the connection, etc are >>>> decoupled; hence we can reach an ecosystem with several clouds. Most of these are reasonably heavyweight nodes. I.e., you aren't talking about lightweight sensor motes. >>>> How do you recommend to reach that ? from the kernel, a module, or >>>> eventually a software ? > Say all objects are connected peer to peer with wifi, some of them are > connected to internet through gsm network or wifi to a box. These object > are moving in space, and for some reasons, connections are dynamical and > can be severely impaired or lost. So, you are expecting NONE of them to be precious nodes? What guarantees do you propose regarding interconnectivity? Will there *always* be a path through the mesh to The Internet? Or, to some_precious_node? How do you expect to handle the absence of that path? Or, the mesh partitioning into two or more disjoint meshes? > They have incoming local streams of data (eg HD videos, accelerometer, GPS, > other wifi and gsm signals, etc). > > I would like to abstract the CPU layer, storage layer, and internet > connection so that in realtime results of one of my objects are saved if > this object dies, so that if one of the object giving internet access to > the group loose its connection, the redundancy allows the group of object > not to lose internet connection. Nothing is going to magically do this *for* you. If you want to have reliability through redundancy, you need to provide that redundancy, somewhere and by some means. You also casually slip the term "realtime" into the description. What are your timeliness constraints on the data and its handling? Especially in conditions of "fault"? Do you expect the "system" to enforce these constraints? Or, do you just assume "realtime" means "current"? > Can I consider these as different load balancing layers ? Do you recommend > to implement this at the kernel layer or at an API layer ? Can I see that > as a lightweight cluster ? I've been proceeding along the lines of a homogenous set of hardware (simply because it means I have less hardware to *design*/build) of reasonably heavyweight capabilities (i.e., no motes). I've built an RTOS that lets me dynamically move "tasks" between nodes at run time. In my world, everything is wired (cuz I don't want to have to deal with the potential for trivial DoS attacks based on remote RF jamming) and PoE powered. This lets me hide devices in places where a "wall wart" wouldn't be convenient. It also lets me power nodes up/down as needed. E.g., if I need more computational horsepower, I can bring a node on-line solely for it's "CPU" -- even though it's particular I/O's may not be needed at this time. (and, because I can move a "task" at will, I can then push arbitrary "tasks" off to that node in the "balancing act") I've split the design into three distinct layers: - RTOS (to provide RT guarantees, mechanisms, etc.) - core services (things that "applications" would tend to require) - applications (intended to be greatly abstracted) A homogenous environment makes the design of the RTOS simpler. I don't have to deal with endian-ness issues, different data representations, on-the-fly marshalling conversions in the RPC mechanism, etc. It lets the *distributed* system more closely resemble an SMP implementation. Running on bare metal, the RTOS can eek out a bit more performance (to compensate for some of the inefficiency of a distributed system). The core services run atop the RTOS and can benefit from its rich API. E.g., the details of implementing redundant tasks can be hidden from the application layer so it just looks like a "system call" -- the application need not worry about *where* the redundant task gets spawned... nor how the data streams get routed/coordinated. The application layer runs in VM's on each node. So, I can just "move" the contents of a particular "application" to a VM running on another node and resume execution, there (the system deals with ensuring all the communication paths move *with* the application). Timeliness guarantees get harder to satisfy as you move up from the metal. E.g., communication delays vary for local IPC vs. RPC -- and what was an IPC a moment ago may become an RPC if one endpoint happens to be relocated as part of the load shifting. As a result, most of the true "real-time" issues are handled in the core services -- as they are designed to be aware of these run-time issues in a way that "application developers" are (deliberately) not. My system/hardware design implicitly assumes some nodes will always be accessible (e.g., they are all connected/powered from the network switch so the switch, at the very least, is present in all configurations!). This allows other "key services" to be guaranteed. E.g., I have a single persistent store that satisfies all of the needs of the nodes in the system (with redundancy implemented *within*). Likewise, precise clock synchronization so the movement of "tasks" doesn't introduce relative timing anomalies. If you can't make any such guarantees, then you'll have to ensure each node is capable enough to provide any guarantees that *it* requires along with those of the other nodes in whichever mesh (or portion thereof) that it resides. > I think the API is more flexible, especially if I have an heterogeneous (by > CPU, OS) set of connected object. However, working at the kernel level > allows existing programs not to be rewritten. What are your thoughts ? What "existing programs"? Are you envisioning desktop applications migrating into your mesh? Or, the data sourced from the mesh integrating with desktop apps? Or, desktop apps sourcing data that the mesh *sinks*? > Do you recommend another list ? Probably -- though I'm not sure which one... Good luck! From owner-freebsd-hackers@freebsd.org Mon Feb 1 19:04:21 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3C14BA97577 for ; Mon, 1 Feb 2016 19:04:21 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A8FCA8A7 for ; Mon, 1 Feb 2016 19:04:20 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u11J4FeO032423 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Mon, 1 Feb 2016 21:04:16 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u11J4FeO032423 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u11J4FPm032422; Mon, 1 Feb 2016 21:04:15 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 1 Feb 2016 21:04:15 +0200 From: Konstantin Belousov To: Jov Cc: freebsd-hackers@freebsd.org Subject: Re: Fwd: [BUGS] BUG #13900: stop standby failed with writer process hang(happen 3 times in 2 days) Message-ID: <20160201190415.GO91220@kib.kiev.ua> References: <20160130071346.31022.37189@wrigleys.postgresql.org> <20160131162929.GI91220@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Feb 2016 19:04:21 -0000 On Mon, Feb 01, 2016 at 10:07:18AM +0800, Jov wrote: > Thanks Belousov! this is the output: > root@fb:~ # ps axlHww | grep 979 > 1001 979 1 0 4 0 196940 8236 - Ds - 0:06.03 > postgres: writer process (postgres) > 0 48614 48587 0 20 0 18824 2252 piperd S+ 0 0:00.00 > grep 979 > > > I use PostgreSQL on FreeBSD more than 3 years and this is the first time I > have problems. > I am suspicious of this problem related with the harden of the security of > the FreeBSD, I add these entry to all of my server aouble one month ago: I doubt that this is related. The problem you reported is somewhat known, I am not sure about the state of it on stable/9. There were some fixes on HEAD and stable/10, there were many attempts of changes. I currently believe that it is mostly fixed, and ony pending fix is a minor patch I provided for several people. But this is only for HEAD and 10. > > security.bsd.see_other_uids=0 > security.bsd.see_other_gids=0 > security.bsd.unprivileged_read_msgbuf=0 > security.bsd.unprivileged_proc_debug=0 > security.bsd.hardlink_check_uid=1 > security.bsd.hardlink_check_gid=1 > security.bsd.stack_guard_page=1 This setting would do nothing for you. > security.bsd.unprivileged_mlock=0 And this one causes less 'security' since programs like gpg no longer can allocate non-swappable memory for sensitive data without setuid. From owner-freebsd-hackers@freebsd.org Tue Feb 2 01:36:40 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 77B05A9851A for ; Tue, 2 Feb 2016 01:36:40 +0000 (UTC) (envelope-from zhao6014@gmail.com) Received: from mail-yk0-x229.google.com (mail-yk0-x229.google.com [IPv6:2607:f8b0:4002:c07::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 33A941B39 for ; Tue, 2 Feb 2016 01:36:40 +0000 (UTC) (envelope-from zhao6014@gmail.com) Received: by mail-yk0-x229.google.com with SMTP id z13so68539431ykd.0 for ; Mon, 01 Feb 2016 17:36:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=VvPpj7lYqyddGkiPnpSaNWLTkIAvOvwmbWBo2Evdy5k=; b=RQlwPNp6YSjmWTB5xENWg7XYJ+ff7ab+e97RXL79F6rDl7dx1lxbx0zep8SJ3I2Ww7 jb6MPfVLQVofgkcuTR44NOfjr0XuPpZvHcczooUbaHHeSfj+rmz9jBBCHJPnFRMhR/as BcUwYPh3yfgSdhwl1v/uQ6HpSz+6QLKF8yFu6Y3lwmimZuLJQi0ujdha4ANXI6/JqVQ5 Mt0WzjuM2/CBzJaKpoStghwBVZknOgA/isBwxXKvo7TEI3SMxV0CrihnFWl42Uy9AcBT TzoERf586XvEqxSq6TyGE9NBUwKYy3M5w0/pW0yOMZBbAlA1IMxje2n6Q2aGR38DpMJB +qUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=VvPpj7lYqyddGkiPnpSaNWLTkIAvOvwmbWBo2Evdy5k=; b=aURvBHHrlLjAJydf1sOx9BK8Jv+ZkcYUYXWdJDakqjjRuX6dXL6YD/hrieHWpFRW5Q xnwHzD1io1MQNxZu/QBfNEl2ZgL/HD4ry1nnSc3r5IFMDRN89O5Xes++8PVTikYPNs9K RwMz3CfTBmCAidVAes26Lb+LJJG7ENWlsc18WgMVUV/O1pxc8IshaJ2wIYD5MTSsFFMd j8NxYEI385xCvdyR/ATc6proiBGtk535ixPreFPpe7Uza9/njnkxxxQtFpjleU3p8JG2 6hVZnX5v4nY6/haWgbjUI7CgR520qML4U3kLAHlmBKi9UY6lRfDtjlGCPjHrfQIFTXGs afew== X-Gm-Message-State: AG10YOQraNwo4nVZ74TNshhTSVo6grF9S5kLmKaPS09DrWoeDLHg68zWAMa84t2Na6aacIhlQb0ytsdEIC0dqA== X-Received: by 10.37.4.146 with SMTP id 140mr15584308ybe.18.1454376998610; Mon, 01 Feb 2016 17:36:38 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.79.6 with HTTP; Mon, 1 Feb 2016 17:36:19 -0800 (PST) In-Reply-To: <20160201190415.GO91220@kib.kiev.ua> References: <20160130071346.31022.37189@wrigleys.postgresql.org> <20160131162929.GI91220@kib.kiev.ua> <20160201190415.GO91220@kib.kiev.ua> From: Jov Date: Tue, 2 Feb 2016 09:36:19 +0800 Message-ID: Subject: Re: Fwd: [BUGS] BUG #13900: stop standby failed with writer process hang(happen 3 times in 2 days) To: Konstantin Belousov Cc: freebsd-hackers@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 01:36:40 -0000 =E2=80=8BThanks for your info,Belousov. Would you please point which PR or which commit I can exam=EF=BC=9FI can no= t move to 10 stable now,maybe I can avoid trigger this bug or add some monitor=E2= =80=8B. Jov blog: http:amutu.com/blog 2016-02-02 3:04 GMT+08:00 Konstantin Belousov : > On Mon, Feb 01, 2016 at 10:07:18AM +0800, Jov wrote: > > Thanks Belousov! this is the output: > > root@fb:~ # ps axlHww | grep 979 > > 1001 979 1 0 4 0 196940 8236 - Ds - 0:06.0= 3 > > postgres: writer process (postgres) > > 0 48614 48587 0 20 0 18824 2252 piperd S+ 0 0:00.0= 0 > > grep 979 > > > > > > I use PostgreSQL on FreeBSD more than 3 years and this is the first tim= e > I > > have problems. > > I am suspicious of this problem related with the harden of the security > of > > the FreeBSD, I add these entry to all of my server aouble one month ago= : > I doubt that this is related. > > The problem you reported is somewhat known, I am not sure about the state > of it on stable/9. There were some fixes on HEAD and stable/10, there > were many attempts of changes. I currently believe that it is mostly > fixed, and ony pending fix is a minor patch I provided for several > people. But this is only for HEAD and 10. > > > > > > > security.bsd.see_other_uids=3D0 > > security.bsd.see_other_gids=3D0 > > security.bsd.unprivileged_read_msgbuf=3D0 > > security.bsd.unprivileged_proc_debug=3D0 > > security.bsd.hardlink_check_uid=3D1 > > security.bsd.hardlink_check_gid=3D1 > > > security.bsd.stack_guard_page=3D1 > This setting would do nothing for you. > > > security.bsd.unprivileged_mlock=3D0 > And this one causes less 'security' since programs like gpg no longer can > allocate non-swappable memory for sensitive data without setuid. > From owner-freebsd-hackers@freebsd.org Tue Feb 2 04:07:55 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 91F4EA978E4 for ; Tue, 2 Feb 2016 04:07:55 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x243.google.com (mail-wm0-x243.google.com [IPv6:2a00:1450:400c:c09::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1E2E51C4E; Tue, 2 Feb 2016 04:07:55 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x243.google.com with SMTP id 128so483448wmz.3; Mon, 01 Feb 2016 20:07:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=yLFxl4l7A+oYW5XPuv16uObuhE/RhPfCOX87b7d2MeQ=; b=waUJGWkNG2NRmCJbdRFuGSxNFvNIVVAmSOVOV4JVW+V094R4EnsELpfniwaA+K+qyW K+kjRbZWMhh4Xl+OopBn+Nsd/hg5GFyZTeQmvM/YzmkRHW1DH0DhpYhIVmS+8ZnFb8ll R7C6m1BqWsEPZ503Q8qtOAyOwI4Ycn2xhoX+5EZtOhWRglbhaXrqDr6RMli/wloreNDY V9FpUTjP+jBi4xtWCjPDJeab0cW0+0FwPN/vC/gNbhAZ9LRHG1dnvCrpm+h4K1sevQ+H bm3dXgzsZPTkaDiD4n4u7ARPk83qvPNxgw5aVC0mH8pRB21TWdRWXZJGUpiZ/Dk/oXVm K7ZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=yLFxl4l7A+oYW5XPuv16uObuhE/RhPfCOX87b7d2MeQ=; b=YtRVjzfIip8Rn6qLIsY5DY7Bb/o3OIPTU6AHb3J1b4ItgPblgRbe4TpOTBXJK89QaP 1lcFO5+6GR+1FUGTnqec6WesJzJ00jjqZJ9Qx44fdyQ4TsDsmGtqFSaIGv4q3SSnL3yj A6bmfJEcC2m3Q/TfU5Rcs8PK71bccmEN/Bv4/EpxyyMiYBxfhhnHdTBNgmKbjDFQAfty DtBniuRcQvfjl2FSwU1LtJCx5DrPPQY4raO4MVPylpdaoAo0fjycPcCZjNiWE/G2XfA+ whEfXoZzZfiApYTFJDD4K63FiM/TgXZM3upGPaLHkiLVfe2KxWFt2Q1V4wrHk/JUxa6p 8Skg== X-Gm-Message-State: AG10YOS0rLUot1N9jOgdWFqdubZvCxuU6xPkPJwqz1aqkKtMynJrcSymfKkEpW9Xw2H/bg== X-Received: by 10.28.63.200 with SMTP id m191mr3082380wma.21.1454386073421; Mon, 01 Feb 2016 20:07:53 -0800 (PST) Received: from mguzik.localdomain (ip-62-245-66-110.net.upcbroadband.cz. [62.245.66.110]) by smtp.gmail.com with ESMTPSA id n131sm550333wmf.9.2016.02.01.20.07.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Feb 2016 20:07:52 -0800 (PST) From: Mateusz Guzik To: kib@freebsd.org Cc: freebsd-hackers@freebsd.org, Mateusz Guzik Subject: [PATCH 2/2] fork: plug a use after free of the returned process pointer Date: Tue, 2 Feb 2016 05:07:49 +0100 Message-Id: <1454386069-29657-3-git-send-email-mjguzik@gmail.com> X-Mailer: git-send-email 2.4.3 In-Reply-To: <1454386069-29657-1-git-send-email-mjguzik@gmail.com> References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 04:07:55 -0000 From: Mateusz Guzik fork1 required its callers to pass a pointer to struct proc * which would be set to the new process (if any). procdesc and racct manipulation also used said pointer. However, the process could have exited prior to do_fork return and be automatically reaped, thus making this a use-after-free. Fix the problem by letting callers indicate whether they want the pid or the struct proc, return the process in stopped state for the latter case. --- sys/compat/cloudabi/cloudabi_proc.c | 2 - sys/kern/kern_fork.c | 104 +++++++++++++++++++----------------- sys/kern/kern_racct.c | 3 +- sys/sys/proc.h | 1 + 4 files changed, 58 insertions(+), 52 deletions(-) diff --git a/sys/compat/cloudabi/cloudabi_proc.c b/sys/compat/cloudabi/cloudabi_proc.c index 010efca..2bc50ca 100644 --- a/sys/compat/cloudabi/cloudabi_proc.c +++ b/sys/compat/cloudabi/cloudabi_proc.c @@ -77,12 +77,10 @@ cloudabi_sys_proc_fork(struct thread *td, { struct fork_req fr = {}; struct filecaps fcaps = {}; - struct proc *p2; int error, fd; cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_EVENT); fr.fr_flags = RFFDG | RFPROC | RFPROCDESC; - fr.fr_procp = &p2; fr.fr_pd_fd = &fd; fr.fr_pd_fcaps = &fcaps; error = fork1(td, &fr); diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index 3b51b7f..d0c3837 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -102,14 +102,13 @@ int sys_fork(struct thread *td, struct fork_args *uap) { struct fork_req fr = {}; - int error; - struct proc *p2; + int error, pid; fr.fr_flags = RFFDG | RFPROC; - fr.fr_procp = &p2; + fr.fr_pidp = &pid; error = fork1(td, &fr); if (error == 0) { - td->td_retval[0] = p2->p_pid; + td->td_retval[0] = pid; td->td_retval[1] = 0; } return (error); @@ -122,11 +121,10 @@ sys_pdfork(td, uap) struct pdfork_args *uap; { struct fork_req fr = {}; - int error, fd; - struct proc *p2; + int error, fd, pid; fr.fr_flags = RFFDG | RFPROC | RFPROCDESC; - fr.fr_procp = &p2; + fr.fr_pidp = &pid; fr.fr_pd_fd = &fd; fr.fr_pd_flags = uap->flags; /* @@ -136,7 +134,7 @@ sys_pdfork(td, uap) */ error = fork1(td, &fr); if (error == 0) { - td->td_retval[0] = p2->p_pid; + td->td_retval[0] = pid; td->td_retval[1] = 0; error = copyout(&fd, uap->fdp, sizeof(fd)); } @@ -148,14 +146,13 @@ int sys_vfork(struct thread *td, struct vfork_args *uap) { struct fork_req fr = {}; - int error; - struct proc *p2; + int error, pid; fr.fr_flags = RFFDG | RFPROC | RFPPWAIT | RFMEM; - fr.fr_procp = &p2; + fr.fr_pidp = &pid; error = fork1(td, &fr); if (error == 0) { - td->td_retval[0] = p2->p_pid; + td->td_retval[0] = pid; td->td_retval[1] = 0; } return (error); @@ -165,8 +162,7 @@ int sys_rfork(struct thread *td, struct rfork_args *uap) { struct fork_req fr = {}; - struct proc *p2; - int error; + int error, pid; /* Don't allow kernel-only flags. */ if ((uap->flags & RFKERNELONLY) != 0) @@ -174,10 +170,10 @@ sys_rfork(struct thread *td, struct rfork_args *uap) AUDIT_ARG_FFLAGS(uap->flags); fr.fr_flags = uap->flags; - fr.fr_procp = &p2; + fr.fr_pidp = &pid; error = fork1(td, &fr); if (error == 0) { - td->td_retval[0] = p2 ? p2->p_pid : 0; + td->td_retval[0] = pid; td->td_retval[1] = 0; } return (error); @@ -378,20 +374,21 @@ fail: } static void -do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, - struct vmspace *vm2, int pdflags) +do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread *td2, + struct vmspace *vm2, struct file *fp_procdesc) { struct proc *p1, *pptr; - int p2_held, trypid; + int trypid; struct filedesc *fd; struct filedesc_to_leader *fdtol; struct sigacts *newsigacts; + int flags; sx_assert(&proctree_lock, SX_SLOCKED); sx_assert(&allproc_lock, SX_XLOCKED); - p2_held = 0; p1 = td->td_proc; + flags = fr->fr_flags; trypid = fork_findpid(flags); @@ -690,7 +687,7 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, * However, don't do this until after fork(2) can no longer fail. */ if (flags & RFPROCDESC) - procdesc_new(p2, pdflags); + procdesc_new(p2, fr->fr_pd_flags); /* * Both processes are set up, now check if any loadable modules want @@ -718,6 +715,11 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) dtrace_fasttrap_fork(p1, p2); #endif + /* + * Hold the process so that it cannot exit after we make it runnable, + * but before we wait for the debugger. + */ + _PHOLD(p2); if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | P_FOLLOWFORK)) { /* @@ -730,24 +732,12 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, td->td_dbgflags |= TDB_FORK; td->td_dbg_forked = p2->p_pid; td2->td_dbgflags |= TDB_STOPATFORK; - _PHOLD(p2); - p2_held = 1; } if (flags & RFPPWAIT) { td->td_pflags |= TDP_RFPPWAIT; td->td_rfppwait_p = p2; } PROC_UNLOCK(p2); - if ((flags & RFSTOPPED) == 0) { - /* - * If RFSTOPPED not requested, make child runnable and - * add to run queue. - */ - thread_lock(td2); - TD_SET_CAN_RUN(td2); - sched_add(td2, SRQ_BORING); - thread_unlock(td2); - } /* * Now can be swapped. @@ -761,14 +751,34 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, knote_fork(&p1->p_klist, p2->p_pid); SDT_PROBE3(proc, , , create, p2, p1, flags); + if (flags & RFPROCDESC) { + procdesc_finit(p2->p_procdesc, fp_procdesc); + fdrop(fp_procdesc, td); + } + + if ((flags & RFSTOPPED) == 0) { + /* + * If RFSTOPPED not requested, make child runnable and + * add to run queue. + */ + thread_lock(td2); + TD_SET_CAN_RUN(td2); + sched_add(td2, SRQ_BORING); + thread_unlock(td2); + if (fr->fr_pidp != NULL) + *fr->fr_pidp = p2->p_pid; + } else { + *fr->fr_procp = p2; + } + + PROC_LOCK(p2); /* * Wait until debugger is attached to child. */ - PROC_LOCK(p2); while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) cv_wait(&p2->p_dbgwait, &p2->p_mtx); - if (p2_held) - _PRELE(p2); + _PRELE(p2); + racct_proc_fork_done(p2); PROC_UNLOCK(p2); } @@ -788,6 +798,11 @@ fork1(struct thread *td, struct fork_req *fr) flags = fr->fr_flags; pages = fr->fr_pages; + if ((flags & RFSTOPPED) != 0) + MPASS(fr->fr_procp != NULL && fr->fr_pidp == NULL); + else + MPASS(fr->fr_procp == NULL); + /* Check for the undefined or unimplemented flags. */ if ((flags & ~(RFFLAGS | RFTSIGFLAGS(RFTSIGMASK))) != 0) return (EINVAL); @@ -821,7 +836,10 @@ fork1(struct thread *td, struct fork_req *fr) * certain parts of a process from itself. */ if ((flags & RFPROC) == 0) { - *fr->fr_procp = NULL; + if (fr->fr_procp != NULL) + *fr->fr_procp = NULL; + else if (fr->fr_pidp != NULL) + *fr->fr_pidp = 0; return (fork_norfproc(td, flags)); } @@ -949,17 +967,7 @@ fork1(struct thread *td, struct fork_req *fr) lim_cur(td, RLIMIT_NPROC)); } if (ok) { - do_fork(td, flags, newproc, td2, vm2, fr->fr_pd_flags); - - /* - * Return child proc pointer to parent. - */ - *fr->fr_procp = newproc; - if (flags & RFPROCDESC) { - procdesc_finit(newproc->p_procdesc, fp_procdesc); - fdrop(fp_procdesc, td); - } - racct_proc_fork_done(newproc); + do_fork(td, fr, newproc, td2, vm2, fp_procdesc); return (0); } diff --git a/sys/kern/kern_racct.c b/sys/kern/kern_racct.c index 0c7c0c4..ce7e2a4 100644 --- a/sys/kern/kern_racct.c +++ b/sys/kern/kern_racct.c @@ -957,16 +957,15 @@ void racct_proc_fork_done(struct proc *child) { + PROC_LOCK_ASSERT(child, MA_OWNED); #ifdef RCTL if (!racct_enable) return; - PROC_LOCK(child); mtx_lock(&racct_lock); rctl_enforce(child, RACCT_NPROC, 0); rctl_enforce(child, RACCT_NTHR, 0); mtx_unlock(&racct_lock); - PROC_UNLOCK(child); #endif } diff --git a/sys/sys/proc.h b/sys/sys/proc.h index ac96566..039fd39 100644 --- a/sys/sys/proc.h +++ b/sys/sys/proc.h @@ -910,6 +910,7 @@ struct proc *zpfind(pid_t); /* Find zombie process by id. */ struct fork_req { int fr_flags; int fr_pages; + int *fr_pidp; struct proc **fr_procp; int *fr_pd_fd; int fr_pd_flags; -- 2.7.0 From owner-freebsd-hackers@freebsd.org Tue Feb 2 04:07:54 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8D248A978D6 for ; Tue, 2 Feb 2016 04:07:54 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com [IPv6:2a00:1450:400c:c09::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 190C31C48; Tue, 2 Feb 2016 04:07:54 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x242.google.com with SMTP id p63so493187wmp.1; Mon, 01 Feb 2016 20:07:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Q5UVUj9RLeQGinQvNojj+g10laQqR5okl6sEhon9WRk=; b=m7JwyibyhhIQz5Fz/EEA7rpXmXKpl75ULnnLve+2FO24c/BNd97bpLVoPB9NFy1cX8 NIi1kaWy6NFoItL+myROxuycO0DgDhxivR1Rh7IW4i39usZz9r1l+SriTXdrHU4iyewM QA77GnAQDiwMHftD5C+c0G2GC384RprsYlLpAcQE2C+gDegfvf/k4JJFKXqEIvWQOz40 qV8O+S49nNY2T3YEL1NS7PxT2U25vYL7VchaEbBWgI8iPbPp8Qo4nf/R3Y5TxZdGmpfg SzxyO57u+LUJe91OlLWCDJDLEEAtdF/u6/cNUjVLZsZDAFx6v3c5vZyGinqtHpO+BzEf 3xqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Q5UVUj9RLeQGinQvNojj+g10laQqR5okl6sEhon9WRk=; b=ToCzzLK+n9BtTrCoAN+/cjSan5gU5QBrFjtnxyRUDu7OovUQwXsLQSnhpnCCTWWI5y eu55AZblfcHZudLVrN3c+cRuWF8lrIcHBVOgzN0NJLWZ9CE1rqW8SIxR4lSgTEmq099u 69jrwGp53gK6s3EkS50yi0Fcf5HdBEixYvpoOghi5czbRjqPxoTvYmShSnzigq9ZS8tr 8LFMBGOsBtWMy4/qYbbo6FM3ySmkH6QdnU+rpwvAla4mnvmxCFfu7HdB4A6zLfT73sG7 ufNiwIA1BPugZHLAyI4dT7IEM3aqxr6WmOTL+ZFnbJahAXGo07vCXgZ6zbrO7elS7WG5 7kaQ== X-Gm-Message-State: AG10YORT2HB9dRiqXgYot57b6AjpO3RXmayUWUUi1vKaCsqH7sQQUpxV/7cN+viX9am2Qg== X-Received: by 10.28.173.208 with SMTP id w199mr14193707wme.45.1454386072626; Mon, 01 Feb 2016 20:07:52 -0800 (PST) Received: from mguzik.localdomain (ip-62-245-66-110.net.upcbroadband.cz. [62.245.66.110]) by smtp.gmail.com with ESMTPSA id n131sm550333wmf.9.2016.02.01.20.07.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Feb 2016 20:07:52 -0800 (PST) From: Mateusz Guzik To: kib@freebsd.org Cc: freebsd-hackers@freebsd.org, Mateusz Guzik Subject: [PATCH 1/2] fork: pass arguments to fork1 in a dedicated structure Date: Tue, 2 Feb 2016 05:07:48 +0100 Message-Id: <1454386069-29657-2-git-send-email-mjguzik@gmail.com> X-Mailer: git-send-email 2.4.3 In-Reply-To: <1454386069-29657-1-git-send-email-mjguzik@gmail.com> References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 04:07:54 -0000 From: Mateusz Guzik --- sys/compat/cloudabi/cloudabi_proc.c | 7 +++++- sys/compat/linux/linux_fork.c | 17 ++++++++++---- sys/kern/init_main.c | 6 +++-- sys/kern/kern_fork.c | 46 +++++++++++++++++++++++++------------ sys/kern/kern_kthread.c | 7 ++++-- sys/sys/proc.h | 12 ++++++++-- 6 files changed, 68 insertions(+), 27 deletions(-) diff --git a/sys/compat/cloudabi/cloudabi_proc.c b/sys/compat/cloudabi/cloudabi_proc.c index d917337..010efca 100644 --- a/sys/compat/cloudabi/cloudabi_proc.c +++ b/sys/compat/cloudabi/cloudabi_proc.c @@ -75,12 +75,17 @@ int cloudabi_sys_proc_fork(struct thread *td, struct cloudabi_sys_proc_fork_args *uap) { + struct fork_req fr = {}; struct filecaps fcaps = {}; struct proc *p2; int error, fd; cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_EVENT); - error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, &fd, 0, &fcaps); + fr.fr_flags = RFFDG | RFPROC | RFPROCDESC; + fr.fr_procp = &p2; + fr.fr_pd_fd = &fd; + fr.fr_pd_fcaps = &fcaps; + error = fork1(td, &fr); if (error != 0) return (error); /* Return the file descriptor to the parent process. */ diff --git a/sys/compat/linux/linux_fork.c b/sys/compat/linux/linux_fork.c index d0f73ad..4cd521b 100644 --- a/sys/compat/linux/linux_fork.c +++ b/sys/compat/linux/linux_fork.c @@ -64,6 +64,7 @@ __FBSDID("$FreeBSD$"); int linux_fork(struct thread *td, struct linux_fork_args *args) { + struct fork_req fr = {}; int error; struct proc *p2; struct thread *td2; @@ -73,8 +74,9 @@ linux_fork(struct thread *td, struct linux_fork_args *args) printf(ARGS(fork, "")); #endif - if ((error = fork1(td, RFFDG | RFPROC | RFSTOPPED, 0, &p2, NULL, 0, - NULL)) != 0) + fr.fr_flags = RFFDG | RFPROC | RFSTOPPED; + fr.fr_procp = &p2; + if ((error = fork1(td, &fr)) != 0) return (error); td2 = FIRST_THREAD_IN_PROC(p2); @@ -97,6 +99,7 @@ linux_fork(struct thread *td, struct linux_fork_args *args) int linux_vfork(struct thread *td, struct linux_vfork_args *args) { + struct fork_req fr = {}; int error; struct proc *p2; struct thread *td2; @@ -106,8 +109,9 @@ linux_vfork(struct thread *td, struct linux_vfork_args *args) printf(ARGS(vfork, "")); #endif - if ((error = fork1(td, RFFDG | RFPROC | RFMEM | RFPPWAIT | RFSTOPPED, - 0, &p2, NULL, 0, NULL)) != 0) + fr.fr_flags = RFFDG | RFPROC | RFMEM | RFPPWAIT | RFSTOPPED; + fr.fr_procp = &p2; + if ((error = fork1(td, &fr)) != 0) return (error); td2 = FIRST_THREAD_IN_PROC(p2); @@ -130,6 +134,7 @@ linux_vfork(struct thread *td, struct linux_vfork_args *args) static int linux_clone_proc(struct thread *td, struct linux_clone_args *args) { + struct fork_req fr = {}; int error, ff = RFPROC | RFSTOPPED; struct proc *p2; struct thread *td2; @@ -170,7 +175,9 @@ linux_clone_proc(struct thread *td, struct linux_clone_args *args) if (args->flags & LINUX_CLONE_VFORK) ff |= RFPPWAIT; - error = fork1(td, ff, 0, &p2, NULL, 0, NULL); + fr.fr_flags = ff; + fr.fr_procp = &p2; + error = fork1(td, &fr); if (error) return (error); diff --git a/sys/kern/init_main.c b/sys/kern/init_main.c index 8d5580b..a63d4e0 100644 --- a/sys/kern/init_main.c +++ b/sys/kern/init_main.c @@ -828,12 +828,14 @@ start_init(void *dummy) static void create_init(const void *udata __unused) { + struct fork_req fr = {}; struct ucred *newcred, *oldcred; struct thread *td; int error; - error = fork1(&thread0, RFFDG | RFPROC | RFSTOPPED, 0, &initproc, - NULL, 0, NULL); + fr.fr_flags = RFFDG | RFPROC | RFSTOPPED; + fr.fr_procp = &initproc; + error = fork1(&thread0, &fr); if (error) panic("cannot fork init: %d\n", error); KASSERT(initproc->p_pid == 1, ("create_init: initproc->p_pid != 1")); diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index e7d7276..3b51b7f 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -101,10 +101,13 @@ struct fork_args { int sys_fork(struct thread *td, struct fork_args *uap) { + struct fork_req fr = {}; int error; struct proc *p2; - error = fork1(td, RFFDG | RFPROC, 0, &p2, NULL, 0, NULL); + fr.fr_flags = RFFDG | RFPROC; + fr.fr_procp = &p2; + error = fork1(td, &fr); if (error == 0) { td->td_retval[0] = p2->p_pid; td->td_retval[1] = 0; @@ -118,16 +121,20 @@ sys_pdfork(td, uap) struct thread *td; struct pdfork_args *uap; { + struct fork_req fr = {}; int error, fd; struct proc *p2; + fr.fr_flags = RFFDG | RFPROC | RFPROCDESC; + fr.fr_procp = &p2; + fr.fr_pd_fd = &fd; + fr.fr_pd_flags = uap->flags; /* * It is necessary to return fd by reference because 0 is a valid file * descriptor number, and the child needs to be able to distinguish * itself from the parent using the return value. */ - error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, - &fd, uap->flags, NULL); + error = fork1(td, &fr); if (error == 0) { td->td_retval[0] = p2->p_pid; td->td_retval[1] = 0; @@ -140,11 +147,13 @@ sys_pdfork(td, uap) int sys_vfork(struct thread *td, struct vfork_args *uap) { - int error, flags; + struct fork_req fr = {}; + int error; struct proc *p2; - flags = RFFDG | RFPROC | RFPPWAIT | RFMEM; - error = fork1(td, flags, 0, &p2, NULL, 0, NULL); + fr.fr_flags = RFFDG | RFPROC | RFPPWAIT | RFMEM; + fr.fr_procp = &p2; + error = fork1(td, &fr); if (error == 0) { td->td_retval[0] = p2->p_pid; td->td_retval[1] = 0; @@ -155,6 +164,7 @@ sys_vfork(struct thread *td, struct vfork_args *uap) int sys_rfork(struct thread *td, struct rfork_args *uap) { + struct fork_req fr = {}; struct proc *p2; int error; @@ -163,7 +173,9 @@ sys_rfork(struct thread *td, struct rfork_args *uap) return (EINVAL); AUDIT_ARG_FFLAGS(uap->flags); - error = fork1(td, uap->flags, 0, &p2, NULL, 0, NULL); + fr.fr_flags = uap->flags; + fr.fr_procp = &p2; + error = fork1(td, &fr); if (error == 0) { td->td_retval[0] = p2 ? p2->p_pid : 0; td->td_retval[1] = 0; @@ -761,8 +773,7 @@ do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td2, } int -fork1(struct thread *td, int flags, int pages, struct proc **procp, - int *procdescp, int pdflags, struct filecaps *fcaps) +fork1(struct thread *td, struct fork_req *fr) { struct proc *p1, *newproc; struct thread *td2; @@ -772,6 +783,10 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, int error, nprocs_new, ok; static int curfail; static struct timeval lastfail; + int flags, pages; + + flags = fr->fr_flags; + pages = fr->fr_pages; /* Check for the undefined or unimplemented flags. */ if ((flags & ~(RFFLAGS | RFTSIGFLAGS(RFTSIGMASK))) != 0) @@ -795,7 +810,7 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, return (EINVAL); /* Must provide a place to put a procdesc if creating one. */ - if (procdescp == NULL) + if (fr->fr_pd_fd == NULL) return (EINVAL); } @@ -806,7 +821,7 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, * certain parts of a process from itself. */ if ((flags & RFPROC) == 0) { - *procp = NULL; + *fr->fr_procp = NULL; return (fork_norfproc(td, flags)); } @@ -845,7 +860,8 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, * later. */ if (flags & RFPROCDESC) { - error = falloc_caps(td, &fp_procdesc, procdescp, 0, fcaps); + error = falloc_caps(td, &fp_procdesc, fr->fr_pd_fd, 0, + fr->fr_pd_fcaps); if (error != 0) goto fail2; } @@ -933,12 +949,12 @@ fork1(struct thread *td, int flags, int pages, struct proc **procp, lim_cur(td, RLIMIT_NPROC)); } if (ok) { - do_fork(td, flags, newproc, td2, vm2, pdflags); + do_fork(td, flags, newproc, td2, vm2, fr->fr_pd_flags); /* * Return child proc pointer to parent. */ - *procp = newproc; + *fr->fr_procp = newproc; if (flags & RFPROCDESC) { procdesc_finit(newproc->p_procdesc, fp_procdesc); fdrop(fp_procdesc, td); @@ -962,7 +978,7 @@ fail2: vmspace_free(vm2); uma_zfree(proc_zone, newproc); if ((flags & RFPROCDESC) != 0 && fp_procdesc != NULL) { - fdclose(td, fp_procdesc, *procdescp); + fdclose(td, fp_procdesc, *fr->fr_pd_fd); fdrop(fp_procdesc, td); } atomic_add_int(&nprocs, -1); diff --git a/sys/kern/kern_kthread.c b/sys/kern/kern_kthread.c index 2072dc7..970c33f 100644 --- a/sys/kern/kern_kthread.c +++ b/sys/kern/kern_kthread.c @@ -80,6 +80,7 @@ int kproc_create(void (*func)(void *), void *arg, struct proc **newpp, int flags, int pages, const char *fmt, ...) { + struct fork_req fr = {}; int error; va_list ap; struct thread *td; @@ -88,8 +89,10 @@ kproc_create(void (*func)(void *), void *arg, if (!proc0.p_stats) panic("kproc_create called too soon"); - error = fork1(&thread0, RFMEM | RFFDG | RFPROC | RFSTOPPED | flags, - pages, &p2, NULL, 0, NULL); + fr.fr_flags = RFMEM | RFFDG | RFPROC | RFSTOPPED | flags; + fr.fr_pages = pages; + fr.fr_procp = &p2; + error = fork1(&thread0, &fr); if (error) return error; diff --git a/sys/sys/proc.h b/sys/sys/proc.h index f2f4a9d..ac96566 100644 --- a/sys/sys/proc.h +++ b/sys/sys/proc.h @@ -907,6 +907,15 @@ struct proc *pfind_locked(pid_t pid); struct pgrp *pgfind(pid_t); /* Find process group by id. */ struct proc *zpfind(pid_t); /* Find zombie process by id. */ +struct fork_req { + int fr_flags; + int fr_pages; + struct proc **fr_procp; + int *fr_pd_fd; + int fr_pd_flags; + struct filecaps *fr_pd_fcaps; +}; + /* * pget() flags. */ @@ -930,8 +939,7 @@ int enterpgrp(struct proc *p, pid_t pgid, struct pgrp *pgrp, int enterthispgrp(struct proc *p, struct pgrp *pgrp); void faultin(struct proc *p); void fixjobc(struct proc *p, struct pgrp *pgrp, int entering); -int fork1(struct thread *, int, int, struct proc **, int *, int, - struct filecaps *); +int fork1(struct thread *, struct fork_req *); void fork_exit(void (*)(void *, struct trapframe *), void *, struct trapframe *); void fork_return(struct thread *, struct trapframe *); -- 2.7.0 From owner-freebsd-hackers@freebsd.org Tue Feb 2 04:07:54 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AAC34A978D8 for ; Tue, 2 Feb 2016 04:07:54 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x241.google.com (mail-wm0-x241.google.com [IPv6:2a00:1450:400c:c09::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 44FB71C49; Tue, 2 Feb 2016 04:07:54 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x241.google.com with SMTP id 128so483375wmz.3; Mon, 01 Feb 2016 20:07:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=QoBxuQvVEuBRl1hO/OLw7OZjvaPhcWnoiTIkgDJsPuI=; b=jo0+QZ7BV4psQvZhTDK0q84ebRrtIqd6dnfADKmaEebIKZ3J0ukv5KUSHl5JwhykUS qpTgWPLC4FFTWH+M460pcwcoz0Wq3NBek1Itf5uoEpkFH3ez5m/+Ztt2sR9qs6+P93nF VMvJcf4zKdAVvHEOI2wAas0qWoUeOQ1N0+Qf9udSNFHh1QiZrNuTlG2xMi2JzwuBuelC bShrTd8NHHbUrihM5bcmkckovRB0TlaeNgAUIcgyIuSKhOrNi/JHADsvY074jmsbgZfC HRYcPKcupa2ZTrKkgLOPNZ1T+0rMt5OHL8v3b6qL26ZdpzF40KsD+MefptpCU2dVXWP7 BvAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=QoBxuQvVEuBRl1hO/OLw7OZjvaPhcWnoiTIkgDJsPuI=; b=LLzQWXP9mmlV6j/pSledFojTLr1pbv6u/EX1p5SHw9JoJdEcOzTszHdswDU8zjdE2J 1B+vqIdu85Q9cGgezpwihZ8Gyxdfb3b1OBR9CL40wIa5t5O0OGKGoCZOL5wzj0sxHTWA HD/iq5AhxWXkmLlenz27WrMfnrMoPhg/kdWobqP6rv9LIL4XY25ZIvAg7o9Rru1s/3KQ J7UqP3Tz0XeQkSfMmogfy0UX50OL1DTAhCjSyGvbXF7tbEwMnpTcfVtR/LaeRKlBeSL/ 4UZ++fO0oEUKpLsWnMBBWB2QyYDzN9jRIt8/UZJguzBVJJpPeqfQWzsqgh3L7vPWjW5F i/bg== X-Gm-Message-State: AG10YOTOkx5jxMS3DueOB29ec9+cuZYK0Em/px3SOhzervbP1q5wzxlzg/MrnpsZJyO9Og== X-Received: by 10.194.20.5 with SMTP id j5mr25863534wje.71.1454386071816; Mon, 01 Feb 2016 20:07:51 -0800 (PST) Received: from mguzik.localdomain (ip-62-245-66-110.net.upcbroadband.cz. [62.245.66.110]) by smtp.gmail.com with ESMTPSA id n131sm550333wmf.9.2016.02.01.20.07.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 01 Feb 2016 20:07:51 -0800 (PST) From: Mateusz Guzik To: kib@freebsd.org Cc: freebsd-hackers@freebsd.org, Mateusz Guzik Subject: Re: [PATCH 0/2] plug fork use-after-free Date: Tue, 2 Feb 2016 05:07:47 +0100 Message-Id: <1454386069-29657-1-git-send-email-mjguzik@gmail.com> X-Mailer: git-send-email 2.4.3 In-Reply-To: <20160201103632.GL91220@kib.kiev.ua> References: <20160201103632.GL91220@kib.kiev.ua> X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 04:07:54 -0000 From: Mateusz Guzik On Mon, Feb 01, 2016 at 12:36:32PM +0200, Konstantin Belousov wrote: > On Mon, Feb 01, 2016 at 06:13:02AM +0100, Mateusz Guzik wrote: > > From: Mateusz Guzik > > > > Quit some time ago I reported a problem with fork and provided a half-assed > > patch, see: > > https://lists.freebsd.org/pipermail/freebsd-hackers/2014-October/046212.html > > > > Now I got around to fixing the problem in a less hackish manner. > > > > Note that despite the new process possibly immediatley exiting and being > > waited on, returning its (possibly now reused PID) is fine - that's the > > pid it possibly saw by other means and in worst case the process is racing > > with itself. > > > > To reiterate, as it is, the code has use-after-free in procdesc and racct > > handling. > > > > The first patch is a small cleanup to reduce the number of arguments to > > fork1, which was getting out of hand. I don't feel strongly about the > > name of the structure used in there. > > > > I agree with the fix, but I want the approach to be pushed further. > > First, please pack all arguments to fork1() into the struct. I think > everything except the curthread pointer should be packed into the > argument structure. You have to touch all fork1() callers anyway, and > with the structure approach you could avoid doing the second pass over > the all callers (in the second patch), esp. if the structure is bzeroed > before being filled. Done. There is a local 'int pid' var passed around, I can change that to pass use td_retval[0]. > > Second, it puzzles me that do_fork() takes both the p2 and > procp arguments. Wouldn't it be cleaner to assign to *procp (or > fork_req->procp) in fork1 ? I understand why this cannot be done with > *procpid. I would say it's cleaner to keep both assignments close, but don't care that much. In general, as was disussed some time ago the code should be resturctured anyay to not put PRS_NEW processes on the list and once that happens both assignments will likely be handled prior to enterding do_fork. -- Mateusz Guzik Mateusz Guzik (2): fork: pass arguments to fork1 in a dedicated structure fork: plug a use after free of the returned process pointer sys/compat/cloudabi/cloudabi_proc.c | 7 +- sys/compat/linux/linux_fork.c | 17 +++-- sys/kern/init_main.c | 6 +- sys/kern/kern_fork.c | 134 +++++++++++++++++++++--------------- sys/kern/kern_kthread.c | 7 +- sys/kern/kern_racct.c | 3 +- sys/sys/proc.h | 13 +++- 7 files changed, 117 insertions(+), 70 deletions(-) -- 2.7.0 From owner-freebsd-hackers@freebsd.org Tue Feb 2 13:11:52 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 39776A97554 for ; Tue, 2 Feb 2016 13:11:52 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 13CA71ECF; Tue, 2 Feb 2016 13:11:50 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u12DBkqY083706 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 2 Feb 2016 15:11:46 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u12DBkqY083706 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u12DBj3Z083705; Tue, 2 Feb 2016 15:11:45 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Tue, 2 Feb 2016 15:11:45 +0200 From: Konstantin Belousov To: Mateusz Guzik Cc: freebsd-hackers@freebsd.org, Mateusz Guzik Subject: Re: [PATCH 1/2] fork: pass arguments to fork1 in a dedicated structure Message-ID: <20160202131145.GT91220@kib.kiev.ua> References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-2-git-send-email-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="ZwgA9U+XZDXt4+m+" Content-Disposition: inline In-Reply-To: <1454386069-29657-2-git-send-email-mjguzik@gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 13:11:52 -0000 --ZwgA9U+XZDXt4+m+ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Feb 02, 2016 at 05:07:48AM +0100, Mateusz Guzik wrote: > From: Mateusz Guzik >=20 > --- > sys/compat/cloudabi/cloudabi_proc.c | 7 +++++- > sys/compat/linux/linux_fork.c | 17 ++++++++++---- > sys/kern/init_main.c | 6 +++-- > sys/kern/kern_fork.c | 46 +++++++++++++++++++++++++------= ------ > sys/kern/kern_kthread.c | 7 ++++-- > sys/sys/proc.h | 12 ++++++++-- > 6 files changed, 68 insertions(+), 27 deletions(-) >=20 > diff --git a/sys/compat/cloudabi/cloudabi_proc.c b/sys/compat/cloudabi/cl= oudabi_proc.c > index d917337..010efca 100644 > --- a/sys/compat/cloudabi/cloudabi_proc.c > +++ b/sys/compat/cloudabi/cloudabi_proc.c > @@ -75,12 +75,17 @@ int > cloudabi_sys_proc_fork(struct thread *td, > struct cloudabi_sys_proc_fork_args *uap) > { > + struct fork_req fr =3D {}; > struct filecaps fcaps =3D {}; > struct proc *p2; > int error, fd; > =20 > cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_EVENT); > - error =3D fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, &fd, 0, &fcaps= ); > + fr.fr_flags =3D RFFDG | RFPROC | RFPROCDESC; > + fr.fr_procp =3D &p2; > + fr.fr_pd_fd =3D &fd; > + fr.fr_pd_fcaps =3D &fcaps; > + error =3D fork1(td, &fr); > if (error !=3D 0) > return (error); The patch is fine. Would be great to not use initializer in declaration, i.e. use bzero() instead of c99 designated initializers. --ZwgA9U+XZDXt4+m+ Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJWsKsRAAoJEJDCuSvBvK1BAuwP+wfZssh5Krj5Oxqp4RbyVRqU RxZ44ZwWprSHsU0vnnpMwTWp0R/YbRHq+6Cb3pieY2UZEtqUv76/Iu4v2gi3Lfyt VEbn02wNDHIbaV5rzwkB1xGj2mwwcgKxIjz42OLQ2F4iYkie/i9oUbF4m9tFhKJ2 rWQqXul4+qFaPVGIzdK8amr9J5DvoSzOG3dCXH7X+VkoOf+QEs5YCrxim34gsX3O xGzX3zf+l/N5i1QTH/awWX2q5Ah1FyOjjgGu0MEestvRNiNi1KK0NDJ7OD5UNcvW bHRqMSkiP2tUdkjl2hg6etHRy1MfipJ6TAOcZjaeKhf4ffz50bT0Am+b0avgRBTg 3UNx0D41NYVal3zY8/HGXbYsUHyC3IOIEgnxILXNH6K7w4n/5FCC2K7MIAreYZ/A MV9tRqWPn2i+M4PZ7s1Ugq1PlSq1jqigofTGOFnDQ0l51EaVxyDFA5xqMwDRQFx5 169RXBS67j0JucGIDQ1kVTkQOoMG3pZmciZm1VND4KQKfXk1DWWFVugSe3TNORmH 0bERq/w9rVlUug17XbbRK0cidftu64DbhoVwbZ4/GHbZonNBMqlY6iW1H1bSBDtY 5SwKBY+jhHqpvWO6kMWff8YcTt/sWvu40EM1+O3xQkLTZfukJbQbzKF25v8sC+MJ 9qfJuED9qB9E8PBBaXea =V/nc -----END PGP SIGNATURE----- --ZwgA9U+XZDXt4+m+-- From owner-freebsd-hackers@freebsd.org Tue Feb 2 13:17:27 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 532F4A97936 for ; Tue, 2 Feb 2016 13:17:27 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 116C738C; Tue, 2 Feb 2016 13:17:27 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1aQapZ-000EMz-5F; Tue, 02 Feb 2016 16:17:17 +0300 Date: Tue, 2 Feb 2016 16:17:17 +0300 From: Slawa Olhovchenkov To: Konstantin Belousov Cc: Mateusz Guzik , freebsd-hackers@freebsd.org, Mateusz Guzik Subject: Re: [PATCH 1/2] fork: pass arguments to fork1 in a dedicated structure Message-ID: <20160202131717.GO88527@zxy.spb.ru> References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-2-git-send-email-mjguzik@gmail.com> <20160202131145.GT91220@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160202131145.GT91220@kib.kiev.ua> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 13:17:27 -0000 On Tue, Feb 02, 2016 at 03:11:45PM +0200, Konstantin Belousov wrote: > On Tue, Feb 02, 2016 at 05:07:48AM +0100, Mateusz Guzik wrote: > > From: Mateusz Guzik > > > > --- > > sys/compat/cloudabi/cloudabi_proc.c | 7 +++++- > > sys/compat/linux/linux_fork.c | 17 ++++++++++---- > > sys/kern/init_main.c | 6 +++-- > > sys/kern/kern_fork.c | 46 +++++++++++++++++++++++++------------ > > sys/kern/kern_kthread.c | 7 ++++-- > > sys/sys/proc.h | 12 ++++++++-- > > 6 files changed, 68 insertions(+), 27 deletions(-) > > > > diff --git a/sys/compat/cloudabi/cloudabi_proc.c b/sys/compat/cloudabi/cloudabi_proc.c > > index d917337..010efca 100644 > > --- a/sys/compat/cloudabi/cloudabi_proc.c > > +++ b/sys/compat/cloudabi/cloudabi_proc.c > > @@ -75,12 +75,17 @@ int > > cloudabi_sys_proc_fork(struct thread *td, > > struct cloudabi_sys_proc_fork_args *uap) > > { > > + struct fork_req fr = {}; > > struct filecaps fcaps = {}; > > struct proc *p2; > > int error, fd; > > > > cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_EVENT); > > - error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, &fd, 0, &fcaps); > > + fr.fr_flags = RFFDG | RFPROC | RFPROCDESC; > > + fr.fr_procp = &p2; > > + fr.fr_pd_fd = &fd; > > + fr.fr_pd_fcaps = &fcaps; > > + error = fork1(td, &fr); > > if (error != 0) > > return (error); > The patch is fine. > > Would be great to not use initializer in declaration, i.e. use bzero() > instead of c99 designated initializers. What purpose? Compiler may use more effecien zeroing with c99 designated initializers. From owner-freebsd-hackers@freebsd.org Tue Feb 2 13:23:30 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 95D57A97BF5 for ; Tue, 2 Feb 2016 13:23:30 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2DDC3A17; Tue, 2 Feb 2016 13:23:30 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u12DNMP2086056 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 2 Feb 2016 15:23:23 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u12DNMP2086056 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u12DNMI8086055; Tue, 2 Feb 2016 15:23:22 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Tue, 2 Feb 2016 15:23:22 +0200 From: Konstantin Belousov To: Mateusz Guzik Cc: freebsd-hackers@freebsd.org, Mateusz Guzik Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160202132322.GU91220@kib.kiev.ua> References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-3-git-send-email-mjguzik@gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="YToU2i3Vx8H2dn7O" Content-Disposition: inline In-Reply-To: <1454386069-29657-3-git-send-email-mjguzik@gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 13:23:30 -0000 --YToU2i3Vx8H2dn7O Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Feb 02, 2016 at 05:07:49AM +0100, Mateusz Guzik wrote: > From: Mateusz Guzik >=20 > fork1 required its callers to pass a pointer to struct proc * which would > be set to the new process (if any). procdesc and racct manipulation also > used said pointer. >=20 > However, the process could have exited prior to do_fork return and be > automatically reaped, thus making this a use-after-free. >=20 > Fix the problem by letting callers indicate whether they want the pid or > the struct proc, return the process in stopped state for the latter case. Patch looks fine, some style notes and one question is below. > --- > sys/compat/cloudabi/cloudabi_proc.c | 2 - > sys/kern/kern_fork.c | 104 +++++++++++++++++++-----------= ------ > sys/kern/kern_racct.c | 3 +- > sys/sys/proc.h | 1 + > 4 files changed, 58 insertions(+), 52 deletions(-) >=20 > diff --git a/sys/compat/cloudabi/cloudabi_proc.c b/sys/compat/cloudabi/cl= oudabi_proc.c > index 010efca..2bc50ca 100644 > --- a/sys/compat/cloudabi/cloudabi_proc.c > +++ b/sys/compat/cloudabi/cloudabi_proc.c > @@ -77,12 +77,10 @@ cloudabi_sys_proc_fork(struct thread *td, > { > struct fork_req fr =3D {}; > struct filecaps fcaps =3D {}; > - struct proc *p2; > int error, fd; > =20 > cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_EVENT); > fr.fr_flags =3D RFFDG | RFPROC | RFPROCDESC; > - fr.fr_procp =3D &p2; > fr.fr_pd_fd =3D &fd; > fr.fr_pd_fcaps =3D &fcaps; > error =3D fork1(td, &fr); > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index 3b51b7f..d0c3837 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -102,14 +102,13 @@ int > sys_fork(struct thread *td, struct fork_args *uap) > { > struct fork_req fr =3D {}; > - int error; > - struct proc *p2; > + int error, pid; > =20 > fr.fr_flags =3D RFFDG | RFPROC; > - fr.fr_procp =3D &p2; > + fr.fr_pidp =3D &pid; > error =3D fork1(td, &fr); > if (error =3D=3D 0) { > - td->td_retval[0] =3D p2->p_pid; > + td->td_retval[0] =3D pid; > td->td_retval[1] =3D 0; > } > return (error); > @@ -122,11 +121,10 @@ sys_pdfork(td, uap) > struct pdfork_args *uap; > { > struct fork_req fr =3D {}; > - int error, fd; > - struct proc *p2; > + int error, fd, pid; > =20 > fr.fr_flags =3D RFFDG | RFPROC | RFPROCDESC; > - fr.fr_procp =3D &p2; > + fr.fr_pidp =3D &pid; > fr.fr_pd_fd =3D &fd; > fr.fr_pd_flags =3D uap->flags; > /* > @@ -136,7 +134,7 @@ sys_pdfork(td, uap) > */ > error =3D fork1(td, &fr); > if (error =3D=3D 0) { > - td->td_retval[0] =3D p2->p_pid; > + td->td_retval[0] =3D pid; > td->td_retval[1] =3D 0; > error =3D copyout(&fd, uap->fdp, sizeof(fd)); > } > @@ -148,14 +146,13 @@ int > sys_vfork(struct thread *td, struct vfork_args *uap) > { > struct fork_req fr =3D {}; > - int error; > - struct proc *p2; > + int error, pid; > =20 > fr.fr_flags =3D RFFDG | RFPROC | RFPPWAIT | RFMEM; > - fr.fr_procp =3D &p2; > + fr.fr_pidp =3D &pid; > error =3D fork1(td, &fr); > if (error =3D=3D 0) { > - td->td_retval[0] =3D p2->p_pid; > + td->td_retval[0] =3D pid; > td->td_retval[1] =3D 0; > } > return (error); > @@ -165,8 +162,7 @@ int > sys_rfork(struct thread *td, struct rfork_args *uap) > { > struct fork_req fr =3D {}; > - struct proc *p2; > - int error; > + int error, pid; > =20 > /* Don't allow kernel-only flags. */ > if ((uap->flags & RFKERNELONLY) !=3D 0) > @@ -174,10 +170,10 @@ sys_rfork(struct thread *td, struct rfork_args *uap) > =20 > AUDIT_ARG_FFLAGS(uap->flags); > fr.fr_flags =3D uap->flags; > - fr.fr_procp =3D &p2; > + fr.fr_pidp =3D &pid; > error =3D fork1(td, &fr); > if (error =3D=3D 0) { > - td->td_retval[0] =3D p2 ? p2->p_pid : 0; > + td->td_retval[0] =3D pid; > td->td_retval[1] =3D 0; > } > return (error); > @@ -378,20 +374,21 @@ fail: > } > =20 > static void > -do_fork(struct thread *td, int flags, struct proc *p2, struct thread *td= 2, > - struct vmspace *vm2, int pdflags) > +do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct = thread *td2, > + struct vmspace *vm2, struct file *fp_procdesc) Style, line is too long. > { > struct proc *p1, *pptr; > - int p2_held, trypid; > + int trypid; > struct filedesc *fd; > struct filedesc_to_leader *fdtol; > struct sigacts *newsigacts; > + int flags; > =20 > sx_assert(&proctree_lock, SX_SLOCKED); > sx_assert(&allproc_lock, SX_XLOCKED); > =20 > - p2_held =3D 0; > p1 =3D td->td_proc; > + flags =3D fr->fr_flags; Why not use fr->fr_flags directly ? It is slightly more churn, but IMO it is worth it. > =20 > trypid =3D fork_findpid(flags); > =20 > @@ -690,7 +687,7 @@ do_fork(struct thread *td, int flags, struct proc *p2= , struct thread *td2, > * However, don't do this until after fork(2) can no longer fail. > */ > if (flags & RFPROCDESC) > - procdesc_new(p2, pdflags); > + procdesc_new(p2, fr->fr_pd_flags); > =20 > /* > * Both processes are set up, now check if any loadable modules want > @@ -718,6 +715,11 @@ do_fork(struct thread *td, int flags, struct proc *p= 2, struct thread *td2, > if ((flags & RFMEM) =3D=3D 0 && dtrace_fasttrap_fork) > dtrace_fasttrap_fork(p1, p2); > #endif > + /* > + * Hold the process so that it cannot exit after we make it runnable, > + * but before we wait for the debugger. Is this possible ? The forked child must execute through fork_return(), and there we do ptracestop() before the child has a chance to ever return to usermode. Do you mean a scenario where the debugger detaches before child executes fork_return() and TDP_STOPATFORK is cleared in advance ? > + */ > + _PHOLD(p2); > if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) =3D=3D (P_TRACED | > P_FOLLOWFORK)) { > /* > @@ -730,24 +732,12 @@ do_fork(struct thread *td, int flags, struct proc *= p2, struct thread *td2, > td->td_dbgflags |=3D TDB_FORK; > td->td_dbg_forked =3D p2->p_pid; > td2->td_dbgflags |=3D TDB_STOPATFORK; > - _PHOLD(p2); > - p2_held =3D 1; > } > if (flags & RFPPWAIT) { > td->td_pflags |=3D TDP_RFPPWAIT; > td->td_rfppwait_p =3D p2; > } > PROC_UNLOCK(p2); > - if ((flags & RFSTOPPED) =3D=3D 0) { > - /* > - * If RFSTOPPED not requested, make child runnable and > - * add to run queue. > - */ > - thread_lock(td2); > - TD_SET_CAN_RUN(td2); > - sched_add(td2, SRQ_BORING); > - thread_unlock(td2); > - } > =20 > /* > * Now can be swapped. > @@ -761,14 +751,34 @@ do_fork(struct thread *td, int flags, struct proc *= p2, struct thread *td2, > knote_fork(&p1->p_klist, p2->p_pid); > SDT_PROBE3(proc, , , create, p2, p1, flags); > =20 > + if (flags & RFPROCDESC) { > + procdesc_finit(p2->p_procdesc, fp_procdesc); > + fdrop(fp_procdesc, td); > + } > + > + if ((flags & RFSTOPPED) =3D=3D 0) { > + /* > + * If RFSTOPPED not requested, make child runnable and > + * add to run queue. > + */ > + thread_lock(td2); > + TD_SET_CAN_RUN(td2); > + sched_add(td2, SRQ_BORING); > + thread_unlock(td2); > + if (fr->fr_pidp !=3D NULL) > + *fr->fr_pidp =3D p2->p_pid; > + } else { > + *fr->fr_procp =3D p2; > + } > + > + PROC_LOCK(p2); > /* > * Wait until debugger is attached to child. > */ > - PROC_LOCK(p2); > while ((td2->td_dbgflags & TDB_STOPATFORK) !=3D 0) > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > - if (p2_held) > - _PRELE(p2); > + _PRELE(p2); > + racct_proc_fork_done(p2); > PROC_UNLOCK(p2); > } > =20 > @@ -788,6 +798,11 @@ fork1(struct thread *td, struct fork_req *fr) > flags =3D fr->fr_flags; > pages =3D fr->fr_pages; > =20 > + if ((flags & RFSTOPPED) !=3D 0) > + MPASS(fr->fr_procp !=3D NULL && fr->fr_pidp =3D=3D NULL); > + else > + MPASS(fr->fr_procp =3D=3D NULL); > + > /* Check for the undefined or unimplemented flags. */ > if ((flags & ~(RFFLAGS | RFTSIGFLAGS(RFTSIGMASK))) !=3D 0) > return (EINVAL); > @@ -821,7 +836,10 @@ fork1(struct thread *td, struct fork_req *fr) > * certain parts of a process from itself. > */ > if ((flags & RFPROC) =3D=3D 0) { > - *fr->fr_procp =3D NULL; > + if (fr->fr_procp !=3D NULL) > + *fr->fr_procp =3D NULL; > + else if (fr->fr_pidp !=3D NULL) > + *fr->fr_pidp =3D 0; > return (fork_norfproc(td, flags)); > } > =20 > @@ -949,17 +967,7 @@ fork1(struct thread *td, struct fork_req *fr) > lim_cur(td, RLIMIT_NPROC)); > } > if (ok) { > - do_fork(td, flags, newproc, td2, vm2, fr->fr_pd_flags); > - > - /* > - * Return child proc pointer to parent. > - */ > - *fr->fr_procp =3D newproc; > - if (flags & RFPROCDESC) { > - procdesc_finit(newproc->p_procdesc, fp_procdesc); > - fdrop(fp_procdesc, td); > - } > - racct_proc_fork_done(newproc); > + do_fork(td, fr, newproc, td2, vm2, fp_procdesc); > return (0); > } > =20 > diff --git a/sys/kern/kern_racct.c b/sys/kern/kern_racct.c > index 0c7c0c4..ce7e2a4 100644 > --- a/sys/kern/kern_racct.c > +++ b/sys/kern/kern_racct.c > @@ -957,16 +957,15 @@ void > racct_proc_fork_done(struct proc *child) > { > =20 > + PROC_LOCK_ASSERT(child, MA_OWNED); > #ifdef RCTL > if (!racct_enable) > return; > =20 > - PROC_LOCK(child); > mtx_lock(&racct_lock); > rctl_enforce(child, RACCT_NPROC, 0); > rctl_enforce(child, RACCT_NTHR, 0); > mtx_unlock(&racct_lock); > - PROC_UNLOCK(child); > #endif > } > =20 > diff --git a/sys/sys/proc.h b/sys/sys/proc.h > index ac96566..039fd39 100644 > --- a/sys/sys/proc.h > +++ b/sys/sys/proc.h > @@ -910,6 +910,7 @@ struct proc *zpfind(pid_t); /* Find zombie process b= y id. */ > struct fork_req { > int fr_flags; > int fr_pages; > + int *fr_pidp; > struct proc **fr_procp; > int *fr_pd_fd; > int fr_pd_flags; > --=20 > 2.7.0 --YToU2i3Vx8H2dn7O Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJWsK3KAAoJEJDCuSvBvK1Bs2cP/2+e9oBdarmZxy4bGjzF0xbY CHWjs3Hpt5x/vrMxMXzEwB0ggEK2vA3HBbtvjLn5mDUX0UBode3OnUh2aGBqqwRG XdS9fQdbjIxl0qA5GdEp27jxvugDi2JxZ7Xom4JErS/bKJBeHKcutBo89n3KEMCA a6OEivYcdN8qJPd5TfOYDZVJDUHf98IwhZgKZ0O94toZybIWEFLnMZJBlHG0Y92V hKe469h3R6uK2SPpHKYZ8bkbrUCubqm+ROB404sRBiuahTYv6Iea726v7kEU5yjC nxPp+Jhly++lWCtVqh9ymZXGP1bIaZV+Qv/IsvtExDgFL/NSK6rvQQ52cRrxoaS4 CILlISeuV+2KjQh3Z9ceKyqn1hhO3xyb7dv56y+ASI0a0tT5hMHZYYZmPVR0uR3J QYEz3jR9VK9JyCrDRLY3Ny/JtyawJfKl92Od98OdLMRrh6UePdCNy1D2/wWUR7Gt KVWbTGhhy9UDp/yfQVMrdXc1GUrtL9CfpN/aGiJboEOMAaHEc81hLzDPMIRr5sGS vSX6T5b2LOmRUEqzTWqys3h1UIpWN2Bs9tYGZv1C1yIOXmZWlhv2pgc5xoRfC2YK 0LXK9n1onsX6EgJzyhfrmw6JBpW2OXIyXQZI8Y6a2RorKp4CemGQdWCfvoE1ITQB FOUAUKHyO0Y5hdEIBIKV =vt4+ -----END PGP SIGNATURE----- --YToU2i3Vx8H2dn7O-- From owner-freebsd-hackers@freebsd.org Tue Feb 2 13:01:20 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7C811A970FE for ; Tue, 2 Feb 2016 13:01:20 +0000 (UTC) (envelope-from dewaynegeraghty@gmail.com) Received: from mail-lf0-x22a.google.com (mail-lf0-x22a.google.com [IPv6:2a00:1450:4010:c07::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 07E831897 for ; Tue, 2 Feb 2016 13:01:20 +0000 (UTC) (envelope-from dewaynegeraghty@gmail.com) Received: by mail-lf0-x22a.google.com with SMTP id j78so47934415lfb.1 for ; Tue, 02 Feb 2016 05:01:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=IAUA+g9x6sFv6c7CJx+vD3opl0d/HQlrxRJaVXxvR1Y=; b=DNhLAjxi+85ywkJMYlVNewZZZNWaw/0ZlR0ROmTIffMgOmhD3IM4sMX61lXbtVbXKd RFMwNoRIH0VYVL2psKQL270fVev/O4MbXXhCIERMTXw+8NGyJMtl9O0nD0uLsl9nIQGh nEvMdRhKfOj0FmFqhUOb/OEJp2X6n3FXd1VSHAOlI28Xw3dXMYexjlE55kBhdZCL4Upi X+/MNqMyjJ2jaqSFxYHgYuPMp1w3ad2Pmz/kQ1JUKR2RO2mEc50NA3zGpWDvdBkjjonK eXQwkVDFEjJTOIqwvSPrLnV7PohAN1tA2pGWX2dFXcGD+fDS1+U80XYJI5GJB+pUfqCy wAyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to :content-type; bh=IAUA+g9x6sFv6c7CJx+vD3opl0d/HQlrxRJaVXxvR1Y=; b=L68WMIogmkurjRB2yQDLRF+fqYf3BEvZt7fLEK4GTC/uFwqvR06s5drLlW3ueGLuCs MYGUTAf2vAHPpJvAaS/mTQlC0fvhPDCSfq4X3S4xPD6IfdLTs/9Jj5rf5VfK5/67lau/ mlTO7WHw1n4p3Ls7JR2gbl/DPjUk7B4scLw3GtLrSA6uCzuK55XKgyvUltnFw293SuvC /XwwuGgY4BmcAXX+XUYAQ7rC3J1F9kOfPartivkMz0PDbzgtjCH45j6/gwC2jU/W/YXf VPzMGW/6Uo5/pOFtJGKg91IzVli/9WlP6Stbi3bkYJW1QJjoaHs275wRsROhvlrZ/i+q s0og== X-Gm-Message-State: AG10YORwPHNh1Jeh3FuwUjadp/01AejTCDq5UV9b/oqB6JZnl6kwhsz10vuPQGD3r988dcV2gfWTP0PXKNyX7g== X-Received: by 10.25.170.129 with SMTP id t123mr11094945lfe.103.1454418078142; Tue, 02 Feb 2016 05:01:18 -0800 (PST) MIME-Version: 1.0 Received: by 10.25.155.145 with HTTP; Tue, 2 Feb 2016 05:00:48 -0800 (PST) From: Dewayne Geraghty Date: Wed, 3 Feb 2016 00:00:48 +1100 Message-ID: Subject: Priority settings for gmirror,cam,geom are at the right priority -8? To: freebsd-hackers@freebsd.org X-Mailman-Approved-At: Tue, 02 Feb 2016 13:49:15 +0000 Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 13:01:20 -0000 While waiting for a bunch of build jails to be cleaned, I glanced at top -PjSuz -mio -o ivcsw and noticed that g_mirror, geom,cam and rm were experiencing involuntary context swaps. I could understand rm, but not the system daemons. All of these had a priority of -8. I'm wondering if the system daemons should be set at a higher priority, or should I expect a speedier rm if I niced it? The system almost idle, load averages being { 0.15, 0.48, 0.87 } PID JID UID VCSW IVCSW READ WRITE FAULT TOTAL PERCENT COMMAND 7 0 0 3594 15 0 0 0 0 0.00% g_mirror gm0 13 0 0 3512 12 0 0 0 0 0.00% geom 37223 0 0 1528 7 1523 3 0 1526 90.67% rm 4 0 0 1938 5 0 0 0 0 0.00% cam 12 0 0 2018 1 0 0 0 0 0.00% intr This is on a xeon 8 core with SSD's whose sole purpose is to build FreeBSD stable and packages that we require. The scheduler is 4BSD and kern.sched.slice=64. As such I'm stumped as to why "intr" would run out of its quantum (and its running at priority -72)? My apologies for any apparent ignorance, I'm a physicist by training and security manager via necessity. :) Kind regards, Dewayne From owner-freebsd-hackers@freebsd.org Tue Feb 2 14:28:26 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E6F69A9967F for ; Tue, 2 Feb 2016 14:28:26 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from pmta2.delivery6.ore.mailhop.org (pmta2.delivery6.ore.mailhop.org [54.200.129.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B230BDED for ; Tue, 2 Feb 2016 14:28:26 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from ilsoft.org (unknown [73.34.117.227]) by outbound2.ore.mailhop.org (Halon Mail Gateway) with ESMTPSA; Tue, 2 Feb 2016 14:29:26 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.15.2/8.14.9) with ESMTP id u12ESJEN008802; Tue, 2 Feb 2016 07:28:19 -0700 (MST) (envelope-from ian@freebsd.org) Message-ID: <1454423299.11162.27.camel@freebsd.org> Subject: Re: [PATCH 1/2] fork: pass arguments to fork1 in a dedicated structure From: Ian Lepore To: Konstantin Belousov , Mateusz Guzik Cc: freebsd-hackers@freebsd.org, Mateusz Guzik Date: Tue, 02 Feb 2016 07:28:19 -0700 In-Reply-To: <20160202131145.GT91220@kib.kiev.ua> References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-2-git-send-email-mjguzik@gmail.com> <20160202131145.GT91220@kib.kiev.ua> Content-Type: text/plain; charset="us-ascii" X-Mailer: Evolution 3.16.5 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 14:28:27 -0000 On Tue, 2016-02-02 at 15:11 +0200, Konstantin Belousov wrote: > On Tue, Feb 02, 2016 at 05:07:48AM +0100, Mateusz Guzik wrote: > > From: Mateusz Guzik > > > > --- > > sys/compat/cloudabi/cloudabi_proc.c | 7 +++++- > > sys/compat/linux/linux_fork.c | 17 ++++++++++---- > > sys/kern/init_main.c | 6 +++-- > > sys/kern/kern_fork.c | 46 +++++++++++++++++++++++++ > > ------------ > > sys/kern/kern_kthread.c | 7 ++++-- > > sys/sys/proc.h | 12 ++++++++-- > > 6 files changed, 68 insertions(+), 27 deletions(-) > > > > diff --git a/sys/compat/cloudabi/cloudabi_proc.c > > b/sys/compat/cloudabi/cloudabi_proc.c > > index d917337..010efca 100644 > > --- a/sys/compat/cloudabi/cloudabi_proc.c > > +++ b/sys/compat/cloudabi/cloudabi_proc.c > > @@ -75,12 +75,17 @@ int > > cloudabi_sys_proc_fork(struct thread *td, > > struct cloudabi_sys_proc_fork_args *uap) > > { > > + struct fork_req fr = {}; > > struct filecaps fcaps = {}; > > struct proc *p2; > > int error, fd; > > > > cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_EVENT); > > - error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, > > &fd, 0, &fcaps); > > + fr.fr_flags = RFFDG | RFPROC | RFPROCDESC; > > + fr.fr_procp = &p2; > > + fr.fr_pd_fd = &fd; > > + fr.fr_pd_fcaps = &fcaps; > > + error = fork1(td, &fr); > > if (error != 0) > > return (error); > The patch is fine. > > Would be great to not use initializer in declaration, i.e. use > bzero() > instead of c99 designated initializers. This would be a better suggestion if the compiler recognized and optimized bzero() with inline code like it does for those forbidden assignments. Or, to put it more basically: it's the 21st century, can't we please please please start using the tools we've got properly instead of slavishly following some silly rules from 1985? -- Ian From owner-freebsd-hackers@freebsd.org Tue Feb 2 14:31:40 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D569DA99746; Tue, 2 Feb 2016 14:31:40 +0000 (UTC) (envelope-from nvass@gmx.com) Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 3A831107A; Tue, 2 Feb 2016 14:31:40 +0000 (UTC) (envelope-from nvass@gmx.com) Received: from moby.local ([79.107.62.61]) by mail.gmx.com (mrgmx103) with ESMTPSA (Nemesis) id 0M4GND-1a9llH092Z-00rqm5; Tue, 02 Feb 2016 15:31:32 +0100 Subject: Re: VirtFS support in bhyve To: jceel@FreeBSD.org, freebsd-hackers@FreeBSD.org, freebsd-virtualization@FreeBSD.org References: <0E724C32-17FB-489A-B6E0-119CE17470E6@FreeBSD.org> From: Nikos Vassiliadis Message-ID: <56B0BD7E.8080909@gmx.com> Date: Tue, 2 Feb 2016 16:30:22 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: <0E724C32-17FB-489A-B6E0-119CE17470E6@FreeBSD.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:N3x9Ogr++o6Nbt7wKKTVPGja9biOSvFG13L+Acraw6BdTJx0j6F xoGiPL/CDtI0FJiRmzvM6kA7ZhgdjGPTkpRFJDVxqd8BUgyqw4Lq2wlWBdo4X1QJ8/akrhI ZCeINQGeUqva1RmL2jOxXkvdizPf+HAijrd7doNSoFc5BnzJGmVmOgwiLiTl38ncO2Zf+r6 X9FV10GW5I/WkY6mQHorg== X-UI-Out-Filterresults: notjunk:1;V01:K0:uO2hPVc4ztc=:i4KXBhA7BO5ssUzAOmwHHJ IHt4r7Rx2l+wlLX1mPVf0q31VZQSO1O8KIAysQLZSFFORfojppme8uqDSlFWXFtljntCusVu/ FMAgOakADh9eF8j74eAz7f49omrnWC9YZ0avdpKesyRk8Tv3gaT902dXn1KY1hoYibqO3vd3a lbq6E/hXAx9nR+bo/ESpE4Tjbdb5JK5TeHIGK0imEfn9hrWFJ7BcUZ4lRXvhb/6BjR4/5Ly9B andNwPe5NqSuor64u42gk883/RJnDMXqBb7ooeG3b/ojuM77exZ1cwoPGwqzDfN+6ZIsBS594 DrSD0UsTPA5LMhiwcY0HsXqWpUa7DaZsUqFFDpRlrqMXBetv8f868d2yag0S28yFEkz1/bR3k POb1RAtTDoLXMkGlB81/BU19dT7gpYXchpQmS7rc3LpMJqM8IagzHJRVTnG9ZJo+uqagYixUR qXocWR9d0MQB09YW8WHHnpxltFBfO/EMLcByz3S8KKzMj9Ffu9oolAQOV/b7tA/VQtsVbOwVW XhHxzN3oq8NCsX0UDlcnSyAfa2QQ215Sd0HxRHYaqTvJ6Kp6P8rMAB73RTNTSnxZugYPgge3D NnOSVlN3GYYPcvZeo4rleTswzkJNKH3qZQUTXm8n2L71Yz70tG18BDKTUec6jGxXyBzoCWzib HtAcU7hgIHsHGYLOAqZYmdqALqjICAB2ChsmSNBE+0GmBeqC2yi/VuV0zIjX4OjfYEQ7DuFYt PxaHdqEH3yeWShd5YvrlA6TSiLEB98s/k5lYEQRxHBq7aosaZ2hIXcq1PVD9yuADRGBesDVhP GpF4Eoh X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 14:31:40 -0000 On 02/01/16 04:22, jceel@FreeBSD.org wrote: > Hello, > > I'm working on virtio-9p (so-called VirtFS) support in bhyve. Project consists of two parts: BSD-licensed lib9p library and actual virtio-9p driver. Right now it's able to do filesystem passthrough using 9P2000.u protocol to Linux guests. > > You can check it out here: https://github.com/jceel/freebsd/tree/virtfs > > Syntax: > bhyve side: append `-s ,virtio-9p,sharename=/host/path` > linux side: `mount -t 9p -o trans=virtio -o version=9p2000.u sharename /mnt/guest/path` > > Using 9p as root filesystem for Linux guests should work too. > > Plans: > - Definitely in-kernel 9pfs filesystem support for FreeBSD guests using same lib9p library > - 9P2000.L support (adds ACLs, extattrs, file locks, atomic reads/writes and so on) > - Filesystem backend using AIO > - Ability to export multiple trees for different "aname" values using one virtio-9p device (that's actually a low-hanging fruit) > > I'm looking forward to your feedback - keep in mind that's totally experimental/incomplete/nonworking code. > > Jakub. Hi Jakub, This is a very cool project! Does this apply to 10-STABLE also? Could you provide a patch for the people who are not familiar with github? Thanks, Nikos From owner-freebsd-hackers@freebsd.org Tue Feb 2 14:39:39 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8182AA99A5F for ; Tue, 2 Feb 2016 14:39:39 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4176D14B9; Tue, 2 Feb 2016 14:39:39 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1aQc7D-000GUF-6S; Tue, 02 Feb 2016 17:39:35 +0300 Date: Tue, 2 Feb 2016 17:39:35 +0300 From: Slawa Olhovchenkov To: Ian Lepore Cc: Konstantin Belousov , Mateusz Guzik , freebsd-hackers@freebsd.org, Mateusz Guzik Subject: Re: [PATCH 1/2] fork: pass arguments to fork1 in a dedicated structure Message-ID: <20160202143935.GP88527@zxy.spb.ru> References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-2-git-send-email-mjguzik@gmail.com> <20160202131145.GT91220@kib.kiev.ua> <1454423299.11162.27.camel@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1454423299.11162.27.camel@freebsd.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 14:39:39 -0000 On Tue, Feb 02, 2016 at 07:28:19AM -0700, Ian Lepore wrote: > On Tue, 2016-02-02 at 15:11 +0200, Konstantin Belousov wrote: > > On Tue, Feb 02, 2016 at 05:07:48AM +0100, Mateusz Guzik wrote: > > > From: Mateusz Guzik > > > > > > --- > > > sys/compat/cloudabi/cloudabi_proc.c | 7 +++++- > > > sys/compat/linux/linux_fork.c | 17 ++++++++++---- > > > sys/kern/init_main.c | 6 +++-- > > > sys/kern/kern_fork.c | 46 +++++++++++++++++++++++++ > > > ------------ > > > sys/kern/kern_kthread.c | 7 ++++-- > > > sys/sys/proc.h | 12 ++++++++-- > > > 6 files changed, 68 insertions(+), 27 deletions(-) > > > > > > diff --git a/sys/compat/cloudabi/cloudabi_proc.c > > > b/sys/compat/cloudabi/cloudabi_proc.c > > > index d917337..010efca 100644 > > > --- a/sys/compat/cloudabi/cloudabi_proc.c > > > +++ b/sys/compat/cloudabi/cloudabi_proc.c > > > @@ -75,12 +75,17 @@ int > > > cloudabi_sys_proc_fork(struct thread *td, > > > struct cloudabi_sys_proc_fork_args *uap) > > > { > > > + struct fork_req fr = {}; > > > struct filecaps fcaps = {}; > > > struct proc *p2; > > > int error, fd; > > > > > > cap_rights_init(&fcaps.fc_rights, CAP_FSTAT, CAP_EVENT); > > > - error = fork1(td, RFFDG | RFPROC | RFPROCDESC, 0, &p2, > > > &fd, 0, &fcaps); > > > + fr.fr_flags = RFFDG | RFPROC | RFPROCDESC; > > > + fr.fr_procp = &p2; > > > + fr.fr_pd_fd = &fd; > > > + fr.fr_pd_fcaps = &fcaps; > > > + error = fork1(td, &fr); > > > if (error != 0) > > > return (error); > > The patch is fine. > > > > Would be great to not use initializer in declaration, i.e. use > > bzero() > > instead of c99 designated initializers. > > This would be a better suggestion if the compiler recognized and > optimized bzero() with inline code like it does for those forbidden > assignments. > > Or, to put it more basically: it's the 21st century, can't we please > please please start using the tools we've got properly instead of > slavishly following some silly rules from 1985? struct fork_req fr = { .fr_flags = RFFDG | RFPROC | RFPROCDESC, .fr_procp = &p2, .fr_pd_fd = &fd, .fr_pd_fcaps = &fcaps }; Like this? From owner-freebsd-hackers@freebsd.org Tue Feb 2 14:53:28 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 11A54A9811C for ; Tue, 2 Feb 2016 14:53:28 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9719181 for ; Tue, 2 Feb 2016 14:53:27 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u12ErMFM062640 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 2 Feb 2016 16:53:22 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u12ErMFM062640 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u12ErMgi062639; Tue, 2 Feb 2016 16:53:22 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 2 Feb 2016 16:53:22 +0200 From: Konstantin Belousov To: Jov Cc: freebsd-hackers@freebsd.org Subject: Re: Fwd: [BUGS] BUG #13900: stop standby failed with writer process hang(happen 3 times in 2 days) Message-ID: <20160202145322.GA91220@kib.kiev.ua> References: <20160130071346.31022.37189@wrigleys.postgresql.org> <20160131162929.GI91220@kib.kiev.ua> <20160201190415.GO91220@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 14:53:28 -0000 On Tue, Feb 02, 2016 at 09:36:19AM +0800, Jov wrote: > ???Thanks for your info,Belousov. > Would you please point which PR or which commit I can exam???I can not move > to 10 stable now,maybe I can avoid trigger this bug or add some monitor???. stable/10 is not fixed as well. I do not have a good suggestion rather then to sync kern_timeout.c with head. After that, you need one more patch. diff --git a/sys/kern/kern_timeout.c b/sys/kern/kern_timeout.c index 9c9d25f..e827665 100644 --- a/sys/kern/kern_timeout.c +++ b/sys/kern/kern_timeout.c @@ -1162,7 +1162,7 @@ _callout_stop_safe(struct callout *c, int safe, void (*drain)(void *)) int direct, sq_locked, use_lock; int not_on_a_list; - if (safe) + if ((safe & CS_DRAIN) != 0) WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, c->c_lock, "calling %s", __func__); @@ -1170,7 +1170,7 @@ _callout_stop_safe(struct callout *c, int safe, void (*drain)(void *)) * Some old subsystems don't hold Giant while running a callout_stop(), * so just discard this check for the moment. */ - if (!safe && c->c_lock != NULL) { + if ((safe & CS_DRAIN) == 0 && c->c_lock != NULL) { if (c->c_lock == &Giant.lock_object) use_lock = mtx_owned(&Giant); else { @@ -1253,7 +1253,7 @@ again: return (-1); } - if (safe) { + if ((safe & CS_DRAIN) != 0) { /* * The current callout is running (or just * about to run) and blocking is allowed, so @@ -1370,7 +1370,7 @@ again: cc_exec_drain(cc, direct) = drain; } CC_UNLOCK(cc); - return (0); + return ((safe & CS_MIGRBLOCK) != 0); } CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p", c, c->c_func, c->c_arg); diff --git a/sys/kern/subr_sleepqueue.c b/sys/kern/subr_sleepqueue.c index bbbec92..12908f6 100644 --- a/sys/kern/subr_sleepqueue.c +++ b/sys/kern/subr_sleepqueue.c @@ -586,7 +586,8 @@ sleepq_check_timeout(void) * another CPU, so synchronize with it to avoid having it * accidentally wake up a subsequent sleep. */ - else if (callout_stop(&td->td_slpcallout) == 0) { + else if (_callout_stop_safe(&td->td_slpcallout, CS_MIGRBLOCK, NULL) + == 0) { td->td_flags |= TDF_TIMEOUT; TD_SET_SLEEPING(td); mi_switch(SW_INVOL | SWT_SLEEPQTIMO, NULL); diff --git a/sys/sys/callout.h b/sys/sys/callout.h index 3e71c87..51dc137 100644 --- a/sys/sys/callout.h +++ b/sys/sys/callout.h @@ -62,6 +62,9 @@ struct callout_handle { struct callout *callout; }; +#define CS_DRAIN 0x0001 +#define CS_MIGRBLOCK 0x0002 + #ifdef _KERNEL /* * Note the flags field is actually *two* fields. The c_flags @@ -81,7 +84,7 @@ struct callout_handle { */ #define callout_active(c) ((c)->c_flags & CALLOUT_ACTIVE) #define callout_deactivate(c) ((c)->c_flags &= ~CALLOUT_ACTIVE) -#define callout_drain(c) _callout_stop_safe(c, 1, NULL) +#define callout_drain(c) _callout_stop_safe(c, CS_DRAIN, NULL) void callout_init(struct callout *, int); void _callout_init_lock(struct callout *, struct lock_object *, int); #define callout_init_mtx(c, mtx, flags) \ From owner-freebsd-hackers@freebsd.org Tue Feb 2 17:56:58 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 66AAEA98BA2 for ; Tue, 2 Feb 2016 17:56:58 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x244.google.com (mail-wm0-x244.google.com [IPv6:2a00:1450:400c:c09::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C9A2217AB; Tue, 2 Feb 2016 17:56:57 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x244.google.com with SMTP id p63so3516731wmp.1; Tue, 02 Feb 2016 09:56:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=vKglTS5hfhOeTYlUWPeMidOsEahTBAZReAAsYVh52p0=; b=Rsq0VF+xjE6K3qPI65U7zpUPrOvbdLRTWVZE7+5Yr0YjsDuNagPoqj50BDXTSK1uD0 O9T0nbmBsPOPb9PFeiyuX4tTEkhTY/X1WDT5cc+nsyZV4qe4/ZbpMrttnUIrpapcT8lu +jFcEUkWd57q9zo5ShDjMVzoEilNfJX8YmeDN5ta3K+3M0HhSeE1YnhxmTcOOWU7Mkbu yrkQ49zViWauZGBPRZ82U180gt2UZUKadaPKeBLfAGxGIQ4fpHov3wcWZNd8geo7Zcwf JM0peJ33XB7mwAM45jEK/GL+AS2Z189cKv7KbkiaZRTvUPCv4Hl78CJ8GY3Z/oiOi0oE iMag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=vKglTS5hfhOeTYlUWPeMidOsEahTBAZReAAsYVh52p0=; b=Re2rLFzyLwJjH/xO/iKySMYZIeioDydP8E4iYwWoZf7R5z7Vu+2xhBQ+jUbo6lfnRr GJgbj5Dj4DSzlLwBBnZvHxa2HcrAV2pIFWsiRrnXgdWNmXiGGMrnssTDHCCbUMlsSB4r jCKNDCBpoeUYubTnXn+d5O/Wqr7R9Gf0kZpUMWF4/x+oV+3iypOG8E+9IExFx1UuRqlh tYfiPhoUp++oyh2ep4ZVdIq2u8VYMenpPyQsAOeoQhYkyx1QgIvti6Wbi4Uu4OL2b2Go 2Ew1USMTe+d6DlFhlB0QcRDWU0ixoGx5SC+MA30MpZb05Ppf+7/1QU6w+UOj6p1/2YIL DejQ== X-Gm-Message-State: AG10YOSjkJzg00KuQ8C65PnKw6beGdjMRVgVVDePP6MBP3SsUGrpa/Fow940S4iPeRyLUg== X-Received: by 10.28.170.139 with SMTP id t133mr20388362wme.50.1454435815194; Tue, 02 Feb 2016 09:56:55 -0800 (PST) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by smtp.gmail.com with ESMTPSA id yz5sm2570225wjc.36.2016.02.02.09.56.54 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Tue, 02 Feb 2016 09:56:54 -0800 (PST) Date: Tue, 2 Feb 2016 18:56:52 +0100 From: Mateusz Guzik To: Konstantin Belousov Cc: freebsd-hackers@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160202175652.GA9812@dft-labs.eu> Mail-Followup-To: Mateusz Guzik , Konstantin Belousov , freebsd-hackers@freebsd.org References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160202132322.GU91220@kib.kiev.ua> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 17:56:58 -0000 On Tue, Feb 02, 2016 at 03:23:22PM +0200, Konstantin Belousov wrote: > On Tue, Feb 02, 2016 at 05:07:49AM +0100, Mateusz Guzik wrote: > > + flags = fr->fr_flags; > Why not use fr->fr_flags directly ? It is slightly more churn, but IMO > it is worth it. > I'm indiffernet on this one, can change it no problem. > > + /* > > + * Hold the process so that it cannot exit after we make it runnable, > > + * but before we wait for the debugger. > Is this possible ? The forked child must execute through fork_return(), > and there we do ptracestop() before the child has a chance to ever return > to usermode. > > Do you mean a scenario where the debugger detaches before child executes > fork_return() and TDP_STOPATFORK is cleared in advance ? > The comment is somewhat misworded and I forgot to update it, how about just stating we hold the process so that we can mark the thread runnable and not have it disappear under we are done. While I have not tested this particular bug, prior to the patch the following should possible: p2 is untraced and td2 is marked as runnable, after which it exits and p2 is automatically reaped. If the code reaches the TDB_STOPATFORK check after that, PROC_LOCK(p2) succeeds due to processes being type stable. td2 dereference can cause no issues due to threads being type stable as well. But the thread could have been resued in a traced process, thus inducing cv_wait(&p2->p_dbgwait, ..) even though td2 is not linked in p2 anymore and p2 is not even a valid process, making curthread wait indefinitely since there is nobody to wake it up. -- Mateusz Guzik From owner-freebsd-hackers@freebsd.org Tue Feb 2 18:16:44 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 437DBA994E4 for ; Tue, 2 Feb 2016 18:16:44 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DFD93800 for ; Tue, 2 Feb 2016 18:16:43 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u12IGZZN010619 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Tue, 2 Feb 2016 20:16:35 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u12IGZZN010619 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u12IGZt9010618; Tue, 2 Feb 2016 20:16:35 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 2 Feb 2016 20:16:35 +0200 From: Konstantin Belousov To: Mateusz Guzik Cc: freebsd-hackers@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160202181635.GC91220@kib.kiev.ua> References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> <20160202175652.GA9812@dft-labs.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160202175652.GA9812@dft-labs.eu> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 18:16:44 -0000 On Tue, Feb 02, 2016 at 06:56:52PM +0100, Mateusz Guzik wrote: > On Tue, Feb 02, 2016 at 03:23:22PM +0200, Konstantin Belousov wrote: > > On Tue, Feb 02, 2016 at 05:07:49AM +0100, Mateusz Guzik wrote: > > > + flags = fr->fr_flags; > > Why not use fr->fr_flags directly ? It is slightly more churn, but IMO > > it is worth it. > > > > I'm indiffernet on this one, can change it no problem. > > > > + /* > > > + * Hold the process so that it cannot exit after we make it runnable, > > > + * but before we wait for the debugger. > > Is this possible ? The forked child must execute through fork_return(), > > and there we do ptracestop() before the child has a chance to ever return > > to usermode. > > > > Do you mean a scenario where the debugger detaches before child executes > > fork_return() and TDP_STOPATFORK is cleared in advance ? > > > > The comment is somewhat misworded and I forgot to update it, how about > just stating we hold the process so that we can mark the thread runnable > and not have it disappear under we are done. This means that the reader has to guess too much, IMHO. At least, add a note that despite fork_return() stops when the child is traced, it is not enough because ... . > > While I have not tested this particular bug, prior to the patch the > following should possible: p2 is untraced and td2 is marked as runnable, > after which it exits and p2 is automatically reaped. If the code reaches > the TDB_STOPATFORK check after that, PROC_LOCK(p2) succeeds due to I.e. td2 is reused and the TDB_STOPATFORK is set by unrelated activity ? You reference the do_fork() code checking TDB_STOPATFORK, and not fork_return(), I guess. > processes being type stable. td2 dereference can cause no issues due to > threads being type stable as well. But the thread could have been resued > in a traced process, thus inducing cv_wait(&p2->p_dbgwait, ..) even > though td2 is not linked in p2 anymore and p2 is not even a valid > process, making curthread wait indefinitely since there is nobody to > wake it up. > Well, if TDP_STOPATFORK bit is set, it has the same meaning due to type stability, and eventually the wake up would be performed. It just the unintended sleep waiting for condvar which is problematic and which I agree with. From owner-freebsd-hackers@freebsd.org Tue Feb 2 21:44:33 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6E03DA9941D for ; Tue, 2 Feb 2016 21:44:33 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com [IPv6:2a00:1450:400c:c09::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CF0FD19EE for ; Tue, 2 Feb 2016 21:44:32 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x242.google.com with SMTP id 128so4343017wmz.3 for ; Tue, 02 Feb 2016 13:44:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=fKmfN90WeGA7TXj1YEht/Tle2dKxTD8B6oFJ+82aLr4=; b=L1xdZ7YfEH1dhdhegwcNqjfvknl3f3frEzLK1FMcC9CmjdcHpF89u0HnZLwIVZRrEN b4fp2EO/UOAvqkt1/bALLnGiRdCZ+3m3sdsSlx0xGsWrn44qyg7tFxLOKcmlZVJO69Zo 2MBptJONIu7M0LpJWoWsEW7xpYxrC1aJcZ6sDghLyq8HUPo1/jcyKf/YjwO5rui2ohpo QJN42SattsTkGNBSY7rkFZK5rUK8jpvAXf6KHJGFH90lxDfNoHLlFM2xxmwUrNAmoOe7 jGT8cFeylfLiecBHsMFOkiVnxuXhuKsN3co1Wzw1qpfIyZpn6yNz+VmImRdz6BKEI0yr glUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=fKmfN90WeGA7TXj1YEht/Tle2dKxTD8B6oFJ+82aLr4=; b=mHFc089jRGbzWTqkolPg5hrgxLYMFX1AWSAel/r8hx/Xe9KKF0D42CcR/sFB99NVOn jbAv8FurGfWcWPjQHevTYsTVTEO0L28ZcCEMopMH9dJYgx9e6pgEzKpfFdK6SsPwB3F6 4SdbsC3Zz0Vr/5qtj2Z05hHXsWkVt+LXG+5exOY2Q9Tet63fzvHQ8JWlJFgWsMLylsxk HSG/eZaRhsp1szRapPwu5z+Q2/JQWLZ7CEfaEWV7j3C9Hdgy6HfQBqZuQ7kOIVi2ifKs ZfloFDoi0c8LB1ZLqr+ccBs8JYzVPwYPUVmNnxojzu69yr83k/c5AqXWCHBz9pG1ILqW fsDg== X-Gm-Message-State: AG10YOQAYIUvR+XLOKrH896DhxclWRu5Kuy3t7KJOED9KawLQZDYF64Qc3UHsnwr0O4akQ== X-Received: by 10.28.228.87 with SMTP id b84mr21545910wmh.36.1454449470396; Tue, 02 Feb 2016 13:44:30 -0800 (PST) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by smtp.gmail.com with ESMTPSA id kb5sm3358927wjc.22.2016.02.02.13.44.29 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Tue, 02 Feb 2016 13:44:29 -0800 (PST) Date: Tue, 2 Feb 2016 22:44:27 +0100 From: Mateusz Guzik To: Konstantin Belousov Cc: freebsd-hackers@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160202214427.GB9812@dft-labs.eu> Mail-Followup-To: Mateusz Guzik , Konstantin Belousov , freebsd-hackers@freebsd.org References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> <20160202175652.GA9812@dft-labs.eu> <20160202181635.GC91220@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160202181635.GC91220@kib.kiev.ua> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 Feb 2016 21:44:33 -0000 On Tue, Feb 02, 2016 at 08:16:35PM +0200, Konstantin Belousov wrote: > On Tue, Feb 02, 2016 at 06:56:52PM +0100, Mateusz Guzik wrote: > > On Tue, Feb 02, 2016 at 03:23:22PM +0200, Konstantin Belousov wrote: > > > On Tue, Feb 02, 2016 at 05:07:49AM +0100, Mateusz Guzik wrote: > > > > + flags = fr->fr_flags; > > > Why not use fr->fr_flags directly ? It is slightly more churn, but IMO > > > it is worth it. > > > > > > > I'm indiffernet on this one, can change it no problem. > > > > > > + /* > > > > + * Hold the process so that it cannot exit after we make it runnable, > > > > + * but before we wait for the debugger. > > > Is this possible ? The forked child must execute through fork_return(), > > > and there we do ptracestop() before the child has a chance to ever return > > > to usermode. > > > > > > Do you mean a scenario where the debugger detaches before child executes > > > fork_return() and TDP_STOPATFORK is cleared in advance ? > > > > > > > The comment is somewhat misworded and I forgot to update it, how about > > just stating we hold the process so that we can mark the thread runnable > > and not have it disappear under we are done. > This means that the reader has to guess too much, IMHO. > > At least, add a note that despite fork_return() stops when the child > is traced, it is not enough because ... . > > > > While I have not tested this particular bug, prior to the patch the > > following should possible: p2 is untraced and td2 is marked as runnable, > > after which it exits and p2 is automatically reaped. If the code reaches > > the TDB_STOPATFORK check after that, PROC_LOCK(p2) succeeds due to > I.e. td2 is reused and the TDB_STOPATFORK is set by unrelated activity ? > You reference the do_fork() code checking TDB_STOPATFORK, and not > fork_return(), I guess. > > > > processes being type stable. td2 dereference can cause no issues due to > > threads being type stable as well. But the thread could have been resued > > in a traced process, thus inducing cv_wait(&p2->p_dbgwait, ..) even > > though td2 is not linked in p2 anymore and p2 is not even a valid > > process, making curthread wait indefinitely since there is nobody to > > wake it up. > > > Well, if TDP_STOPATFORK bit is set, it has the same meaning due to type > stability, and eventually the wake up would be performed. It just the > unintended sleep waiting for condvar which is problematic and which I > agree with. CPU0 is executing fork1. p2 is not traced. CPU0 CPU1 p2 and td2 created td2 is marked runnable td2 is scheduled here td2 does not have TDB_STOPATFORK set td2 exits p2 is autoreaped td2's space is reused td2 gets linked into p3 td2 gets TDB_STOPATFORK PROC_LOCK(p2); TDB_STOPATFORK test on td2 cv_wait(&p2->p_dbgwait, ..); So at this point p2 has no linked threads and is free. td2 belongs to p3 and p2 is waiting for a wakeup which can't happen. Now that I look at it this may be broken in an additonal way, which is not fixed by the patch: what if td2 spawns a new thread and thr_exits? In this case testing td2 is still invalid. Maybe I'm just getting paranoid here, I don't have time to properly test this right now. In worst case should be fixable well enough with FIRST_THREAD_IN_PROC. How about the following comment around _PHOLD: We are going to make the main thread runnable. It can quickly exit, causing the process to be reaped and possibly reused, thus invalidating our p2 pointer. Protect against this by holding the process, which postpones the exit. And if the suspicion gets confimed the following would be added: diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index d0c3837..2a076ed 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -773,6 +773,12 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * PROC_LOCK(p2); /* + * By the time we got here the thread could have created a new thread + * and exited. Reload the main thread to ensure we got the right + * pointer. + */ + td2 = FIRST_THREAD_IN_PROC(p2); + /* * Wait until debugger is attached to child. */ while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) -- Mateusz Guzik From owner-freebsd-hackers@freebsd.org Wed Feb 3 01:04:17 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9E8CAA9AB22 for ; Wed, 3 Feb 2016 01:04:17 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com [IPv6:2a00:1450:400c:c09::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 42B931DCE; Wed, 3 Feb 2016 01:04:17 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x242.google.com with SMTP id r129so4952616wmr.0; Tue, 02 Feb 2016 17:04:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=ZsDFtcVKAzFzibtwRScJ7D+u8pnTjOJjbIxhcwRx+jw=; b=v9f+wU6kP4MIGF6P+gbIeKvv3SNH0Dg482v/aOnzohvBi3MX+J5upDHZoWYn45Ekwa 5TS11oJ2O3bPugjAJjiJW2dWO1pnfrWJS6Zrv/+RGxnwW5z3ZggfGDPDC9YZ6BOVedhe BkfC7jmooXZHEMhPQQwrIVm6W/Mh/5eDLdhNHUrXpx45eT/4TV12vipHmArv2BvT2yyW qE24kvFE0gwy3VxauxEOF3U3rFW1Uc4xmYZeHeR5xybR7hr/RZG3tTMineDIEOEPyGUk rtj92FuKcPMHaOzNpAtWkYvB6i00Fjd+VvnWHmzoB0EQNPyP3xXKQ4xLNfQtB/7MeXMT 7oFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=ZsDFtcVKAzFzibtwRScJ7D+u8pnTjOJjbIxhcwRx+jw=; b=FeVfB1O5VEhxa8ayQK+1V/Sxqrc3FG2xrNZ0UA4wb5hZmeOz8VdOYF+Lru/sb+0Dx1 yVcX9iqxK9JU6PJK/ls+ik+xapNRbESygJBBp3d1K8YfRXS2qyPSa0v2xFv6cgq02MeS FV0lV3xWgLVR29g7uoy5rRsMg/XXyIqd+AnizXX0dtThJ+85JqWuwmgFWBj+pMBSoUt6 tbq+EWsSuTTbqtS/tZk5TXcaFgSbeg/mYodLhLOMvwgFB7+Hn5VeeTq+/da7WsSSJfBj sl2moUpmVMcrsdpQQQONG33GoGos6u8ukXM8MDNXPTjPEWlTonp6hu1Suya8hw8RSad3 1Gtg== X-Gm-Message-State: AG10YOQbWGCvHpzW51tUbcJPFx7NXmrmfPqhUjIZL2GM3vGOb/PuHCo64nWJLkWnfS6+qg== X-Received: by 10.194.246.37 with SMTP id xt5mr31542101wjc.7.1454461455686; Tue, 02 Feb 2016 17:04:15 -0800 (PST) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by smtp.gmail.com with ESMTPSA id e77sm18851736wma.18.2016.02.02.17.04.14 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Tue, 02 Feb 2016 17:04:14 -0800 (PST) Date: Wed, 3 Feb 2016 02:04:13 +0100 From: Mateusz Guzik To: Konstantin Belousov , freebsd-hackers@freebsd.org Cc: jmg@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160203010412.GC9812@dft-labs.eu> Mail-Followup-To: Mateusz Guzik , Konstantin Belousov , freebsd-hackers@freebsd.org, jmg@freebsd.org References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> <20160202175652.GA9812@dft-labs.eu> <20160202181635.GC91220@kib.kiev.ua> <20160202214427.GB9812@dft-labs.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160202214427.GB9812@dft-labs.eu> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Feb 2016 01:04:17 -0000 On Tue, Feb 02, 2016 at 10:44:27PM +0100, Mateusz Guzik wrote: > On Tue, Feb 02, 2016 at 08:16:35PM +0200, Konstantin Belousov wrote: > > On Tue, Feb 02, 2016 at 06:56:52PM +0100, Mateusz Guzik wrote: > > > On Tue, Feb 02, 2016 at 03:23:22PM +0200, Konstantin Belousov wrote: > > > > On Tue, Feb 02, 2016 at 05:07:49AM +0100, Mateusz Guzik wrote: > > > > > + flags = fr->fr_flags; > > > > Why not use fr->fr_flags directly ? It is slightly more churn, but IMO > > > > it is worth it. > > > > > > > > > > I'm indiffernet on this one, can change it no problem. > > > > > > > > + /* > > > > > + * Hold the process so that it cannot exit after we make it runnable, > > > > > + * but before we wait for the debugger. > > > > Is this possible ? The forked child must execute through fork_return(), > > > > and there we do ptracestop() before the child has a chance to ever return > > > > to usermode. > > > > > > > > Do you mean a scenario where the debugger detaches before child executes > > > > fork_return() and TDP_STOPATFORK is cleared in advance ? > > > > > > > > > > The comment is somewhat misworded and I forgot to update it, how about > > > just stating we hold the process so that we can mark the thread runnable > > > and not have it disappear under we are done. > > This means that the reader has to guess too much, IMHO. > > > > At least, add a note that despite fork_return() stops when the child > > is traced, it is not enough because ... . > > > > > > While I have not tested this particular bug, prior to the patch the > > > following should possible: p2 is untraced and td2 is marked as runnable, > > > after which it exits and p2 is automatically reaped. If the code reaches > > > the TDB_STOPATFORK check after that, PROC_LOCK(p2) succeeds due to > > I.e. td2 is reused and the TDB_STOPATFORK is set by unrelated activity ? > > You reference the do_fork() code checking TDB_STOPATFORK, and not > > fork_return(), I guess. > > > > > > > processes being type stable. td2 dereference can cause no issues due to > > > threads being type stable as well. But the thread could have been resued > > > in a traced process, thus inducing cv_wait(&p2->p_dbgwait, ..) even > > > though td2 is not linked in p2 anymore and p2 is not even a valid > > > process, making curthread wait indefinitely since there is nobody to > > > wake it up. > > > > > Well, if TDP_STOPATFORK bit is set, it has the same meaning due to type > > stability, and eventually the wake up would be performed. It just the > > unintended sleep waiting for condvar which is problematic and which I > > agree with. > > > CPU0 is executing fork1. p2 is not traced. > > CPU0 CPU1 > p2 and td2 created > td2 is marked runnable > td2 is scheduled here > td2 does not have TDB_STOPATFORK set > td2 exits > p2 is autoreaped > td2's space is reused > td2 gets linked into p3 > td2 gets TDB_STOPATFORK > PROC_LOCK(p2); > TDB_STOPATFORK test on td2 > cv_wait(&p2->p_dbgwait, ..); > > So at this point p2 has no linked threads and is free. td2 belongs to > p3 and p2 is waiting for a wakeup which can't happen. > > Now that I look at it this may be broken in an additonal way, which is > not fixed by the patch: what if td2 spawns a new thread and thr_exits? > In this case testing td2 is still invalid. Maybe I'm just getting > paranoid here, I don't have time to properly test this right now. In > worst case should be fixable well enough with FIRST_THREAD_IN_PROC. > > How about the following comment around _PHOLD: > We are going to make the main thread runnable. It can quickly exit, > causing the process to be reaped and possibly reused, thus invalidating > our p2 pointer. Protect against this by holding the process, which > postpones the exit. > > And if the suspicion gets confimed the following would be added: > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index d0c3837..2a076ed 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -773,6 +773,12 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > PROC_LOCK(p2); > /* > + * By the time we got here the thread could have created a new thread > + * and exited. Reload the main thread to ensure we got the right > + * pointer. > + */ > + td2 = FIRST_THREAD_IN_PROC(p2); > + /* > * Wait until debugger is attached to child. > */ > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > > To end the speculation and hackery I decided to reorganize the func a little bit instead. Namely it can be trivially modified to not drop the lock proc lock, which gets rid of all aforementioned races. I got 2 variants. For brevity both patches are applied on top of the current patchset. When combined, the patch would be combined with second patch in the patchset. Both currently and with the first patch below knote_fork can get a now-freed or even recycled pid. I find that odd. This variant only saves the pid and calls knote_fork. This is likely fine enough. Just in case, the second variant adds a primitive - knlist_empty_lockless to perform a racy check to see if there are any knotes. If so, the process is held and released after knote_fork. Note that this is no more racy than current code with respect to spotting knotes. What does improve is the fact that the process is guaranteed to be around for during knote_fork, although I don't know how important this is. With both variants we also save one lock/unlock round, and a second lock/unlock with of knotes for the second variant in the common case. Variant 1: diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index d0c3837..fae4eaf 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -378,7 +378,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * struct vmspace *vm2, struct file *fp_procdesc) { struct proc *p1, *pptr; - int trypid; + int p2_pid, trypid; struct filedesc *fd; struct filedesc_to_leader *fdtol; struct sigacts *newsigacts; @@ -715,11 +715,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) dtrace_fasttrap_fork(p1, p2); #endif - /* - * Hold the process so that it cannot exit after we make it runnable, - * but before we wait for the debugger. - */ - _PHOLD(p2); if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | P_FOLLOWFORK)) { /* @@ -737,7 +732,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * td->td_pflags |= TDP_RFPPWAIT; td->td_rfppwait_p = p2; } - PROC_UNLOCK(p2); /* * Now can be swapped. @@ -745,16 +739,10 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * _PRELE(p1); PROC_UNLOCK(p1); - /* - * Tell any interested parties about the new process. - */ - knote_fork(&p1->p_klist, p2->p_pid); SDT_PROBE3(proc, , , create, p2, p1, flags); - if (flags & RFPROCDESC) { + if (flags & RFPROCDESC) procdesc_finit(p2->p_procdesc, fp_procdesc); - fdrop(fp_procdesc, td); - } if ((flags & RFSTOPPED) == 0) { /* @@ -771,15 +759,26 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * *fr->fr_procp = p2; } - PROC_LOCK(p2); /* * Wait until debugger is attached to child. */ while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) cv_wait(&p2->p_dbgwait, &p2->p_mtx); - _PRELE(p2); racct_proc_fork_done(p2); + /* + * The process can exit and be waited on after we drop the lock. Save + * the pid so that it can be used for knotes. + */ + p2_pid = p2->p_pid; PROC_UNLOCK(p2); + + /* + * Tell any interested parties about the new process. + */ + knote_fork(&p1->p_klist, p2_pid); + + if (flags & RFPROCDESC) + fdrop(fp_procdesc, td); } int ======================== Variant 2: diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c index d41ac96..3610d8a 100644 --- a/sys/kern/kern_event.c +++ b/sys/kern/kern_event.c @@ -2038,12 +2038,22 @@ knlist_remove_inevent(struct knlist *knl, struct knote *kn) (kn->kn_status & KN_HASKQLOCK) == KN_HASKQLOCK); } +/* + * For when the caller accepts that the check is inherently racy. + */ +int +knlist_empty_lockless(struct knlist *knl) +{ + + return SLIST_EMPTY(&knl->kl_list); +} + int knlist_empty(struct knlist *knl) { KNL_ASSERT_LOCKED(knl); - return SLIST_EMPTY(&knl->kl_list); + return knlist_empty_lockless(knl); } static struct mtx knlist_lock; diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index d0c3837..70490ef 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -378,7 +378,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * struct vmspace *vm2, struct file *fp_procdesc) { struct proc *p1, *pptr; - int trypid; + int p2_held, trypid; struct filedesc *fd; struct filedesc_to_leader *fdtol; struct sigacts *newsigacts; @@ -387,6 +387,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * sx_assert(&proctree_lock, SX_SLOCKED); sx_assert(&allproc_lock, SX_XLOCKED); + p2_held = 0; p1 = td->td_proc; flags = fr->fr_flags; @@ -715,11 +716,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) dtrace_fasttrap_fork(p1, p2); #endif - /* - * Hold the process so that it cannot exit after we make it runnable, - * but before we wait for the debugger. - */ - _PHOLD(p2); if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | P_FOLLOWFORK)) { /* @@ -737,7 +733,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * td->td_pflags |= TDP_RFPPWAIT; td->td_rfppwait_p = p2; } - PROC_UNLOCK(p2); /* * Now can be swapped. @@ -745,16 +740,10 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * _PRELE(p1); PROC_UNLOCK(p1); - /* - * Tell any interested parties about the new process. - */ - knote_fork(&p1->p_klist, p2->p_pid); SDT_PROBE3(proc, , , create, p2, p1, flags); - if (flags & RFPROCDESC) { + if (flags & RFPROCDESC) procdesc_finit(p2->p_procdesc, fp_procdesc); - fdrop(fp_procdesc, td); - } if ((flags & RFSTOPPED) == 0) { /* @@ -771,15 +760,32 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * *fr->fr_procp = p2; } - PROC_LOCK(p2); /* * Wait until debugger is attached to child. */ while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) cv_wait(&p2->p_dbgwait, &p2->p_mtx); - _PRELE(p2); racct_proc_fork_done(p2); + if (!knlist_empty_lockless(&p1->p_klist)) { + /* + * Hold the process so that it does not exit until we call + * knote_fork. + */ + _PHOLD(p2); + p2_held = 1; + } PROC_UNLOCK(p2); + + if (p2_held) { + /* + * Tell any interested parties about the new process. + */ + knote_fork(&p1->p_klist, p2->p_pid); + PRELE(p2); + } + + if (flags & RFPROCDESC) + fdrop(fp_procdesc, td); } int diff --git a/sys/sys/event.h b/sys/sys/event.h index 0f13231..771b3bb 100644 --- a/sys/sys/event.h +++ b/sys/sys/event.h @@ -254,6 +254,7 @@ extern void knote_fork(struct knlist *list, int pid); extern void knlist_add(struct knlist *knl, struct knote *kn, int islocked); extern void knlist_remove(struct knlist *knl, struct knote *kn, int islocked); extern void knlist_remove_inevent(struct knlist *knl, struct knote *kn); +extern int knlist_empty_lockless(struct knlist *knl); extern int knlist_empty(struct knlist *knl); extern void knlist_init(struct knlist *knl, void *lock, void (*kl_lock)(void *), void (*kl_unlock)(void *), -- Mateusz Guzik From owner-freebsd-hackers@freebsd.org Wed Feb 3 08:05:19 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8D1F8A9A1E9 for ; Wed, 3 Feb 2016 08:05:19 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x242.google.com (mail-wm0-x242.google.com [IPv6:2a00:1450:400c:c09::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 14CF11D9C; Wed, 3 Feb 2016 08:05:19 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x242.google.com with SMTP id l66so6137486wml.2; Wed, 03 Feb 2016 00:05:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=jg2eW4B7R2FAWeIO+4dg2kPa3NaS8ju2VS5tRK1lk80=; b=0G6ap+9YrRpjFI4Y8x4hq0LhiU0y8mJBHIt3FJuND/c7fiMg5TUUqznoJlyiSS9npD /seDQ7ExehWVMXNgxxD/LLYGgNEKALyYpGShMwMp+gc60hO9eIFUvXI7hUI5NVDJpSmj OLk2SE2/sFdmIaG94QTzhz2McVDYt/DBmrT8XiBD3EKHhf196KpcqUM1wjPwjqIpIWT4 tzkf3W806TXJF9YqPn/OlTwdaCfFmfWkvceiF4gYgFVf/qnK4/rDCA+l/XlR/XvbEelK VrVuRXpN0+rNKN7hTT74Jq6cJTGqisBg+ykmM6W3mIf1sd6Uig49fZ7Dkfm5deNsLL/b Blug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:subject:message-id:mail-followup-to :references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=jg2eW4B7R2FAWeIO+4dg2kPa3NaS8ju2VS5tRK1lk80=; b=elVmuovszDqPoGA+ZY+o5/hI3cY4PYpP3fpEEr43ZfN75XcApJGQnbifKqSRvR1VMq BdSFHjDlGmYEIY7x7MDwjJ/mQjUCv36DWBWe2GsjTAe4AB+NSysCTI0uGrXaQd2EZQ6q 53UyCKOtalMwtt0JLTs8Kxa1k9ex5PjDORa5QaDfX+zyCH0lEk2rE5Nf1UdTlVEDxFXI wkk8q7jIu61NgjiTABjnE+7usLF3Q7keUqW1u35b2CR4R+D8EHdmFCzmeP+wKLZWAo62 TO/TSCwzxsIWiqMljM7k/7NGxIibRhrpJU8sWi/R85LrMhl03DjfqI0tQZLu+nTUZZv6 51pg== X-Gm-Message-State: AG10YOThvO7HWXgqKVkR7eGiXQukbMJEpKwL4mGuwWzGt8B3BWFExbaDYGxRNxxEEPNGZQ== X-Received: by 10.195.11.35 with SMTP id ef3mr231647wjd.35.1454486717551; Wed, 03 Feb 2016 00:05:17 -0800 (PST) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by smtp.gmail.com with ESMTPSA id cb2sm5158346wjc.16.2016.02.03.00.05.16 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Wed, 03 Feb 2016 00:05:16 -0800 (PST) Date: Wed, 3 Feb 2016 09:05:15 +0100 From: Mateusz Guzik To: Konstantin Belousov , freebsd-hackers@freebsd.org, jmg@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160203080514.GA8753@dft-labs.eu> Mail-Followup-To: Mateusz Guzik , Konstantin Belousov , freebsd-hackers@freebsd.org, jmg@freebsd.org References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> <20160202175652.GA9812@dft-labs.eu> <20160202181635.GC91220@kib.kiev.ua> <20160202214427.GB9812@dft-labs.eu> <20160203010412.GC9812@dft-labs.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160203010412.GC9812@dft-labs.eu> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Feb 2016 08:05:19 -0000 On Wed, Feb 03, 2016 at 02:04:13AM +0100, Mateusz Guzik wrote: > On Tue, Feb 02, 2016 at 10:44:27PM +0100, Mateusz Guzik wrote: > > On Tue, Feb 02, 2016 at 08:16:35PM +0200, Konstantin Belousov wrote: > > > On Tue, Feb 02, 2016 at 06:56:52PM +0100, Mateusz Guzik wrote: > > > > On Tue, Feb 02, 2016 at 03:23:22PM +0200, Konstantin Belousov wrote: > > > > > On Tue, Feb 02, 2016 at 05:07:49AM +0100, Mateusz Guzik wrote: > > > > > > + flags = fr->fr_flags; > > > > > Why not use fr->fr_flags directly ? It is slightly more churn, but IMO > > > > > it is worth it. > > > > > > > > > > > > > I'm indiffernet on this one, can change it no problem. > > > > > > > > > > + /* > > > > > > + * Hold the process so that it cannot exit after we make it runnable, > > > > > > + * but before we wait for the debugger. > > > > > Is this possible ? The forked child must execute through fork_return(), > > > > > and there we do ptracestop() before the child has a chance to ever return > > > > > to usermode. > > > > > > > > > > Do you mean a scenario where the debugger detaches before child executes > > > > > fork_return() and TDP_STOPATFORK is cleared in advance ? > > > > > > > > > > > > > The comment is somewhat misworded and I forgot to update it, how about > > > > just stating we hold the process so that we can mark the thread runnable > > > > and not have it disappear under we are done. > > > This means that the reader has to guess too much, IMHO. > > > > > > At least, add a note that despite fork_return() stops when the child > > > is traced, it is not enough because ... . > > > > > > > > While I have not tested this particular bug, prior to the patch the > > > > following should possible: p2 is untraced and td2 is marked as runnable, > > > > after which it exits and p2 is automatically reaped. If the code reaches > > > > the TDB_STOPATFORK check after that, PROC_LOCK(p2) succeeds due to > > > I.e. td2 is reused and the TDB_STOPATFORK is set by unrelated activity ? > > > You reference the do_fork() code checking TDB_STOPATFORK, and not > > > fork_return(), I guess. > > > > > > > > > > processes being type stable. td2 dereference can cause no issues due to > > > > threads being type stable as well. But the thread could have been resued > > > > in a traced process, thus inducing cv_wait(&p2->p_dbgwait, ..) even > > > > though td2 is not linked in p2 anymore and p2 is not even a valid > > > > process, making curthread wait indefinitely since there is nobody to > > > > wake it up. > > > > > > > Well, if TDP_STOPATFORK bit is set, it has the same meaning due to type > > > stability, and eventually the wake up would be performed. It just the > > > unintended sleep waiting for condvar which is problematic and which I > > > agree with. > > > > > > CPU0 is executing fork1. p2 is not traced. > > > > CPU0 CPU1 > > p2 and td2 created > > td2 is marked runnable > > td2 is scheduled here > > td2 does not have TDB_STOPATFORK set > > td2 exits > > p2 is autoreaped > > td2's space is reused > > td2 gets linked into p3 > > td2 gets TDB_STOPATFORK > > PROC_LOCK(p2); > > TDB_STOPATFORK test on td2 > > cv_wait(&p2->p_dbgwait, ..); > > > > So at this point p2 has no linked threads and is free. td2 belongs to > > p3 and p2 is waiting for a wakeup which can't happen. > > > > Now that I look at it this may be broken in an additonal way, which is > > not fixed by the patch: what if td2 spawns a new thread and thr_exits? > > In this case testing td2 is still invalid. Maybe I'm just getting > > paranoid here, I don't have time to properly test this right now. In > > worst case should be fixable well enough with FIRST_THREAD_IN_PROC. > > > > How about the following comment around _PHOLD: > > We are going to make the main thread runnable. It can quickly exit, > > causing the process to be reaped and possibly reused, thus invalidating > > our p2 pointer. Protect against this by holding the process, which > > postpones the exit. > > > > And if the suspicion gets confimed the following would be added: > > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > > index d0c3837..2a076ed 100644 > > --- a/sys/kern/kern_fork.c > > +++ b/sys/kern/kern_fork.c > > @@ -773,6 +773,12 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > > > PROC_LOCK(p2); > > /* > > + * By the time we got here the thread could have created a new thread > > + * and exited. Reload the main thread to ensure we got the right > > + * pointer. > > + */ > > + td2 = FIRST_THREAD_IN_PROC(p2); > > + /* > > * Wait until debugger is attached to child. > > */ > > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > > > > > > To end the speculation and hackery I decided to reorganize the func a > little bit instead. Namely it can be trivially modified to not drop the > lock proc lock, which gets rid of all aforementioned races. Here is a trivial change retaining doing knote_fork before waiting for the debugger. Here the problem is worked around by making td2 runnable after knote_fork is performed and without relocking p2 in-between. diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index d0c3837..366262f 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -715,11 +715,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) dtrace_fasttrap_fork(p1, p2); #endif - /* - * Hold the process so that it cannot exit after we make it runnable, - * but before we wait for the debugger. - */ - _PHOLD(p2); if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | P_FOLLOWFORK)) { /* @@ -756,6 +751,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * fdrop(fp_procdesc, td); } + PROC_LOCK(p2); if ((flags & RFSTOPPED) == 0) { /* * If RFSTOPPED not requested, make child runnable and @@ -771,13 +767,11 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * *fr->fr_procp = p2; } - PROC_LOCK(p2); /* * Wait until debugger is attached to child. */ while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) cv_wait(&p2->p_dbgwait, &p2->p_mtx); - _PRELE(p2); racct_proc_fork_done(p2); PROC_UNLOCK(p2); } > > I got 2 variants. For brevity both patches are applied on top of the > current patchset. When combined, the patch would be combined with second > patch in the patchset. > > Both currently and with the first patch below knote_fork can get a > now-freed or even recycled pid. I find that odd. This variant only saves > the pid and calls knote_fork. This is likely fine enough. > > Just in case, the second variant adds a primitive - > knlist_empty_lockless to perform a racy check to see if there are any > knotes. If so, the process is held and released after knote_fork. Note > that this is no more racy than current code with respect to spotting > knotes. What does improve is the fact that the process is guaranteed to > be around for during knote_fork, although I don't know how important > this is. > > With both variants we also save one lock/unlock round, and a second > lock/unlock with of knotes for the second variant in the common case. > > Variant 1: > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index d0c3837..fae4eaf 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -378,7 +378,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > struct vmspace *vm2, struct file *fp_procdesc) > { > struct proc *p1, *pptr; > - int trypid; > + int p2_pid, trypid; > struct filedesc *fd; > struct filedesc_to_leader *fdtol; > struct sigacts *newsigacts; > @@ -715,11 +715,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) > dtrace_fasttrap_fork(p1, p2); > #endif > - /* > - * Hold the process so that it cannot exit after we make it runnable, > - * but before we wait for the debugger. > - */ > - _PHOLD(p2); > if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | > P_FOLLOWFORK)) { > /* > @@ -737,7 +732,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > td->td_pflags |= TDP_RFPPWAIT; > td->td_rfppwait_p = p2; > } > - PROC_UNLOCK(p2); > > /* > * Now can be swapped. > @@ -745,16 +739,10 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > _PRELE(p1); > PROC_UNLOCK(p1); > > - /* > - * Tell any interested parties about the new process. > - */ > - knote_fork(&p1->p_klist, p2->p_pid); > SDT_PROBE3(proc, , , create, p2, p1, flags); > > - if (flags & RFPROCDESC) { > + if (flags & RFPROCDESC) > procdesc_finit(p2->p_procdesc, fp_procdesc); > - fdrop(fp_procdesc, td); > - } > > if ((flags & RFSTOPPED) == 0) { > /* > @@ -771,15 +759,26 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > *fr->fr_procp = p2; > } > > - PROC_LOCK(p2); > /* > * Wait until debugger is attached to child. > */ > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > - _PRELE(p2); > racct_proc_fork_done(p2); > + /* > + * The process can exit and be waited on after we drop the lock. Save > + * the pid so that it can be used for knotes. > + */ > + p2_pid = p2->p_pid; > PROC_UNLOCK(p2); > + > + /* > + * Tell any interested parties about the new process. > + */ > + knote_fork(&p1->p_klist, p2_pid); > + > + if (flags & RFPROCDESC) > + fdrop(fp_procdesc, td); > } > > int > ======================== > > Variant 2: > diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c > index d41ac96..3610d8a 100644 > --- a/sys/kern/kern_event.c > +++ b/sys/kern/kern_event.c > @@ -2038,12 +2038,22 @@ knlist_remove_inevent(struct knlist *knl, struct knote *kn) > (kn->kn_status & KN_HASKQLOCK) == KN_HASKQLOCK); > } > > +/* > + * For when the caller accepts that the check is inherently racy. > + */ > +int > +knlist_empty_lockless(struct knlist *knl) > +{ > + > + return SLIST_EMPTY(&knl->kl_list); > +} > + > int > knlist_empty(struct knlist *knl) > { > > KNL_ASSERT_LOCKED(knl); > - return SLIST_EMPTY(&knl->kl_list); > + return knlist_empty_lockless(knl); > } > > static struct mtx knlist_lock; > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index d0c3837..70490ef 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -378,7 +378,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > struct vmspace *vm2, struct file *fp_procdesc) > { > struct proc *p1, *pptr; > - int trypid; > + int p2_held, trypid; > struct filedesc *fd; > struct filedesc_to_leader *fdtol; > struct sigacts *newsigacts; > @@ -387,6 +387,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > sx_assert(&proctree_lock, SX_SLOCKED); > sx_assert(&allproc_lock, SX_XLOCKED); > > + p2_held = 0; > p1 = td->td_proc; > flags = fr->fr_flags; > > @@ -715,11 +716,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) > dtrace_fasttrap_fork(p1, p2); > #endif > - /* > - * Hold the process so that it cannot exit after we make it runnable, > - * but before we wait for the debugger. > - */ > - _PHOLD(p2); > if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | > P_FOLLOWFORK)) { > /* > @@ -737,7 +733,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > td->td_pflags |= TDP_RFPPWAIT; > td->td_rfppwait_p = p2; > } > - PROC_UNLOCK(p2); > > /* > * Now can be swapped. > @@ -745,16 +740,10 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > _PRELE(p1); > PROC_UNLOCK(p1); > > - /* > - * Tell any interested parties about the new process. > - */ > - knote_fork(&p1->p_klist, p2->p_pid); > SDT_PROBE3(proc, , , create, p2, p1, flags); > > - if (flags & RFPROCDESC) { > + if (flags & RFPROCDESC) > procdesc_finit(p2->p_procdesc, fp_procdesc); > - fdrop(fp_procdesc, td); > - } > > if ((flags & RFSTOPPED) == 0) { > /* > @@ -771,15 +760,32 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > *fr->fr_procp = p2; > } > > - PROC_LOCK(p2); > /* > * Wait until debugger is attached to child. > */ > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > - _PRELE(p2); > racct_proc_fork_done(p2); > + if (!knlist_empty_lockless(&p1->p_klist)) { > + /* > + * Hold the process so that it does not exit until we call > + * knote_fork. > + */ > + _PHOLD(p2); > + p2_held = 1; > + } > PROC_UNLOCK(p2); > + > + if (p2_held) { > + /* > + * Tell any interested parties about the new process. > + */ > + knote_fork(&p1->p_klist, p2->p_pid); > + PRELE(p2); > + } > + > + if (flags & RFPROCDESC) > + fdrop(fp_procdesc, td); > } > > int > diff --git a/sys/sys/event.h b/sys/sys/event.h > index 0f13231..771b3bb 100644 > --- a/sys/sys/event.h > +++ b/sys/sys/event.h > @@ -254,6 +254,7 @@ extern void knote_fork(struct knlist *list, int pid); > extern void knlist_add(struct knlist *knl, struct knote *kn, int islocked); > extern void knlist_remove(struct knlist *knl, struct knote *kn, int islocked); > extern void knlist_remove_inevent(struct knlist *knl, struct knote *kn); > +extern int knlist_empty_lockless(struct knlist *knl); > extern int knlist_empty(struct knlist *knl); > extern void knlist_init(struct knlist *knl, void *lock, > void (*kl_lock)(void *), void (*kl_unlock)(void *), > > -- > Mateusz Guzik -- Mateusz Guzik From owner-freebsd-hackers@freebsd.org Wed Feb 3 10:53:53 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A8877A991EE for ; Wed, 3 Feb 2016 10:53:53 +0000 (UTC) (envelope-from araujobsdport@gmail.com) Received: from mail-ob0-x22a.google.com (mail-ob0-x22a.google.com [IPv6:2607:f8b0:4003:c01::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 6BBC362C; Wed, 3 Feb 2016 10:53:53 +0000 (UTC) (envelope-from araujobsdport@gmail.com) Received: by mail-ob0-x22a.google.com with SMTP id xk3so24005092obc.2; Wed, 03 Feb 2016 02:53:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=oB7mfoN6WNU1dePo1hp75SHGT0FxfM2Tzb1JTDLouuA=; b=Q1rext8blhVoyncZc4js6tOLjHNIUl1D3YEGydiglX3zLgxEdYNl0yg5JEEQKBjaJr CAp9pnA6WW0mONzvgBqa+tLbt1nR+HS+FBbtDyzuLFsH/eFUVFj7krjZ8iwkKUWz0E34 kfeWTBzpFv/N8QZAq3hjIE03+0zv1nWgg/Ymswhgu61tQSCc3kAC+sIamesAkkrvGDhe 0hJYBj7abB3nCz8vAJtidjnkOswxDs3er5IldA0t7bM/MZzicmFlkFf7QGPXkF2CrqSv rcTKr3tVfLaI/zA35Z/SxAEswhrpwUzDjExIrg+7aNxFLYxkSILDgqvWTR/3VXglr/Ft HPuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:reply-to:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=oB7mfoN6WNU1dePo1hp75SHGT0FxfM2Tzb1JTDLouuA=; b=RuFN37ZessYLi1s4slnkdoOfiOV4uQSZSVLzaywAzbLwpc5hwQ1qrjiWWMcVJloqyg vU2V26v8JjgkgbGtKdAZSGoSASKgeXYVVW6Utw6CFBwtIfY6/wpOPMXYAqEvOhn++/7Q ponTI/Kx55QnKosRFaTBqZlEu0hkYQnXl2WWjNLmHlLHUmJHSK0reoPNnNT+FSHFj42e bnwM7LLWNpwuJzOvZiprvLF3dfocBUB26IWJzD3tS5ocbjGvfxL3KWFRxo939sfrLKLp YsoDc0xRRU5LdF4tEl2zLmdfI8cb1PsD09XeQ/RTw8C3GGexGgnf1AtQk3evMxVflh9O ax8g== X-Gm-Message-State: AG10YOSq4Tke+PvaXYHh0VJgFmm3iiOXVK+pTdVTrX3coZhq86lPEwJGy44Jn5bbcvtZtWROZhctp6vLfsHcfg== MIME-Version: 1.0 X-Received: by 10.60.246.74 with SMTP id xu10mr971551oec.31.1454496832615; Wed, 03 Feb 2016 02:53:52 -0800 (PST) Received: by 10.182.5.138 with HTTP; Wed, 3 Feb 2016 02:53:52 -0800 (PST) Received: by 10.182.5.138 with HTTP; Wed, 3 Feb 2016 02:53:52 -0800 (PST) Reply-To: araujo@FreeBSD.org In-Reply-To: <1453903686.42081.32.camel@freebsd.org> References: <56A86D91.3040709@freebsd.org> <1453903686.42081.32.camel@freebsd.org> Date: Wed, 3 Feb 2016 18:53:52 +0800 Message-ID: Subject: Re: syslogd(8) with OOM Killer protection From: Marcelo Araujo To: Ian Lepore Cc: freebsd-hackers@freebsd.org, Allan Jude Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Feb 2016 10:53:53 -0000 Hi, Thanks everybody for the feedback! So here is a potential patch that covers what we did talk here already: https://reviews.freebsd.org/D5176 Note that I'm still making more tests and this patch is not ready to be used in production. Basically we can add oom protection doing: syslogd_oomprotect="Yes" where it will protect only the main processes. Or syslogd_oomprotect="All" where it will protect all future children of the specified processes. Best. On Jan 27, 2016 10:08 PM, "Ian Lepore" wrote: > On Wed, 2016-01-27 at 02:11 -0500, Allan Jude wrote: > > On 2016-01-27 01:21, Marcelo Araujo wrote: > > > Hi guys, > > > > > > I would like to know your opinion about this REVIEW[1]. > > > The basic idea is protect by default the syslogd(8) against been > > > killed by > > > OOM with an option to disable the protection. > > > > > > Some people like the idea, other people would prefer something more > > > global > > > where we can protect any daemon by the discretion of our choice. > > > > > > Thoughts? > > > > > > > > > [1] https://reviews.freebsd.org/D4973 > > > > > > > > > Best, > > > > > > > I do like the idea of generalizing it, say via rc.subr > > > > So you can just do: > > > > someapp_protect=YES (and maybe syslogd has this enabled by default in > > /etc/defaults/rc.conf) and it prefixes the start command with protect > > -i. > > > > Maybe the setting could be named *_oomprotect to make it clear what > kind of protection is being configured? > > -- Ian > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From owner-freebsd-hackers@freebsd.org Wed Feb 3 14:13:44 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 83054A9A071 for ; Wed, 3 Feb 2016 14:13:44 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0155CA6E; Wed, 3 Feb 2016 14:13:43 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u13EDTVQ010003 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 3 Feb 2016 16:13:29 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u13EDTVQ010003 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u13EDTYs010002; Wed, 3 Feb 2016 16:13:29 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 3 Feb 2016 16:13:29 +0200 From: Konstantin Belousov To: Mateusz Guzik Cc: freebsd-hackers@freebsd.org, jmg@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160203141329.GF91220@kib.kiev.ua> References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> <20160202175652.GA9812@dft-labs.eu> <20160202181635.GC91220@kib.kiev.ua> <20160202214427.GB9812@dft-labs.eu> <20160203010412.GC9812@dft-labs.eu> <20160203080514.GA8753@dft-labs.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160203080514.GA8753@dft-labs.eu> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Feb 2016 14:13:44 -0000 On Wed, Feb 03, 2016 at 09:05:15AM +0100, Mateusz Guzik wrote: > On Wed, Feb 03, 2016 at 02:04:13AM +0100, Mateusz Guzik wrote: > > On Tue, Feb 02, 2016 at 10:44:27PM +0100, Mateusz Guzik wrote: > > > On Tue, Feb 02, 2016 at 08:16:35PM +0200, Konstantin Belousov wrote: > > > > On Tue, Feb 02, 2016 at 06:56:52PM +0100, Mateusz Guzik wrote: > > > > > On Tue, Feb 02, 2016 at 03:23:22PM +0200, Konstantin Belousov wrote: > > > > > > On Tue, Feb 02, 2016 at 05:07:49AM +0100, Mateusz Guzik wrote: > > > > > > > + flags = fr->fr_flags; > > > > > > Why not use fr->fr_flags directly ? It is slightly more churn, but IMO > > > > > > it is worth it. > > > > > > > > > > > > > > > > I'm indiffernet on this one, can change it no problem. > > > > > > > > > > > > + /* > > > > > > > + * Hold the process so that it cannot exit after we make it runnable, > > > > > > > + * but before we wait for the debugger. > > > > > > Is this possible ? The forked child must execute through fork_return(), > > > > > > and there we do ptracestop() before the child has a chance to ever return > > > > > > to usermode. > > > > > > > > > > > > Do you mean a scenario where the debugger detaches before child executes > > > > > > fork_return() and TDP_STOPATFORK is cleared in advance ? > > > > > > > > > > > > > > > > The comment is somewhat misworded and I forgot to update it, how about > > > > > just stating we hold the process so that we can mark the thread runnable > > > > > and not have it disappear under we are done. > > > > This means that the reader has to guess too much, IMHO. > > > > > > > > At least, add a note that despite fork_return() stops when the child > > > > is traced, it is not enough because ... . > > > > > > > > > > While I have not tested this particular bug, prior to the patch the > > > > > following should possible: p2 is untraced and td2 is marked as runnable, > > > > > after which it exits and p2 is automatically reaped. If the code reaches > > > > > the TDB_STOPATFORK check after that, PROC_LOCK(p2) succeeds due to > > > > I.e. td2 is reused and the TDB_STOPATFORK is set by unrelated activity ? > > > > You reference the do_fork() code checking TDB_STOPATFORK, and not > > > > fork_return(), I guess. > > > > > > > > > > > > > processes being type stable. td2 dereference can cause no issues due to > > > > > threads being type stable as well. But the thread could have been resued > > > > > in a traced process, thus inducing cv_wait(&p2->p_dbgwait, ..) even > > > > > though td2 is not linked in p2 anymore and p2 is not even a valid > > > > > process, making curthread wait indefinitely since there is nobody to > > > > > wake it up. > > > > > > > > > Well, if TDP_STOPATFORK bit is set, it has the same meaning due to type > > > > stability, and eventually the wake up would be performed. It just the > > > > unintended sleep waiting for condvar which is problematic and which I > > > > agree with. > > > > > > > > > CPU0 is executing fork1. p2 is not traced. > > > > > > CPU0 CPU1 > > > p2 and td2 created > > > td2 is marked runnable > > > td2 is scheduled here > > > td2 does not have TDB_STOPATFORK set > > > td2 exits > > > p2 is autoreaped > > > td2's space is reused > > > td2 gets linked into p3 > > > td2 gets TDB_STOPATFORK > > > PROC_LOCK(p2); > > > TDB_STOPATFORK test on td2 > > > cv_wait(&p2->p_dbgwait, ..); > > > > > > So at this point p2 has no linked threads and is free. td2 belongs to > > > p3 and p2 is waiting for a wakeup which can't happen. I am convinced about this. I thought that the fork_return() guarantee that the child cannot exit if TDB_STOPATFORK is set is enough, but the issue is other way around. > > > > > > Now that I look at it this may be broken in an additonal way, which is > > > not fixed by the patch: what if td2 spawns a new thread and thr_exits? > > > In this case testing td2 is still invalid. Maybe I'm just getting > > > paranoid here, I don't have time to properly test this right now. In > > > worst case should be fixable well enough with FIRST_THREAD_IN_PROC. > > > > > > How about the following comment around _PHOLD: > > > We are going to make the main thread runnable. It can quickly exit, > > > causing the process to be reaped and possibly reused, thus invalidating > > > our p2 pointer. Protect against this by holding the process, which > > > postpones the exit. > > > > > > And if the suspicion gets confimed the following would be added: > > > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > > > index d0c3837..2a076ed 100644 > > > --- a/sys/kern/kern_fork.c > > > +++ b/sys/kern/kern_fork.c > > > @@ -773,6 +773,12 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > > > > > PROC_LOCK(p2); > > > /* > > > + * By the time we got here the thread could have created a new thread > > > + * and exited. Reload the main thread to ensure we got the right > > > + * pointer. > > > + */ > > > + td2 = FIRST_THREAD_IN_PROC(p2); > > > + /* > > > * Wait until debugger is attached to child. > > > */ > > > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > > > > > > > > > > To end the speculation and hackery I decided to reorganize the func a > > little bit instead. Namely it can be trivially modified to not drop the > > lock proc lock, which gets rid of all aforementioned races. > > Here is a trivial change retaining doing knote_fork before waiting for > the debugger. Here the problem is worked around by making td2 runnable > after knote_fork is performed and without relocking p2 in-between. > > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index d0c3837..366262f 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -715,11 +715,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) > dtrace_fasttrap_fork(p1, p2); > #endif > - /* > - * Hold the process so that it cannot exit after we make it runnable, > - * but before we wait for the debugger. > - */ > - _PHOLD(p2); > if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | > P_FOLLOWFORK)) { > /* > @@ -756,6 +751,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > fdrop(fp_procdesc, td); > } > > + PROC_LOCK(p2); It is hard to read the patch over patch. Is there proc lock for p1 owned ? Note that the order for the proc locks is child->parent. It is not catched by witness. > if ((flags & RFSTOPPED) == 0) { > /* > * If RFSTOPPED not requested, make child runnable and > @@ -771,13 +767,11 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > *fr->fr_procp = p2; > } > > - PROC_LOCK(p2); > /* > * Wait until debugger is attached to child. > */ > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > - _PRELE(p2); > racct_proc_fork_done(p2); > PROC_UNLOCK(p2); > } > > > > > I got 2 variants. For brevity both patches are applied on top of the > > current patchset. When combined, the patch would be combined with second > > patch in the patchset. Can you commit two current patches to ease the conversation ? > > > > Both currently and with the first patch below knote_fork can get a > > now-freed or even recycled pid. I find that odd. This variant only saves > > the pid and calls knote_fork. This is likely fine enough. > > > > Just in case, the second variant adds a primitive - > > knlist_empty_lockless to perform a racy check to see if there are any > > knotes. If so, the process is held and released after knote_fork. Note > > that this is no more racy than current code with respect to spotting > > knotes. What does improve is the fact that the process is guaranteed to > > be around for during knote_fork, although I don't know how important > > this is. > > > > With both variants we also save one lock/unlock round, and a second > > lock/unlock with of knotes for the second variant in the common case. > > > > Variant 1: > > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > > index d0c3837..fae4eaf 100644 > > --- a/sys/kern/kern_fork.c > > +++ b/sys/kern/kern_fork.c > > @@ -378,7 +378,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > struct vmspace *vm2, struct file *fp_procdesc) > > { > > struct proc *p1, *pptr; > > - int trypid; > > + int p2_pid, trypid; > > struct filedesc *fd; > > struct filedesc_to_leader *fdtol; > > struct sigacts *newsigacts; > > @@ -715,11 +715,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) > > dtrace_fasttrap_fork(p1, p2); > > #endif > > - /* > > - * Hold the process so that it cannot exit after we make it runnable, > > - * but before we wait for the debugger. > > - */ > > - _PHOLD(p2); > > if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | > > P_FOLLOWFORK)) { > > /* > > @@ -737,7 +732,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > td->td_pflags |= TDP_RFPPWAIT; > > td->td_rfppwait_p = p2; > > } > > - PROC_UNLOCK(p2); > > > > /* > > * Now can be swapped. > > @@ -745,16 +739,10 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > _PRELE(p1); > > PROC_UNLOCK(p1); > > > > - /* > > - * Tell any interested parties about the new process. > > - */ > > - knote_fork(&p1->p_klist, p2->p_pid); > > SDT_PROBE3(proc, , , create, p2, p1, flags); > > > > - if (flags & RFPROCDESC) { > > + if (flags & RFPROCDESC) > > procdesc_finit(p2->p_procdesc, fp_procdesc); > > - fdrop(fp_procdesc, td); > > - } > > > > if ((flags & RFSTOPPED) == 0) { > > /* > > @@ -771,15 +759,26 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > *fr->fr_procp = p2; > > } > > > > - PROC_LOCK(p2); > > /* > > * Wait until debugger is attached to child. > > */ > > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > > - _PRELE(p2); > > racct_proc_fork_done(p2); > > + /* > > + * The process can exit and be waited on after we drop the lock. Save > > + * the pid so that it can be used for knotes. > > + */ > > + p2_pid = p2->p_pid; > > PROC_UNLOCK(p2); > > + > > + /* > > + * Tell any interested parties about the new process. > > + */ > > + knote_fork(&p1->p_klist, p2_pid); > > + > > + if (flags & RFPROCDESC) > > + fdrop(fp_procdesc, td); > > } > > > > int > > ======================== > > > > Variant 2: > > diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c > > index d41ac96..3610d8a 100644 > > --- a/sys/kern/kern_event.c > > +++ b/sys/kern/kern_event.c > > @@ -2038,12 +2038,22 @@ knlist_remove_inevent(struct knlist *knl, struct knote *kn) > > (kn->kn_status & KN_HASKQLOCK) == KN_HASKQLOCK); > > } > > > > +/* > > + * For when the caller accepts that the check is inherently racy. > > + */ > > +int > > +knlist_empty_lockless(struct knlist *knl) > > +{ > > + > > + return SLIST_EMPTY(&knl->kl_list); > > +} > > + > > int > > knlist_empty(struct knlist *knl) > > { > > > > KNL_ASSERT_LOCKED(knl); > > - return SLIST_EMPTY(&knl->kl_list); > > + return knlist_empty_lockless(knl); > > } > > > > static struct mtx knlist_lock; > > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > > index d0c3837..70490ef 100644 > > --- a/sys/kern/kern_fork.c > > +++ b/sys/kern/kern_fork.c > > @@ -378,7 +378,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > struct vmspace *vm2, struct file *fp_procdesc) > > { > > struct proc *p1, *pptr; > > - int trypid; > > + int p2_held, trypid; > > struct filedesc *fd; > > struct filedesc_to_leader *fdtol; > > struct sigacts *newsigacts; > > @@ -387,6 +387,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > sx_assert(&proctree_lock, SX_SLOCKED); > > sx_assert(&allproc_lock, SX_XLOCKED); > > > > + p2_held = 0; > > p1 = td->td_proc; > > flags = fr->fr_flags; > > > > @@ -715,11 +716,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > if ((flags & RFMEM) == 0 && dtrace_fasttrap_fork) > > dtrace_fasttrap_fork(p1, p2); > > #endif > > - /* > > - * Hold the process so that it cannot exit after we make it runnable, > > - * but before we wait for the debugger. > > - */ > > - _PHOLD(p2); > > if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | > > P_FOLLOWFORK)) { > > /* > > @@ -737,7 +733,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > td->td_pflags |= TDP_RFPPWAIT; > > td->td_rfppwait_p = p2; > > } > > - PROC_UNLOCK(p2); > > > > /* > > * Now can be swapped. > > @@ -745,16 +740,10 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > _PRELE(p1); > > PROC_UNLOCK(p1); > > > > - /* > > - * Tell any interested parties about the new process. > > - */ > > - knote_fork(&p1->p_klist, p2->p_pid); > > SDT_PROBE3(proc, , , create, p2, p1, flags); > > > > - if (flags & RFPROCDESC) { > > + if (flags & RFPROCDESC) > > procdesc_finit(p2->p_procdesc, fp_procdesc); > > - fdrop(fp_procdesc, td); > > - } > > > > if ((flags & RFSTOPPED) == 0) { > > /* > > @@ -771,15 +760,32 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > > *fr->fr_procp = p2; > > } > > > > - PROC_LOCK(p2); > > /* > > * Wait until debugger is attached to child. > > */ > > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > > - _PRELE(p2); > > racct_proc_fork_done(p2); > > + if (!knlist_empty_lockless(&p1->p_klist)) { > > + /* > > + * Hold the process so that it does not exit until we call > > + * knote_fork. > > + */ > > + _PHOLD(p2); > > + p2_held = 1; > > + } > > PROC_UNLOCK(p2); > > + > > + if (p2_held) { > > + /* > > + * Tell any interested parties about the new process. > > + */ > > + knote_fork(&p1->p_klist, p2->p_pid); > > + PRELE(p2); > > + } > > + > > + if (flags & RFPROCDESC) > > + fdrop(fp_procdesc, td); > > } > > > > int > > diff --git a/sys/sys/event.h b/sys/sys/event.h > > index 0f13231..771b3bb 100644 > > --- a/sys/sys/event.h > > +++ b/sys/sys/event.h > > @@ -254,6 +254,7 @@ extern void knote_fork(struct knlist *list, int pid); > > extern void knlist_add(struct knlist *knl, struct knote *kn, int islocked); > > extern void knlist_remove(struct knlist *knl, struct knote *kn, int islocked); > > extern void knlist_remove_inevent(struct knlist *knl, struct knote *kn); > > +extern int knlist_empty_lockless(struct knlist *knl); > > extern int knlist_empty(struct knlist *knl); > > extern void knlist_init(struct knlist *knl, void *lock, > > void (*kl_lock)(void *), void (*kl_unlock)(void *), > > > > -- > > Mateusz Guzik > > -- > Mateusz Guzik From owner-freebsd-hackers@freebsd.org Thu Feb 4 04:35:44 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 65FA0A9B88B; Thu, 4 Feb 2016 04:35:44 +0000 (UTC) (envelope-from kaduk@mit.edu) Received: from dmz-mailsec-scanner-7.mit.edu (dmz-mailsec-scanner-7.mit.edu [18.7.68.36]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 61844B68; Thu, 4 Feb 2016 04:35:42 +0000 (UTC) (envelope-from kaduk@mit.edu) X-AuditID: 12074424-bfbff70000000a40-28-56b2d51659a8 Received: from mailhub-auth-3.mit.edu ( [18.9.21.43]) (using TLS with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by (Symantec Messaging Gateway) with SMTP id 4A.63.02624.615D2B65; Wed, 3 Feb 2016 23:35:34 -0500 (EST) Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id u144ZXbu010626; Wed, 3 Feb 2016 23:35:34 -0500 Received: from multics.mit.edu (system-low-sipb.mit.edu [18.187.2.37]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id u144ZTMJ000685 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Wed, 3 Feb 2016 23:35:33 -0500 Received: (from kaduk@localhost) by multics.mit.edu (8.12.9.20060308) id u144ZSTS007434; Wed, 3 Feb 2016 23:35:28 -0500 (EST) Date: Wed, 3 Feb 2016 23:35:28 -0500 (EST) From: Benjamin Kaduk X-X-Sender: kaduk@multics.mit.edu To: freebsd-hackers@freebsd.org cc: freebsd-current@freebsd.org, freebsd-stable@freebsd.org Subject: FreeBSD Quarterly Status Report - Fourth Quarter 2015 Message-ID: User-Agent: Alpine 1.10 (GSO 962 2008-03-14) MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrEIsWRmVeSWpSXmKPExsUixCmqrSt2dVOYwcZ5Oha7rp1mt5jz5gOT xfbN/xgtDjcLObB4zPg0nyWAMYrLJiU1J7MstUjfLoErY/arhWwF645wVPRteMvawPjuElsX IyeHhICJRPuDo0xdjFwcQgJtTBKfbx2BcjYwSlw7sZ4NwjnIJDFh2R4mkBYhgXqJz5eegdks AloSvxu72UFsNgE1icd7m1khxipKbD41ibmLkYNDREBeYsF5e5Aws4C1xNx168FKhAXsJN51 /wW7glfAUWLzqv8sILaogI7E6v1TWCDighInZz5hgegNkNh84BLrBEb+WUhSs5CkIGwdiSkT VzBC2NoS92+2sS1gZFnFKJuSW6Wbm5iZU5yarFucnJiXl1qka66Xm1mil5pSuokRFLTsLio7 GJsPKR1iFOBgVOLhbfDcFCbEmlhWXJl7iFGSg0lJlDf1GFCILyk/pTIjsTgjvqg0J7X4EKME B7OSCG/IbqAcb0piZVVqUT5MSpqDRUmc14gfKCWQnliSmp2aWpBaBJOV4eBQkuCVvwKUFSxK TU+tSMvMKUFIM3FwggznARo++TLI8OKCxNzizHSI/ClGRSlx3qsgCQGQREZpHlwvOKnsZlJ9 xSgO9IowrzvICh5gQoLrfgU0mAlo8Gy+9SCDSxIRUlINjP35BsoXXCO82MJeFX3VFxV+ctzY UWgr29p9gjN/7hDbJirV+UllxoSr20tXpEmVdb4yfvT2Xnri3FWxqcsPrtqh9Oh6M4+zoW37 Biul5m2SogtU34mIrubvmPdHPI/X+e22mlnfMtbfOzld7+qOWfEnJiw5Inf3gORZ+zb2eesY X01cwbTptb0SS3FGoqEWc1FxIgAEFFpPBQMAAA== Content-Type: TEXT/PLAIN; charset=ISO-8859-2 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Feb 2016 04:35:44 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 FreeBSD Project Quarterly Status Report: October - December 2015 The fourth quarter of 2015 saw a great deal of activity for FreeBSD. This is now the third quarter running for which I can say that this is the largest report yet published! Many thanks to everyone who proactively submitted topics and entries -- it is great to have more complete coverage of ongoing development for the community to learn about in these reports. An experimental new Triage Team was formed this quarter to create a new way for community members to participate, and to improve issue management and productivity in general. Making more effective use of automation and tooling can help to increase developer productivity and the quality of FreeBSD, just as the adoption of Jenkins and continual integration tooling catches regressions quickly and maintains the high standards for the system. Efforts to bring our BSD high standards to new architectures continue, with impressive work on arm64 leading to its promotion to Tier-2 status and a flurry of work bringing up the new RISC-V hardware architecture. Software architecture is also under active development, including system startup and service management. A handful of potential init system replacements are mentioned in this report: launchd, relaunchd, and nosh. Architectural changes originating both from academic research (multipath TCP) and from the realities of industry (sendfile(2) improvements) are also under way. It is heartening to see how FreeBSD provides a welcoming platform for contributions from both research and industry. To all the readers, whether from academia or industry, hobbyist or professional: I hope you are as excited as I am to read about all of the progress and projects covered in this report, and the future of FreeBSD! --Ben Kaduk __________________________________________________________________ The deadline for submissions covering the period from January to March 2016 is April 7, 2016. __________________________________________________________________ FreeBSD Team Reports * FreeBSD Release Engineering Team * Issue Tracking (Bugzilla) * The FreeBSD Core Team * The FreeBSD Issue Triage Team Projects * CAM I/O Scheduler * Encrypted Kernel Crash Dumps * Jenkins Continuous Integration for FreeBSD * Mellanox iSCSI Extensions for RDMA (iSER) Support * MIPS: Ralink/Mediatek Support * Multipath TCP for FreeBSD * OpenBSM * Raspberry Pi: VideoCore Userland Application Packaging * RCTL Disk IO Limits * Root Remount * Routing Stack Update * The Graphics Stack on FreeBSD * The nosh Project * UEFI Boot and Framebuffer Support Kernel * Chelsio iSCSI Offload Driver (Initiator and Target) * FreeBSD Integration Services (BIS) * FreeBSD Xen * Improvements to the QLogic HBA Driver * iMX.6 Video Output Support * ioat(4) Driver Enhancements * Kernel Vnode Cache Tuning * Mellanox Drivers * Minimal Kernel with PNP-Based Autoloading * MMC Stack Under CAM Framework * ntb_hw(4)/if_ntb(4) Driver Synced up to Linux * Out of Memory Handler Rewrite * sendfile(2) Improvements * sysctl Enhancements * Touchscreen Support for Raspberry Pi and Beaglebone Black Architectures * armv6 Hard Float Default ABI * FreeBSD on Marvell Armada38x * FreeBSD on Newer ARM Boards * FreeBSD on SoftIron Overdrive 3000 * FreeBSD/arm64 * FreeBSD/RISC-V * Improvements for ARMv6/v7 Support Userland Programs * Base System Build Improvements * ELF Tool Chain Tools * The LLDB Debugger * Updates to GDB Ports * Bringing GitLab into the Ports Collection * GNOME on FreeBSD * IPv6 Promotion Campaign * KDE on FreeBSD * Linux Kernel as a Library Added to the Ports Collection * LXQt on FreeBSD * New Tools to Enhance the Porting Experience * Node.js Modules * Ports Collection * Supporting Variants in the Ports Framework * Xfce on FreeBSD Documentation * "FreeBSD Mastery: Specialty Filesystems" Early Access Version Now Available * style(9) Enhanced to Allow C99 bool Miscellaneous * HardenedBSD * NanoBSD Modernization * relaunchd * System Initialization and Service Management * The FreeBSD Foundation __________________________________________________________________ FreeBSD Release Engineering Team Links FreeBSD 10.3-RELEASE schedule URL: https://www.freebsd.org/releases/10.3R/schedule.html FreeBSD Development Snapshots URL: http://ftp.freebsd.org/pub/FreeBSD/snapshots/ISO-IMAGES/ Contact: FreeBSD Release Engineering Team The FreeBSD Release Engineering Team is responsible for setting and publishing release schedules for official project releases of FreeBSD, announcing code freezes, and maintaining the respective branches, among other things. During the last quarter of 2015, the Release Engineering team added support for three additional FreeBSD/arm systems: BANANAPI, CUBIEBOARD, and CUBIEBOARD2. In addition to regular development snapshot builds for FreeBSD 11.0-CURRENT and FreeBSD 10.2-STABLE, several changes and enhancements were made to the release build code. Of note, the release build code no longer produces MD5 checksums, in favor of SHA512. Toward the end of the year, focus was primarily centered on the upcoming FreeBSD 10.3 release cycle, which will begin in January 2016. As always, help testing development snapshot builds is crucial to producing quality releases, and we encourage testing development snapshots whenever possible. This project is sponsored by The FreeBSD Foundation. __________________________________________________________________ Issue Tracking (Bugzilla) Links Bugzilla Home Page URL: https://bugs.freebsd.org/bugzilla/ Contact: Bugmeisters Contact: Kubilay Kocak Contact: Mahdi Mokhtari The bugmeister team has gained a new member, Mahdi Mokhtari (mokhi64@gmail.com). Mahdi has been contributing to the FreeBSD Project for just over one month. After getting started by creating ports for Chef-Server and MySQL 5.7 (with Bernard Spil's help), an introduction to Kubilay Kocak led to guidance on appropriate projects, such as Bugzilla development to help Bugmeister, the Bugzilla Triage team, Developers, and the community by making issue tracking better. This is how things are going so far: Issue Tracking can be either "Defect Tracking for Systems" or "Bug-Tracking for Systems". System Defect Tracking is to allow individual or groups of developers to keep track of outstanding issues in their product effectively. We use Bugzilla to manage issues for the FreeBSD project. We are pleased to announce some developments on our issue management systems: * We have made improvements to the AutoAssigner module (not yet deployed) that was previously developed by Marcus von Appen to assign port bugs to their maintainers by default, such as: + Improvements and bugfixes to port detection in the Summary: field of issues, for automatic assignment to their maintainers in a better way. + Refactoring code to make future development easier and faster in a more modular way. * We have developed a new module (FBSDAttachment), which automates setting maintainer-approval flag values on attachments under most conditions. This will improve time to resolution, consistency of triage, and reduce manual effort by triagers and maintainers. * We reported and upstreamed a number of bugs in Bugzilla, working with the upstream Bugzilla developers. Open tasks: 1. Major improvements to templates for usability and simplicity. 2. Further improvements to automation (for example, additional processing of commit logs). __________________________________________________________________ The FreeBSD Core Team Contact: FreeBSD Core Team Two major issues have occupied much of core's attention during the last quarter: the reorganisation of the Security Team and the question of whether to import GPLv3 licensed code into the source repository. 1. The idea of reorganizing the Security team was first proposed to Core during a meeting at BSDCan this year by Gleb Smirnoff -- core member and newly-appointed deputy Security Officer (SO). The "Security Team", which previously could contain several people (a varying number over time, but more than two) has been refashioned into just two roles: Security Officer and Deputy Security Officer. Accordingly, the role of the SO team has been redefined to be the controller of the distribution of security sensitive information into and within the project: they are responsible for interfacing with external bodies and individuals reporting security problems to the project, and connecting those reports to the appropriate individuals within the project with the technical expertise to address the identified concerns. These changes will improve the project's responsiveness to security alerts, help maintain security on privileged information received in confidence before general publication and, not least, reduce the work load on the security officer. The SO team will continue to benefit from liasons with the Core, Cluster Administration, and Release Engineering teams, and will be assisted by a secretary; they will also be able to obtain input and assistance in drafting security advisories from former and potential future (Deputy) Security Officers. Core would particularly like to thank the former members of the Security Team group for their past contributions, now that the Security Team role has been merged into the Security Officer's responsibilities. 2. The other large question concerning Core is how to provide a modern toolchain for all supported achitectures. Tier 1 architectures are required to ship with a toolchain unencumbered by onerous license terms. This is currently provided for i386 and arm64 by the LLVM suite, including the Clang compiler, LLD and LLDB. However LLVM support for other (Tier 2 or below) architectures is not yet of sufficient quality to be viable, and the older but pre-existing GPLv2 toolchain cannot support some of the interesting new architectures such as arm64 and RISC-V. Pragmatically, in order for the project to support these architectures, until LLVM support arrives we must turn to the GNU project's GPLv3 licenced toolchain. The argument here is whether to import GPLv3 licensed code into the FreeBSD src repository with all of the obligations on patent terms and source code redistribution that would entail, not only for the FreeBSD project itself but for numerous downstream consumers of FreeBSD code. Not having a toolchain readily available is a big impediment to working on a new architecture. One potential solution is to create a range of "GPLv3 toolchain" base-system packages out of a completely separate source code repository, for instance within the FreeBSD area on Github. These would be distributed equivalently to the other base system binary packages when that mechanism is introduced. Core recognises that this is a decision with wide-ranging consequences and will be producing a position paper for circulation amongst all interested parties in order to judge community opinion on the matter. Core welcomes feedback from all interested parties on the subject. Beyond these two big questions, Core has handled a number of other items: * Core approved the formation of a wiki-admin team to take over managing the Wiki, to curate the Wiki content and work on navigation and organization of existing technical content and to evaluate new Wiki software with the aim of opening up the Wiki to contributions from the public. * An external review board has been assembled to look at the Code of Conduct, including a mixture of project members and experts from external groups. The review process is getting under way and Core is awaiting their report. * The standard documentation license was found to be unfit for its purpose, and the doceng group had temporarily reverted to the previous license while a new replacement was drafted. This new license is now the default for new documentation submissions. However, one factor emerging from this review was the difficulty of maintaining correct authorial attributions for sections of documentation, some of which may only be a few words long. Unlike source code, blocks of documentation are frequently moved around within individual files, or even between files. Consequently, Core would like to introduce a Voluntary Contribution Agreement along the lines of the one operated by the Apache Foundation. With this, copyrights are signed over to the FreeBSD Foundation, with individual contributions being recognised by recording names in a general "Authors" file. This will be another alternative alongside the existing copyright mechanisms used in the project. Core is interested to hear any opinions on the subject. * Core approved the formation of a new "dev-announce" mailing list, which all FreeBSD committers should be members of. This will be a low-traffic moderated list to contain important announcements, heads-ups, warnings of code freezes, changes in policy and notifications of events that affect the project as a whole. * Around eight years ago, an attempt was made to import the OpenBSD sensors framework. This was rejected at the time as potentially blocking the development of a better designed framework. However, no such development has occurred in the intervening time whilst the sensors framework has been in use successfully by both OpenBSD and FreeNAS. Despite some concerns about the efficiency of the framework and potential impacts on power consumption and hence battery lifetime, core is minded to approve the import, but wants to consult with interested developers first. * Core is exploring the legal ramifications for the project of the "Right to Be Forgotten" established by the European Court of Justice. * Core is also seeking an alternative means for holding their regular monthly conference calls. The current, paid-for, service has less than satisfactory sound quality and reliability, and Core would like to switch to a free video conferencing solution. This quarter also saw a particularly large influx of new commit bit requests, with on occasion, four votes running simultaneously. Please welcome Kurt Lidl, Svatopluk Kraus, Michal Meloun, Jonathan Looney (Juniper), Daisuke Aoyama, Phil Shafer (Juniper), Ravi Pokala (Panasas), Anish Gupta and Mark Bloch (Mellanox) to the ranks of src committers. In addition, core was delighted to restore commit privileges for Eric Melville after a hiatus of many years. No commit bits were taken in during the quarter. A non-committer account was approved for Kevin Bowling of LimeLight Networks. Kevin will be doing systems administration work with clusteradm, with particular interest in the parts of the cluster that are now hosted in LLNW's facilities. Deb Goodkin of the FreeBSD Foundation was added to the developers mailing list: she was one of the few members of the Foundation Board not already on the list, and having awareness of what is going on in the developer community will help her to support the project more effectively. __________________________________________________________________ The FreeBSD Issue Triage Team Contact: Bugmeister Contact: Kubilay Kocak Contact: Vladimir Krstulja Contact: Rodrigo N. Hernandez By the end of the Q4 2015 period, Kubilay Kocak (koobs@) started an initiative to form an experimental Bugzilla Triage Team. The main goals of the team are to increase community involvement (addition/training of new triagers) and enhance current procedures and tools, among others. This experiment was started with the participation of Vladimir (blackflow on irc/freenode) and Rodrigo (DanDare on irc/freenode), who approached koobs@ with a desire to contribute and get more involved with the FreeBSD Project. This experimental pilot project has the task of setting up procedures for enhanced Issue (Problem Report) management that include better classification and prioritization, eventually leading to faster resolution of issues. We are now happy to report on the progress of this experimental team: * The #FreeBSD-bugs IRC channel has been set up on Freenode and we are successfully using it to exchange information about triage processes, ask for help, propose changes and discuss related topics. * We have identified the primary role of an Issue Triage Team to be that of classification of problem reports of all kinds (currently limited mostly to ports and obvious src issues) and facilitation of issue assignment, which is making sure that the reported issues are explained well, contain all the appropriate information (or as much of it as possible), and are brought to attention of the people who can act upon them. * Vladimir and Rodrigo are successfully training in bug triage as well as porting processes (Vladimir is also taking maintainership of some ports). * This experiment is benefiting from the introduction of newcomers to issue tracking. It naturally resulted in a entire review of the tracking process from its very elementary aspects. This "fresh eyes" participation spotted minor details during the process, giving the opportunity to scrutinize actual procedures on a number of smaller points, followed by proposals on how to improve the overall Issue Tracking and Management. The new ideas include both organizational and technical ideas and solutions, such as new or modified keywords or flags for better classification, the triage workflow, and Bugzilla technical improvements, among others. * An important goal is producing documentation about best practices for using Bugzilla and issue management workflow. This documentation should be aimed not only at people directly engaged in issue triage tasks, but also at general users. Another relevant point is that feedback from the triage team can be used to improve Bugzilla in terms of adjusting existing features to best fit FreeBSD's needs, and the development of new features (please see Mahdi "Magic" Mokhtari's report on "Bugzilla improvements"). * We are still collating ideas in preparation of setting up a Wiki namespace for the overall topic of issue management, containing information for all the parties involved in issue tracking: from users (reporters) to maintainers and committers. The unorganized brainstorming document is linked in this report. Since the Issue Triage Team is very young, we expect more information be available and more actions to be reported in the next status report. Open tasks: 1. Set up the Wiki namespace and organize the brainstorming document into a meaningful set of documents. 2. We are actively recruiting to grow our FreeBSD Triage Team. If you are interested in participating and contributing to one of the most important community-facing areas of the FreeBSD project, join #freebsd-bugs on the freenode IRC and let us know! Experience with issue tracking is desirable, but not required. No prior internal project knowledge or technical skills are required, just bring your communication skills and awesome attitude. Training is provided. __________________________________________________________________ CAM I/O Scheduler Links BSDCan Paper URL: https://people.FreeBSD.org/~imp/bsdcan2015/iosched-v3.pdf Phabricator Review URL: https://reviews.FreeBSD.org/D4609 Contact: Warner Losh Reviews have begun on the CAM I/O scheduler that I wrote for Netflix. It is anticipated that this process will be done in time for the FreeBSD 11 branch. Details about this work can be found in the linked BSDcan paper from last year. Briefly, the scheduler allows one to differentiate I/O types and limit I/O based on the type and characteristics of the I/Os (including the latency of recent requests relative to historical averages). This is most useful when tuning system loads to SSD performance. Both a simple default scheduler, the same that we use today in FreeBSD, as well as a scheduler that can be well-tuned for system loads related to video streaming will be included. This project is sponsored by Netflix, Inc. __________________________________________________________________ Encrypted Kernel Crash Dumps Links Technical Details URL: https://lists.FreeBSD.org/pipermail/freebsd-security/2015-December= /008780.html Patch Review URL: https://reviews.FreeBSD.org/D4712 Contact: Konrad Witaszczyk Kernel crash dumps contain information about currently running processes. This can include sensitive data, for example passwords kept in memory by a browser when a kernel panic occurred. An entity that can read data from a dump device or a crash directory can also extract this information from a core dump. To prevent this situation, the core dump should be encrypted before it is stored on the dump device. This project allows a kernel to encrypt a core dump during a panic. A user can configure the kernel for encrypted dumps and save the core dump after reboot using the existing tools, dumpon(8) and savecore(8). A new tool decryptcore(8) was added to decrypt the core files. A patch has been uploaded to Phabricator for review. The patch is currently being updated to address the review comments, and should be committed as soon as it is accepted. For more technical details, please visit the FreeBSD-security mailing list archive or see the Phabricator review. __________________________________________________________________ Jenkins Continuous Integration for FreeBSD Links The Jenkins CI Server in the FreeBSD Cluster URL: https://jenkins.FreeBSD.org Portest Script URL: https://github.com/Ultima1252/portest Jenkins Workflow Plugin URL: https://github.com/jenkinsci/workflow-plugin Cloudbees URL: https://cloudbees.com Jenkins Phabricator Plugin URL: https://github.com/uber/phabricator-jenkins-plugin Phabricator Plugin Fixes URL: https://github.com/uber/phabricator-jenkins-plugin/pull/110 Durable Task Plugin Fixes URL: https://github.com/jenkinsci/durable-task-plugin/pull/14 Clang Scanbuild Plugin Fixes URL: https://github.com/jenkinsci/clang-scanbuild-plugin/commits/master Multiple SCMs Plugin Fixes URL: https://github.com/jenkinsci/multiple-scms-plugin/commits/master SCM Sync Configuration Plugin Fixes URL: https://github.com/jenkinsci/scm-sync-configuration-plugin/commits= /master Porting Jobs to the Workflow Plugin URL: https://lists.FreeBSD.org/pipermail/freebsd-testing/2016-January/0= 01285.html Akuma Fixes for FreeBSD URL: https://github.com/kohsuke/akuma/pull/9 Kyua Fix for Invalid Characters URL: https://github.com/jmmv/kyua/pull/148 Contact: Craig Rodrigues Contact: Jenkins Administrators Contact: FreeBSD Testing The Jenkins Continuous Integration and Testing project has been helping to improve the quality of FreeBSD. Since the last status report, we have quickly found commits that caused build breakage or test failures. FreeBSD developers saw these problems and quickly fixed them. Some of the highlights include: * Ricky Gallagher wrote a script named portest, which can take a patch to the FreeBSD ports tree as input, and can generate a sequence of commands to check out the ports tree from Subversion, apply the patch, and then invoke poudriere to build the affected part of the ports tree. Ricky consulted with Torsten Z=FChlsdorff during its development. This script will be used later to test changes to the ports tree. * Craig Rodrigues converted some Jenkins builds to use the Workflow plugin. Workflow is a plugin written by Jesse Glick and other developers at Cloudbees, the main company providing commercial support for Jenkins. With this plugin, a Jenkins job can be written in a Domain Specific Language (DSL) which is written in the Groovy scripting language. Workflow scripts are meant to provide sophisticated access to Jenkins functionality, in a simple scripting language. As Jenkins jobs get more complicated and have more interdependencies, using a DSL is easier for maintainability instead of creating Jenkins jobs via menus. Craig Rodrigues worked with Jesse Glick to identify and fix a problem with the Durable Task plugin used by the workflow plugin. This problem seemed to show up mostly on non-Linux platforms such as OS X and FreeBSD. * Eitan Adler worked with Craig Rodrigues to test a Jenkins plugin written by Aiden Scandella at Uber which integrates Phabricator and Jenkins. With this plugin, if someone submits a code review with Phabricator's Differential tool, a Jenkins build with this code change will be triggered. The Phabricator code review would then be updated with the result of the build. Eitan Adler and Craig Rodrigues had some initial success testing this plugin using the FreeBSD docs repository, but this plugin still has a lot of hardcoded dependencies specific to Uber's environment which make it difficult to use out-of-the-box for FreeBSD. Alexander Yerenkow submitted some patches upstream to fix some of these problems, but this plugin still needs more work. Craig Rodrigues thinks that it might be better to write a workflow script to call Phabricator commands directly. * Craig Rodrigues pushed fixes upstream to several plugins including: + SCM Sync configuration plugin + NodeLabel parameter plugin + Subversion plugin + Multiple SCMs plugin + Clang Scanbuild plugin Craig Rodrigues was granted commit access to the SCM Sync configuration plugin, Multiple SCMs plugin, and Clang Scanbuild plugin. * Li-Wen Hsu set up multiple builds using jails on machines located at NYI and administered by the FreeBSD Cluster Administrators. One of these builds targets 64-bit ARM. * Michael Zhilin fixed the Akuma library for FreeBSD. The Akuma library is used by Jenkins to determine what command-line arguments were passed to a running process. To fix it, Michael invoked an FreeBSD-specific sysctl() with KERN_PROC_ARGS to determine the arguments for a running pid. This fix allows a running Jenkins instance to restart itself after new plugins are installed. * Julio Merino accepted a fix for Kyua from Craig Rodrigues to fix writing out XML characters to test report files. Open tasks: 1. Work more on using the workflow plugin for various builds. 2. Set up a build to test bmake's meta-mode. 3. Finish off integration with Phabricator. 4. People interested in helping out should join the freebsd-testing@FreeBSD.org list. __________________________________________________________________ Mellanox iSCSI Extensions for RDMA (iSER) Support Links GitHub repository URL: https://github.com/sagigrimberg/iser-FreeBSD Contact: Max Gurtovoy Contact: Sagi Grimberg Building on the new in-kernel iSCSI initiator stack released in FreeBSD 10.0 and the recently added iSCSI offload interface, Mellanox Technologies has developed iSCSI extensions for RDMA (iSER) initiator support to enable efficient data movement using the hardware offload capabilities of Mellanox's 10, 40, 56, and 100 Gigabit Infiniband (IB)/Ethernet adapters. Remote Direct Memory Access (RDMA) has been shown to have great value for storage applications. RDMA infrastructure provides benefits such as zero-copy, CPU offload, reliable transport, fabric consolidation, and many more. The iSER protocol eliminates some of the bottlenecks in the traditional iSCSI/TCP stack, provides low latency and high throughput, and is well suited for latency aware workloads. This work includes a new ICL module that implements the iSER initiator. The iSCSI stack is slightly modified to support some extra features such as asynchronous IO completions, unmapped data buffers, and data-transfer offloads. The user will be able to choose iSER as the iSCSI transport with iscsictl. The project is in the process of being merged to FreeBSD 11-CURRENT and is expected to ship with FreeBSD 11.0. This project is sponsored by Mellanox Technologies. __________________________________________________________________ MIPS: Ralink/Mediatek Support Links Github Branch With Work in Progress URL: https://github.com/sgalabov/FreeBSD/tree/local/sgalabov_mtk Contact: Stanislav Galabov This project is aimed at adding FreeBSD support for Ralink/Mediatek's family of WiFi router system-on-chip (SoC) devices based on MIPS processors. These SoCs are commonly found in embedded network devices such as WiFi routers. Having support for these SoCs would allow FreeBSD to run on a number of additional low-cost devices, which could help spread FreeBSD's popularity in the embedded systems world. The project currently aims to support the following Ralink/Mediatek chipsets: RT3050, RT3052, RT3350, RT3352, RT3662, RT3883, RT5350, RT6855, RT6856, MT7620, MT7621, MT7628 and MT7688. The following functionality (where applicable) is currently planned to be supported: Interrupt controller, UART, GPIO, USB, PCI/PCIe, Ethernet, and SPI. This project is sponsored by Smartcom - Bulgaria AD. Open tasks: 1. Help with adding WiFi driver support (possibly to ral(4)) for the above SoCs would be greatly appreciated. 2. Help with refactoring if_rt(4) to be usable on all of the above SoCs would be appreciated. 3. Help wth testing target boards (e.g., WiFi routers) would be appreciated. __________________________________________________________________ Multipath TCP for FreeBSD Links MPTCP for FreeBSD Repository URL: https://bitbucket.org/nw-swin/caia-mptcp-freebsd/ MPTCP for FreeBSD Project Website URL: http://caia.swin.edu.au/urp/newtcp/mptcp/ Contact: Nigel Williams Multipath TCP (MPTCP) is an extension to TCP that allows for the use of multiple network interfaces on a standard TCP session. The addition of new addresses and scheduling of data across these occurs transparently from the perspective of the TCP application. The goal of this project is to deliver an MPTCP kernel patch that interoperates with the reference MPTCP implementation, along with additional enhancements to aid network research. A v0.51 release has been tagged in our repository, with some minor improvements over v0.5. We have now removed much of the MPTCP code that was inside the functions tcp_do_segment, tcp_output, and other code used for standard TCP connections. The goal of this is to restrict the added MPTCP code to just MPTCP connections, leaving regular TCP connections using the existing code. We are currently in the process of implementing a subflow socket buffer upcall and event processing. These will handle changes in subflow socket state, MP-signalling, and incoming data segments. This also requires some re-working of the MP option processing, particularly how incoming DSN maps are parsed and stored for use during MP-layer reassembly. We are also looking at how our changes might take advantage of the new TCP stack modularisation enhancements to create subflow-specific TCP functions. This project is sponsored by The Cisco University Research Program Fund at Community Foundation Silicon Valley, and The FreeBSD Foundation. Open tasks: 1. Complete the implementations of subflow event processing and new option parsing. 2. Update documentation and task lists. __________________________________________________________________ OpenBSM Links OpenBSM: Open Source Basic Security Module (BSM) Audit Implementation URL: http://www.openbsm.org OpenBSM on GitHub URL: https://github.com/openbsm/openbsm FreeBSD Audit Handbook Chapter URL: https://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/audit.h= tml Contact: Christian Brueffer Contact: Robert Watson Contact: TrustedBSD audit mailing list OpenBSM is a BSD-licensed implementation of Sun's Basic Security Module (BSM) API and file format. It is the user-space side of the CAPP Audit implementations in FreeBSD and Mac OS X. Additionally, the audit trail processing tools are expected to work on Linux. Progress has been slow but steady this quarter, culminating in OpenBSM 1.2 alpha 4, the first release in three years. It features various bug fixes and documentation improvements; the complete list of changes is documented in the NEWS file on GitHub. The release was imported into FreeBSD head and merged to FreeBSD 10-STABLE. As such, it will be part of FreeBSD 10.3-RELEASE. Open tasks: 1. Test the new release on different versions of FreeBSD, Mac OS X, and Linux. In particular, testing on Mac OS X 10.9 (Mavericks) and newer would be greatly appreciated. 2. Fix problems that have been reported via GitHub and the FreeBSD bug tracker. 3. Implement features mentioned in the TODO list on GitHub. __________________________________________________________________ Raspberry Pi: VideoCore Userland Application Packaging Contact: Mika=EBl Urankar Contact: Oleksandr Tymoshenko The Raspberry Pi SoC consists of two parts: ARM and GPU (VideoCore). Many interesting features like OpenGL, video playback, and HDMI controls are implemented on the VideoCore side and can be accessed from the OS through libraries provided by Broadcom (userland repo). These libraries were ported to FreeBSD some time ago, so Mika=EBl created the port misc/raspberrypi-userland for them. He also created a port for omxplayer (a low-level video player that utilizes VideoCore APIs) and is working on a port for Kodi (formerly XBMC), a more user-firendly media player software with Raspberry Pi support. __________________________________________________________________ RCTL Disk IO Limits Contact: Edward Tomasz Napierala An important missing piece of the RCTL resource limits mechanism was the ability to limit disk throughput. This project aims to fill that hole by making it possible to add RCTL rules for read bytes per second (BPS), write BPS, read I/O operations per second (IOPS), and write IOPS. It also adds a new throttling mechanism to delay process execution when a limit is reached. The project is at the late implementation stage. The major piece of work left apart from testing is to integrate it with ZFS. The project is expected to ship with FreeBSD 11.0. This project is sponsored by The FreeBSD Foundation. __________________________________________________________________ Root Remount Links Commit to Head URL: https://svnweb.freebsd.org/base?view=3Drevision&revision=3D290548 reboot(8) Manual Page Changes URL: https://svnweb.freebsd.org/base/head/sbin/reboot/reboot.8?r1=3D290= 548&r2=3D290547&pathrev=3D290548 Contact: Edward Tomasz Napierala One of the long-missing features of FreeBSD was the ability to boot up with a temporary rootfs, configure the kernel to be able to access the real rootfs, and then replace the temporary root with the real one. In Linux, this functionality is known as pivot_root. The reroot projects provides similar functionality in a different, slightly more user-friendly way: rerooting. Simply put, from the user point of view it looks like the system performs a partial shutdown, killing all processes and unmounting the rootfs, and then partial bringup, mounting the new rootfs, running init, and running the startup scripts as usual. The project is finished. All the relevant code has been committed to FreeBSD 11-CURRENT and is expected to ship with FreeBSD 11.0. This project is sponsored by The FreeBSD Foundation. __________________________________________________________________ Routing Stack Update Links Initial Proposal URL: http://wiki.freebsd.org/ProjectsRoutingProposal Contact: Alexander Chernikov The projects/routing Subversion branch is a FreeBSD routing system rework aimed at providing performance, scalability and the ability to add advanced features to the routing stack. The current packet output path suffers from excessive locking. Acquiring and releasing four distinct contested locks is required to convert a packet to a frame suitable to put on the wire. The first project goal is to reduce the number of locks needed to just two rmlock(9)s for the output path, which permits close-to-linear scaling. Since September, one of the locks (used to protect link-level entries) has been completely eliminated from the packet data path. A new routing API was introduced, featuring better scalability and hiding routing internals. Most of the consumers of the old routing API were converted to use the new API. __________________________________________________________________ The Graphics Stack on FreeBSD Links Graphics Stack Roadmap and Supported Hardware Matrix URL: https://wiki.FreeBSD.org/Graphics Ports Development Tree on GitHub URL: https://github.com/FreeBSD/freebsd-ports-graphics Contact: FreeBSD Graphics team Several important ports were updated: Mesa to 11.0.8, the X.Org server to 1.17.4, libdrm to 2.4.65, as well as many applications and libraries. The latest release of the X.Org server, 1.18, is being tested in our Ports development tree. On the kernel side, the i915 update is almost ready to land. There are a couple known regressions for currently supported GPUs that we want to fix before committing. We started a discussion on the FreeBSD-x11@ mailing list to organize future contributions to the kernel drivers. We have already received some valuable comments. We are confident that future updates will happen at a faster pace, thanks to several motivated people! FOSDEM is held in Brussels on the 30th and 31st of January. We will attend this conference. It will be a perfect time to see people again from FreeBSD and from the XDC. On Sunday, we will give a talk about how to contribute to the Graphics Stack. Our blog is currently down because the service was discontinued. We hope to get a dump of our data to put it back online elsewhere. Unfortunately, there is no ETA for this item. Open tasks: 1. See the "Graphics" wiki page for up-to-date information. __________________________________________________________________ The nosh Project Links Introduction URL: http://homepage.ntlworld.com./jonathan.deboynepollard/Softwares/no= sh.html FreeBSD binary packages URL: http://homepage.ntlworld.com./jonathan.deboynepollard/Softwares/no= sh/freebsd-binary-packages.html Installation How-To URL: http://homepage.ntlworld.com./jonathan.deboynepollard/Softwares/no= sh/timorous-admin-installation-how-to.html Roadmap URL: http://homepage.ntlworld.com./jonathan.deboynepollard/Softwares/no= sh/roadmap.html Commands URL: http://homepage.ntlworld.com./jonathan.deboynepollard/Softwares/no= sh/commands.html A Slightly Outdated User Guide URL: http://homepage.ntlworld.com./jonathan.deboynepollard/Softwares/no= sh/guide/index.html The Supervision Mailing List URL: https://www.mail-archive.com/supervision@list.skarnet.org/ Contact: Jonathan de Boyne Pollard The nosh project is a suite of system-level utilities for initializing, running, and shutting down BSD systems, and for managing daemons, terminals, and logging. It supersedes BSD init and the NetBSD rc.d system, drawing inspiration from Solaris SMF for named milestones, daemontools-encore for service control/status mechanisms, UCSPI, and IBM AIX for separated service and system management. It comprises a range of compatibility mechanisms, including shims for familiar commands from other systems, and an automatic import mechanism that takes existing configuration data from /etc/fstab, /etc/rc.conf{,.local}, /etc/ttys, and elsewhere, applying them to its native service definitions and creating additional native services. It is portable (including to Linux) and composable, it provides a migration path from the world of systemd Linux, and it does not require new kernel APIs. It provides clean service environments, orderings and dependencies between services, parallelized startup and shutdown (including fsck), strictly size-capped and autorotated logging, the service manager as a "subreaper", and uses kevent(2) for event-driven parallelism. Since the last status report, in October 2015, the project has seen: the complete replacement of its event-handling subsystem on Linux; the introduction of tools for exporting cyclog/multilog logs via RFC 5426 to remote log handlers (such as logstash); and the switching of the user-mode virtual terminal subsystem on BSD to using USB devices directly, a more powerful device interface than sysmouse et al. because it permits directly positioning touch devices for mice and other things (thus permitting "mouse integration" under VirtualBox for those who run PC-BSD/FreeBSD on VirtualBox virtual machines), but sysmouse et al. can still be used if desired. In version 1.24, released shortly before publication of this report, there are extensive additions for supporting a purely-ZFS system with an empty /etc/fstab (as the PC-BSD 10.2 system installer creates), and the ability to convert systemd unit files' process priority settings to BSD's rtprio/idprio. Version 1.24 also sees a large chunk taken out of the remainder of the on-going project to create enough native service bundles and ancillary utilities to entirely supplant the rc.d system. The progress of this project has been open from the start, and can be followed on the nosh roadmap web page. As of version 1.24, there are a mere 27 items remaining out of the original target list of 157, with a 28th and a 29th (from PC-BSD 10.2) added. Items crossed off by version 1.24 include (amongst others) mfs support for /tmp, static ARP and networking, persistent entropy for the randomness subsystem, pefs, and hald. The remaining items in the task list are mostly aimed at making the overall system integration cleaner and friendlier to modern systems. We are also interested in receiving suggestions, bug reports, and other feedback from users. Try following the how-to guide and see how things go! Open tasks: 1. Add kernel support for passing a -b option to PID 1, and support for a boot_bare variable in the loader, to allow "emergency" (where no shell dotfiles are loaded) and "rescue" mode bootstraps, akin to Linux. (History: the -b mechanism and idea date back to version 2.57d of Miquel van Smoorenburg's System 5 init clone, dated 1995-12-03, and was already known as "emergency boot" by 1997.) 2. Add support to FreeBSD's fsck(8) for outputting machine-readable progress reports to a designated file descriptor, so that nosh can provide progress bars for multiple fscks running in parallel. nosh already provides this functionality on Linux, where fsck(8) does provide machine-readable output. 3. Identify when the configuration import system needs to be triggered, such as when bsdconfig alters configuration files, and create the necessary hooks to import external configuration changes into nosh. __________________________________________________________________ UEFI Boot and Framebuffer Support Contact: Ed Maste A number of UEFI bug fixes were committed over the last quarter, further improving compatibility with different UEFI implementations. Specifically: on some implementations, FreeBSD failed to boot with an "ExitBootServices() returned 0x8000000000000002" error. This has been fixed with a retry loop (as required by UEFI) in r292515 and r292338. UEFI improvements from other developers have recently been committed or are in progress. These include support for environment variables set on the EFI loader command line, improved text console mode setting, support for nvram variables, and root-on-ZFS support. This project is sponsored by The FreeBSD Foundation. Open tasks: 1. Test FreeBSD-CURRENT snapshots on a variety of UEFI implementations. 2. Merge UEFI changes to stable/10 for FreeBSD 10.3-RELEASE. __________________________________________________________________ Chelsio iSCSI Offload Driver (Initiator and Target) Links Commit Adding Hardware Acceleration Support URL: https://svnweb.freebsd.org/changeset/base/292740 Contact: Navdeep Parhar A new driver, cxgbei, enabling hardware accelerated iSCSI with Chelsio's T5- and T4-based offload-capable cards, has been committed to head. Both Initiator and Target are supported. The wire traffic is standard iSCSI (SCSI over TCP as per RFC 3720, etc.) so an Initiator/Target using this driver will interoperate with all other standards-compliant implementations. Hardware assistance provided by the T5 and T4 ASICs includes: * Complete TCP processing. * iSCSI PDU identification and extraction from the byte oriented TCP stream. * Header and/or data digest generation and verification. * Zero copy support for both transmit and receive. This project is sponsored by Chelsio Communications. Open tasks: 1. The cxgbei(4) man page is missing but will be committed shortly. 2. The driver is in advanced stage QA and will see some bugfixes and performance enhancements in the very near future. MFC is possible as soon as the QA cycle completes. __________________________________________________________________ FreeBSD Integration Services (BIS) Links FreeBSD Virtual Machines on Microsoft Hyper-V URL: https://wiki.FreeBSD.org/HyperV Linux and FreeBSD Virtual Machines on Hyper-V URL: https://technet.microsoft.com/en-us/library/dn531030.aspx Contact: Dexuan Cui Contact: Hongjiang Zhang When FreeBSD virtual machines (VMs) run on Hyper-V, using Hyper-V synthetic devices is recommended to get the best network and storage performance and make full use of all the benefits that Hyper-V provides. The collection of drivers that are required to run Hyper-V synthetic devices in FreeBSD are known as FreeBSD Integration Services (BIS). Some of the BIS drivers (like network and storage drivers) have existed in FreeBSD 9.x and 10.x for years, but there are still some performance and stability issues and bugs. Compared with Windows and Linux VMs, the current BIS lacks some important features, such as virtual Receive Side Scaling (vRSS) support in the Hyper-V network driver and support for UEFI VM (boot from UEFI), among others. We are now working more on the issues and performance tuning to make FreeBSD VMs run better on Hyper-V and the Hyper-V based cloud platform Azure. Our work during 2015Q4 is documented below: * Optimizing the VMBus driver and Hyper-V network driver for performance: + Sent out patches to enable INTR_MPSAFE for the interrupt handling thread, speed up relid-to-channel lookup in the thread by map table, and optimize the VMBus ringbuffer writable notification to the host. + Developing a patch to enable the virtual Receive Side Scaling (vRSS) for Hyper-V network device driver. This will greatly improve the network performance for SMP virtual machine (VM). + Sent out a patch to enable the Hyper-V timer, which will improve the accuracy of timekeeping when FreeBSD VMs run on Hyper-V. * Fixing bugs and cleaning up the code: + Fixed a bug in checksum offloading (PR 203630 -- [Hyper-V] [nat] [tcp] 10.2 NAT bug in TCP stack or hyperv netsvc driver) in the Hyper-V network driver, making FreeBSD VM based NAT gateways work more reliably. + Fixed a serialization issue in the initialization of VMBus devices, fixing PR 205156 ([Hyper-V] NICs' (hn0, hn1) MAC addresses can appear in an uncertain way across reboot). + Fixed a KVP (Key-Value Pair) issue (retrieving a key's value can hang for an uncertain period of time). + Added ioctl support for SIOCGIFMEDIA for the Hyper-V network driver, fixing PR 187006 ([Hyper-V] dynamic address (DHCP) obtaining does not work on HYPER-V OS 2012 R2). + Sent out patches to add an interrupt counter for Hyper-V VMBus interrupts (so the user can easily get statistical information about VMBus interrupts), and fix the KVP daemon's poll timeout (so the daemon will avoid unnecessary polling every 100 milliseconds. + Identified a TSC calibration issue: the i8254 PIT timer emulation of Hyper-V is not fully reliable, so the Hyper-V time counter should be used to calibrate the TSC. A patch was drafted. With the patch, it looks the warning kernel message (e.g., "calcru: runtime went backwards from 46204978 usec to 23362331 usec for pid 0 (kernel)") will go away, and the time-based tracing of Dtrace will be more accurate. * We plan to add support for UEFI VMs (Hyper-V Generation-2 VMs). Currently some issues and to-do items were identified. For example, we cannot use the i8254 PIT to calibrate the TSC because the i8254 PIT does not exist in a UEFI VM, and we need to add support for the Hyper-V synthetic keyboard/mouse/framebuffer device. * We are working on a disk detection issue: when a FreeBSD VM runs on a Windows Server 2016 Technical Preview host, the VM will detect 16 disks when only one disk is configured for the VM. VMs running on these hosts can fail to boot. A workaround patch was created and we are trying to make a formal fix. * We are tidying up some internal BIS test cases and plan to publish them on github. This project is sponsored by Microsoft. __________________________________________________________________ FreeBSD Xen Links FreeBSD PVH DomU Wiki Page URL: http://wiki.xen.org/wiki/FreeBSD_PVH FreeBSD PVH Dom0 Wiki Page URL: http://wiki.xen.org/wiki/FreeBSD_Dom0 FreeBSD/Xen HVMlite Implementation URL: http://xenbits.xen.org/gitweb/?p=3Dpeople/royger/freebsd.git;a=3Ds= hortlog;h=3Drefs/heads/new_entry_point_v5 Contact: Roger Pau Monn=E9 Contact: Wei Liu Xen is a hypervisor using a microkernel design, providing services that allow multiple computer operating systems to execute on the same computer hardware concurrently. Xen support for FreeBSD on x86 as a guest was introduced in version 8, and ARM support is currently being worked on. Support for running FreeBSD as an amd64 Xen host (Dom0) is available in head. The x86 work done during this quarter has been focused on rewriting the PVH implementation inside of Xen, into what is now being called HVMlite to differentiate it with the previous PVH implementation. The Xen side of patches have already been committed to the Xen source tree, and will be available in Xen 4.7, the next version. Work has also begun on implementing HVMlite Dom0 support, although no patches have yet been published. HVMlite support for FreeBSD has not yet been committed, although an initial implementation is available in a personal git repository. The plan is to completely replace PVH with HVMlite on FreeBSD as soon as HVMlite supports Dom0 mode. Apart from this, Wei Liu is working on improving netfront performance on FreeBSD. Initial patches have been posted to the FreeBSD review system. The x86 unmapped bounce buffer code has also been improved, and unmapped IO support has been added to the blkfront driver. This project is sponsored by Citrix Systems R&D. Open tasks: 1. Finish HVMlite Dom0 support inside of Xen. 2. Deprecate and remove PVH support from Xen. 3. Remove PVH support from FreeBSD and switch to HVMlite. 4. Generalize the event channel code so it can be used on ARM. 5. Improve the performance of the various backends (netback, blkback). __________________________________________________________________ Improvements to the QLogic HBA Driver Contact: Alexander Motin The QLogic HBA driver, isp(4), received a substantial set of changes. The primary goal was to make the Fibre Channel target role work well with CTL, but many other things were also fixed/improved: * Added support for modern 16Gbps 26xx FC cards. * The firmware in ispfw(4) were updated to the latest versions. * Target role support was fixed and tested for all FC cards from ancient 1Gbps 22xx to modern 16Gbps 26xx. * Port database handling was unified for target and initiator roles, allowing an HBA port to play both roles at the same time. * The maximal number of ports was increased from 256 to 1024. * Multi-ID (NPIV) functionality was fixed/implemented, allowing 24xx and above cards to provide up to 255 virtual FC ports per physical port. * Added support for 8-byte LUNs for 24xx and above cards. The code is committed to FreeBSD head and stable/10 branches. This project is sponsored by iXsystems, Inc.. Open tasks: 1. NVRAM data reading is hackish and requires rework. 2. FCoE support for 26xx cards was not tested yet. __________________________________________________________________ iMX.6 Video Output Support Links Commit Adding Basic Video Support URL: https://svnweb.FreeBSD.org/changeset/base/292574 Contact: Oleksandr Tymoshenko iMX.6 is a family of SoC used in multiple hobbyist ARM boards such as the Hummingboard, RIoTboard, and Cubox. Most of these products have HDMI output, but until recently, FreeBSD did not benefit from it. As of r292574, there is basic video output support so you can use the console on iMX6-based boards and probably run Xorg (not yet tested). Due to the lack of some kernel functionality (see open tasks), the only supported mode is 1024x768. Open tasks: 1. Proper pixel clock initialization (relies on a clock framework). 2. More flexible video output path (support multiple IPUs and DIs). __________________________________________________________________ ioat(4) Driver Enhancements Links Wikipedia on I/OAT URL: https://en.wikipedia.org/wiki/I/O_Acceleration_Technology Last quarter's ioat(4) report URL: https://www.FreeBSD.org/news/status/report-2015-07-2015-09.html#io= at%284%29-Driver-Import Contact: Conrad Meyer I/OAT DMA engines are bulk memory operation offload engines built into some Intel Server/Storage platform CPUs. Several enhancements were made to the driver. It now avoids memory allocation in locked paths, which should avoid deadlocking in memory pressure scenarios. Support for Broadwell-EP devices has been added. The "blockfill" operation and a non-contiguous 8 KB copy operation have been added to the API. The driver can recover from various programming errors by resetting the hardware. This project is sponsored by EMC / Isilon Storage Division. Open tasks: 1. XOR and other advanced ("RAID") operation support. __________________________________________________________________ Kernel Vnode Cache Tuning Links MFC to stable/10 URL: https://reviews.FreeBSD.org/rS292895 Contact: Kirk McKusick Contact: Bruce Evans Contact: Konstantin Belousov Contact: Peter Holm Contact: Mateusz Guzik This completed project includes changes to better manage the vnode freelist and to streamline the allocation and freeing of vnodes. Vnode cache recycling was reworked to meet free and unused vnode targets. Free vnodes are rarely completely free; rather, they are just ones that are cheap to recycle. Usually they are for files which have been stat'd but not read; these usually have inode and namecache data attached to them. The free vnode target is the preferred minimum size of a sub-cache consisting mostly of such files. The system balances the size of this sub-cache with its complement to try to prevent either from thrashing while the other is relatively inactive. The targets express a preference for the best balance. "Above" this target there are 2 further targets (watermarks) related to the recyling of free vnodes. In the best-operating case, the cache is exactly full, the free list has size between vlowat and vhiwat above the free target, and recycling from the free list and normal use maintains this state. Sometimes the free list is below vlowat or even empty, but this state is even better for immediate use, provided the cache is not full. Otherwise, vnlru_proc() runs to reclaim enough vnodes (usually non-free ones) to reach one of these states. The watermarks are currently hard-coded as 4% and 9% of the available space. These, and the default of 25% for wantfreevnodes, are too large if the memory size is large. For example, 9% of 75% of MAXVNODES is more than 566000 vnodes to reclaim whenever vnlru_proc() becomes active. The vfs.vlru_alloc_cache_src sysctl is removed. The new code frees namecache sources as the last chance to satisfy the highest watermark, instead of selecting source vnodes randomly. This provides good enough behavior to keep vn_fullpath() working in most situations. Filesystem layouts with deep trees, where the removed knob was required, are thus handled automatically. As the kernel allocates and frees vnodes, it fully initializes them on every allocation and fully releases them on every free. These are not trivial costs: it starts by zeroing a large structure, then initializes a mutex, a lock manager lock, an rw lock, four lists, and six pointers. Looking at vfs.vnodes_created, these operations are being done millions of times an hour on a busy machine. As a performance optimization, this code update uses the uma_init and uma_fini routines to do these initializations and cleanups only as the vnodes enter and leave the vnode zone. With this change, the initializations are done kern.maxvnodes times at system startup, and then only rarely again. The frees are done only if the vnode zone shrinks, which never happens in practice. For those curious about the avoided work, look at the vnode_init() and vnode_fini() functions in sys/kern/vfs_subr.c to see the code that has been removed from the main vnode allocation/free path. __________________________________________________________________ Mellanox Drivers Links Hardware Information URL: http://www.mellanox.com/page/ethernet_cards_overview Commit Adding the Driver URL: https://svnweb.FreeBSD.org/changeset/base/290650 Contact: Hans Petter Selasky The Mellanox FreeBSD team is proud to announce support for the ConnectX-4 series of network cards in FreeBSD 11-current and FreeBSD 10-stable. These devices deliver top performance, with up to 100GBit/s of raw transfer capacity, and support both Ethernet and Infiniband. Currently, the Ethernet driver is ready for use and the Infiniband support for ConnectX-4 is making good progress. We hope that it will be complete before FreeBSD 11.0 is released. For more technical information, refer to the mlx5en(4) manual page in 11-current. The new driver for ConnectX-4 cards is called mlx5 and is put under /sys/dev and not under /sys/ofed as was done for the previous mlx4 driver. The mlx5en(4) kernel module is compiled by default in GENERIC kernels. This project is sponsored by Mellanox Technologies. __________________________________________________________________ Minimal Kernel with PNP-Based Autoloading Links Blog Post URL: http://bsdimp.blogspot.com/2016/01/details-on-coming-automatic-mod= ule.html Contact: Warner Losh Work on automatically loading modules based on the plug-and-play data from devices that are scanned and found to not already have a driver attached is in progress. Digging this information out from kernel modules, as well as tagging relevant bits of driver tables, has been committed. PC Card, USB, and some PCI devices now have these markings. This data is stored in a file that the kernel, boot loader, and userland processes all can access. When complete, a user will be able to run a minimal kernel (currently checked in as the MINIMAL config). Devices necessary for booting will be loaded by loader(8). Other devices may be loaded there, or early in the boot (depending on which gives better performance). Users will still be able to run more monolithic; configurations, as well as limit which kernel modules are available as can be done today, though without the convenience that automatic loading will provide. This work remains ongoing. Open tasks: 1. Go through all the simplebus drivers and add plug-and-play information there. Some additional minor simplebus functionality is needed. There is some work in progress for this. 2. Go through all the PCI drivers and add plug-and-play information to them. Unlike PC Card or USB, the PCI bus does not have a stylized table of PCI IDs, so each driver invents its own method, meaning that the semi-mechanical conversion that was done with PC Card and USB will not be possible. Instead, customized code for each driver will be needed. Since a large number of drivers have their own device tables, the work will be primarily writing a description of the current table style. 3. Run-time parsing and loading is still needed. __________________________________________________________________ MMC Stack Under CAM Framework Links Project Information URL: https://bakulin.de/freebsd/mmccam.html Source Code URL: https://github.com/kibab/FreeBSD/tree/mmccam Patch for Review URL: https://reviews.FreeBSD.org/D4761 Contact: Ilya Bakulin The goal of this project is to reimplement the existing MMC/SD stack using the CAM framework. This will permit utilizing the well-tested CAM locking model and debug features. It will also be possible to process interrupts generated by the inserted card, which is a prerequisite for implementing the SDIO interface. The first version of the code was uploaded to Phabricator for review. The new stack is able to attach to the SD card and bring it to an operational state so it is possible to read and write to the card. The only supported SD controller driver is ti_sdhci, which is used on the BeagleBone Black. Modifying other SDHCI-compliant drivers should not be difficult. Open tasks: 1. Rework bus/target/LUN enumeration and the locking model. I do not really understand the CAM locking and am likely to do it incorrectly. 2. Modify the SDHCI driver on at least one x86 platform. This will make development and collaboration easier. 3. Begin implementing SDIO-specific bits. __________________________________________________________________ ntb_hw(4)/if_ntb(4) Driver Synced up to Linux Links Jon Mason's NTB wiki URL: https://github.com/jonmason/ntb/wiki Intel NTB whitepaper URL: https://www-ssl.intel.com/content/dam/www/public/us/en/documents/w= hite-papers/xeon-c5500-c3500-non-transparent-bridge-paper.pdf Contact: Conrad Meyer ntb_hw(4) is now up-to-date with the Linux NTB driver as of the work-in-progress 4.4 kernel (and actually, contains some fixes that haven't landed in the mainline Linux tree yet but will land in 4.5). Only Back-to-back ("B2B") configurations are supported at this time. Going forward, newer hardware may only support the B2B configuration. if_ntb(4) is mostly up-to-date with the Linux NTB netdevice driver. Notably absent is support for changing the MTU at runtime. This project is sponsored by EMC / Isilon Storage Division. Open tasks: 1. Improving if_ntb(4) to avoid using the entire Base Address Register (BAR) when very large BAR sizes are configured (e.g., 512 GB). 2. Improving pmap_mapdev(9) to somehow allocate only superpage mappings for large BARs, on platforms that support superpages. (NTB BARs can be as large as 512 GB.) __________________________________________________________________ Out of Memory Handler Rewrite Contact: Konstantin Belousov The Out of Memory (OOM) code is intended to handle the situation where the system needs free memory to make progress, but no memory can be reused. Most often, the situation is that to free memory, the system needs more free memory. Consider a case where the system needs to page-out dirty pages, but needs to allocate structures to track the writes. OOM "solves" the problem by killing some selection of user processes. In other words, it trades away system deadlock by suffering a partial loss of user data. The assumption is that it is better to kill a process and recover data in other processes than to lose everything. Free memory in the FreeBSD Virtual Memory (VM) system appears from two sources. One is the voluntary reclamation of pages used by a process, for example unmapping private anonymous regions, or the last unlink of an otherwise unreferenced file with cached pages. Another source is the pagedaemon, which forcefully frees pages which carry data, of course after the data is moved to some other storage, like swap or file blocks. OOM is triggered when the pagedaemon definitely cannot free memory to satisfy the requests. The old criteria to trigger the OOM action was a combination of low free swap space and a low count of free pages (the latter is expressed precisely with the paging targets constants, but this is not relevant to the discussion). That test is mostly incorrect. For example, a low free page state might be caused by a greedy consumer allocating all pages freed by the page daemon in the current pass, but this does not preclude the page daemon from producing more pages. Also, since page-outs are asynchronous, the previous page daemon pass might not immmediately produce any free pages, but they would appear some short time later. More seriously, low swap space does not necessarily indicate that we are in trouble: lots of pages might not require swap allocations to be freed, like clean pages or pages backed by files. The last notion is serious, since swap-less systems were considered as having full swap. Instead of trying to deduce the deadlock from looking at the current VM state, the new OOM handler tracks the history of page daemon passes. Only when several consecutive passes failed to meet the paging target is an OOM kill considered necessary. The count of consequent failed passes was selected empirically, by testing on small (32M) and large (512G) machines. Auto-tuning of the counter is possible, but requires some more architectural changes to the I/O subsystem. Another issue was identified with the algorithm which selects a victim process for OOM kill. It compared the counts of pages mapping entries (PTEs) installed into the machine paging structures. For different reasons, the machine-dependent VM code (pmap) may remove the pte for a memory-resident page. Under some circumstances related to other measures to prevent low memory deadlock, very large processes which consume all system memory could have few or no ptes. The old OOM selector ignored the process which caused the deadlock, killing unrelated processes. A new function, vm_pageout_oom_pagecount(), was written which applies a reasonable heuristic to estimate the number of pages freed by killing the given process. This eliminates the effect of selecting small unrelated processes for OOM kill. The rewrite was committed to head in r290917 and r290920. This project is sponsored by The FreeBSD Foundation. __________________________________________________________________ sendfile(2) Improvements Links Commit to Head URL: https://svnweb.FreeBSD.org/base?view=3Drevision&revision=3D293439 Slides URL: http://www.slideshare.net/facepalmtarbz2/new-sendfile-in-english Presentation (in Russian) URL: https://events.yandex.ru/lib/talks/2682/ Contact: Gleb Smirnoff The sendfile(2) system call was introduced in 1998 as an alternative to a traditional read(2)/write(2) loop, speeding up server performance by a factor of ten at the time. Since it was adopted by all major operating systems, it is now used by any serious web server software. Wherever there is high traffic, there is sendfile(2) under the hood. Now, with FreeBSD 11, we are making the next revolutinary step in serving traffic. sendfile(2) no longer blocks waiting on disk I/O. Instead, it immediately returns control to the application, performing the necessary I/O in the background. The original sendfile(2) waited for the disk read operation to complete and then put the data that was read into the socket, then returned to userspace. If a web server served thousands of clients with thousands of requests, it was forced to spawn extra contexts from which to run sendfile(2) to avoid stalls. Alternatively, it could use special tricks like the SF_NODISKIO flag that forces sendfile(2) to serve only content that is cached in memory. Now, these tricks are in the past, and a web server can simply use sendfile(2) as it would use write(2), without any extra care. The new sendfile cuts out the overhead of extra contexts, short writes, and extra syscalls to prepopulate the cache, bringing performance to a new level. The new syscall is built on top of two newly-introduced kernel features. The first is an asynchronous VM pager interface and the corresponding VOP_GETPAGES_ASYNC() file system method for UFS. The second is the concept of "not ready" data in sockets. When sendfile(2) is called, first VOP_GETPAGES_ASYNC() is called, which dispatches I/O requests for completion. Buffers with pages to be populated are put into the socket buffer, but flagged as not-yet-ready. Control immediately returns to the application. When the I/O is finished, the buffers are marked as ready, and the socket is activated to continue transmission. Additional features of the new sendfile are new flags that provide the application with extra control over the transmitted content. Now it is possible to prevent caching of content in memory, which is useful when it is known that the content is unlikely to be reused any time soon. In such cases, it is better to let the associated storage be freed, rather than putting the data in cache. It is also possible to specify a readahead with every syscall, if the application can predict client behavior. The new sendfile(2) is a drop-in replacement, API and ABI compatible with the old one. Applications do not even need to recompile to benefit from the new implementation. This work is a joint effort between two companies: NGINX, Inc., and Netflix. There were many people involved in the project. At its initial stage, before code was written, the idea of such an asynchronous drop-in replacement was discussed amongst Gleb Smirnoff, Scott Long, Konstantin Belousov, Adrian Chadd, and Igor Sysoev. The initial prototype was coded by Gleb under the supervision of Kostik on the VM parts of the patch, and under constant pressure from Igor, who demanded that nginx be capable of running with the new sendfile(2) with no modifications. The prototype demonstrated good performance and stability and quickly went into Netflix production in late 2014. During 2015, the code matured and continued serving production traffic at Netflix. Scott Long, Randall R. Stewart, Maksim Yevmenkin, and Andrew Gallatin added their contributions to the code. Now we are releasing the code behind our success to the FreeBSD community, making it available to all FreeBSD users worldwide! This project is sponsored by Netflix, and NGINX, Inc.. Open tasks: 1. SSL_sendfile() -- an extension to the new sendfile(2) that allows uploading session keys to the kernel, and then using sendfile(2) on an SSL-enabled socket. __________________________________________________________________ sysctl Enhancements Links Wikipedia Entry on C99 Fixed-Width Integer Types URL: https://en.wikipedia.org/wiki/C_data_types#Fixed-width_integer_typ= es sysctl(8) -t Submission PR URL: https://bugs.FreeBSD.org/bugzilla/show_bug.cgi?id=3D203918 Contact: Conrad Meyer Contact: Ravi Pokala Contact: Marcelo Araujo Support was added for fixed-width sysctls (signed and unsigned 8-bit, 16-bit, 32-bit, and 64-bit integers). The new KPIs are documented in the sysctl(9) manual page. The sysctl(8) command line tool supports all of the new types. sysctl(8) gained the -t flag, which prints sysctl type information (the original patch was submitted by Yoshihiro Ota). This support includes the newly added fixed-width types. This project is sponsored by EMC / Isilon Storage Division. __________________________________________________________________ Touchscreen Support for Raspberry Pi and Beaglebone Black Links Beaglebone Black with 4DCAPE-43T Demo URL: http://kernelnomicon.org/?p=3D534 Input Stack Plans URL: https://wiki.FreeBSD.org/201510DevSummit/GraphicsStack evdev Port URL: https://wiki.FreeBSD.org/SummerOfCode2014/evdev_Touchscreens Contact: Oleksandr Tymoshenko There are two working proof-of-concept drivers for the AM335x touchscreen and for the official Raspberry Pi's touchscreen LCD. Proper touchscreen support would consist of a userland event reading API, a kernel event reporting API, and kernel hardware drivers for specific devices. There is an ongoing effort to port the Linux evdev API to FreeBSD so applications that use libraries like libinput or tslib could be used without any major changes. Since it is not yet complete, I created a naive evdev-like API for both kernel and tslib and was able to run a demo on a Beaglebone Black with 4DCAPE-43T. Once evdev makes it into the tree, both hardware drivers can be modified to include "report events" portions and committed. __________________________________________________________________ armv6 Hard Float Default ABI Links Blog Entry URL: http://bsdimp.blogspot.com/2015/12/hard-float-api-coming-soon-by-d= efault.html Contact: Warner Losh Work on moving armv6 from a "soft float" ABI (but still using hardware floating point) to a fully "hardware float" API moves forward. The ability to have both soft and hard ABI libraries on the same system is now functional. All armv6 and armv7 systems we support have hardware floating point capabilities. We currently use the floating-point hardware, but with a slightly un-optimal ABI, for compatibility with older versions of FreeBSD. The ABI differences are only at the userspace level -- the kernel does not care what floating-point ABI is used, and both types of binaries can run at the same time. The run-time linker now knows if a binary uses the hardware float ABI or the software float ABI by examining some fields in the ELF header. The linker uses different paths and config files for hard versus soft binaries. The rc system has been enhanced to load the software float paths. ldconfig now understands soft libraries in much the same way that it understands 32-bit libraries on 64-bit systems. No additional kernel support was necessary for this, apart from a minor patch to pass the ELF header information to the binary, which has been in the tree since last summer. The experimental armv6hf MACHINE_ARCH will be retired after a transition period. It will cease to mean anything different from armv6 after the build system changes go in. Support for building soft-float ABI libraries will remain in the tree, to support the WITH_LIBSOFT build option. Open tasks: 1. Complete documentation needs to be written. 2. Hooks into the FreeBSD build system to generate soft float and transition to hard float after a flag day need to be polished up and committed. 3. A number of different upgrade/coexistence scenarios need to be tested, and a full package run needs to be done to assess the latest state of the ports tree. This work should be completed by the end of January. __________________________________________________________________ FreeBSD on Marvell Armada38x Contact: Marcin Wojtas Contact: Michal Stanek Contact: Bartosz Szczepanek Contact: Jan Dabros FreeBSD has been ported to run on the Marvell Armada38x platform. This SoC family boasts single/dual high-performance ARM Cortex-A9 CPUs. The multi-user SMP system is fully working and has been tested on Marvell DB-88F6288-GP and SolidRun ClearFog development boards. The root filesystem can be hosted on a USB 3.0/2.0 drive or via NFS using a PCIe network card. Experimental support is available for on-chip Gigabit Ethernet (NETA). Additional features: * GIC+MPIC cascaded interrupts courtesy of INTRNG * CESA dual-channel cryptographic engine * USB 3.0 and 2.0 * PCIe 2.0 * I2C * GPIO * Watchdog * RTC The port is under community review and will be integrated into head soon. This project is sponsored by Stormshield, and Semihalf. Open tasks: 1. Optimize performance of NETA and prepare for submission. __________________________________________________________________ FreeBSD on Newer ARM Boards Links FreeBSD on Odroid-C1 URL: https://wiki.FreeBSD.org/FreeBSD/arm/Odroid-C1 Commit Adding Glue Driver URL: https://svnweb.FreeBSD.org/changeset/base/291683 Contact: John Wehle Contact: Ganbold Tsagaankhuu We made the changes required to support the Amlogic Meson Ethernet controller on the Hardkernel ODROID-C1 board, which has an Amlogic aml8726-m8b SoC. The main effort needed was to write a glue driver for the Ethernet controller -- the Amlogic Meson Ethernet controller is compatible with Synopsys DesignWare 10/100/1000 Ethernet MAC (if_dwc). __________________________________________________________________ FreeBSD on SoftIron Overdrive 3000 Links SoftIron Website URL: http://softiron.co.uk/products/ Contact: Andrew Turner The SoftIron Overdrive 3000 is an ARMv8 based server with an 8-core AMD Opteron A1100 processor. The Overdrive 3000 has two 10Gbase-T Ethernet ports, two PCI Express ports, and eight SATA ports. FreeBSD has been updated to be able to boot on this hardware. Support for the SATA device was added to the ahci(4) driver. Unlike on x86, this is a Memory Mapped (mmio) device, and not on the PCI bus. To support this, a new ahci mmio driver attachment has been added. The generic PCIe driver has been updated to improve interrupt handling. This includes supporting the interrupt-map devicetree property, and supporting MSI and MSI-X interrupts on arm64. Support for MSI and MSI-X interrupts has been added to the ARM Generic Interrupt Controller v2 (gicv2) driver. This allows devices to use these interrupts. This has been tested with a collection of PCIe NIC hardware. This project is sponsored by SoftIron Inc.. Open tasks: 1. Write a driver for the 10Gbase-T NIC. __________________________________________________________________ FreeBSD/arm64 Links FreeBSD arm64 Wiki Entry URL: https://wiki.FreeBSD.org/arm64 Contact: Andrew Turner Contact: Konstantin Belousov Contact: Ed Maste Contact: Ed Schouten Support was added for kernel modules. This included adding the needed relocation types to the in-kernel relocator, and updating the build logic to build modules for arm64. CTF data is currently not generated for modules due to a linker bug. Shared page support was added. This allows gettimeofday(2) to be implemented in userland by directly accessing the timer register. This reduces the overhead of these calls as we no longer need to call into the kernel. This also moves the signal trampoline code away from the stack, allowing for the stack to become non-executable. CloudABI support for arm64 was added. This included moving the machine-independent code into a separate file to be shared among all architectures. An issue in the arm64 kernel was found and fixed thanks to the CloudABI test suite. Self-hosted poudriere package builds have been tested. These complement the previous build strategy of using qemu usermode emulation. With this combination of self-hosted and qemu usermode building, many ports that used to be broken on arm64 have been fixed, resulting in over 17,000 ports building for the architecture. The machine-dependent portion of kernel support for single-stepping userland binaries has been started. This will allow debuggers like lldb to step through an application while debugging. Many small fixes have been made to FreeBSD/arm64. These include fixing stack tracing through exceptions, printing more information about "data abort" kernel panics, cleaning up the atomic functions, supporting multi-pass driver attachment, fixing userland stack alignment, cleaning up early page table creation, fixing asynchronous software trap handling, and enabling interrupts in exception handlers. This project is sponsored by The FreeBSD Foundation, and ABT Systems Ltd. __________________________________________________________________ FreeBSD/RISC-V Links Project Wiki URL: https://wiki.FreeBSD.org/riscv Contact: Ruslan Bukin Contact: Ed Maste Contact: Arun Thomas We have begun work on support for the RISC-V architecture. RISC-V is a new ISA designed to support computer architecture research and education that is now set to become a standard open architecture for industry implementations. A minimal set of changes needed to compile the kernel toolchain has been committed, along with machine headers, run-time linker (rtld-elf) support, and libc/libstand. All development has been happening in a separate branch, with a goal of moving development to head in a few weeks. At present, FreeBSD/RISC-V boots to multiuser in the Spike simulator. This project is sponsored by DARPA, AFRL, and HEIF5. Open tasks: 1. We plan to commit the rest of userspace (i.e., libc), kernel support, etc., in a few weeks. __________________________________________________________________ Improvements for ARMv6/v7 Support Contact: Dominik Ermel Contact: Wojciech Macek Contact: Zbigniew Bodek Numerous improvements for the ARMv6/v7 kernel and tools have been developed by the Semihalf team. Those include: * Fixes for KGDB support. * Support for branch instructions in ptrace single stepping. * Fixes for kernel minidumps. * Improvements for LIBUSBBOOT. * Support for Exynos EHCI in the loader. * A fix for instruction single stepping in DDB. * Support for hardware watchpoints, including watchpoints on SMP systems. * Single stepping using the ARM Debug Architecture. * Support for gzip-compressed kernel modules in kldload. * Backport of the new pmap VM code to FreeBSD 10-STABLE (not yet sent to upstream). Most of the introduced changes have been committed to head and more are on the way. This project is sponsored by Juniper Networks Inc., and Semihalf. Open tasks: 1. Finish upstreaming the hardware watchpoints support. __________________________________________________________________ Base System Build Improvements Links FreeBSD-Arch Post Describing Plans URL: https://lists.FreeBSD.org/pipermail/freebsd-arch/2015-December/017= 571.html BSDCan 2014 META_MODE Presentation URL: http://www.bsdcan.org/2014/schedule/events/460.en.html WITH_FAST_DEPEND Details URL: https://svnweb.FreeBSD.org/base?view=3Drevision&revision=3D290433 WITH_CCACHE_BUILD Details URL: https://svnweb.FreeBSD.org/base?view=3Drevision&revision=3D290526 Contact: Bryan Drewery Bryan Drewery (bdrewery@) has been working to improve the build framework as well as buildworld build times. The build system has been largely untouched by large-scale changes for many years. Most of the effort has been on improving the recent META_MODE merge that was presented at BSDCan 2014. This is a new build system that is not currently enabled by default but brings many benefits. Beyond that, some highlights of the work changing buildworld are: * WITH_FAST_DEPEND, which avoids calling "mkdep" during the make depend phase and instead generates dependency files during compilation. The old scheme was pre-processing all source files twice. The new version saves 16-35% in build times. * WITH_CCACHE_BUILD adds built-in ccache support, avoiding many of the historical pitfalls of changing CC in make.conf to use ccache. * Many improvements for parallelization of the build. * LIBADD improvements to ensure proper usage of this tool to replace duplicate LDADD and DPADD statements. Further work is under way to reduce overlinking. * A lot of cleanup of improper framework usage. * Ensuring that installing files from the build tree fails if the destination directory is missing, rather than installing a file as the directory name. This project is sponsored by EMC / Isilon Storage Division. Open tasks: 1. See the FreeBSD-arch mail for more information on planned work. __________________________________________________________________ ELF Tool Chain Tools Links ELF Tool Chain Website URL: http://elftoolchain.sourceforge.net Contact: Ed Maste The ELF Tool Chain project provides BSD-licensed implementations of compilation tools and libraries for building and analyzing ELF objects. The project began as part of FreeBSD but later became an independent project in order to encourage wider participation from others in the open-source developer community. In the last quarter of 2015 the ELF Tool Chain tools were updated to a snapshot of upstream Subversion revision 3272. Improvements include better input file validation, RISC-V support, support for Xen ELF notes, additional MIPS and ARM relocations, better performance, and bug fixes. The ELF Tool Chain project is planning a new release in the first quarter of 2016, which will facilitate wider testing and use by projects in addition to FreeBSD. This project is sponsored by The FreeBSD Foundation. Open tasks: 1. Add missing functionality (PE/COFF support) to elfcopy and migrate the base system build. 2. Fix issues found by fuzzing inputs to the tools. 3. Add automatic support for separate debug files. __________________________________________________________________ The LLDB Debugger Links FreeBSD LLDB Wiki Page URL: https://wiki.FreeBSD.org/lldb Contact: Ed Maste LLDB is the debugger from the LLVM family of projects. Originally developed for Mac OS X, it now also supports FreeBSD, NetBSD, Linux, Android, and Windows. It builds on existing components in the larger LLVM project, for example using Clang's expression parser and LLVM's disassembler. LLDB in the FreeBSD base system was upgraded to version 3.7.0 as part of the Clang and LLVM upgrade, and it will similarly be upgraded again to 3.8.0 for FreeBSD 11.0-RELEASE. LLDB is now enabled by default on the amd64 and arm64 platforms. It is now a functional basic debugger on arm64, after a number of fixes were made in the last quarter to both LLDB and the FreeBSD kernel. This project is sponsored by The FreeBSD Foundation. Open tasks: 1. Rework the LLDB build to use LLVM and Clang shared libraries. 2. Port a remote debugging stub to FreeBSD. 3. Add support for local and core file kernel debugging. 4. Improve support on architectures other than amd64 and arm64. __________________________________________________________________ Updates to GDB Links New 1:1-Only Thread Target for FreeBSD URL: https://github.com/bsdjhb/gdb/tree/freebsd-threads Contact: John Baldwin The KGDB option is now on by default in the devel/gdb port. Changes to support cross-debugging of crashdumps in libkvm were committed to head in r291406. A new thread target for FreeBSD that is suitable for merging upstream has been written and lightly tested. However, it is not yet available as an option in the port. This thread target uses ptrace(2) directly rather than libthread_db and as such supports threads on all ABIs (such as FreeBSD/i386 binaries on FreeBSD/amd64 and possibly Linux binaries, though that is not yet tested). It also requires less-invasive changes in the MD targets in GDB compared to the libthread_db-based target. Open tasks: 1. Add a port option for the new 1:1-only thread target. 2. Test the new 1:1-only thread target. 3. Figure out why the powerpc kgdb targets are not able to unwind the stack past the initial frame. 4. Add support for more platforms (arm, mips, aarch64) to upstream gdb for both userland and kgdb. 5. Add support for debugging powerpc vector registers. __________________________________________________________________ Bringing GitLab into the Ports Collection Links PR for the New Port URL: https://bugs.FreeBSD.org/bugzilla/show_bug.cgi?id=3D202468 Installation Guide URL: https://github.com/t-zuehlsdorff/gitlabhq/blob/8-3-docu/doc/instal= l/installation-freebsd.md Upstream GitLab website URL: https://github.com/gitlabhq/gitlabhq/ Contact: Torsten Z=FChlsdorff GitLab is a web-based Git repository manager with many features that is used by more than 100,000 organizations including NASA and Alibaba. It also is a very long-standing entry on the "Wanted Ports" list of the FreeBSD Wiki. In the last quarter, there was steady progress in the project itself and the porting. The current release of GitLab 8.3 is now based on Rails 4.2, which obsoletes the need for around 50 new ports. Now there are only 5 dependencies left to be committed! While the new version of GitLab 8.3 eases the porting, there are big changes since the last working port of GitLab 7.14. Nonetheless, it could be expected to see the next working port in the first quarter of 2016. This project is sponsored by anyMOTION GRAPHICS GmbH, D=FCsseldorf, Germany. Open tasks: 1. Update the patches from GitLab 7.14 to 8.3. 2. Update the documentation. 3. Provide an updated patch. __________________________________________________________________ GNOME on FreeBSD Links FreeBSD Gnome Website URL: http://www.FreeBSD.org/gnome Devel Repository URL: https://github.com/FreeBSD/freebsd-ports-gnome Upstream Build Bot URL: https://wiki.gnome.org/Projects/Jhbuild/FreeBSD USE_GNOME Porter's Handbook Chapter URL: https://www.FreeBSD.org/doc/en_US.ISO8859-1/books/porters-handbook= /using-gnome.html Contact: FreeBSD GNOME Team The FreeBSD GNOME Team maintains the GNOME, MATE, and CINNAMON desktop environments and graphical user interfaces for FreeBSD. GNOME 3 is part of the GNU Project. MATE is a fork of the GNOME 2 desktop. CINNAMON is a desktop environment using GNOME 3 technologies but with a GNOME 2 look and feel. This quarter, due to limited available time there was not much progress. This began to change in December, when work started on porting MATE 1.12 and CINNAMON 2.8 to FreeBSD. Open tasks: 1. The FreeBSD GNOME website is stale. Work is under way to improve it. 2. Continue working on investigating the issues blocking GNOME 3.18. __________________________________________________________________ IPv6 Promotion Campaign Links Wiki Page URL: https://wiki.FreeBSD.org/IPv6PortsTODO Contact: Torsten Z=FChlsdorff There are more and more machines on the internet that only support IPv6. I manage some of them, and was regularly hit by missing IPv6 support when fetching the distfiles needed for building ports. I did some research into the impact of missing IPv6 support on the ports tree. The results are that 10,308 of 25,522 ports are not fetchable when using IPv6. This renders, through dependencies, a total of 17,715 ports unbuildable from IPv6-only systems. All you can do then is wait and hope that distcache.FreeBSD.org caches the distfile. But this will take some time, which might not be a luxury available when a piece of software in use is hit by a security issue. Based on the research, a promotion campaign for IPv6 was started. Some volunteers will contact the relevant system administrators and try to convince them to support IPv6. This will start in January 2016 and will hopefully create some progress soon. __________________________________________________________________ KDE on FreeBSD Links KDE on FreeBSD Website URL: https://FreeBSD.kde.org/ Experimental KDE Ports Staging Area URL: https://FreeBSD.kde.org/area51.php KDE on FreeBSD Wiki URL: https://wiki.FreeBSD.org/KDE KDE/FreeBSD Mailing List URL: https://mail.kde.org/mailman/listinfo/kde-FreeBSD Development Repository for Integrating KDE Frameworks 5 and Plasma 5 URL: http://src.mouf.net/area51/log/branches/plasma5 Contact: KDE on FreeBSD team The KDE on FreeBSD team focuses on packaging and making sure that the experience of KDE and Qt on FreeBSD is as good as possible. The team kept busy during the last quarter of 2015. Quite a few big updates were committed to the ports tree, and a few more are being worked on in our experimental repository. As in previous quarters, we would like to thank several people who have contributed with machines, patches, and general help. Tobias Berner, Guido Falsi (madpilot@), Adriaan de Groot, Ralf Nolden, Steve Wills (swills@), and Josh Paetzel (jpaetzel@) have been essential to our work. The following big updates landed in the ports tree this quarter. In many cases, we have also contributed patches to the upstream projects. * CMake 3.4.0 and 3.4.1 * Calligra 2.9.1, the latest release of the integrated work applications suite. Calligra had last been updated in the ports tree at the end of 2013! * PyQt4 4.11.4, QScintilla2 2.9.1 and SIP 4.17. * PyQt5 5.5.1. Thanks to the work spearheaded by Guido Falsi and Tobias Berner in the previous quarter, the PyQt5 ports have finally been committed to the ports tree. Not only was this long-awaited on its own, it allows other ports to be updated to their latest versions. * QtCreator 3.5.1 and 3.6.0. * A couple of Qt5 packaging bugs were fixed: it should now be more straightforward to use the Qt5 ports to build software outside the ports tree, and it is now possible to build ports that require a C++11 compiler and Qt5 on FreeBSD 9.x. Work on updating the Qt5 ports to their latest version, as well as porting KDE Frameworks 5 and Plasma 5 to FreeBSD, is well under way in our experimental area51 repository. At the moment, it contains Qt5 5.5.1, KDE Frameworks 5.17.0, Plasma 5.5.1 and KDE Applications 15.12.0. Users interested in testing those ports are encouraged to follow the instructions in our website and report their results to our mailing list. Qt5 5.5.1 is in our "qt-5.5" branch, and Plasma 5 and the rest is in the "plasma5" branch (which also contains Qt 5.5.1). Open tasks: 1. Commit the Qt5 5.5.1 update. 2. Land the KDE Frameworks 5 and Plasma 5 ports in the tree. 3. Investigate what needs to be done to make QtWebEngine, the Chromium-based replacement for QtWebKit, work on FreeBSD. __________________________________________________________________ Linux Kernel as a Library Added to the Ports Collection Links Upstream LKL Github repository URL: https://github.com/lkl/linux Contact: Conrad Meyer LKL ("Linux Kernel as a Library") is a special "architecture" of the full Linux kernel that builds as a userspace library on various platforms, including FreeBSD. One application of such a library is using Linux filesystem drivers to implement a FUSE backend. fusefs-lkl's lklfuse binary is such a FUSE filesystem. It can mount ext4/3/2, XFS, and BTRFS read-write, using the native drivers from Linux. sysutils/fusefs-lkl can now be installed either from packages or ports, providing access to these filesystems on FreeBSD via FUSE. __________________________________________________________________ LXQt on FreeBSD Links FreeBSD LXQt Project URL: https://wiki.FreeBSD.org/LXQt LXQt Devel Repository URL: https://www.assembla.com/spaces/lxqt/subversion/source Contact: Olivier Duchateau LXQt is the Qt port of and the upcoming version of LXDE, the Lightweight Desktop Environment. It is the product of the merge between the LXDE-Qt and the Razor-qt projects. The porting effort remains very much a work in progress: it needs some components of Plasma 5, the new major KDE workspace. Currently, only the 0.10 branch is functional. See our wiki page for a complete list of applications. We also sent updates for some components of LXDE, required for the LXQt desktop: * x11/menu-cache 1.0.1 * x11/lxmenu-data 0.1.4 Binary packages are available (only for test purposes) which are regularly tested with the KDE development repository. Open tasks: 1. Port libsysstat to BSD systems. 2. Fix some issues that need to be resolved, especially the shutdown and reboot commands. __________________________________________________________________ New Tools to Enhance the Porting Experience Links pytoport: Generate FreeBSD Ports from Python modules on PyPI URL: https://github.com/FreeBSD/pytoport bandar: Create Development Overlays for the Ports Tree URL: https://github.com/bbqsrc/bandar skog: Generate Visual Dependency Trees for FreeBSD Ports URL: https://github.com/bbqsrc/skog-python spdx-lookup: SPDX License List Query Tool URL: https://github.com/bbqsrc/spdx-lookup-python Contact: Brendan Molloy When I starting working on ports for FreeBSD in the last couple of weeks, I found that my workflow was not as efficient as it could be using just the available tools, so I made a few that could be useful to the development community at large. All of these have been or will soon be added to the Ports tree, so you can play with them today! pytoport is a command-line application that generates a skeleton port for a given PyPI package name. It attempts to generate the correct dependencies, makes a good attempt at guessing the license using spdx-lookup, and generates a pkg-descr. This made generating the fifteen or so ports I was working on a complete breeze. While doing this, however, I noticed that some ports were bringing in dependencies that I did not expect, and I needed some way to visualise this. skog builds a dependency tree from the depends lists output by the Ports framework, and displays it on the command line (with extra shiny output if you are using UTF-8). No more pesky example and documentation dependencies being dragged in when you clearly toggled that OPTION as far off as it would go. While doing all of this, I found it cumbersome to be copying ports back and forth between my small development tree living in git and the larger upstream SVN tree I was using in poudriere. I built a tool called bandar that takes advantage of the FUSE version of unionfs to easily overlay my dev tree on the upstream tree, run lint checks, poudriere, and generate archives with ease. I am very impressed with how easy it was to build more tooling for FreeBSD. I hope some of these tools will be of some use to you, and as always, I'd love to hear your feedback! Open tasks: 1. Improve skog to support searching a tree for a certain port. 2. Get the bandar port completed. 3. Continue to improve pytoport, adding trove support and better dependency handling. 4. Patches welcome for all of the above! __________________________________________________________________ Node.js Modules Links Node.js Modules Repository URL: https://www.assembla.com/spaces/cozycloud/subversion/source Contact: Olivier Duchateau Node.js is a platform built on Chrome's JavaScript runtime for easily building fast, scalable network applications. It uses an event-driven, non-blocking I/O model that makes it lightweight and efficient -- perfect for data-intensive real-time applications that run across distributed devices. The goal of this project is to make it easy to install the modules available in the npm package registry. Currently, the repository contains slightly fewer than 300 new ports, in particular: * Socket.IO, a library for realtime web applications * Jison, a JavaSript parser generator We have improved the USES framework: * Users can define which version of Node.js will be installed through /etc/make.conf. * node-gyp is now well-integrated into the USES framework, via the build argument. * The pkg-plist is now automatically generated to make portlint happy. Each port is up-to-date. Open tasks: 1. Update the pre-draft documentation. 2. Bring in grunt.js (and modules), the JavaScript task runner. __________________________________________________________________ Ports Collection Links Ports Collection Landing Page URL: http://www.FreeBSD.org/ports/ Contributor's Guide URL: https://www.freebsd.org/doc/en/articles/contributing/ports-contrib= uting.html Ports Monitoring Service URL: http://portsmon.FreeBSD.org/index.html Ports Management Team Website URL: http://www.FreeBSD.org/portmgr/index.html Portmgr on Facebook URL: http://www.facebook.com/portmgr Contact: Frederic Culot Contact: Frederic Culot Contact: FreeBSD Ports Management Team As of the end of the fourth quarter, the ports tree holds a bit more than 25,000 ports, and the PR count is around 2,000. The activity on the ports tree remains steady, with about 7,000 commits performed by almost 120 active committers. On the problem reports front, figures show an encouraging trend, with a significant increase in the number of PRs fixed during Q4. Indeed, almost 1,800 reports were fixed, which makes an increase of about 20% compared to Q3. In Q4, eight commit bits were taken in for safekeeping, following an inactivity period of more than 18 months (lioux, lippe, simon, jhay, max, sumikawa, alexey, sperber). Three new developers were granted a ports commit bit (Kenji Takefu, Carlos Puga Medina, and Ian Lepore), and one returning committer (miwi) had his commit bit reinstated. Also related to the management of ports commit bits, nox's grants were revoked, since the FreeBSD developers learned that Juergen Lock had passed away. On the management side, no changes were made to the portmgr team during Q4. On QA side 33 exp-runs were performed to validate sensitive updates or cleanups. Amongst those noticeable changes are the update to GCC 4.9, CMake to 3.4.1, PostgreSQL to 9.4, and ruby-gems to 2.5.0. Some infrastructure changes included the usage of a WRKSRC different from WRKDIR when NO_WRKSUBDIR is set, the removal of bsd.cpu.mk from sys.mk, and the move of QT_NONSTANDARD to bsd.qt.mk. Open tasks: 1. We would like to remind everyone that the ports tree is built and run by volunteers, and any help is greatly appreciated. While Q4 saw a significant increase in the number of problem reports fixed, we encourage all ports committers to have a look at the issues reported by our users and try to fix as many as possible. Many thanks to all who made a contribution during Q4, and keep up the good work in 2016! __________________________________________________________________ Supporting Variants in the Ports Framework Links Poudriere PoC with Variants URL: https://github.com/bbqsrc/poudriere/compare/master...feature/varia= nts Ports Makefile PoC with Examples URL: https://gist.github.com/bbqsrc/e7e3a54d84706485aa3a Contact: Brendan Molloy I recently became involved with FreeBSD (as in, the last 2-3 weeks), and found myself quickly involved with Ports development. What struck me immediately was the difficulty in providing a Python package that was depended upon by multiple versions of Python. As it turns out, poudriere can currently only generate one package per port, meaning that a Python version-neutral (compatible with 2.x and 3.x) port cannot simultaneously be packaged for each variant at the same time. I discussed the issue with Kubilay Kocak, who suggested that I look into implementing a "variants protocol" within the Ports framework and the necessary changes to poudriere to allow a port to generate more than one package. Support for variants is strongly needed in Ports and provides significant benefits. * It would allow Python and other languages to provide packages for dependencies for multiple language versions from the same port. * It alleviates the need for so-called "slave ports", as a single port could now have multiple generated packages from a single port. * It would have a very small impact on the greater Ports ecosystem: adding only two new variables, VARIANT and VARIANTS. * It would provide a more consistent approach between different packaging teams for handling variations. For a simple example, editors/vim-lite could be folded into the editors/vim port, while still generating a vim and vim-lite package. For Python, VARIANTS can be derived from the already used USES flags and generate compatible packages. py27-foobar and py34-foobar could now be consistently generated by poudriere without issue. Fortunately, this is not a wishful thinking piece. I dug in my heels and have implemented a proof-of-concept implementation of variants in the Ports framework, including the necessary modifications to poudriere in order to support it. It was mildly upsettling to find that poudriere is mostly written in Bourne shell scripts, but I pressed on nonetheless. I started with the prototype made by Baptiste Daroussin as a base, and built from there. The poudriere PoC aims to limit changes as much as possible to merely adding support for the new variants flags, while also at the request of Kubilay Kocak making the logging output more package-centric (as opposed to port-centric) as a result of these changes. This is a work in progress, and I would love to hear your feedback. I have enjoyed my first few weeks working on FreeBSD, and I hope to stay here for quite some time. Open tasks: 1. Any constructive feedback on the implementation would be very welcome! 2. Hopefully the code will be of sufficient quality to be considered for formal review in the coming months. __________________________________________________________________ Xfce on FreeBSD Links FreeBSD Xfce Project URL: https://wiki.FreeBSD.org/Xfce FreeBSD Xfce Repository URL: https://www.assembla.com/spaces/xfce4/subversion/source Contact: FreeBSD Xfce Team Xfce is a free software desktop environment for Unix and Unix-like platforms, such as FreeBSD. It aims to be fast and lightweight, while still being visually appealing and easy to use. During this quarter, the team has kept these applications up-to-date: * audio/xfce4-pulseaudio-plugin 0.2.4 * multimedia/xfce4-parole 0.8.1 * x11/xfce4-whiskermenu-plugin 1.5.2 We also follow the unstable releases (available in our experimental repository) of: * x11/xfce4-dashboard 0.5.4 Open tasks: 1. Propose a patch to upstream to fix Xfdashboard with our version of OpenGL (it currently coredumps). __________________________________________________________________ "FreeBSD Mastery: Specialty Filesystems" Early Access Version Now Available Links Book site URL: https://www.michaelwlucas.com/nonfiction/fmsf Early access version URL: https://www.tiltedwindmillpress.com/?product=3Dfmspf Contact: Michael Lucas FreeBSD Mastery: Specialty Filesystems is now in copyediting. The ebook should be available by the end of January at all major vendors, and the print in February. The book covers everything from removable media, to FUSE, NFSv4 ACLs, iSCSI, CIFS, and more. If you act really quickly, you can get the electronic early access version at a 10% discount. You will get the final ebook when it comes out as well. (This offer evaporates when the final version comes out.) __________________________________________________________________ style(9) Enhanced to Allow C99 bool Links Bruce's Email Requesting bool be Added to style(9) URL: https://lists.FreeBSD.org/pipermail/svn-src-head/2015-December/079= 671.html Differential Revision for the Change URL: https://reviews.FreeBSD.org/D4384 Contact: Bruce Evans Contact: Conrad Meyer Use of bool is now allowed. It was allowed previously, as well, but now it is really allowed. Party like it's 1999! This project is sponsored by EMC / Isilon Storage Division. Open tasks: 1. Specify style(9)'s opinion on iso646.h. 2. Fix intmax_t to be 128-bit on platforms where __int128_t is used. __________________________________________________________________ HardenedBSD Links HardenedBSD Website URL: https://hardenedbsd.org/ Introducing HardenedBSD's New Binary Updater URL: https://hardenedbsd.org/article/shawn-webb/2015-12-31/introducing-= hardenedbsds-new-binary-updater secadm Beta Published URL: https://hardenedbsd.org/article/shawn-webb/2015-11-22/introducing-= secadm-030-beta-01 New Package Building Server URL: https://hardenedbsd.org/article/admin/2015-11-22/new-package-build= ing-server secadm URL: https://github.com/HardenedBSD/secadm HardenedBSD Haswell Support URL: https://github.com/HardenedBSD/hardenedBSD-playground/tree/hardene= d/experimental/master-i915 Nightly Builds for HardenedBSD Haswell Support URL: http://jenkins.hardenedbsd.org/builds/HardenedBSD-CURRENT-i915kms-= amd64-LATEST/ Contact: Shawn Webb Contact: Oliver Pinter HardenedBSD has been hard at work improving the performance and stability of our security enhancements. Security flags are now per-thread instead of per-process, removing some locking overhead. ASLR for mmap(MAP_32BIT) requests has been refactored, but lib32 is now disabled by default. We have developed a new binary update utility, hbsd-update, akin to freebsd-update. In addition to normal OS installs, it can also update jails and ZFS Boot Environments (ZFS BEs). Updates are signed using X.509 certificates. secadm 0.3-beta has landed. It has been rewritten from scratch to be more efficient. As part of the rewrite, the rule syntax has changed and users must update their rulesets as described in the README. Thanks to generous donations of a server from G2, Inc and hosting from Automated Tendencies, we can now do full package builds in just 35 hours, down from 75 hours. This machine will also provide weekly binary updates for the kernel and base system. Owing partly to the needs of the developers, we have an experimental branch that includes the work Jean-S=E9bastien P=E9dron has under way fo= r Haswell graphics support, on top of FreeBSD 11-current. Binary updates are also provided for this branch. Unfortunately, in order to focus our efforts on improving HardenedBSD, we have had to pull back from submitting our ASLR patches to FreeBSD. The past two years' efforts to address comments on the submission have taken their toll, and the effort is no longer sustainable. We are proud to be based on FreeBSD and believe that the whole community could benefit from the security technologies we are developing. We hope that someone else will be able to step forward and finish off the task of integrating ASLR into FreeBSD. This project is sponsored by Automated Tendencies, G2, Inc, and SoldierX. __________________________________________________________________ NanoBSD Modernization Contact: Warner Losh This quarter's NanoBSD updates target three main areas. First, building a NanoBSD image required root privileges. Second, building for embedded platforms required detailed knowledge of the format required to boot. Third, the exact image sizes needed to be known to produce an image. When NanoBSD was written, FreeBSD's build system required root privileges for the install step and onward. NanoBSD added to this by creating a md(4) device in which to construct the image. Some configurations of NanoBSD added further to this by creating a chroot in which to cleanly build packages. NanoBSD solves the first problem using the new NO_ROOT build option to create a meta file. NanoBSD also augments this record as files are created and removed. The meta file is then fed into makefs(8) to create a UFS image with the proper permissions. The UFS image, and sometimes a DOS FAT partition, are then passed to mkimg(1) to create the final SD image. The mtree manipulation has been written as a separate script to allow it to move into the base system where it could assist with other build orchestration tools (though the move has not happened yet). The detailed knowledge of how to build each embedded image (as well as some of the base images for qemu) has always been hard to enshrine. Crochet puts this knowledge into its builds. The FreeBSD release system puts it into its system. NanoBSD, prior to the current work, provided no way to access its knowledge of how to build images. The current state of this project allows the user to set a simple image type and have NanoBSD deal with all of the details needed to create that image type. This includes using the u-boot ports and installing the right files into a FAT partition so that FreeBSD can boot with ubldr(8), creating the right boot1.elf file for powerpc64 qemu booting, or the more familiar (though needlessly complicated) x86 setup. Previous versions of NanoBSD required too much specialized knowledge from the user. This work aims to concentrate the knowledge into a set of simple scripts for any build orchestration system to use. Finally, NanoBSD images in the past have needed very specific knowledge of the target device. Part of this is a legacy of the BIOS state-of-the-art a decade ago, which required very careful matching of the image to the actual device in the deployed system. Although relevant at the time, such systems are now vanishingly rare. Support for them will be phased out (though given the flexibility of NanoBSD, it can be moved to the few remaining examples in the tree and also partially covered by the generic image scripts). Today, the typical use case is to create an SD or microSD card image, and have the image resize itself on boot. NanoBSD now supports that workflow. In addition to these items, a number of minor improvements have been made: * Support for CPUTYPE-specialized builds. This includes both NanoBSD support as well as important bug fixes in the base system. * Support for marking MBR partitions as active. * Support for more partition types. Open tasks: 1. mkimg(8) needs to be augmented to create images for the i.MX6 and Allwinner (and others) SoCs. These SoCs require a boot image to be written after the MBR, but before the first partition starts. 2. The chroot functionality of some NanoBSD configurations has not yet been migrated for non-privileged builds. 3. The functionality to manipulate mtree(8) files should be moved into the base system for use by other build orchestration tools. 4. The script to create a bootable image from one or more trees of files, as well as some creation of those trees, should be moved into the base system for use with other build orchestration tools. 5. The growfs functionality works great for single images growing to the whole disk. However, NanoBSD would prefer that the boot FS/partition grow to approximately 1/2 the size of the media and another identical (or close) partition be created for the ping-ponging upgrades that NanoBSD is setup for. This needs to be implemented in the growfs rc.d(8) script. __________________________________________________________________ relaunchd Links Development tree on GitHub URL: https://github.com/mheily/relaunchd Contact: Mark Heily The relaunchd project provides a service management daemon that is similar to the original launchd introduced in Apple OS X. It is not limited to the original features of launchd, however: interesting work is being done to add support for launching programs in jails, passing socket descriptors from the host to a jail, and launching programs within a preconfigured capsicum(4) sandbox. Additionally, relaunchd uses UCL for its configuration files, so jobs can be defined in JSON or other formats supported by UCL. While there is still work to be done, most of the important features of the original launchd have been implemented, and relaunchd has been made available in the FreeBSD Ports Collection. It should still be considered experimental and not ready for production use, but everyone is welcome to try it, report issues, and contribute code or ideas for improvement. Open tasks: 1. Add support for restarting jobs if they crash. 2. Implement the cron(8) emulation feature. 3. Add support for monitoring files and directories for changes and launching jobs when changes are detected. 4. Finish things that are incomplete, such as support for jails and passing open socket descriptors to child processes. 5. Improve the documentation and provide more examples of usage. __________________________________________________________________ System Initialization and Service Management Links A Comparison of init(8) and rc(8) Replacements URL: http://www.daemonspawn.org/2016/01/a-comparison-of-alternatives-to= -init8.html Contact: Mark Heily Contact: Jonathan de Boyne Pollard Contact: Jordan Hubbard There are three active projects to provide an alternative to the traditional init(8) and rc(8) subsystems that manage the boot process and system services. There are a number of reasons driving the desire for change, including: * Faster boot times, made possible by launching services in parallel * Greater reliability, by ensuring that services are automatically restarted if they terminate unexpectedly * Simplified dependency management, using socket activation and similar techniques * The ability to launch services "on demand", and have them self-terminate when idle * Improved security, by removing the need to start common daemons as the root user Two of the projects, launchd and relaunchd, are based on the launchd(8) API introduced by Apple in Mac OS X. The NextBSD project has ported the original Apple source code by writing a Mach compatibility layer that allows launchd to run on FreeBSD. The relaunchd project started from scratch with the goal of creating a more modular, lightweight, and portable implementation of the launchd API. The third project, nosh, is a unique creation that borrows concepts from launchd, systemd, and several other Unix operating systems. While the FreeBSD Project has not made a decision to replace the current init(8) and rc(8) subsystems, the existence and active development of alternatives will continue to drive innovation in this space. Jordan Hubbard is the contact point for the NextBSD launchd, Jonathan de Boyne Pollard is the contact point for nosh, and Mark Heily is the contact point for relaunchd. __________________________________________________________________ The FreeBSD Foundation Links Foundation Website URL: http://www.FreeBSDFoundation.org/ FreeBSD Journal URL: http://FreeBSDJournal.com/ Contact: Deb Goodkin The FreeBSD Foundation is a 501(c)(3) non-profit organization dedicated to supporting and promoting the FreeBSD Project and community worldwide. Funding comes from individual and corporate donations and is used to fund and manage development projects, conferences and developer summits, and provide travel grants to FreeBSD developers. The Foundation purchases hardware to improve and maintain FreeBSD infrastructure and publishes FreeBSD white papers and marketing material to promote, educate, and advocate for the FreeBSD Project. The Foundation also represents the FreeBSD Project in executing contracts, license agreements, and other legal arrangements that require a recognized legal entity. Here are some highlights of what we did to help FreeBSD last quarter: On the advocacy front, the Foundation attended and sponsored EuroBSDcon, which took place Oct 1-4 (https://2015.eurobsdcon.org/) in Stockholm, Sweden. Two days prior, during the developer summit, Deb Goodkin ran a session on Recruiting to FreeBSD. The Foundation was also very active during the event itself; in addition to Deb, we had Dru Lavigne, Kirk McKusick, Erwin Lansing, Ed Maste, Hiroki Sato, Benedict Reuschling, and Edward Tomasz Napiera=B3a attend the conference. Deb and Ed gave a presentation on how the Foundation supports a BSD project. Kirk gave a presentation on "a Brief History of the BSD Fast File System," and he taught the two-day tutorial "Introduction to the FreeBSD Open-Source Operating System." Deb then attended the 2015 Grace Hopper Conference that was held in Houston, TX, October 14-16. The conference is for women in computing and most of the attendees were female computer science majors, female software developers, and college professors. The Foundation was proud to be a Silver Sponsor. The conference was very successful for us. Our presence allowed us to raise awareness of the Project, help recruit more women, and get more professors to include FreeBSD in their curriculum. George V. Neville-Neil traveled to Bangkok, Thailand to present talks on DTrace, FreeBSD, and teaching with DTrace. The talks were presented at Chulalongkorn University, which is the largest University in Thailand with the largest engineering school. The first talk was the practitioner's introduction to DTrace in which the technology, history and usage is explained without diving into all the kernel subsystems. The second was the sales pitch for teaching with Dtrace and with FreeBSD. The pitch was well received and there were some very good points made by the audience. The facts that the course materials are both open source and hosted on github were also well received. Kirk McKusick completed a 10-hour tutorial about FreeBSD for Pearson Education in their "Live Lesson" program. In particular, there is a great free snippet from that course comparing FreeBSD against Linux here: http://youtu.be/dTpqALCwQ1Y?a. Find out more about the whole session at: http://click.linksynergy.com/fs-bin/click?id=3DNZS3W7D*uS0&subid=3D&offe= rid=3D163217.1&type=3D10&tmpid=3D3559&RD_PARM1=3Dhttp%253A%252F%252Fwww.inf= ormit.com%252Fstore%252Fintroduction-to-the-freebsd-open-source-operating-s= ystem-9780134305868. Anne Dickison resumed the Faces of FreeBSD series with interviews featuring Michael Dexter and Erin Clark. She also continued to produce and distribute FreeBSD materials for conferences, as well as advocating for FreeBSD over our social channels. George V. Neville-Neil headed up the latest Silicon Valley Vendor and Developer Summit, November 2-3, at the NetApp campus in Sunnyvale, California. Topics of discussion ranged over new developments in persistent memory, the use of FreeBSD by a company that builds rackscale systems, developments in our compiler and tool suite, as well as others. Additional Foundation Board and Staff attending the summit included: Deb Goodkin, Glen Barber, Justin T. Gibbs, Kirk McKusick, Ed Maste, and Hiroki Sato. The complete schedule, and some of the slides, are available on the FreeBSD Wiki https://wiki.freebsd.org/201511VendorDevSummit . Notes from the always lively "Have/Need/Want session" are available at https://wiki.freebsd.org/201511VendorDevSummit/HaveNeedWant . While in the Bay Area, some Foundation members visited commercial users of FreeBSD to help understand their needs, update them on the work the Foundation is doing, and facilitate collaboration between them and the Project. We were a sponsor of the 2015 OpenZFS Developer Summit, which took place October 19-20, in San Francisco, CA. Justin T. Gibbs and Kirk McKusick attended the conference. Justin T. Gibbs continued his semester long class teaching Intro to Computer Science using FreeBSD at a middle school. Ed Maste, Edward Tomasz Napiera=B3a, and Konstantin Belousov continue to make progress on Foundation funded development projects. More specifically: * Ed Worked on a number of items relating to the tool chain: LLD linker, ELF Tool Chain components, and LLDB debugger, and tested, integrated, and merged outstanding UEFI work. * Edward finished work on the reroot project as well as spending some time on a certificate-transparency port. He also implemented a prototype to support disk IO limit in RCTL. * Konstantin rewrote the out of memory killer logic, which, in particular, fixed FreeBSD operation on systems without swap, especially systems with very little memory. The latter are becoming more and more common with the popularity of embedded ARM platforms where FreeBSD runs, but it also affects large systems which are usually configured without swap. He also finalized and committed the shared page support for the ARMv7 and ARMv8 systems. This allows for a non-executable stack on ARMv7, and a much faster userspace gettimeofday(2) for both, similar to x86. Ed Maste presented a FreeBSD/arm64 talk and a hands-on demo at ARM Techcon, which took place November 10-12, 2015, in Santa Clara, CA. We continued publishing our monthly newsletters and acquiring new company testimonials about using FreeBSD, including from Verisign and Nginx. Anne Dickison, Dru Lavigne, and Glen Barber represented the Foundation at USENIX LISA '15, which took place November 3-8, in Washington D.C.. The Foundation had a booth in the Expo Hall and participated in a BoF. Besides connecting with current community members, we spoke with attendees who were interested in getting involved with the Project and helped set them on the correct path. We also took the opportunity to remind those who had not used FreeBSD in a while what they were missing. Glen also attended the USENIX Release Engineering Summit, which was co-located with LISA '15. We published the Sept/Oct and Nov/Dec issues of the FreeBSD Journal. George V. Neville-Neil and Robert Watson announced the release of their TeachBSD initiative: http://teachbsd.org/. TeachBSD offers a set of open source reusable course materials designed to allow others to teach both university students and software practitioners FreeBSD operating system fundamentals. The Foundation is proud to have partly sponsored their efforts to teach the initial graduate level course on operating systems with tracing at the University of Cambridge. Deb Goodkin invited a representative from the Outreachy program to talk at the Ottawa FreeBSD Developer Summit about the program and how we can get involved. Deb also started discussions with CS professors from the University of Colorado, Boulder to offer some Intro to FreeBSD workshops. Glen Barber continued wearing many hats to support to the Project. For Release Engineering: * Added support for building BANANAPI, CUBIEBOARD, and CUBIEBOARD2 arm images. * Deprecated the use of MD5 checksums for verifying installation media downloaded from the FreeBSD Project mirrors. * Various miscellaneous updates and fixes to release build code. * Continued providing regular development snapshot builds. Under Systems Administration: * Assisted the Admins team with migrating various services to two new colocation facilities near Sunnyvale, generously provided by RootBSD and LimeLight Networks. * Moved email services for the Foundation to a new server. Ed Maste attended the Reproducible Builds World Summit, which took place in Athens, Greece, December 1-3, 2015. We wrapped up our 2015 fundraising efforts with our End-of-Year fundraising campaign by participating in #GivingTuesday, and continuing with weekly email and social media requests for support of the Foundation. Final fundraising numbers will be available in Q1 2016. __________________________________________________________________ -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQGgBAEBCgAGBQJWstP3AAoJECjZpvNk63USG3wMHArdbptYhXljtbfGKDUWPJjs ip1eIqDxywocb8P0VZ/NqSdASyuWXyHZGPxjFhE9BIAvUKeTCJPsJLWPqZyiv2Qj 856dYOTyOnzkMbGaN9dPTdH2ymrsfRqQKVr1DP+7la6IdFpvlUZ2RwxBX7qPs73/ yMy640tfC+8VVuKt3EjZdbN6hBbeSoiZ0b8pJ+GP42CKhwSYW86wrtMSudjNRZzZ lBYSJJ9RPh76Z6Qjcv8mBioEEl+XZyadzErM7rDNpiIRV4Ucr3ez24sUiGl30+4G SpniAKmXBL+6wWm9jhWfm6obOQJecTc7ydc5zT0Qb8rXIBPsYGxOYxujM1zqWbqt 4TjAqnzkanTqOfZDMzvpMj8dBk9gUxjHloQ66pfNx6UeV0cCbSh6k/vIQdTbvGAn NknGKyYQ8n0AQCHht3sXdOcGCca0gHq0uDEWR71woqTReXAjCdAEHNSE+XnnHWYa lCgsuNe5PWtSfYuyMu95X863us7mShf8277hQzNlft8ILTY=3D =3D7aRE -----END PGP SIGNATURE----- From owner-freebsd-hackers@freebsd.org Thu Feb 4 09:35:20 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 67806A9A89D for ; Thu, 4 Feb 2016 09:35:20 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x244.google.com (mail-wm0-x244.google.com [IPv6:2a00:1450:400c:c09::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EBBDD1881; Thu, 4 Feb 2016 09:35:19 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x244.google.com with SMTP id r129so11289766wmr.0; Thu, 04 Feb 2016 01:35:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=AWTC6BiE93bAgGcoONAW78/wiBxnqExe5rm5No9gUto=; b=tQKq2Sy4inZUE02p0lYcM+eF8z+UXR94NA3x+/GK8lKcXLyL229vGgSwfpPP5itEgl ryVxnb28ItCD6Z3Wv2lR/KsrKo0v5X3jkNIeHzxWQ78XHlGqjc2YczFECR8IQIZOoqMZ MOgv2cW3bj6jKA5tvsKrEwntBIRZUXN42GgSsMPCmgEpEw3LZZgz4Yl0y4TUpgaJb66W 9XOJlq0tLYu900uq2ohhkePG4J4hOrNJckZI7VhQlBHW8Df5NLIhFzmaSs+ssMiLfmqX 70lsumnOEwNCsmLevSzisVKUEUDKd3a76AahOLsx16vd7c6PjR/m9INW6Pu5xhQA4wtH uXKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=AWTC6BiE93bAgGcoONAW78/wiBxnqExe5rm5No9gUto=; b=LVjJ1CRIampJTPrOoWsDxXMES1D1R2DSYYgvzfBWsSZ+hVTHxD/OadGPtU3Di4UQ4n WQCnff+UF1BzjO5+UG+6PRcDeXHErAGZ98t45S2JIYdmiAe0S8RN030dj8ZdQEh5RNMu bdeNEyDJcgvtSbRhEx7qi3vBZtJbf09QyPJxSjRrp1HmQiI7UkEQZr0dizjzODNE2dNK 0dTRqSC5evZHja3fk8sUftCC+zKKy36NmpU97Wo7enINqjUyruD+m62QUgo0TJFuYwD9 D3ZYHUWO1zxGUfLJbKPMlvC39yT9b9SzJoi8WQ01VXGVVDaJ8FSfr0BAKzaGzrWnML8d A+kQ== X-Gm-Message-State: AG10YOREjzUtP3CxYOEYFqubzwP2XoYTOmufIyBRt3QDIa3WjZx2In5qWe5cvEoX24uhdw== X-Received: by 10.28.46.82 with SMTP id u79mr31123903wmu.67.1454578518484; Thu, 04 Feb 2016 01:35:18 -0800 (PST) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by smtp.gmail.com with ESMTPSA id l7sm10450766wjx.14.2016.02.04.01.35.17 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Thu, 04 Feb 2016 01:35:17 -0800 (PST) Date: Thu, 4 Feb 2016 10:35:15 +0100 From: Mateusz Guzik To: Konstantin Belousov Cc: freebsd-hackers@freebsd.org, jmg@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160204093515.GA21877@dft-labs.eu> Mail-Followup-To: Mateusz Guzik , Konstantin Belousov , freebsd-hackers@freebsd.org, jmg@freebsd.org References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> <20160202175652.GA9812@dft-labs.eu> <20160202181635.GC91220@kib.kiev.ua> <20160202214427.GB9812@dft-labs.eu> <20160203010412.GC9812@dft-labs.eu> <20160203080514.GA8753@dft-labs.eu> <20160203141329.GF91220@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160203141329.GF91220@kib.kiev.ua> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Feb 2016 09:35:20 -0000 On Wed, Feb 03, 2016 at 04:13:29PM +0200, Konstantin Belousov wrote: > On Wed, Feb 03, 2016 at 09:05:15AM +0100, Mateusz Guzik wrote: > > > > CPU0 is executing fork1. p2 is not traced. > > > > > > > > CPU0 CPU1 > > > > p2 and td2 created > > > > td2 is marked runnable > > > > td2 is scheduled here > > > > td2 does not have TDB_STOPATFORK set > > > > td2 exits > > > > p2 is autoreaped > > > > td2's space is reused > > > > td2 gets linked into p3 > > > > td2 gets TDB_STOPATFORK > > > > PROC_LOCK(p2); > > > > TDB_STOPATFORK test on td2 > > > > cv_wait(&p2->p_dbgwait, ..); > > > > > > > > So at this point p2 has no linked threads and is free. td2 belongs to > > > > p3 and p2 is waiting for a wakeup which can't happen. > I am convinced about this. I thought that the fork_return() guarantee > that the child cannot exit if TDB_STOPATFORK is set is enough, but the > issue is other way around. > > > > > > > > > Now that I look at it this may be broken in an additonal way, which is > > > > not fixed by the patch: what if td2 spawns a new thread and thr_exits? > > > > In this case testing td2 is still invalid. Maybe I'm just getting > > > > paranoid here, I don't have time to properly test this right now. In > > > > worst case should be fixable well enough with FIRST_THREAD_IN_PROC. > > > > I committed previous 2 patches. Stuff below is just speculation. So the remaining problem, after we know the process has to survive, is survival of the thread and its relationship with the process. The problem stems from not having the proc lock over the entire time from the moment the thread is marked as runnable to the moment where the code is done with it. Race 1: CPU0 CPU1 p1: p2 and td2 created td2: marked runnable td2: scheduled here td2: does not have TDB_STOPATFORK set td2: calls thr_new td2: calls thr_exit td2: reused and linked into p3 td2: gets TDB_STOPATFORK p1: PROC_LOCK(p2); p1: TDB_STOPATFORK test on td2 p1: cv_wait(&p2->p_dbgwait, ..); p2 is the process we want, but td2 now belongs to a different thread. Race 2: However, seems to be even more buggy. To quote: while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) cv_wait(&p2->p_dbgwait, &p2->p_mtx); The check is done in a loop which drops the proc lock. This makes me wonder about the following additional race: p2 is traced, TDB_STOPATFORK is set on td2. CPU0 CPU1 p1: PROC_LOCK(p2); p1: TDB_STOPATFORK test on td2 p1: cv_wait(&p2->p_dbgwait, ..); td2: is scheduled here td2: clears TDB_STOPATFORK td2: cv_broadcast(&p2->p_dbgwait) p1: not scheduled yet td2: calls thr_new td2: calls thr_exit td2: is reused and linked into p3 td2: gets TDB_STOPATFORK p1: scheduled here p1: internal PROC_LOCK(p2); p1: TDB_STOPATFORK test on td2 But td2 now belongs to p3. I think the patch below deals with race 1 just fine. For race 2, it is unclear to me if the while loop is justified. If a single 'if' statement was sufficient, there would be no problem since unlock + lock would be avoided guaranteeting the consistency. I was pondering borrowing fork_return's logic to check if tracing is enabled before testing TDB_STOPATFORK. However, tracing state could have changed several times invalidating the result. Maybe refreshing the pointer to th first thread would do the trick, but imho the lock dropping business is extremely fishy and will have to be dealt with at some point. I have other stuff I want to do before 11.0, so I may drop this for the time being. The patch for race 1: diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index baee954..1fea651 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -717,11 +717,6 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * if ((fr->fr_flags & RFMEM) == 0 && dtrace_fasttrap_fork) dtrace_fasttrap_fork(p1, p2); #endif - /* - * Hold the process so that it cannot exit after we make it runnable, - * but before we wait for the debugger. - */ - _PHOLD(p2); if ((p1->p_flag & (P_TRACED | P_FOLLOWFORK)) == (P_TRACED | P_FOLLOWFORK)) { /* @@ -758,6 +753,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * fdrop(fp_procdesc, td); } + PROC_LOCK(p2); if ((fr->fr_flags & RFSTOPPED) == 0) { /* * If RFSTOPPED not requested, make child runnable and @@ -773,13 +769,11 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * *fr->fr_procp = p2; } - PROC_LOCK(p2); /* * Wait until debugger is attached to child. */ while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) cv_wait(&p2->p_dbgwait, &p2->p_mtx); - _PRELE(p2); racct_proc_fork_done(p2); PROC_UNLOCK(p2); } -- Mateusz Guzik From owner-freebsd-hackers@freebsd.org Thu Feb 4 09:53:49 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C106DA9B0ED for ; Thu, 4 Feb 2016 09:53:49 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4B6852A2; Thu, 4 Feb 2016 09:53:49 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u149rgk9065308 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 4 Feb 2016 11:53:42 +0200 (EET) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u149rgk9065308 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u149rfkH065307; Thu, 4 Feb 2016 11:53:41 +0200 (EET) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Thu, 4 Feb 2016 11:53:41 +0200 From: Konstantin Belousov To: Mateusz Guzik Cc: freebsd-hackers@freebsd.org, jmg@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160204095341.GO91220@kib.kiev.ua> References: <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> <20160202175652.GA9812@dft-labs.eu> <20160202181635.GC91220@kib.kiev.ua> <20160202214427.GB9812@dft-labs.eu> <20160203010412.GC9812@dft-labs.eu> <20160203080514.GA8753@dft-labs.eu> <20160203141329.GF91220@kib.kiev.ua> <20160204093515.GA21877@dft-labs.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160204093515.GA21877@dft-labs.eu> User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Feb 2016 09:53:49 -0000 On Thu, Feb 04, 2016 at 10:35:15AM +0100, Mateusz Guzik wrote: > Stuff below is just speculation. > > So the remaining problem, after we know the process has to survive, is > survival of the thread and its relationship with the process. > > The problem stems from not having the proc lock over the entire time > from the moment the thread is marked as runnable to the moment where the > code is done with it. > > Race 1: > > CPU0 CPU1 > p1: p2 and td2 created > td2: marked runnable > td2: scheduled here > td2: does not have TDB_STOPATFORK set > td2: calls thr_new > td2: calls thr_exit > td2: reused and linked into p3 > td2: gets TDB_STOPATFORK > p1: PROC_LOCK(p2); > p1: TDB_STOPATFORK test on td2 > p1: cv_wait(&p2->p_dbgwait, ..); > > p2 is the process we want, but td2 now belongs to a different thread. > > Race 2: > > However, seems to be even more buggy. To quote: > > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > > The check is done in a loop which drops the proc lock. This makes me > wonder about the following additional race: > > p2 is traced, TDB_STOPATFORK is set on td2. > > CPU0 CPU1 > p1: PROC_LOCK(p2); > p1: TDB_STOPATFORK test on td2 > p1: cv_wait(&p2->p_dbgwait, ..); > td2: is scheduled here > td2: clears TDB_STOPATFORK > td2: cv_broadcast(&p2->p_dbgwait) > p1: not scheduled yet > td2: calls thr_new > td2: calls thr_exit > td2: is reused and linked into p3 > td2: gets TDB_STOPATFORK > p1: scheduled here > p1: internal PROC_LOCK(p2); > p1: TDB_STOPATFORK test on td2 > > But td2 now belongs to p3. > > I think the patch below deals with race 1 just fine. > > For race 2, it is unclear to me if the while loop is justified. If a > single 'if' statement was sufficient, there would be no problem since > unlock + lock would be avoided guaranteeting the consistency. > > I was pondering borrowing fork_return's logic to check if tracing is > enabled before testing TDB_STOPATFORK. However, tracing state could have > changed several times invalidating the result. Maybe refreshing the > pointer to th first thread would do the trick, but imho the lock > dropping business is extremely fishy and will have to be dealt with at > some point. > So if the issue is only reassignment of td2 to p3, why not do the following ? I think that possible ABA problem where td2 gets TDB_STOPATFORK set after being reused for p2 (and not p3) after yet another fork, is actually fine. diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index baee954..5bb14e8 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -777,7 +777,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * /* * Wait until debugger is attached to child. */ - while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) + while (td2->td_proc == p2 && (td2->td_dbgflags & TDB_STOPATFORK) != 0) cv_wait(&p2->p_dbgwait, &p2->p_mtx); _PRELE(p2); racct_proc_fork_done(p2); From owner-freebsd-hackers@freebsd.org Thu Feb 4 10:16:01 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3A98DA9B895 for ; Thu, 4 Feb 2016 10:16:01 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wm0-x244.google.com (mail-wm0-x244.google.com [IPv6:2a00:1450:400c:c09::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E34EEE60; Thu, 4 Feb 2016 10:16:00 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by mail-wm0-x244.google.com with SMTP id 128so11438639wmz.3; Thu, 04 Feb 2016 02:16:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=sWqVrnC1WgXlqmNAAU7DZmqme94wzXnxFHdffBGXHxM=; b=JJ9WYuqBquP0tzzR/526N1eXgIvf0WR0OTEyDXMXI9X0akmMzLmaoO9qKpzwPp2rM7 JpXpZKYK0ILlTTM7j/j/8qpEanZ3lhq5cUz7HERqCDGc/rjDeAk6ZpjzN1oPwwpojxup q5BXGADCnRBTp/hAeKngDR/gZb/XpEhQ4zmVdChHWKOGtKr8MwCsLa6uXnA1WaBwSe1Z sPkPXsEU1cghqW0EAwbWwEVnWIhbn53ojfN2RSVf9rsF5Qv54HbOIWOfYBIVasCcgW+0 RZBhYZDn52okkgnBH3/XWRcqa5WnVL4Lr9qXx2GMWf2Fti7dzBZPi1qWDvGwd5uWjJxd DWYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-type :content-disposition:in-reply-to:user-agent; bh=sWqVrnC1WgXlqmNAAU7DZmqme94wzXnxFHdffBGXHxM=; b=BiI9hNLMpTm+S5CDXqgBlOPXEH30xT8lVKp59a9G1ZczkvrS20BebG8zc6Bf6Y/Oce Lv8f7kFkFzbmk3rJp/3ClXDou2sOipqlTjmDhHv47t4dnj7QfCXAdoc7rVGLt6cl2jFp hUQGrUjPYDU2YZj+pdJdnZqSn8ytyTuWwXWzu0cw8Zfk5bgQIG91pu1WSTRxMdCZ9wF8 1HmlUmxRL7Gme9lXAkx0TAyc81sVRII1xOFneDZ6YMspqWPS8fpvFWg2w01xBpaFfTlB DZYxhXSpR9EcumSQBzxtOC/q5yMCPjdmbwyhmbj0cytYL1fAt5VpswRyGrqsb+1I590n Oobg== X-Gm-Message-State: AG10YOSOYG63ZRpJzZ3xSdnWHk3tVjJBQYH3g0lWH3SqDuGvNk2R9g1+e/9CfceLlNtx2Q== X-Received: by 10.28.4.134 with SMTP id 128mr9006178wme.96.1454580959047; Thu, 04 Feb 2016 02:15:59 -0800 (PST) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by smtp.gmail.com with ESMTPSA id p9sm10602046wjy.41.2016.02.04.02.15.58 (version=TLS1_2 cipher=AES128-SHA bits=128/128); Thu, 04 Feb 2016 02:15:58 -0800 (PST) Date: Thu, 4 Feb 2016 11:15:56 +0100 From: Mateusz Guzik To: Konstantin Belousov Cc: freebsd-hackers@freebsd.org, jmg@freebsd.org Subject: Re: [PATCH 2/2] fork: plug a use after free of the returned process pointer Message-ID: <20160204101556.GB21877@dft-labs.eu> Mail-Followup-To: Mateusz Guzik , Konstantin Belousov , freebsd-hackers@freebsd.org, jmg@freebsd.org References: <1454386069-29657-3-git-send-email-mjguzik@gmail.com> <20160202132322.GU91220@kib.kiev.ua> <20160202175652.GA9812@dft-labs.eu> <20160202181635.GC91220@kib.kiev.ua> <20160202214427.GB9812@dft-labs.eu> <20160203010412.GC9812@dft-labs.eu> <20160203080514.GA8753@dft-labs.eu> <20160203141329.GF91220@kib.kiev.ua> <20160204093515.GA21877@dft-labs.eu> <20160204095341.GO91220@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160204095341.GO91220@kib.kiev.ua> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Feb 2016 10:16:01 -0000 On Thu, Feb 04, 2016 at 11:53:41AM +0200, Konstantin Belousov wrote: > On Thu, Feb 04, 2016 at 10:35:15AM +0100, Mateusz Guzik wrote: > > Stuff below is just speculation. > > > > So the remaining problem, after we know the process has to survive, is > > survival of the thread and its relationship with the process. > > > > The problem stems from not having the proc lock over the entire time > > from the moment the thread is marked as runnable to the moment where the > > code is done with it. > > > > Race 1: > > > > CPU0 CPU1 > > p1: p2 and td2 created > > td2: marked runnable > > td2: scheduled here > > td2: does not have TDB_STOPATFORK set > > td2: calls thr_new > > td2: calls thr_exit > > td2: reused and linked into p3 > > td2: gets TDB_STOPATFORK > > p1: PROC_LOCK(p2); > > p1: TDB_STOPATFORK test on td2 > > p1: cv_wait(&p2->p_dbgwait, ..); > > > > p2 is the process we want, but td2 now belongs to a different thread. > > > > Race 2: > > > > However, seems to be even more buggy. To quote: > > > > while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > > > > The check is done in a loop which drops the proc lock. This makes me > > wonder about the following additional race: > > > > p2 is traced, TDB_STOPATFORK is set on td2. > > > > CPU0 CPU1 > > p1: PROC_LOCK(p2); > > p1: TDB_STOPATFORK test on td2 > > p1: cv_wait(&p2->p_dbgwait, ..); > > td2: is scheduled here > > td2: clears TDB_STOPATFORK > > td2: cv_broadcast(&p2->p_dbgwait) > > p1: not scheduled yet > > td2: calls thr_new > > td2: calls thr_exit > > td2: is reused and linked into p3 > > td2: gets TDB_STOPATFORK > > p1: scheduled here > > p1: internal PROC_LOCK(p2); > > p1: TDB_STOPATFORK test on td2 > > > > But td2 now belongs to p3. > > > > I think the patch below deals with race 1 just fine. > > > > For race 2, it is unclear to me if the while loop is justified. If a > > single 'if' statement was sufficient, there would be no problem since > > unlock + lock would be avoided guaranteeting the consistency. > > > > I was pondering borrowing fork_return's logic to check if tracing is > > enabled before testing TDB_STOPATFORK. However, tracing state could have > > changed several times invalidating the result. Maybe refreshing the > > pointer to th first thread would do the trick, but imho the lock > > dropping business is extremely fishy and will have to be dealt with at > > some point. > > > So if the issue is only reassignment of td2 to p3, why not do the following ? > I think that possible ABA problem where td2 gets TDB_STOPATFORK set after > being reused for p2 (and not p3) after yet another fork, is actually fine. > > diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c > index baee954..5bb14e8 100644 > --- a/sys/kern/kern_fork.c > +++ b/sys/kern/kern_fork.c > @@ -777,7 +777,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * > /* > * Wait until debugger is attached to child. > */ > - while ((td2->td_dbgflags & TDB_STOPATFORK) != 0) > + while (td2->td_proc == p2 && (td2->td_dbgflags & TDB_STOPATFORK) != 0) > cv_wait(&p2->p_dbgwait, &p2->p_mtx); > _PRELE(p2); > racct_proc_fork_done(p2); This is definitely fine for the being, it's just that unlock+lock pair which seems extremely error prone and someone(tm) should investigate it further at some point (tm). -- Mateusz Guzik From owner-freebsd-hackers@freebsd.org Thu Feb 4 11:07:05 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4B524A9CD22 for ; Thu, 4 Feb 2016 11:07:05 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0CBC21434; Thu, 4 Feb 2016 11:07:05 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1aRHkb-0006Hy-I7; Thu, 04 Feb 2016 14:07:01 +0300 Date: Thu, 4 Feb 2016 14:07:01 +0300 From: Slawa Olhovchenkov To: Ian Lepore Cc: Konstantin Belousov , Mateusz Guzik , freebsd-hackers@freebsd.org, Mateusz Guzik Subject: Re: [PATCH 1/2] fork: pass arguments to fork1 in a dedicated structure Message-ID: <20160204110701.GB88527@zxy.spb.ru> References: <20160201103632.GL91220@kib.kiev.ua> <1454386069-29657-1-git-send-email-mjguzik@gmail.com> <1454386069-29657-2-git-send-email-mjguzik@gmail.com> <20160202131145.GT91220@kib.kiev.ua> <1454423299.11162.27.camel@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1454423299.11162.27.camel@freebsd.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Feb 2016 11:07:05 -0000 On Tue, Feb 02, 2016 at 07:28:19AM -0700, Ian Lepore wrote: > > Would be great to not use initializer in declaration, i.e. use > > bzero() > > instead of c99 designated initializers. > > This would be a better suggestion if the compiler recognized and > optimized bzero() with inline code like it does for those forbidden > assignments. This is imposible until bzero() is nonstandart and implemeted in seperate file. From owner-freebsd-hackers@freebsd.org Sat Feb 6 18:50:52 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2AE07A9FE88 for ; Sat, 6 Feb 2016 18:50:52 +0000 (UTC) (envelope-from alfred@freebsd.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 16A191D5C for ; Sat, 6 Feb 2016 18:50:51 +0000 (UTC) (envelope-from alfred@freebsd.org) Received: from Alfreds-MacBook-Pro-2.local (unknown [IPv6:2601:645:8001:cee1:40b9:c742:d506:c6ee]) by elvis.mu.org (Postfix) with ESMTPSA id 813B4345A947 for ; Sat, 6 Feb 2016 10:50:44 -0800 (PST) To: FreeBSD Hackers From: Alfred Perlstein Subject: Macbook donation (wireless needs fixing) Organization: FreeBSD Message-ID: <56B64086.4090806@freebsd.org> Date: Sat, 6 Feb 2016 10:50:46 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Feb 2016 18:50:52 -0000 Hey folks, So now I have a few macbook laptops, they would be really great to run FreeBSD except that the wifi isn't supported. The macbooks have a broadcom BCM43xx chip inside of it for wifi. Is anyone interested in a *free* (to keep) laptop(s) in exchange for finishing this driver? Getting FreeBSD running on older mac hardware would open up a pretty large niche of laptops that are starting to get on the second use market. Please let me know, I can drop the laptop off anywhere in the SF bay area, or possibly toss it into a USPS/UPS box and send it to you. My only ask is that after 3 months, if the driver isn't somewhat working you return it so that it can be donated to a charity supporting the homeless. thanks! -Alfred From owner-freebsd-hackers@freebsd.org Sat Feb 6 22:37:39 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0ACADAA0090 for ; Sat, 6 Feb 2016 22:37:39 +0000 (UTC) (envelope-from marktinguely@gmail.com) Received: from mail-yw0-x22b.google.com (mail-yw0-x22b.google.com [IPv6:2607:f8b0:4002:c05::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C10671B00; Sat, 6 Feb 2016 22:37:38 +0000 (UTC) (envelope-from marktinguely@gmail.com) Received: by mail-yw0-x22b.google.com with SMTP id u200so97315ywf.0; Sat, 06 Feb 2016 14:37:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=QMshVbVsvMTAih7OGGNtSMNyq2M2A+K5GVwsruaEvYM=; b=LvJ0SiEH6OsN+3ZI+cx3Znwm5XBCvCav+cL+MytMUn8I5noU0k2me7h8FJc6k38AqW 4KZAi4HqpQfg5GWUdueYbuyfItgjN8E265AxFWjNMelba68vt1w3b87doHUWUgNVhSse S8bR/GELVmZyHJN3a6/Y3XKv3F22jGb2U+rNjXmsf4UFmRZTr4HOIG4A/+3zsjgB0AxV 2re//o4Kn1LVOSiJ267NTnJ2TyWhUcBEjV/pbTUepafX1nRS8XFT94lm7750ikl5I7ah 5P4bhh9FxPBPhEku30CqnJWVF+U0WQ5N/Wbf60Me0Qrc3kblSLzKSPbnrpeLKH+L3chp yRJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=QMshVbVsvMTAih7OGGNtSMNyq2M2A+K5GVwsruaEvYM=; b=TbcDbIStozB7rTcy7H2Qe0NUTng/Jv14ChwhVxnPdS3MAUqToabkB63S6gaovS636Z GVKzAdizhNMrNjU5nQXyOOJmwuH2Hwe/8wZxOLhcZgDf1riXgtE1ByUWchMnSV2Zrx4z iILjaoPgk80Osw+UooOG4d2waXx5fcCVQSonVkeixc888V3SgmL1ToK+dPjf06rog+RV KSp67Kc//ZCpx6yZzLsFouef2w6a3ioya/GPbJY4G7dP7ZmcY8OIiP9DdB5I9eHS2ekb hhMjANH0i+u2+Pfr9Lc5AKTl4o4s9qbfBvy17OVNMGRm12flaFeNeDa8Q8XLQ/+0sXEz fUSA== X-Gm-Message-State: AG10YOSRHAzprQ88RDV7WQGaf64vOQTVJ23XFopvoBStjxRak6Q0Ol5dH1MCFimJhSTzqPakMjdXqsU3/HPlBA== MIME-Version: 1.0 X-Received: by 10.129.88.136 with SMTP id m130mr10567922ywb.81.1454798258065; Sat, 06 Feb 2016 14:37:38 -0800 (PST) Received: by 10.129.133.3 with HTTP; Sat, 6 Feb 2016 14:37:38 -0800 (PST) Received: by 10.129.133.3 with HTTP; Sat, 6 Feb 2016 14:37:38 -0800 (PST) In-Reply-To: <56B64086.4090806@freebsd.org> References: <56B64086.4090806@freebsd.org> Date: Sat, 6 Feb 2016 16:37:38 -0600 Message-ID: Subject: Re: Macbook donation (wireless needs fixing) From: Mark Tinguely To: Alfred Perlstein Cc: FreeBSD Hackers Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Feb 2016 22:37:39 -0000 Hey Alfred, it is a nice offer. Let's not set some kind person up for failure. What should a potential volunteer (1-2-3, not me!) want to know about the chip on the mac laptops? A simple web search shows the existing Broadcom BCM43XX driverS don't work with Mac laptops. What do we know? Mark On Feb 6, 2016 12:51 PM, "Alfred Perlstein" wrote: > Hey folks, > > So now I have a few macbook laptops, they would be really great to run > FreeBSD except that the wifi isn't supported. > > The macbooks have a broadcom BCM43xx chip inside of it for wifi. > > Is anyone interested in a *free* (to keep) laptop(s) in exchange for > finishing this driver? > > Getting FreeBSD running on older mac hardware would open up a pretty large > niche of laptops that are starting to get on the second use market. > > Please let me know, I can drop the laptop off anywhere in the SF bay area, > or possibly toss it into a USPS/UPS box and send it to you. > > My only ask is that after 3 months, if the driver isn't somewhat working > you return it so that it can be donated to a charity supporting the > homeless. > > thanks! > -Alfred > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > From owner-freebsd-hackers@freebsd.org Sat Feb 6 22:40:10 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D59CDAA0185 for ; Sat, 6 Feb 2016 22:40:10 +0000 (UTC) (envelope-from larry.maloney@hackerdojo.com) Received: from mail-pf0-x22a.google.com (mail-pf0-x22a.google.com [IPv6:2607:f8b0:400e:c00::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A50101C63 for ; Sat, 6 Feb 2016 22:40:10 +0000 (UTC) (envelope-from larry.maloney@hackerdojo.com) Received: by mail-pf0-x22a.google.com with SMTP id w123so88854089pfb.0 for ; Sat, 06 Feb 2016 14:40:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hackerdojo-com.20150623.gappssmtp.com; s=20150623; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=e8WHS2kfySgP8hkA74zshDihI5vteBkTklJp0ShJ294=; b=PPW7uW8QUnIqXOTp9kM12EdzVJR3I28bAR+RZaMvgwjiSfnawu4jtuueCJOhQ6ErrP NEaXRDQbWO6vrpU+YcgviKFntY4G0pp9uZU2DrDeC3VF32WuOJKIVrsjKanZ3y2jLblI 9yvvcp+0fbyg996jy1JApV2u68I2wvsnXxxGd/8Gh8HK4dSmCJWVPEJY03e8rp8ZeJtM aCJclt60njKRQmx3SGNiLm3tKbIvKjlZC8kKiEUEhHef8ERTW/QblGzPNKFQk+9euIYR wTlbEQqesaCw9m6JmBnM40eZrn14UojPk3ElpDeOMN62WIA8f79FvdFS9pSUhOg0v7DA 6cRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to; bh=e8WHS2kfySgP8hkA74zshDihI5vteBkTklJp0ShJ294=; b=O+wUn+5t0kPNEgbEBaFqeqEZjqy7jD5H9NUgTm0Q40Hb5foj8GdkUazV92HLgW7NeP WJl1zPmxJfn0Rk4UpYIdo2J0Y6F8oz/GAKKPVIW78/YdEBW8YdHalzLuOM32M3iLLReM svLt5sm/l8ZwTP5rzzyZ1hqJuhQA5dKcsOqwhPC28Gktf7vgQafJbU0MtKknrV8f4a9q mfxkBAVGYUl3uixRZDr4eVim9fBFnDBCuc5P0tBpRRvE31JQ32wIslPdK6bhrmlQFZNX 8T8pyuFnC9ym4y0+LaHsmYW8i7RmDB7/5lCBq9XbYqoTmU38Y7XPQ0/kKOMwoe5z48/y RkAw== X-Gm-Message-State: AG10YOQNkGEz960Kd+agPyWLscaAPXxhEPZ1+bIsHTuyBioA7v1xCZtrsl9ydNrhnyRh2Q== X-Received: by 10.98.9.219 with SMTP id 88mr25823020pfj.0.1454798409997; Sat, 06 Feb 2016 14:40:09 -0800 (PST) Received: from [192.168.1.101] (c-73-202-177-47.hsd1.ca.comcast.net. [73.202.177.47]) by smtp.gmail.com with ESMTPSA id s23sm15025122pfi.12.2016.02.06.14.40.07 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 06 Feb 2016 14:40:07 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\)) Subject: Re: Macbook donation (wireless needs fixing) From: Larry Maloney In-Reply-To: Date: Sat, 6 Feb 2016 14:40:08 -0800 Cc: Alfred Perlstein , FreeBSD Hackers Content-Transfer-Encoding: quoted-printable Message-Id: <0B20C45E-A5AF-4FAB-81F8-4BE49BF4C61D@hackerdojo.com> References: <56B64086.4090806@freebsd.org> To: Mark Tinguely X-Mailer: Apple Mail (2.3112) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Feb 2016 22:40:11 -0000 I know I installed FreeBSD on a MacBook, circa 2010-2011. (KiputerOS) I could swear I had the Wifi working with NDIS. (Might have been = Broadcom, but not sure) That would have been 8.x something. /Larry > On Feb 6, 2016, at 2:37 PM, Mark Tinguely = wrote: >=20 > Hey Alfred, it is a nice offer. >=20 > Let's not set some kind person up for failure. What should a potential > volunteer (1-2-3, not me!) want to know about the chip on the mac = laptops? > A simple web search shows the existing Broadcom BCM43XX driverS don't = work > with Mac laptops. What do we know? >=20 > Mark > On Feb 6, 2016 12:51 PM, "Alfred Perlstein" = wrote: >=20 >> Hey folks, >>=20 >> So now I have a few macbook laptops, they would be really great to = run >> FreeBSD except that the wifi isn't supported. >>=20 >> The macbooks have a broadcom BCM43xx chip inside of it for wifi. >>=20 >> Is anyone interested in a *free* (to keep) laptop(s) in exchange for >> finishing this driver? >>=20 >> Getting FreeBSD running on older mac hardware would open up a pretty = large >> niche of laptops that are starting to get on the second use market. >>=20 >> Please let me know, I can drop the laptop off anywhere in the SF bay = area, >> or possibly toss it into a USPS/UPS box and send it to you. >>=20 >> My only ask is that after 3 months, if the driver isn't somewhat = working >> you return it so that it can be donated to a charity supporting the >> homeless. >>=20 >> thanks! >> -Alfred >> _______________________________________________ >> freebsd-hackers@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to = "freebsd-hackers-unsubscribe@freebsd.org" >>=20 > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to = "freebsd-hackers-unsubscribe@freebsd.org" From owner-freebsd-hackers@freebsd.org Sat Feb 6 23:54:13 2016 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9F37DAA0322 for ; Sat, 6 Feb 2016 23:54:13 +0000 (UTC) (envelope-from alfred@freebsd.org) Received: from elvis.mu.org (elvis.mu.org [IPv6:2001:470:1f05:b76::196]) by mx1.freebsd.org (Postfix) with ESMTP id 93393849 for ; Sat, 6 Feb 2016 23:54:13 +0000 (UTC) (envelope-from alfred@freebsd.org) Received: from [IPv6:2600:1010:b062:e98d:89d5:bbdb:5149:880] (unknown [IPv6:2600:1010:b062:e98d:89d5:bbdb:5149:880]) by elvis.mu.org (Postfix) with ESMTPSA id 3470E345A947; Sat, 6 Feb 2016 15:54:01 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (1.0) Subject: Re: Macbook donation (wireless needs fixing) From: Alfred Perlstein X-Mailer: iPhone Mail (13D15) In-Reply-To: <0B20C45E-A5AF-4FAB-81F8-4BE49BF4C61D@hackerdojo.com> Date: Sat, 6 Feb 2016 15:53:57 -0800 Cc: Mark Tinguely , FreeBSD Hackers Content-Transfer-Encoding: quoted-printable Message-Id: References: <56B64086.4090806@freebsd.org> <0B20C45E-A5AF-4FAB-81F8-4BE49BF4C61D@hackerdojo.com> To: Larry Maloney X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Feb 2016 23:54:13 -0000 Thanks Larry. Want a native driver to make it plug and play.=20 Sent from my iPhone > On Feb 6, 2016, at 2:40 PM, Larry Maloney w= rote: >=20 > I know I installed FreeBSD on a MacBook, circa 2010-2011. (KiputerOS) >=20 > I could swear I had the Wifi working with NDIS. (Might have been Broadcom= , but not sure) >=20 > That would have been 8.x something. >=20 > /Larry >> On Feb 6, 2016, at 2:37 PM, Mark Tinguely wrote:= >>=20 >> Hey Alfred, it is a nice offer. >>=20 >> Let's not set some kind person up for failure. What should a potential >> volunteer (1-2-3, not me!) want to know about the chip on the mac laptops= ? >> A simple web search shows the existing Broadcom BCM43XX driverS don't wor= k >> with Mac laptops. What do we know? >>=20 >> Mark >>> On Feb 6, 2016 12:51 PM, "Alfred Perlstein" wrote: >>>=20 >>> Hey folks, >>>=20 >>> So now I have a few macbook laptops, they would be really great to run >>> FreeBSD except that the wifi isn't supported. >>>=20 >>> The macbooks have a broadcom BCM43xx chip inside of it for wifi. >>>=20 >>> Is anyone interested in a *free* (to keep) laptop(s) in exchange for >>> finishing this driver? >>>=20 >>> Getting FreeBSD running on older mac hardware would open up a pretty lar= ge >>> niche of laptops that are starting to get on the second use market. >>>=20 >>> Please let me know, I can drop the laptop off anywhere in the SF bay are= a, >>> or possibly toss it into a USPS/UPS box and send it to you. >>>=20 >>> My only ask is that after 3 months, if the driver isn't somewhat working= >>> you return it so that it can be donated to a charity supporting the >>> homeless. >>>=20 >>> thanks! >>> -Alfred >>> _______________________________________________ >>> freebsd-hackers@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers >>> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.or= g" >> _______________________________________________ >> freebsd-hackers@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org= " >=20 > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"= >=20