From owner-freebsd-virtualization@freebsd.org Wed Dec 14 00:06:59 2016 Return-Path: Delivered-To: freebsd-virtualization@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DFFCDC76CA8 for ; Wed, 14 Dec 2016 00:06:59 +0000 (UTC) (envelope-from tychon@freebsd.org) Received: from sasl.smtp.pobox.com (pb-sasl1.pobox.com [64.147.108.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A1537DB0; Wed, 14 Dec 2016 00:06:58 +0000 (UTC) (envelope-from tychon@freebsd.org) Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl1.pobox.com (Postfix) with ESMTP id F14C45186C; Tue, 13 Dec 2016 19:05:16 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=content-type :mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; s=sasl; bh= tmjupcCyxDPoEgNYWP2dhB/Wb08=; b=H5fbP//8oqx9AV8UHxQssKQLO2a6hk1C hpPCNBLDm/vZ4o6z8Xltk15VBRruZ5Pceu8pq0sCB0cxsG1EB/sixgx47zWyyjgY So0TvAQ97QUXOEeM2m4hpMEgAy6He4E5ljhVYS36GbQMm5Inxp6pfMAYSkWWbRKY 7Ztl8sE+jI0= Received: from pb-sasl1.nyi.icgroup.com (unknown [127.0.0.1]) by pb-sasl1.pobox.com (Postfix) with ESMTP id E87EC5186B; Tue, 13 Dec 2016 19:05:16 -0500 (EST) Received: from [10.0.1.5] (unknown [209.6.121.211]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl1.pobox.com (Postfix) with ESMTPSA id DE3575186A; Tue, 13 Dec 2016 19:05:15 -0500 (EST) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) Subject: Re: Query about bhyve's blockif_cancel and the signalling mechanisms From: Tycho Nightingale In-Reply-To: <631f775d-8d61-55ba-1e7b-8ce4fcadcbf3@freebsd.org> Date: Tue, 13 Dec 2016 19:05:14 -0500 Cc: Ian Campbell , freebsd-virtualization@freebsd.org, anil@recoil.org Content-Transfer-Encoding: quoted-printable Message-Id: <397B138D-3701-4FB4-A9B3-618CE2624C3C@freebsd.org> References: <631f775d-8d61-55ba-1e7b-8ce4fcadcbf3@freebsd.org> To: Peter Grehan X-Mailer: Apple Mail (2.1878.6) X-Pobox-Relay-ID: FE2CDA74-C190-11E6-84C3-B2316462E9F6-09779102!pb-sasl1.pobox.com X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Dec 2016 00:07:00 -0000 Hi, On Dec 13, 2016, at 1:32 AM, Peter Grehan wrote: > Hi Ian, >=20 >> To recap my understanding of the mechanisms at work (glossing over = the >> queue handling and condvars involved etc), the bhyve block_if >> infrastructure registers a callback for SIGCONT with the mevent >> subsystem, which is a kevent/kqueue thing which delivers events to = the >> main thread (mevent_dispatch is the last thing in main()) it also = sets >> SIGCONT to SIG_IGN. >=20 > That's correct. The intent was to have the signal delivered via the = kevent callback rather than standard signal delivery. >=20 >> When a disk controller device model wants to >> cancel a block request (e.g. in ahci_port_stop) it calls >> blockif_cancel which sends a SIGCONT to the blkio thread which has >> claimed the request, notionally to kick it out of whatever blocking >> system call it is in and cause it to return an error to the device >> model. >=20 > Yep, that's correct. >=20 >> The main thing I do not follow is whether or not the blkio thread is >> actually interrupted at all when the signal has been configured to be >> delivered via the kevent/kqueue mechanisms to a 3rd unrelated thread. >=20 > It is interrupted on FreeBSD. >=20 >> I've dug around in the FreeBSD kevent and signal man pages but I >> cannot find any part which describes anything of the semantics which >> bhyve seems to be relying on (which seems to be that the system call >> in the target thread will return EINTR at some point before the = thread >> which is "handling" the signal via kevent/kqueue sees that event). >>=20 >> Have I missed something here or is bhyve relying on some subtle >> underlying semantics? >=20 > I didn't think it too FreeBSD-specific - if a thread is blocked in a = system call, sending a signal should force it to exit on most Unices. >=20 >> I have a secondary concern which is what happens if the IO thread is >> on its way to making a blocking system call in blockif_proc but has >> not actually done so when the signal is delivered. It seems like it >> would simply carry on and make the blocking call with perhaps >> unexpected consequences (i/o getting wedged, perhaps only until a >> second reset attempt). I've not actually seen this happening though >> and there's a chance I'm simply over thinking things after staring at >> them for so long! >=20 > I believe this case is handled - I discussed this at length with Tycho = when the code was committed a while back. >=20 > Tycho - any thoughts ? ahci_port_stop() is called under the protection the port soft-state lock = so that will stem any further requests from landing in the blockif = queue. That=92s the easy case. As for blockif requests which are queued, those are simply completed. = The ones that are in-flight all have their status set to BST_BUSY when = they are moved from the pending queue to the busy queue just prior to = being sent to blockif_proc(). It=92s therefore possible that an = in-flight request (one on the busy list) has yet to call blockif_proc(), = or is already inside blockif_proc() or has just completed = blockif_proc(). In all cases however BST_BUSY is cleared in = blockif_complete(). The key is therefore that regardless of where the = thread is, blockif_cancel() will continue to issue pthread_kill() until = the request reaches blockif_complete() =97 breaking it out of system = calls as necessary. Does that make sense? Tycho=