From owner-svn-src-head@freebsd.org Sun Jun 11 10:27:04 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0FFB1C78AE7; Sun, 11 Jun 2017 10:27:04 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail110.syd.optusnet.com.au (mail110.syd.optusnet.com.au [211.29.132.97]) by mx1.freebsd.org (Postfix) with ESMTP id B113232E; Sun, 11 Jun 2017 10:27:03 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c122-106-153-191.carlnfd1.nsw.optusnet.com.au [122.106.153.191]) by mail110.syd.optusnet.com.au (Postfix) with ESMTPS id 8A08510373F; Sun, 11 Jun 2017 19:53:48 +1000 (AEST) Date: Sun, 11 Jun 2017 19:53:47 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov cc: Jonathan Looney , John Baldwin , "Jonathan T. Looney" , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r319720 - head/sys/dev/vt In-Reply-To: <20170610182039.GL2088@kib.kiev.ua> Message-ID: <20170611150813.B891@besplex.bde.org> References: <201706082047.v58KlI51079003@repo.freebsd.org> <7306919.ixyIA96xWQ@ralph.baldwin.cx> <20170610091203.GH2088@kib.kiev.ua> <20170610182039.GL2088@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=VbSHBBh9 c=1 sm=1 tr=0 a=Tj3pCpwHnMupdyZSltBt7Q==:117 a=Tj3pCpwHnMupdyZSltBt7Q==:17 a=kj9zAlcOel0A:10 a=pGLkceISAAAA:8 a=GCEFH8vNhOtddbTsc2wA:9 a=CjuIK1q_8ugA:10 a=6kGIvZw6iX1k4Y-7sg4_:22 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 11 Jun 2017 10:27:04 -0000 On Sat, 10 Jun 2017, Konstantin Belousov wrote: > On Sat, Jun 10, 2017 at 09:33:48AM -0700, Jonathan Looney wrote: >> Hi Konstantin, >> >> On Sat, Jun 10, 2017 at 2:12 AM, Konstantin Belousov >> wrote: >>> No, acquire is only specified for loads, and release for stores. In other >>> words, on some hypothetical ll/sc architecture, the atomic_add_acq() >>> could be implemented as follows, in asm-pseudocode >>> atomic_add_acq(int x): >>> ll x, r1 >>> acq x >>> add 1, r >>> sc r1, x >>> Your use of the atomic does not prevent stores reordering. >> >> If this is true, it sounds like the atomic(9) man page needs some revision. >> It says: >> >> When an atomic operation has acquire semantics, the effects of the >> operation must have completed before any subsequent load or store (by >> program order) is performed. Conversely, acquire semantics do not >> require that prior loads or stores have completed before the atomic >> operation is performed. >> >> Up until now, I have relied on this description of the way acquire barriers >> work. If this description is incorrect, I think it is important to fix it >> quickly. I think it is correct, but bit confusing. Its first sentence says that the acquire operation acts as if it completes before any subsequent load or store operation (even non-atomic ones; even to different locations; is it really that strong?) starts. Its second sentence is redundant except for "Conversely". Completion of earlier operations is before the acquire operation is not the converse of starting of later operations after the acquire operation. So the acquire operation gives a sort of read-write barrier that is obviously not very useful by itself. Using only acquire operations, I don't see how to order anything except other acquire operations: acquire1 load/store1 # can be any time after acquire1 acquire2 # must be after acquire1 load/store2 # can be any time after acquire2 Since load/store1 may be delayed until after acquire2, it may be mixed with load/store2 in any order in general. Release operations give an ordering that pairs correctly with acquire operations. Essentially what is misdescribed as the "converse" above. > There are many issues with the atomic(9) man page, and they are not easy > to fix. In essence, useful, implementable and rigorous specification of > the atomics behaviour or the memory model as whole (which cannot be avoided > if atomics are specified) seems to be still somewhat unsolved scientific > problem. > > As is, I strongly suggest to rely on the standard C or C++ > specifications and augment it with some code reading of > arch/include/atomic.h. Despite it is unfeasible to use compiler-provided > atomics in kernel in C runtime, the intended behaviour as specified in > standards give us a strong base to get something that does not have > inherent design mistakes. Unfortunately, this is the only way to get it > correct now. But the standards are even harder to read (since they are more formal and more general). >>> I'm not claiming that this fixes all bugs in this area. (In fact, I >>>> specifically disclaim this.) But, it does stop the crash from occurring. >>>> >>>> If you still feel there are better mechanisms to achieve the desired >>>> ordering, please let me know and I'll be happy to fix and/or improve >>> this. >>> See the pseudocode I posted in my original reply, which uses acq/rel pair. Try to do it without using any atomic ops directly. Atomic ops are too hard to use. Well, I tried it (just reading the code). The bug seems to be from just foot shooting. Normal mutex locking should work, but is not used for calls to vt_resume_flush_timer(). I think this is not just to avoid LORs, since initialization of callouts is done while holding the lock. >> The code you posted for vt_resume_flush_timer() would switch vd_timer_armed >> from 0 to 1 even if VDF_ASYNC is not set; however, that is not what we >> want. vd_timer_armed should be left untouched if VDF_ASYNC is not set. >> >> It sounds like what I want is some sort of fence in vt_upgrade() like jhb@ >> specified in his email. For example: >> >> vd->vd_timer_armed = 1; >> atomic_thread_fence_rel(); >> vd->vd_flags |= VDF_ASYNC; >> >> I recognize a fence would be cleaner since I really only needed the barrier >> and not the atomic operation. Do you agree the above would be sufficient to >> ensure store ordering? > > No, it is not sufficient. The release barrier on write needs a dual > acquire barrier on read to have an effect. So if you prefer to use > fence_rel(), the reader should issue fence_acq() between reading > vd_flags and doing cmpset on vd_timer_armed. The complications seem to be mostly unnecessary since everything is under the vt lock in vt_upgrade(): XX void XX vt_upgrade(struct vt_device *vd) XX { XX ... XX VT_LOCK(vd); The lock is actually the mutex vd->vd_lock. XX if (vd->vd_curwindow == NULL) XX vd->vd_curwindow = vd->vd_windows[VT_CONSWINDOW]; XX XX register_handlers = 0; XX if (!(vd->vd_flags & VDF_ASYNC)) { This flag is clear initially to tell other functions to not do any callout stuff because our callout handler and infrastructure are not initialized yet. We also have foot-shooting, with testing of this flag and testing and setting of vd->vd_timer_armed not under the mutex. This function doesn't have the foot-shooting, but is affected by it since it must be careful to set the flags in an atomic and/or ordered way to support the foot-shooting. XX /* Attach keyboard. */ XX vt_allocate_keyboard(vd); XX XX /* Init 25 Hz timer. */ XX callout_init_mtx(&vd->vd_timer, &vd->vd_lock, 0); Any callout will be locked by the same mutex that VT_LOCK() just acquired, thus cannot race with us. However, with EARLY_AP_STARTUP, ordinary output can easily be running on another CPU (thread). It uses foot-shooting to race with us (and complicated atomic operations to avoid racing with other instances of ordinary output). Non-ordinary output like screen switches might do the same (not very likely early). XX XX /* XX * Start timer when everything ready. XX * Note that the operations here are purposefully ordered. XX * We need to ensure vd_timer_armed is non-zero before we set XX * the VDF_ASYNC flag. That prevents this function from XX * racing with vt_resume_flush_timer() to update the XX * callout structure. XX */ XX atomic_add_acq_int(&vd->vd_timer_armed, 1); The previous version set this to 1 non-atomically after calling callout_reset(). XX vd->vd_flags |= VDF_ASYNC; XX callout_reset(&vd->vd_timer, hz / VT_TIMERFREQ, vt_timer, vd); XX register_handlers = 1; XX } XX XX VT_UNLOCK(vd); It would be more natural to complete our initialization before setting VDF_ACTIVE: (void)atomic_cmpset_int(&vd->vd_timer_armed, 0, 1); callout_reset(... vt_timer ...); vd->vd_flags |= VDF_ASYNC; The cmpset is not quite right, but it is the same as what other instances will do. It is sure to find 0 and set to 1. We just have to set it before VDF_ASYNC. Then I trust callout_reset() to complete its setting for all threads. Then it is safe to set VDF_ASYNC. This setting becomes visible to other CPUs later. It doesn't matter if this is much later. The other instances can't touch callouts until they see it change. Callout handlers can't run until the lock is released, and releasing the lock makes the setting of VDF_ASYNC visible. Grep shows at least the following places with the foot-shooting: - vt_window_switch() drops the lock, then soon calls vt_resume_flush_timer() - vt_mouse_event() calls vt_resume_flush_timer(). I doubt that it holds the lock. - similarly in vt_mouse_state() - vt_replace_backend() drops the lock, then later calls vt_resume_flush_timer(). It calls vt_upgrade() in between(), so can clearly regain full locking. Use lock assertions to find all the places. Bruce