From owner-freebsd-hackers@freebsd.org  Sun Apr  9 01:02:08 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id D8B0BD337B5
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 01:02:08 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: from asp.reflexion.net (outbound-mail-210-8.reflexion.net
 [208.70.210.8])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 9F43DC29
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 01:02:07 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: (qmail 13912 invoked from network); 9 Apr 2017 01:03:01 -0000
Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2)
 by 0 (rfx-qmail) with SMTP; 9 Apr 2017 01:03:01 -0000
Received: by mail-cs-02.app.dca.reflexion.local
 (Reflexion email security v8.40.0) with SMTP;
 Sat, 08 Apr 2017 21:02:01 -0400 (EDT)
Received: (qmail 10694 invoked from network); 9 Apr 2017 01:02:01 -0000
Received: from unknown (HELO iron2.pdx.net) (69.64.224.71)
 by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 9 Apr 2017 01:02:01 -0000
Received: from [192.168.1.106] (c-76-115-7-162.hsd1.or.comcast.net
 [76.115.7.162])
 by iron2.pdx.net (Postfix) with ESMTPSA id BEB76EC8172;
 Sat,  8 Apr 2017 18:02:00 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: The arm64 fork-then-swap-out-then-swap-in failures: a program
 source for exploring them
From: Mark Millard <markmi@dsl-only.net>
In-Reply-To: <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
Date: Sat, 8 Apr 2017 18:02:00 -0700
Cc: andrew@freebsd.org,
 Konstantin Belousov <kostikbel@gmail.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
References: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net>
 <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
To: freebsd-arm <freebsd-arm@freebsd.org>,
 freebsd-hackers@freebsd.org
X-Mailer: Apple Mail (2.3273)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 01:02:08 -0000

[I've identified the code path involved is the arm64 small allocations
turning into zeros for later fork-then-swapout-then-back-in,
specifically the ongoing RES(ident memory) size decrease that
"top -PCwaopid" shows before the fork/swap sequence. Hopefully
I've also exposed enough related information for someone that
knows what they are doing to get started with a specific
investigation, looking for a fix. I'd like for a pine64+
2GB to have buildworld complete despite the forking and
swapping involved (yep: for a time zero RES(ident memory) for
some processes involved in the build).]

On 2017-Apr-7, at 1:16 AM, Mark Millard <markmi at dsl-only.net> wrote:

> [I now can: (A) crudely control the number of allocated
> pages that get zeros (that should not). (B) Watch a
> "top -PCwaopid" display and predict if the
> test-architecture will fail or not before the fork()
> or swap-out happens.]
>=20
> On 2017-Apr-4, at 8:00 PM, Mark Millard <markmi@dsl-only.net> wrote:
>=20
>> Uncommenting/commenting parts of the below program allows
>> exploring the problems with fork-then-swap-out-then-in on
>> arm64.
>>=20
>> Note: By swap-out I mean that zero RES(ident memory) results,
>>     for the process(s) of interest, as shown by
>>     "top -PCwaopid" .
>>=20
>> I discovered recently that swapping-out just before the
>> fork() prevents the failure from the swapping after the
>> fork().
>>=20
>> Note:
>> Without the fork() no problem happens. Without the later
>> swap-out no problem happens. Both are required. But some
>> activities before the fork() or between fork() and the
>> swap-out prevent the failures.
>>=20
>> Some of the comments are based on a pine64+ 2GB context.
>> I use stress to force swap-outs during some sleeps in
>> the program. See also Buzilla 217239 and 217138. (I now
>> expect that they have the same cause.)
>>=20
>> In my environment I've seen the fork-then-swap-out/swap-in
>> failures on a pine64+ 2GB and a rpi3. They are repeatable
>> on both. I do not have access to server-class machines, or
>> any other arm64 machines.
>>=20
>>=20
>> // swap_testing5.c
>>=20
>> // Built via (cc was clang 4.0 in my case):
>> //
>> // cc -g -std=3Dc11 -Wpedantic -o swaptesting5 swap_testing5.c
>> // -O0 and -O2 also gets the problem.
>>=20
>> // Note: jemalloc's tcache needs to be enabled to get the failure.
>> //       But FreeBSD can get into a state were /etc/malloc.conf
>> //       -> 'tcache:false' is ineffective. Also: the allocation
>> //       size needs to by sufficiently small (<=3D SMALL_MAXCLASS)
>> //       to see the problem. Other comments are based on a specific
>> //       context (pine64+ 2GB).
>>=20
>> #include <signal.h>     // for raise(.), SIGABRT (induce core dump)
>> #include <unistd.h>     // for fork(), sleep(.)
>> #include <sys/types.h>  // for pid_t
>> #include <sys/wait.h>   // for wait(.)
>>=20
>> extern void test_setup(void);         // Sets up the memory byte =
patterns.
>> extern void test_check(void);         // Tests the memory byte =
patterns.
>> extern void memory_willneed(void); // For seeing if
>>                                  // =
posix_madvise(.,.,POSIX_MADV_WILLNEED)
>>                                  // makes a difference.
>>=20
>> int main(void) {
>>   sleep(30); // Potentialy force swap-out here.
>>              // [Swap-out here does not avoid later failures.]
>>=20
>>   test_setup();
>>   test_check(); // Before potential sleep(.)/swap-out or fork(.) =
[passes]
>>=20
>>   sleep(30); // Potentialy force swap-out here.
>>              // [Everything below passes if swapped-out here,
>>              //  no matter if there are later swap-outs
>>              //  or not.]
>>=20
>>   pid_t pid =3D fork(); // To test no-fork use: =3D 0; no-fork does =
not fail.
>>   int wait_status =3D 0;
>>=20
>>   // HERE: After fork; before sleep/swap-out/wait.
>>=20
>>   // if (0 <  pid) memory_willneed(); // Does not prevent either =
parent or
>>                                    // child failure if enabled.
>>=20
>>   // if (0 =3D=3D pid) memory_willneed(); // Prevents both the parent =
and the
>>                                    // child failure. Disable to see
>>                                    // failure of both parent and =
child.
>>                                    // [Presuming no prior swap-out: =
that
>>                                    // would make everything pass.]
>>=20
>>   // During sleep/wait: manually force this process to
>>   // swap out. I use something like:
>>   //     stress -m 1 --vm-bytes 1800M
>>   // in another shell and ^C'ing it after top shows the
>>   // swapped status desired. 1800M just happened to work
>>   // on the Pine64+ 2GB that I was using. I watch with
>>   // top -PCwaopid [checking for zero RES(ident memory)].
>>=20
>>   if (0 < pid) {
>>       sleep(30);    // Intend to swap-out during sleep.
>>       // test_check(); // Test in parent before child runs (longer =
sleep).
>>                     // This test fails if run for a failing =
region_size
>>                     // unless earlier preventing-activity happened.
>>       wait(&wait_status); // Only if test_check above passes or is
>>                           // disabled above.
>>   }
>>   if (-1 !=3D wait_status && 0 <=3D pid) {
>>       if (0 =3D=3D pid) { sleep(90); } // Intend to swap-out during =
sleep.
>>       test_check(); // Fails for small-enough region_size, both
>>                     // parent and child processes, unless earlier
>>                     // preventing-activty happened.
>>   }
>> }
>>=20
>> // The memory and test code follows.
>>=20
>> #include <stddef.h>     // for size_t, NULL
>> #include <stdlib.h>     // for malloc(.), free(.)
>> #include <sys/mman.h>   // for POSIX_MADV_WILLNEED, =
posix_madvise(.,.,.)
>>=20
>> #define region_size (14u*1024u)
>>       // Bad dyn_region pattern, parent and child processes examples:
>>       // 256u, 2u*1024u, 4u*1024u, 8u*1024u, 9u*1024u, 12u*1024u, =
14u*1024u
>>       // No failure examples:
>>       // 14u*1024u+1u, 15u*1024u, 16u*1024u, 32u*1024u, =
256u*1024u*1024u
>> #define num_regions (256u*1024u*1024u/region_size)
>>=20
>> typedef volatile unsigned char value_type;
>> struct region_struct { value_type array[region_size]; };
>> typedef struct region_struct region;
>> static region * volatile dyn_regions[num_regions] =3D {NULL,};
>>=20
>> static value_type value(size_t v) { return =
(value_type)((v&0xFEu)|0x1u); }
>>                 // value avoids zero values: the bad values are =
zeros.
>>=20
>> void test_setup(void) {
>>   for(size_t i=3D0u; i<num_regions; i++) {
>>       dyn_regions[i] =3D malloc(sizeof(region));
>>       if (!dyn_regions[i]) raise(SIGABRT);
>>=20
>>       for(size_t j=3D0u; j<region_size; j++) {
>>           (*dyn_regions[i]).array[j] =3D value(j);
>>       }
>>   }
>> }
>>=20
>> void memory_willneed(void) {
>>   for(size_t i=3D0u; i<num_regions; i++) {
>>       (void) posix_madvise(dyn_regions[i], region_size, =
POSIX_MADV_WILLNEED);
>>   }
>> }
>>=20
>> static volatile size_t first_failure_idx =3D 0u; // dyn_regions index
>> static volatile size_t first_failure_pos =3D 0u; //   sub-array index
>> static volatile size_t after_bad_idx     =3D 0u; // dyn_regions index
>> static volatile size_t after_bad_pos     =3D 0u; //   sub-array index
>> static volatile size_t after_good_idx    =3D 0u; // dyn_regions index
>> static volatile size_t after_good_pos    =3D 0u; //   sub-array index
>>=20
>> // Note: Some failing cases get (conjunctive notation):
>> //
>> //    0 =3D=3D first_failure_idx < after_bad_idx < after_good_idx =3D=3D=
 num_regions
>> // && 0 =3D=3D first_failure_pos && 0<=3Dafter_bad_pos<=3Dregion_size =
&& after_good_idx=3D=3D0
>> // && (after_bad_pos is a multiple of the page size in Bytes, here:
>> //     after_bad_pos=3D=3DN*4096 for some non-negative integral value =
N)
>> //
>> // other failing cases instead fail with:
>> //
>> //    0 =3D=3D first_failure && num_regions =3D=3D after_bad_idx =3D=3D=
 after_good_idx
>> // && 0 =3D=3D first_failure_pos =3D=3D after_bad_pos =3D=3D =
after_good_idx
>> //
>> // after_bad_idx strongly tends to vary from failing run to failing =
run
>> // as does after_bad_pos.
>>=20
>> // Note: The working cases get:
>> //
>> //    num_regions =3D=3D first_failure =3D=3D after_bad_idx =3D=3D =
after_good_idx
>> // && 0 =3D=3D first_failure_pos =3D=3D after_bad_pos =3D=3D =
after_good_idx
>>=20
>> void test_check(void) {
>>   first_failure_idx =3D first_failure_pos =3D 0u;
>>=20
>>   while (first_failure_idx < num_regions) {
>>       while (  first_failure_pos < region_size
>>             && (  value(first_failure_pos)
>>                =3D=3D =
(*dyn_regions[first_failure_idx]).array[first_failure_pos]
>>                )
>>             ) {
>>           first_failure_pos++;
>>       }
>>=20
>>       if (region_size !=3D first_failure_pos) break;
>>=20
>>       first_failure_idx++;
>>       first_failure_pos =3D 0u;
>>   }
>>=20
>>   after_bad_idx =3D first_failure_idx;
>>   after_bad_pos =3D first_failure_pos;
>>=20
>>   while (after_bad_idx < num_regions) {
>>       while (  after_bad_pos < region_size
>>             && (  value(after_bad_pos)
>>                !=3D =
(*dyn_regions[after_bad_idx]).array[after_bad_pos]
>>                )
>>             ) {
>>           after_bad_pos++;
>>       }
>>=20
>>       if(region_size !=3D after_bad_pos) break;
>>=20
>>       after_bad_idx++;
>>       after_bad_pos =3D 0u;
>>   }
>>=20
>>   after_good_idx =3D after_bad_idx;
>>   after_good_pos =3D after_bad_pos;
>>=20
>>   while (after_good_idx < num_regions) {
>>       while (  after_good_pos < region_size
>>             && (  value(after_good_pos)
>>                =3D=3D =
(*dyn_regions[after_good_idx]).array[after_good_pos]
>>                )
>>             ) {
>>           after_good_pos++;
>>       }
>>=20
>>       if(region_size !=3D after_good_pos) break;
>>=20
>>       after_good_idx++;
>>       after_good_pos =3D 0u;
>>   }
>>=20
>>   if (num_regions !=3D first_failure_idx) raise(SIGABRT);
>> }
>=20
>=20
> I've found that for the above swap_testing5.c
> I can make variations that change how much of the
> allocated region prefix ends up zero vs. stays good.
>=20
> I vary the sleep time between testing the initialized
> allocations and doing the fork. The longer the sleep
> the more zero pages show up (be sure to read the
> comments):
>=20
> # diff swap_testing[56].c                                              =
                                                                         =
                                                        1c1
> < // swap_testing5.c
> ---
>> // swap_testing6.c
> 5c5
> < // cc -g -std=3Dc11 -Wpedantic -o swaptesting5 swap_testing5.c
> ---
>> // cc -g -std=3Dc11 -Wpedantic -o swaptesting5 swap_testing6.c
> 33c33
> <     sleep(30); // Potentialy force swap-out here.
> ---
>>    sleep(150); // Potentialy force swap-out here.
> 37a38,48
>>               // For no-swap-out here cases:
>>               //
>>               // The longer the sleep here the more allocations
>>               // that end up as zero.
>>               //
>>               // top's Mem Active, Inact, Wired, Bug, Free and
>>               // Swap Total, Used, and Free stay unchanged.
>>               // What does change is the process RES decreases
>>               // while the process SIZE and SWAP stay unchanged
>>               // during this sleep.
>>=20
>=20
> NOTE: On other architectures that I've tried (such as armv6/v7)
>      RES does not decrease during the sleep --and the problem
>      does not happen even for as long of sleeps as I've tried.
>=20
>      (I use "stress -m 2 --vm-bytes 900M" on armv6/v7 instead
>      of -m 1 --vm-bytes 1800M because that large in one
>      process is not allowed.)
>=20
> So watching top's RES during the sleep (longer than a few
> seconds) just before the fork() predicts the later
> fails-vs.-not status: If RES decreases (while other things
> associated with the process status stay the same) then
> there will be a failure.
>=20
> At this point I've no clue why the sleeping process has
> a decreasing RES(ident memory) size.
>=20
> I infer that without the sleep there still is a small
> amount of loss of RES but on too short of a timescale
> to observe in a "top -PCwaopid" or other such: in other
> words that the same behavior is causing the failure then
> as well, possibly for a loss of only one page of RES.


I've been able to identify what code sequence
is gradually removing the "small_mappings" via
some breakpointing in the kernel after reaching
the "should be just sleeping" status. Specifically
I started with breakpointing when
pmap_resident_count_dec was on the call stack
in order to see the call chain(s) that lead to
it being called while RES(ident memory) is
gradually decreasing during the sleep that
is just before forking.

(tid 100067 is [pagedaemon{pagedaemon}], which
is in vm_pageout_worker. bt does not show inlined
layers.)

[ thread pid 17 tid 100067 ]
Breakpoint at   $x.1:   undefined       d65f03c0
db> bt
Tracing pid 17 tid 100067 td 0xfffffd0001c4aa00
. . .
handle_el1h_sync() at pmap_remove_l3+0xdc
        pc =3D 0xffff000000604870  lr =3D 0xffff000000611158
        sp =3D 0xffff000083a49980  fp =3D 0xffff000083a49a40

pmap_remove_l3() at pmap_ts_referenced+0x580
        pc =3D 0xffff000000611158  lr =3D 0xffff000000615c50
        sp =3D 0xffff000083a49a50  fp =3D 0xffff000083a49ac0

pmap_ts_referenced() at vm_pageout+0xe60
        pc =3D 0xffff000000615c50  lr =3D 0xffff0000005d1f74
        sp =3D 0xffff000083a49ad0  fp =3D 0xffff000083a49b50

vm_pageout() at fork_exit+0x94
        pc =3D 0xffff0000005d1f74  lr =3D 0xffff0000002e01c0
        sp =3D 0xffff000083a49b60  fp =3D 0xffff000083a49b90

fork_exit() at fork_trampoline+0x10
        pc =3D 0xffff0000002e01c0  lr =3D 0xffff0000006177b4
        sp =3D 0xffff000083a49ba0  fp =3D 0x0000000000000000

It turns out that pmap_ts_referenced is on its:

small_mappings:
. . .

path for the above so the pmap_remove_l3 call is
the one from that execution path. (Found by more
breakpointing after enabling such on the paths.)

So this is the path with:
(breakpoint hook not shown)

                                /*
                                 * Wired pages cannot be paged out so
                                 * doing accessed bit emulation for
                                 * them is wasted effort. We do the
                                 * hard work for unwired pages only.
                                 */
                                pmap_remove_l3(pmap, pte, pv->pv_va, =
tpde,
                                    &free, &lock);
                                pmap_invalidate_page(pmap, pv->pv_va);
                                cleared++;
                                if (pvf =3D=3D pv)
                                        pvf =3D NULL;
                                pv =3D NULL;
                                . . .

pmap_remove_l3 decrements the resident_count in
this sequence.

=46rom what I can tell this code is eliminating the
content of pages that in the failing tests, ones
with  no backing store yet (not swapped-out yet
by test design). The observed behavior is that
the pages that have the above happen end up as
zero pages after the later
fork-then-swapout-then-back-in .

I do not see anything putting the pages that this
happens to into any other lists to keep track of
the contents of the page content. The swap-out
and swap-in seem to have ignored these pages and
to have been based on automatically zeroed pages
instead.

Note that the (or a) question might be if these
pages should have ever gotten to this code at
all. (I'm no expert overall.) But that might
get into why POSIX_MADV_WILLNEED spanning each
page is sufficient to avoid the zeros issue for
work-then-swapout-and-back-in. I'll only write
here about what the backtrace code seems to be
doing if I'm interpreting correctly.

One oddity here is that pmap_remove_l3 does its own
pmap_invalidate_page to invalidate the same tlb entry as
the above pmap_invalidate_page, so a double-invalidate.
(I've no clue if such is just suboptimal vs. a form of
error.)

pmap_remove_l3 here does things that the analogous
sys/arm/arm/pmap-v6.c's pmap_ts_referenced does not
do and pmap-v6 does something this code does not.

arm64's pmap_remove_l3 does (in summary):

  pmap_invalidate_page
  decrements the resident_count
  pmap_unwire_l3
(then pmap_ts_referenced's small_mappings code
 does another pmap_invalidate_page for the
 same argument values)

arm pmap-v6's pmap_ts_referenced's small_mappings
code does:

  conditional vm_page_dirty
  pte2_clear_bit for PTE2_A
  pmap_tlb_flush

There is, for example, no decrement of the
resident_count involved (that I found anyway).=20

But I've no clue just what should be analogous
vs. what should not between pmap-v6 and arm64's
pmap code in this area.

I'll also note that the code before the
arm64 small_mappings code also uses
pmap_remove_l3 but does not do the
decrement nor the extra pmap_invalidate_page
(for example). But again I do not know
how analogous the two paths should be.

Only the small_mappings path seems to have the
end-up-with-zeros problem for the later
fork-then-swap-out and then swap-back-in
context.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net


From owner-freebsd-hackers@freebsd.org  Sun Apr  9 10:13:49 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id BE0B9D347F9
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 10:13:49 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-pf0-x243.google.com (mail-pf0-x243.google.com
 [IPv6:2607:f8b0:400e:c00::243])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 8D9CC1A5
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 10:13:49 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-pf0-x243.google.com with SMTP id i5so3126316pfc.3
 for <freebsd-hackers@freebsd.org>; Sun, 09 Apr 2017 03:13:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=XzePEJ9rJfqoPW0mLbD6oYy8bbBOiUmYY+DWmXcf8YE=;
 b=Lg9IcjOeCme4j0qPFLYygXO8o9G8O7CS2oY6bYFPh+IyDWV1xGcx/gLXn/fzWZTPAv
 DlCq2ieNo4BSZpqXAoNKqS0f6J00eFVsLYEoB3WhzsDWfNgxb6/IMnJyZJKYc8WYEuW6
 sRTM5bZtZL9Mpag3rFuL8PCoomnrACk8QqpH5nzv2qUG16uEWGVeX7Snz6cao3YBmDUY
 KIjARLxThR58u0ziX2R4uMH0Yh4fv39DhDdoiHuQmSshxT0w+u25P2zel/iY72UsNwKR
 y2KYvxP8dw6KzAp3I0NJs8zSHkg2W5dyW1Y9ibFASBQrSUixG6hp45yI2aBQenaEJapp
 5FEg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=XzePEJ9rJfqoPW0mLbD6oYy8bbBOiUmYY+DWmXcf8YE=;
 b=mQAt8EUyxXpBTviMhUgGsarB0DK6DUyfb4uMIyBiYcTKfrXnGtIHSrE3a6CzPI0+eW
 r1RadyvL5tv7U9hS2XuErFCYYNXbq4C4N5SxTgdm3ZQtld6ZPqw8RYttkR4OzPnTFXVy
 ic9okfap03TKeueOyakM0mwCkLbik8NWbSA8OoqDbPgURjR60o4aVi4SkGGXvFLbuuBL
 +nREjbSZoIHDz9e+im1dKUupnaaiL88JMRt0mu03/t3idTmgUSRe9HEvyRy9acOCkJ8z
 0Wb8YZPehJKmPVG0T+YTQpRyx9AgJf3XhniRdx6Vn4WnAUkGvfep4uE/LFujxlDsFvNR
 3vCg==
X-Gm-Message-State: AN3rC/7XiLg11OyTo5B8+tivHdDhvFl7Qlg73Achb/iceHDLL1pEaD2GOS1ulr1fZNzqvw==
X-Received: by 10.98.103.1 with SMTP id b1mr1919350pfc.184.1491732828452;
 Sun, 09 Apr 2017 03:13:48 -0700 (PDT)
Received: from [192.168.0.100] ([110.64.91.54])
 by smtp.gmail.com with ESMTPSA id x30sm18654332pgc.2.2017.04.09.03.13.45
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Sun, 09 Apr 2017 03:13:47 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Ed Schouten <ed@nuxi.nl>
References: <e99b6366-7d30-a889-b7db-4a3b3133ff5e@gmail.com>
 <CABh_MKkbVVi+gTkaBVDvVfRggS6pbHKJE_VbYBZpAaTCZ81b7Q@mail.gmail.com>
Cc: freebsd-hackers@freebsd.org
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
Date: Sun, 9 Apr 2017 18:13:40 +0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.7.0
MIME-Version: 1.0
In-Reply-To: <CABh_MKkbVVi+gTkaBVDvVfRggS6pbHKJE_VbYBZpAaTCZ81b7Q@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 10:13:49 -0000

On 2017/4/6 17:31, Ed Schouten wrote:
> Hi Yubin,
>
> 2017-04-06 11:16 GMT+02:00 Yubin Ruan <ablacktshirt@gmail.com>:
>> Does this function provides the ordinary "spinlock" functionality? There
>> is no special "test-and-set" instruction, and neither any extra locking
>> to protect internal data structure manipulation. Isn't this subjected to
>> race condition?
>
> Locking a spinlock is done through macro mtx_lock_spin(), which
> expands to __mtx_lock_spin() in sys/sys/mutex.h. That macro first
> calls into the function you looked at, spinlock_enter(), to disable
> interrupts. It then calls into the _mtx_obtain_lock_fetch() to do the
> test-and-set operation you were looking for.

Thanks for replying. I have read some of those codes.

Just a few more questions, if you don't mind:

(1) why are spinlocks forced to disable interrupt in FreeBSD?

 From the book "The design and implementation of the FreeBSD Operating
System", the authors say "spinning can result in deadlock if a thread 
interrupted the thread that held a mutex and then tried to acquire the 
mutex"...(section 4.3, Mutex Synchronization, paragraph 4)

I don't get the point why a spinlock(or *spin mutex* in the FreeBSD
world) has to disable interrupt. Being interrupted does not necessarily
mean a deadlock. Assume that thread A holding a lock T gets interrupted
by another thread B(context switch here) and thread B try to acquire
the lock T. After finding out that lock T has already been acquired,
thread B will just spin until it gets preempted, after which thread A
gets waken up and run and release the lock T. So, you see there is not
necessarily any deadlock even if thread A get interrupted.

I can only remember two conditions where using spinlock without
disabling interrupts will cause deadlock:

#######1, spinlock used in an interrupt handler
If a thread A holding a spinlock T get interrupted and the interrupt
handler responsible for this interrupt try to acquire T, then we have
deadlock, because A would never have a chance to run before the
interrupt handler return, and the interrupt handler, unfortunately,
will continue to spin ... so in this situation, one has to disable
interrupt before spinning.

As far as I know, in Linux, they provide two kinds of spinlocks:

   spin_lock(..);   /* spinlock that does not disable interrupts */
   spin_lock_irqsave(...); /* spinlock that disable local interrupt */


#######2, priority inversion problem
If thread B with a higher priority get in and try to acquire the lock
that thread A currently holds, then thread B would spin, while at the
same time thread A has no chance to run because it has lower priority,
thus not being able to release the lock.
(I haven't investigate enough into the source code, so I don't know
how FreeBSD and Linux handle this priority inversion problem. Maybe
they use priority inheritance or random boosting?)

thanks,
Yubin Ruan

From owner-freebsd-hackers@freebsd.org  Sun Apr  9 12:27:25 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6CA0DD2AC9F;
 Sun,  9 Apr 2017 12:27:25 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id D253DF49;
 Sun,  9 Apr 2017 12:27:24 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from tom.home (kib@localhost [127.0.0.1])
 by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v39CRF34047280
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Sun, 9 Apr 2017 15:27:15 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v39CRF34047280
Received: (from kostik@localhost)
 by tom.home (8.15.2/8.15.2/Submit) id v39CRFNJ047279;
 Sun, 9 Apr 2017 15:27:15 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Sun, 9 Apr 2017 15:27:15 +0300
From: Konstantin Belousov <kostikbel@gmail.com>
To: Mark Millard <markmi@dsl-only.net>
Cc: freebsd-arm <freebsd-arm@freebsd.org>, freebsd-hackers@freebsd.org,
 andrew@freebsd.org
Subject: Re: The arm64 fork-then-swap-out-then-swap-in failures: a program
 source for exploring them
Message-ID: <20170409122715.GF1788@kib.kiev.ua>
References: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net>
 <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
 <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
User-Agent: Mutt/1.8.0 (2017-02-23)
X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no
 autolearn_force=no version=3.4.1
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 12:27:25 -0000

On Sat, Apr 08, 2017 at 06:02:00PM -0700, Mark Millard wrote:
> [I've identified the code path involved is the arm64 small allocations
> turning into zeros for later fork-then-swapout-then-back-in,
> specifically the ongoing RES(ident memory) size decrease that
> "top -PCwaopid" shows before the fork/swap sequence. Hopefully
> I've also exposed enough related information for someone that
> knows what they are doing to get started with a specific
> investigation, looking for a fix. I'd like for a pine64+
> 2GB to have buildworld complete despite the forking and
> swapping involved (yep: for a time zero RES(ident memory) for
> some processes involved in the build).]

I was not able to follow the walls of text, but do not think that
I pmap_ts_reference() is the real culprit there.

Is my impression right that the issue occurs on fork, and looks as
a memory corruption, where some page suddently becomes zero-filled ?
And swapping seems to be involved ?  It is somewhat interesting to see
if the problem is reproducable on non-arm64 machines, e.g. armv7 or amd64.

If answers to my two questions are yes, there is probably some bug with
arm64 pmap handling of the dirty bit emulation.  ARMv8.0 does not provide
hardware dirty bit, and pmap interprets an accessed writeable page as
unconditionally dirty.  More, accessed bit is also not maintained by
hardware, instead if should be set by pmap.  And arm64 pmap sets the
AF bit unconditionally when creating valid pte.

Hmm, could you try the following patch, I did not even compiled it.
diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
index 3d5756ba891..55aa402eb1c 100644
--- a/sys/arm64/arm64/pmap.c
+++ b/sys/arm64/arm64/pmap.c
@@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, vm_prot_t prot)
 		    sva += L3_SIZE) {
 			l3 = pmap_load(l3p);
 			if (pmap_l3_valid(l3)) {
+				if ((l3 & ATTR_SW_MANAGED) &&
+				    pmap_page_dirty(l3)) {
+					vm_page_dirty(PHYS_TO_VM_PAGE(l3 &
+					    ~ATTR_MASK));
+				}
 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
 				PTE_SYNC(l3p);
 				/* XXX: Use pmap_invalidate_range */

From owner-freebsd-hackers@freebsd.org  Sun Apr  9 13:28:53 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id BBEE7D3507E
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 13:28:53 +0000 (UTC)
 (envelope-from vasanth.raonaik@gmail.com)
Received: from mail-oi0-x232.google.com (mail-oi0-x232.google.com
 [IPv6:2607:f8b0:4003:c06::232])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 8053EE6B
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 13:28:53 +0000 (UTC)
 (envelope-from vasanth.raonaik@gmail.com)
Received: by mail-oi0-x232.google.com with SMTP id g204so51509505oib.1
 for <freebsd-hackers@freebsd.org>; Sun, 09 Apr 2017 06:28:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc; bh=MvooiGLvDttcl//L/B/qEVo9aBPnzYDm2YR5CZZI1xA=;
 b=JME6LdWKTYMqwGo62ty6Yi9tziia5GvNv5D470u29z38p9QBKmNm8YgH2DhJs2TE9q
 FnTilbvcqipZ6bhQ8KAnsRPrrSunmlEN9o1uLeyp3VIVEE3wcsKD2tHJQLfioWVKD7C5
 dIVLDaXmb2epdrjtB2kTUHCHCGa7WUDbWsP/Tyyr9O1kSOaQ/zW2x7ccBK6yw5l4gUTx
 5YtAG0TfhEi3xFmFIoGSrLm2Vy5yiChxCvFJyCH7Nw9aIp5Ge9xx+oyBWrI6tY0+rIjm
 cSa1X9QCYVlvq4nnSoonzms+V+wCVCHMJSUPsY+in4SjrwuFoLA1/DhzB2dUz0twcKdR
 RvuQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=MvooiGLvDttcl//L/B/qEVo9aBPnzYDm2YR5CZZI1xA=;
 b=AEv6nYpTfaV9FqmnjfEGy5l1nix0bpNwG5X9fnld13knaGiUcM3G6Jj0ql+CC43Q/i
 xXeaHGxDt2ga8Nurt1FvoYQYyFo71t0jeDwQPYOBjCuE/aZtfywb25DBJDReIUS2A2Cw
 c5LGXlPxe35l+5lRCCarsbKmw16QuiaDHJE9ClD3rYmWtPkwR1DKvjl+kkScw8xCxEL3
 Pj7nT7uU7KnDem3rM2RQ7WtCaB2sDTcigZSTdq73h+8qgVkjzc4e0bHyBhhSfS49MfQ3
 jRhdNZseW5cKwI6buoTFu/iYRssyYSMd+EUTT4ruI0F7d8jqZ4049ipo4twBJ/E2r6pz
 Pvfw==
X-Gm-Message-State: AN3rC/4k9xG6DtH7W9WHi/IBz066/NHndLhRDVfOBgQg/lWalSc5lSOPwri0wRloGFI1Hot3UjnPAk1alv47EQ==
X-Received: by 10.202.245.137 with SMTP id t131mr798786oih.149.1491744532658; 
 Sun, 09 Apr 2017 06:28:52 -0700 (PDT)
MIME-Version: 1.0
References: <e99b6366-7d30-a889-b7db-4a3b3133ff5e@gmail.com>
 <CABh_MKkbVVi+gTkaBVDvVfRggS6pbHKJE_VbYBZpAaTCZ81b7Q@mail.gmail.com>
 <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
In-Reply-To: <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
From: vasanth sabavat <vasanth.raonaik@gmail.com>
Date: Sun, 09 Apr 2017 13:28:41 +0000
Message-ID: <CAAuizBjVU4o9ofi8sAyg_kva-+ognyxomQU46aeb5Q23Htn-SA@mail.gmail.com>
Subject: Re: Understanding the FreeBSD locking mechanism
To: Ed Schouten <ed@nuxi.nl>, Yubin Ruan <ablacktshirt@gmail.com>
Cc: freebsd-hackers@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 13:28:53 -0000

On Sun, Apr 9, 2017 at 3:14 AM Yubin Ruan <ablacktshirt@gmail.com> wrote:

> On 2017/4/6 17:31, Ed Schouten wrote:
> > Hi Yubin,
> >
> > 2017-04-06 11:16 GMT+02:00 Yubin Ruan <ablacktshirt@gmail.com>:
> >> Does this function provides the ordinary "spinlock" functionality? There
> >> is no special "test-and-set" instruction, and neither any extra locking
> >> to protect internal data structure manipulation. Isn't this subjected to
> >> race condition?
> >
> > Locking a spinlock is done through macro mtx_lock_spin(), which
> > expands to __mtx_lock_spin() in sys/sys/mutex.h. That macro first
> > calls into the function you looked at, spinlock_enter(), to disable
> > interrupts. It then calls into the _mtx_obtain_lock_fetch() to do the
> > test-and-set operation you were looking for.
>
> Thanks for replying. I have read some of those codes.
>
> Just a few more questions, if you don't mind:
>
> (1) why are spinlocks forced to disable interrupt in FreeBSD?
>
>  From the book "The design and implementation of the FreeBSD Operating
> System", the authors say "spinning can result in deadlock if a thread
> interrupted the thread that held a mutex and then tried to acquire the
> mutex"...(section 4.3, Mutex Synchronization, paragraph 4)
>
> I don't get the point why a spinlock(or *spin mutex* in the FreeBSD
> world) has to disable interrupt. Being interrupted does not necessarily
> mean a deadlock. Assume that thread A holding a lock T gets interrupted
> by another thread B(context switch here) and thread B try to acquire
> the lock T. After finding out that lock T has already been acquired,
> thread B will just spin until it gets preempted, after which thread A
> gets waken up and run and release the lock T.


Assume single CPU, If thread B spins where will thread A get to run and
finish up its critical section and release the lock? The one CPU you have
is held by thread b for spinning.

For spin locks on single core, it does not make sense to spin. We just
disable interrupts as we are currently the only ones running we just need
to make sure no others will get to preempt us. That's why spin locks should
be held for short duration.

When you have multiple cores,  ThreadA can spin on cpu1, while thread B
holding the lock on cpu2 can finish up and release it. We disable
interrupts only on cpu1 so we don't want to preempt threadA. The cost of
preemption is very high compared to short spin. Note: short spin.

Look at adaptive spin locks.



So, you see there is not
> necessarily any deadlock even if thread A get interrupted.
>
> I can only remember two conditions where using spinlock without
> disabling interrupts will cause deadlock:
>
> #######1, spinlock used in an interrupt handler
> If a thread A holding a spinlock T get interrupted and the interrupt
> handler responsible for this interrupt try to acquire T, then we have
> deadlock, because A would never have a chance to run before the
> interrupt handler return, and the interrupt handler, unfortunately,
> will continue to spin ... so in this situation, one has to disable
> interrupt before spinning.
>
> As far as I know, in Linux, they provide two kinds of spinlocks:
>
>    spin_lock(..);   /* spinlock that does not disable interrupts */
>    spin_lock_irqsave(...); /* spinlock that disable local interrupt */
>
>
> #######2, priority inversion problem
> If thread B with a higher priority get in and try to acquire the lock
> that thread A currently holds, then thread B would spin, while at the
> same time thread A has no chance to run because it has lower priority,
> thus not being able to release the lock.
> (I haven't investigate enough into the source code, so I don't know
> how FreeBSD and Linux handle this priority inversion problem. Maybe
> they use priority inheritance or random boosting?)
>
> thanks,
> Yubin Ruan
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
>
-- 
Thanks,
Vasanth

From owner-freebsd-hackers@freebsd.org  Sun Apr  9 14:40:12 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7A0EDD3661E;
 Sun,  9 Apr 2017 14:40:12 +0000 (UTC)
 (envelope-from eric@metricspace.net)
Received: from mail.metricspace.net (mail.metricspace.net
 [IPv6:2001:470:1f11:617::107])
 by mx1.freebsd.org (Postfix) with ESMTP id 4DF8417D;
 Sun,  9 Apr 2017 14:40:12 +0000 (UTC)
 (envelope-from eric@metricspace.net)
Received: from [172.16.0.205] (unknown [172.16.0.205])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client did not present a certificate) (Authenticated sender: eric)
 by mail.metricspace.net (Postfix) with ESMTPSA id B1EF8186E;
 Sun,  9 Apr 2017 14:40:11 +0000 (UTC)
Subject: Re: Proposal for a design for signed kernel/modules/etc
To: "freebsd-hackers@freebsd.org" <freebsd-hackers@FreeBSD.org>,
 freebsd-security@freebsd.org
References: <6f6b47ed-84e0-e4c0-9df5-350620cff45b@metricspace.net>
 <20170408111144.GC14604@brick>
 <181f7b78-64c3-53a6-a143-721ef0cb5186@metricspace.net>
 <20170408115222.GA64207@brick>
From: Eric McCorkle <eric@metricspace.net>
Message-ID: <7611f7a3-3e50-65f2-4347-e37018ae1abc@metricspace.net>
Date: Sun, 9 Apr 2017 10:40:07 -0400
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <20170408115222.GA64207@brick>
Content-Type: multipart/signed; micalg=pgp-sha256;
 protocol="application/pgp-signature";
 boundary="mtArbQXOnqfKwxkx45JFK13QqcR6Jn47r"
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 14:40:12 -0000

This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--mtArbQXOnqfKwxkx45JFK13QqcR6Jn47r
Content-Type: multipart/mixed; boundary="LRivelBAQdLMTNuKBGmlcXn2bavWRRppf";
 protected-headers="v1"
From: Eric McCorkle <eric@metricspace.net>
To: "freebsd-hackers@freebsd.org" <freebsd-hackers@FreeBSD.org>,
 freebsd-security@freebsd.org
Message-ID: <7611f7a3-3e50-65f2-4347-e37018ae1abc@metricspace.net>
Subject: Re: Proposal for a design for signed kernel/modules/etc
References: <6f6b47ed-84e0-e4c0-9df5-350620cff45b@metricspace.net>
 <20170408111144.GC14604@brick>
 <181f7b78-64c3-53a6-a143-721ef0cb5186@metricspace.net>
 <20170408115222.GA64207@brick>
In-Reply-To: <20170408115222.GA64207@brick>

--LRivelBAQdLMTNuKBGmlcXn2bavWRRppf
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On 04/08/2017 07:52, Edward Tomasz Napiera=C5=82a wrote:
> On 0408T0803, Eric McCorkle wrote:
>> On 04/08/2017 07:11, Edward Tomasz Napiera=C5=82a wrote:
>>> On 0327T1354, Eric McCorkle wrote:
>>>> Hello everyone,
>>>>
>>>> The following is a design proposal for signed kernel and kernel modu=
le
>>>> loading, both at boot- and runtime (with the possibility open for si=
gned
>>>> executables and libraries if someone wanted to go that route).  I'm
>>>> interested in feedback on the idea before I start actually writing c=
ode
>>>> for it.
>>>
>>> I see two potential problems with this.
>>>
>>> First, our current loader(8) depends heavily on Forth code.  By makin=
g
>>> it load modified 4th files, you can do absolutely anything you want;
>>> AFAIK they have unrestricted access to hardware.  So you should prefe=
rably
>>> be able to sign them as well.  You _might_ (not sure on this one) als=
o
>>> want to be able to restrict access to some of the loader configuratio=
n
>>> variables.
>>
>> Loader is handled by the UEFI secure boot framework, though the concer=
ns
>> about the 4th code are still valid.  In a secure system, you'd want to=

>> do something about that, but the concerns are different enough (and it=
's
>> isolated enough) that it could be done separately.
>=20
> Unless the way to address those ends up being a signature mechanism
> that doesn't depend on the format of the files being signed.

I explored the idea of wrapped or detached signatures in the previous
discussion.  Envelopes or detached signatures could make sense for the
4th files.  It's a small, obscure set of code that probably isn't
changed very often.

Envelopes or detached signatures for kernel modules and especially
signed executables and libraries both have extensive, far-reaching
consequences for system administration, packaging, tooling, the ports
collection, and so on, whereas signing the executable with an additional
section has no such consequences.

Config files (and the 4th files really are more like config files) have
a different set of constraints, and detached signatures are probably the
way to go there.  So loader should probably support detached PKCS#7
signature checks.


--LRivelBAQdLMTNuKBGmlcXn2bavWRRppf--

--mtArbQXOnqfKwxkx45JFK13QqcR6Jn47r
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----

iHUEARYIAB0WIQRELMWN3SgpoYkrmidWwohAqoAEjQUCWOpHyAAKCRBWwohAqoAE
jT0zAQCjaQTkFbS5xkr4eixhwOysahTZRg1iKojdfj/NpbIwyQEAj8MuUJvPSi12
xIqgCFSa47WyfCEAoAMOcjMqwdSEpgs=
=i63w
-----END PGP SIGNATURE-----

--mtArbQXOnqfKwxkx45JFK13QqcR6Jn47r--

From owner-freebsd-hackers@freebsd.org  Sun Apr  9 15:52:45 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6378AD3642C;
 Sun,  9 Apr 2017 15:52:45 +0000 (UTC)
 (envelope-from etnapierala@gmail.com)
Received: from mail-wm0-x22a.google.com (mail-wm0-x22a.google.com
 [IPv6:2a00:1450:400c:c09::22a])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id E00DC370;
 Sun,  9 Apr 2017 15:52:44 +0000 (UTC)
 (envelope-from etnapierala@gmail.com)
Received: by mail-wm0-x22a.google.com with SMTP id t189so21549268wmt.1;
 Sun, 09 Apr 2017 08:52:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=sender:date:from:to:cc:subject:message-id:mail-followup-to
 :references:mime-version:content-disposition
 :content-transfer-encoding:in-reply-to:user-agent;
 bh=cSWNPRDYzpXGgut0TcmsMKgHX9IWDb8BS4PWSe4Bwrw=;
 b=d+XCugcs4eR/vvrYVnu2s8AMj4oCtvs8SlCR/PjpqarV7L9pGpe8ISWF3NxMME7Xar
 E/xg/MGw9d1P9O8Qm7DRZ4oR5uV6rXmYwbF8+oMrIofSBaIB/s9skvcSKZiQwVVMs4/7
 bTx/EXcc2akVr64wmdJIj3vDnDmk6Dm/iPCYqsrQxjbGIbK0fpsRw2uDWZOrcyUCYdCi
 caefe7TtRMXx8bjDg2d05/PO04X/beUpMhi+Cf6gSHB/It/m0mc/y7P3s0RM6RNxb2q9
 SebnWXt2osiNi9Coa83Q3NZXt9fH5YSc77MrTc0NZL1ejYFVQSlmmwtkw1ozJQ/iIp2y
 DrBA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:sender:date:from:to:cc:subject:message-id
 :mail-followup-to:references:mime-version:content-disposition
 :content-transfer-encoding:in-reply-to:user-agent;
 bh=cSWNPRDYzpXGgut0TcmsMKgHX9IWDb8BS4PWSe4Bwrw=;
 b=f96PP7XQuGTIM6Sb+omP1tmo/ytWysmcJ+pEiElt7HiJDcNvkPNf2+CDVotrPl/V35
 rmtytIqBHxn+OxUNxNGDcL1Q/C2S87Hlv/5rUolLZh1LDWmolcKzX37qgftIj7yK1GAd
 iEJXe2GOnuW79DbaadBqpWyBNmAFTANYecjc1FiKRrivf8q8Xcj/WL5keDSVHXptwmjV
 ttTtZbyIk2tSJjFzdTcdyEpn8gKkoEW+iGRPjjcBhONQE7RMYr+9hK2fMe/ZIsV1AtJQ
 3HDEAtVS7KP4a9ypFzplB+mu0qt6agiA7H7aGkFLsAxJ/A1soPtH+qlsB2WkVktpdkRX
 JdVg==
X-Gm-Message-State: AN3rC/49buf3i1xQMGn6u86ZX/cq55QDm863joY2x9oMXvZgbM7ikqgK
 SS95vJU8tHE7p/gk
X-Received: by 10.28.90.2 with SMTP id o2mr6544309wmb.53.1491753162403;
 Sun, 09 Apr 2017 08:52:42 -0700 (PDT)
Received: from brick (cpc92310-cmbg19-2-0-cust934.5-4.cable.virginm.net.
 [82.9.227.167])
 by smtp.gmail.com with ESMTPSA id v186sm6809403wmv.2.2017.04.09.08.52.41
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Sun, 09 Apr 2017 08:52:41 -0700 (PDT)
Sender: =?UTF-8?Q?Edward_Tomasz_Napiera=C5=82a?= <etnapierala@gmail.com>
Date: Sun, 9 Apr 2017 16:52:40 +0100
From: Edward Tomasz =?utf-8?Q?Napiera=C5=82a?= <trasz@FreeBSD.org>
To: Eric McCorkle <eric@metricspace.net>
Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@FreeBSD.org>,
 freebsd-security@freebsd.org
Subject: Re: Proposal for a design for signed kernel/modules/etc
Message-ID: <20170409155240.GA18363@brick>
Mail-Followup-To: Eric McCorkle <eric@metricspace.net>,
 "freebsd-hackers@freebsd.org" <freebsd-hackers@FreeBSD.org>,
 freebsd-security@freebsd.org
References: <6f6b47ed-84e0-e4c0-9df5-350620cff45b@metricspace.net>
 <20170408111144.GC14604@brick>
 <181f7b78-64c3-53a6-a143-721ef0cb5186@metricspace.net>
 <20170408115222.GA64207@brick>
 <7611f7a3-3e50-65f2-4347-e37018ae1abc@metricspace.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <7611f7a3-3e50-65f2-4347-e37018ae1abc@metricspace.net>
User-Agent: Mutt/1.8.0 (2017-02-23)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 15:52:45 -0000

On 0409T1040, Eric McCorkle wrote:
> On 04/08/2017 07:52, Edward Tomasz Napierała wrote:
> > On 0408T0803, Eric McCorkle wrote:
> >> On 04/08/2017 07:11, Edward Tomasz Napierała wrote:
> >>> On 0327T1354, Eric McCorkle wrote:
> >>>> Hello everyone,
> >>>>
> >>>> The following is a design proposal for signed kernel and kernel module
> >>>> loading, both at boot- and runtime (with the possibility open for signed
> >>>> executables and libraries if someone wanted to go that route).  I'm
> >>>> interested in feedback on the idea before I start actually writing code
> >>>> for it.
> >>>
> >>> I see two potential problems with this.
> >>>
> >>> First, our current loader(8) depends heavily on Forth code.  By making
> >>> it load modified 4th files, you can do absolutely anything you want;
> >>> AFAIK they have unrestricted access to hardware.  So you should preferably
> >>> be able to sign them as well.  You _might_ (not sure on this one) also
> >>> want to be able to restrict access to some of the loader configuration
> >>> variables.
> >>
> >> Loader is handled by the UEFI secure boot framework, though the concerns
> >> about the 4th code are still valid.  In a secure system, you'd want to
> >> do something about that, but the concerns are different enough (and it's
> >> isolated enough) that it could be done separately.
> > 
> > Unless the way to address those ends up being a signature mechanism
> > that doesn't depend on the format of the files being signed.
> 
> I explored the idea of wrapped or detached signatures in the previous
> discussion.  Envelopes or detached signatures could make sense for the
> 4th files.  It's a small, obscure set of code that probably isn't
> changed very often.
> 
> Envelopes or detached signatures for kernel modules and especially
> signed executables and libraries both have extensive, far-reaching
> consequences for system administration, packaging, tooling, the ports
> collection, and so on, whereas signing the executable with an additional
> section has no such consequences.
> 
> Config files (and the 4th files really are more like config files) have
> a different set of constraints, and detached signatures are probably the
> way to go there.  So loader should probably support detached PKCS#7
> signature checks.

The third way that might be worth considering would be to just append
the signature.  This would work for both 4th (if you prepend it with
whatever is the 4th comment character) and ELF, without the need for
changing or extending either format.



From owner-freebsd-hackers@freebsd.org  Sun Apr  9 15:56:36 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id D99EBD365C9
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 15:56:36 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-pg0-x241.google.com (mail-pg0-x241.google.com
 [IPv6:2607:f8b0:400e:c05::241])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id A7AF27A1
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 15:56:36 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-pg0-x241.google.com with SMTP id 81so22803373pgh.3
 for <freebsd-hackers@freebsd.org>; Sun, 09 Apr 2017 08:56:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=8yKtTJGqBRM5lN7JcFOLcgahMmg9BBFrov2P3XgjQiM=;
 b=TtrSR+cnwoyzQyyR0+WPMMc/0Szy+JWVz0nm3nrV9xEvswRQ5ugJtK/zKNLFwhQRSz
 3XHJPJB5JL8L15JmNYl7Q109N1wWSygtY0ZPom5/jW1iujSpQPUCsyLp+sqOCv2rjO7G
 6ZTOdzCaok6j32t0JCVWVkUD7A/8XBsAcLg/XbXm3hmeEmOvHZwiPZ6icmkrvLjuwQDq
 2CI96FOfmmR08G7L0VZm08GF3zCrHoH3QXXxHC9u3O8l+oARco7je0GgsRwOJ7Gh5Hcf
 t0/jOhACsKghzMYWcszXx/lwkF132cj2ffWXST+w0CKJeqlQEFsfUfzmIJzWjqKevByn
 aXgA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=8yKtTJGqBRM5lN7JcFOLcgahMmg9BBFrov2P3XgjQiM=;
 b=VFbpv93sHljt7zpwBZQuHeauKFKKbQBBeRYa2iYNU6nGPLNEXlRlc9lIwyVZRX9K4v
 XbRymTceWVo82lBv9Id/pvEfnM2G6QYxI1xxIJDFYN0aERO5BH44+fN56h5wbeLqxNqR
 vJoisICvGGGFq2QhQfbbq/bYJHdTJ5D2HvYzgsNefOBbUEhZWkH2jWk96fbG9F2jw/X4
 70u6oTEuGKf+O9dCzcBg0ax70YP/uMQDsbs7q0YX1qw1K4vNjkirYw8IznkcZOYnx3Ke
 I5Ye9m0ldZhG9wOfy90AMoYPJIbegPZVShP66oDQQilcu/T1vjOc+fjcHPIhcQIgQ05h
 giUw==
X-Gm-Message-State: AFeK/H3nwWI5SDsFFlKoObg+DhST2Sfnl5Qim3NAiK0O3eIwMEEm1IQdYDNpirJVf7PiVg==
X-Received: by 10.84.231.193 with SMTP id g1mr48577104pln.84.1491753396150;
 Sun, 09 Apr 2017 08:56:36 -0700 (PDT)
Received: from [192.168.0.100] ([110.64.91.54])
 by smtp.gmail.com with ESMTPSA id f1sm19719389pfc.105.2017.04.09.08.56.32
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Sun, 09 Apr 2017 08:56:34 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: vasanth sabavat <vasanth.raonaik@gmail.com>
References: <e99b6366-7d30-a889-b7db-4a3b3133ff5e@gmail.com>
 <CABh_MKkbVVi+gTkaBVDvVfRggS6pbHKJE_VbYBZpAaTCZ81b7Q@mail.gmail.com>
 <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
 <CAAuizBjVU4o9ofi8sAyg_kva-+ognyxomQU46aeb5Q23Htn-SA@mail.gmail.com>
Cc: Ed Schouten <ed@nuxi.nl>, freebsd-hackers@freebsd.org
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <d26c0a68-579c-e39c-779c-b11689e745a6@gmail.com>
Date: Sun, 9 Apr 2017 23:56:29 +0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.7.0
MIME-Version: 1.0
In-Reply-To: <CAAuizBjVU4o9ofi8sAyg_kva-+ognyxomQU46aeb5Q23Htn-SA@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 15:56:36 -0000

On 2017/4/9 21:28, vasanth sabavat wrote:
>
> On Sun, Apr 9, 2017 at 3:14 AM Yubin Ruan <ablacktshirt@gmail.com
> <mailto:ablacktshirt@gmail.com>> wrote:
>
>     On 2017/4/6 17:31, Ed Schouten wrote:
>     > Hi Yubin,
>     >
>     > 2017-04-06 11:16 GMT+02:00 Yubin Ruan <ablacktshirt@gmail.com
>     <mailto:ablacktshirt@gmail.com>>:
>     >> Does this function provides the ordinary "spinlock"
>     functionality? There
>     >> is no special "test-and-set" instruction, and neither any extra
>     locking
>     >> to protect internal data structure manipulation. Isn't this
>     subjected to
>     >> race condition?
>     >
>     > Locking a spinlock is done through macro mtx_lock_spin(), which
>     > expands to __mtx_lock_spin() in sys/sys/mutex.h. That macro first
>     > calls into the function you looked at, spinlock_enter(), to disable
>     > interrupts. It then calls into the _mtx_obtain_lock_fetch() to do the
>     > test-and-set operation you were looking for.
>
>     Thanks for replying. I have read some of those codes.
>
>     Just a few more questions, if you don't mind:
>
>     (1) why are spinlocks forced to disable interrupt in FreeBSD?
>
>      From the book "The design and implementation of the FreeBSD Operating
>     System", the authors say "spinning can result in deadlock if a thread
>     interrupted the thread that held a mutex and then tried to acquire the
>     mutex"...(section 4.3, Mutex Synchronization, paragraph 4)
>
>     I don't get the point why a spinlock(or *spin mutex* in the FreeBSD
>     world) has to disable interrupt. Being interrupted does not necessarily
>     mean a deadlock. Assume that thread A holding a lock T gets interrupted
>     by another thread B(context switch here) and thread B try to acquire
>     the lock T. After finding out that lock T has already been acquired,
>     thread B will just spin until it gets preempted, after which thread A
>     gets waken up and run and release the lock T.
>
>
> Assume single CPU, If thread B spins where will thread A get to run and
> finish up its critical section and release the lock? The one CPU you
> have is held by thread b for spinning.
>
> For spin locks on single core, it does not make sense to spin. We just
> disable interrupts as we are currently the only ones running we just
> need to make sure no others will get to preempt us. That's why spin
> locks should be held for short duration.
>
> When you have multiple cores,  ThreadA can spin on cpu1, while thread B
> holding the lock on cpu2 can finish up and release it. We disable
> interrupts only on cpu1 so we don't want to preempt threadA. The cost of
> preemption is very high compared to short spin. Note: short spin.
>
> Look at adaptive spin locks.

Can't the scheduler preempt thread B and put thread A to run? After all,
we did not disable interrupt.

regards,
Yubin Ruan

>
>     So, you see there is not
>     necessarily any deadlock even if thread A get interrupted.
>
>     I can only remember two conditions where using spinlock without
>     disabling interrupts will cause deadlock:
>
>     #######1, spinlock used in an interrupt handler
>     If a thread A holding a spinlock T get interrupted and the interrupt
>     handler responsible for this interrupt try to acquire T, then we have
>     deadlock, because A would never have a chance to run before the
>     interrupt handler return, and the interrupt handler, unfortunately,
>     will continue to spin ... so in this situation, one has to disable
>     interrupt before spinning.
>
>     As far as I know, in Linux, they provide two kinds of spinlocks:
>
>        spin_lock(..);   /* spinlock that does not disable interrupts */
>        spin_lock_irqsave(...); /* spinlock that disable local interrupt */
>
>
>     #######2, priority inversion problem
>     If thread B with a higher priority get in and try to acquire the lock
>     that thread A currently holds, then thread B would spin, while at the
>     same time thread A has no chance to run because it has lower priority,
>     thus not being able to release the lock.
>     (I haven't investigate enough into the source code, so I don't know
>     how FreeBSD and Linux handle this priority inversion problem. Maybe
>     they use priority inheritance or random boosting?)
>
>     thanks,
>     Yubin Ruan
>     _______________________________________________
>     freebsd-hackers@freebsd.org <mailto:freebsd-hackers@freebsd.org>
>     mailing list
>     https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>     To unsubscribe, send any mail to
>     "freebsd-hackers-unsubscribe@freebsd.org
>     <mailto:freebsd-hackers-unsubscribe@freebsd.org>"
>
> --
> Thanks,
> Vasanth


From owner-freebsd-hackers@freebsd.org  Sun Apr  9 16:01:46 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id D02ACD3688F;
 Sun,  9 Apr 2017 16:01:46 +0000 (UTC)
 (envelope-from eric@metricspace.net)
Received: from mail.metricspace.net (mail.metricspace.net
 [IPv6:2001:470:1f11:617::107])
 by mx1.freebsd.org (Postfix) with ESMTP id 9A58CCD2;
 Sun,  9 Apr 2017 16:01:46 +0000 (UTC)
 (envelope-from eric@metricspace.net)
Received: from [172.16.0.205] (unknown [172.16.0.205])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client did not present a certificate) (Authenticated sender: eric)
 by mail.metricspace.net (Postfix) with ESMTPSA id 1E01A189B;
 Sun,  9 Apr 2017 16:01:46 +0000 (UTC)
Subject: Re: Proposal for a design for signed kernel/modules/etc
To: "freebsd-hackers@freebsd.org" <freebsd-hackers@FreeBSD.org>,
 freebsd-security@freebsd.org
References: <6f6b47ed-84e0-e4c0-9df5-350620cff45b@metricspace.net>
 <20170408111144.GC14604@brick>
 <181f7b78-64c3-53a6-a143-721ef0cb5186@metricspace.net>
 <20170408115222.GA64207@brick>
 <7611f7a3-3e50-65f2-4347-e37018ae1abc@metricspace.net>
 <20170409155240.GA18363@brick>
From: Eric McCorkle <eric@metricspace.net>
Message-ID: <8a60d967-eb7f-b529-df03-c0bfccbe9747@metricspace.net>
Date: Sun, 9 Apr 2017 12:01:42 -0400
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <20170409155240.GA18363@brick>
Content-Type: multipart/signed; micalg=pgp-sha256;
 protocol="application/pgp-signature";
 boundary="O4gwvhcQpw7ukKHrkL2c3DUgq5CLSkPG4"
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 16:01:46 -0000

This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--O4gwvhcQpw7ukKHrkL2c3DUgq5CLSkPG4
Content-Type: multipart/mixed; boundary="P2CjpF9BIF78trnltNW12bP2AkfCfbv8U";
 protected-headers="v1"
From: Eric McCorkle <eric@metricspace.net>
To: "freebsd-hackers@freebsd.org" <freebsd-hackers@FreeBSD.org>,
 freebsd-security@freebsd.org
Message-ID: <8a60d967-eb7f-b529-df03-c0bfccbe9747@metricspace.net>
Subject: Re: Proposal for a design for signed kernel/modules/etc
References: <6f6b47ed-84e0-e4c0-9df5-350620cff45b@metricspace.net>
 <20170408111144.GC14604@brick>
 <181f7b78-64c3-53a6-a143-721ef0cb5186@metricspace.net>
 <20170408115222.GA64207@brick>
 <7611f7a3-3e50-65f2-4347-e37018ae1abc@metricspace.net>
 <20170409155240.GA18363@brick>
In-Reply-To: <20170409155240.GA18363@brick>

--P2CjpF9BIF78trnltNW12bP2AkfCfbv8U
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On 04/09/2017 11:52, Edward Tomasz Napiera=C5=82a wrote:
> On 0409T1040, Eric McCorkle wrote:
>> On 04/08/2017 07:52, Edward Tomasz Napiera=C5=82a wrote:
>>> On 0408T0803, Eric McCorkle wrote:
>>>> On 04/08/2017 07:11, Edward Tomasz Napiera=C5=82a wrote:
>>>>> On 0327T1354, Eric McCorkle wrote:
>>>>>> Hello everyone,
>>>>>>
>>>>>> The following is a design proposal for signed kernel and kernel mo=
dule
>>>>>> loading, both at boot- and runtime (with the possibility open for =
signed
>>>>>> executables and libraries if someone wanted to go that route).  I'=
m
>>>>>> interested in feedback on the idea before I start actually writing=
 code
>>>>>> for it.
>>>>>
>>>>> I see two potential problems with this.
>>>>>
>>>>> First, our current loader(8) depends heavily on Forth code.  By mak=
ing
>>>>> it load modified 4th files, you can do absolutely anything you want=
;
>>>>> AFAIK they have unrestricted access to hardware.  So you should pre=
ferably
>>>>> be able to sign them as well.  You _might_ (not sure on this one) a=
lso
>>>>> want to be able to restrict access to some of the loader configurat=
ion
>>>>> variables.
>>>>
>>>> Loader is handled by the UEFI secure boot framework, though the conc=
erns
>>>> about the 4th code are still valid.  In a secure system, you'd want =
to
>>>> do something about that, but the concerns are different enough (and =
it's
>>>> isolated enough) that it could be done separately.
>>>
>>> Unless the way to address those ends up being a signature mechanism
>>> that doesn't depend on the format of the files being signed.
>>
>> I explored the idea of wrapped or detached signatures in the previous
>> discussion.  Envelopes or detached signatures could make sense for the=

>> 4th files.  It's a small, obscure set of code that probably isn't
>> changed very often.
>>
>> Envelopes or detached signatures for kernel modules and especially
>> signed executables and libraries both have extensive, far-reaching
>> consequences for system administration, packaging, tooling, the ports
>> collection, and so on, whereas signing the executable with an addition=
al
>> section has no such consequences.
>>
>> Config files (and the 4th files really are more like config files) hav=
e
>> a different set of constraints, and detached signatures are probably t=
he
>> way to go there.  So loader should probably support detached PKCS#7
>> signature checks.
>=20
> The third way that might be worth considering would be to just append
> the signature.  This would work for both 4th (if you prepend it with
> whatever is the 4th comment character) and ELF, without the need for
> changing or extending either format.

No, that won't work at all.  That's going to break the tooling for ELF
files as well as applications that use them, and it won't work for any
configuration file aside from loader.4th  It wouldn't even work for
boot.conf, for example.

More generally, that's basing an entire standard off a dead language
that's used in only one place, and in a way the precludes compatibility
with any file format that uses a different comment character.  It also
mandates some kind of ASCII encoding scheme to avoid newlines.

If I was going to adopt a solution that broke existing tooling, I'd at
least go with a proper envelope scheme.


--P2CjpF9BIF78trnltNW12bP2AkfCfbv8U--

--O4gwvhcQpw7ukKHrkL2c3DUgq5CLSkPG4
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----

iHUEARYIAB0WIQRELMWN3SgpoYkrmidWwohAqoAEjQUCWOpa5gAKCRBWwohAqoAE
jWZbAP4iE8lRz6j0hwlRq4UEs8FRld4Okk4KzkmhwOJ4Wm8Z7QD+KTupXfPRXknm
6S8BLi6wyH1kgDDmwp8CGw/iQTv66Q8=
=EaXS
-----END PGP SIGNATURE-----

--O4gwvhcQpw7ukKHrkL2c3DUgq5CLSkPG4--

From owner-freebsd-hackers@freebsd.org  Sun Apr  9 16:24:36 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id BA036D36EE5
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 16:24:36 +0000 (UTC)
 (envelope-from rysto32@gmail.com)
Received: from mail-io0-x232.google.com (mail-io0-x232.google.com
 [IPv6:2607:f8b0:4001:c06::232])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 8278CA12
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 16:24:36 +0000 (UTC)
 (envelope-from rysto32@gmail.com)
Received: by mail-io0-x232.google.com with SMTP id a103so6831390ioj.1
 for <freebsd-hackers@freebsd.org>; Sun, 09 Apr 2017 09:24:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc; bh=PGcYTtZ5TIDJ77DbPiJQ2wzZLs4147SEwUmNQEtEpzM=;
 b=ib1mlHiAngJi9ECbeWcxseHQhhgETA+HW0k7ppwTmoJ5scvlrgi7MxHwOpZWpoTtKV
 xoKSKQ9zBipwB3tibVxJVqW1AMiOtCU10kM5yys6gUTLFcQMbfyGUL1eIeDR5pKUSEOx
 FAc8ctSahadGIkwH7n3cKemmlc/WwR5TRNd15M4Q6VZtxqH5Tgi0JpWv+knmXEM5aLii
 caL8bbRSCB2bxxnEtnfwVRn8DJuxhpQk8MtOKuAbThrNrNbZgKvfznPqgT/MFx6NHoWJ
 uOzFdRKqB5srWA38wBxdUpVv+WNoQhOw/GV78MMT0uED8ASxQR7ra0uHRq/6rmujlmQO
 biFQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc;
 bh=PGcYTtZ5TIDJ77DbPiJQ2wzZLs4147SEwUmNQEtEpzM=;
 b=APPWrkHtkNfTsAekwckFbr+VrB/3w8Azv+ynqTAl6MtlPhqZe8k5e1e4CewifLgHlj
 eQP+xKIPfhefdgzH9JwTM0CY/rr4Yi++ELS2leeQydbf3YD+KSe763W+kE7744KmNfb6
 Z1ofzRMcKSU9sedZaLwhvkNtpwaH0qaQAzZp8SH5ZD1G+OwzkK1vhaCFMtAMvGm74ZVR
 Hw/k7ii0sbHkPdELpIX4YJgt9xHPO0C8cIksYZ6Lsdy3K5vTOq2CDa+0tA+dFHo5IW85
 5TnZVMOQfj+8L7VpUyC3iArPqwfWXPO5q0QCb6b2jyY5/gQWZXg+La95YyWPO/T2Lo8A
 C11Q==
X-Gm-Message-State: AN3rC/7dwtgGfntc+Tq39E3vxQ21UF1KC6cHtx4ZXvEFDQt2Me7SaxxzZZ+S/CH7+jr2mK3OsGm1JDjwuozeuQ==
X-Received: by 10.107.11.159 with SMTP id 31mr2397501iol.41.1491755075934;
 Sun, 09 Apr 2017 09:24:35 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.107.19.33 with HTTP; Sun, 9 Apr 2017 09:24:35 -0700 (PDT)
In-Reply-To: <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
References: <e99b6366-7d30-a889-b7db-4a3b3133ff5e@gmail.com>
 <CABh_MKkbVVi+gTkaBVDvVfRggS6pbHKJE_VbYBZpAaTCZ81b7Q@mail.gmail.com>
 <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
From: Ryan Stone <rysto32@gmail.com>
Date: Sun, 9 Apr 2017 12:24:35 -0400
Message-ID: <CAFMmRNwWnaq-4vEDCByqdUzWfoiZeN0nM_M5rt8ST0P8xnUTsA@mail.gmail.com>
Subject: Re: Understanding the FreeBSD locking mechanism
To: Yubin Ruan <ablacktshirt@gmail.com>
Cc: Ed Schouten <ed@nuxi.nl>,
 "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 16:24:36 -0000

On Sun, Apr 9, 2017 at 6:13 AM, Yubin Ruan <ablacktshirt@gmail.com> wrote:

>
> #######1, spinlock used in an interrupt handler
> If a thread A holding a spinlock T get interrupted and the interrupt
> handler responsible for this interrupt try to acquire T, then we have
> deadlock, because A would never have a chance to run before the
> interrupt handler return, and the interrupt handler, unfortunately,
> will continue to spin ... so in this situation, one has to disable
> interrupt before spinning.
>
> As far as I know, in Linux, they provide two kinds of spinlocks:
>
>   spin_lock(..);   /* spinlock that does not disable interrupts */
>   spin_lock_irqsave(...); /* spinlock that disable local interrupt *


In the FreeBSD locking style, a spinlock is only used in the case where one
needs to
synchronize with an interrupt handler.  This is why spinlocks always
disable local
interrupts in FreeBSD.

FreeBSD's lock for the first case is the MTX_DEF mutex, which is
adaptively-spinning
blocking mutex implementation.  In short, the MTX_DEF mutex will spin
waiting for the
lock if the owner is running, but will block if the owner is deschedules.
This prevents
expensive trips through the scheduler for the common case where the mutex
is only held
for short periods, without wasting CPU cycles spinning in cases where the
owner thread is
descheduled and therefore will not be completing soon.

#######2, priority inversion problem
> If thread B with a higher priority get in and try to acquire the lock
> that thread A currently holds, then thread B would spin, while at the
> same time thread A has no chance to run because it has lower priority,
> thus not being able to release the lock.
> (I haven't investigate enough into the source code, so I don't know
> how FreeBSD and Linux handle this priority inversion problem. Maybe
> they use priority inheritance or random boosting?)
>

FreeBSD's spin locks prevent priority inversion by preventing the holder
thread from
being descheduled.

MTX_DEF locks implement priority inheritance.

From owner-freebsd-hackers@freebsd.org  Sun Apr  9 16:48:51 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id AA113D364CF
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 16:48:51 +0000 (UTC)
 (envelope-from vasanth.raonaik@gmail.com)
Received: from mail-oi0-x22e.google.com (mail-oi0-x22e.google.com
 [IPv6:2607:f8b0:4003:c06::22e])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 6507E3B0
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 16:48:51 +0000 (UTC)
 (envelope-from vasanth.raonaik@gmail.com)
Received: by mail-oi0-x22e.google.com with SMTP id b187so127163346oif.0
 for <freebsd-hackers@freebsd.org>; Sun, 09 Apr 2017 09:48:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc; bh=6Y0XDOknjJXK/q5/gAGI4h8hsW3WVLx4f39J0Vi01jg=;
 b=ZQCNiWGfb8lq2JU2qeQfiYvTcJnqiFmKbYQOIPNtyRVYFvvVR7xyn5+ezk1mdsxKpK
 A1X0wCzg1f/suT986BCqP0Z2qNI2dOj9GcDIYYjHucMCHKlOG3lNYfxj+sQwnGOsKlIu
 vB9AITCzxypu3lwAyNbCmsfNz3LNGrgoVt6BBD8eSiiaYKbf9FHM38XjhStb8CNYqXeJ
 mIANAcoCMAnHzenxobVW+ksk3nCXgEGazdbT2PU2L/ol7HNo/1qxXeCZwuaSHbURUTkM
 WwwKIxnAYBHTLWwM/dY/fXWHiq3RmDGLK9JEKZ7QO9vZLgp5ehWy+Wf+Q2Z9OSLHFz+6
 mX+g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=6Y0XDOknjJXK/q5/gAGI4h8hsW3WVLx4f39J0Vi01jg=;
 b=FjEzEcynYdbnLxe7pN3OFRED8/By1vzZhN7FYpVcwOrncS9P3WVS/5z5tp0qk3QIfn
 ahgK5B1MzS/jVyJvG++ep7iiNn3GHrhoZMaUZsC8smlw0WYCGfzmEH1jP8Pe6ol+iE6h
 XRGSF7UtJWyqEltI9p0yzxBakoLrBs020F6h7/HJbCshSdUWLpMxZ/252lluWt+IO17w
 DBt8caQKoEII+PWyZmpUV9H1HVaSIZIIpBGTDm+LgLHfjjqOM1gVOK4NfjxKn/j5C9jV
 AjMA8hdHD0i/TaMCR5oslYGM7baJDi/EvnnKS1sty0SxD9ZOYXOHy1PNq+BGmNLbcLdl
 QHLw==
X-Gm-Message-State: AN3rC/45zT5DtAjTfWZ8h3nroQl8uwzrdlUmyAzMzFeOwR/azlG149cEhuoLFSEERXgMGO0JinRc+NBfBe8gVA==
X-Received: by 10.157.11.123 with SMTP id p56mr7577784otd.149.1491756530356;
 Sun, 09 Apr 2017 09:48:50 -0700 (PDT)
MIME-Version: 1.0
References: <e99b6366-7d30-a889-b7db-4a3b3133ff5e@gmail.com>
 <CABh_MKkbVVi+gTkaBVDvVfRggS6pbHKJE_VbYBZpAaTCZ81b7Q@mail.gmail.com>
 <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
 <CAFMmRNwWnaq-4vEDCByqdUzWfoiZeN0nM_M5rt8ST0P8xnUTsA@mail.gmail.com>
In-Reply-To: <CAFMmRNwWnaq-4vEDCByqdUzWfoiZeN0nM_M5rt8ST0P8xnUTsA@mail.gmail.com>
From: vasanth sabavat <vasanth.raonaik@gmail.com>
Date: Sun, 09 Apr 2017 16:48:39 +0000
Message-ID: <CAAuizBiJFkqaEcaHkjP7ZVTgALzVagOopaf9gt3JjbQA3UE02A@mail.gmail.com>
Subject: Re: Understanding the FreeBSD locking mechanism
To: Ryan Stone <rysto32@gmail.com>, Yubin Ruan <ablacktshirt@gmail.com>
Cc: Ed Schouten <ed@nuxi.nl>,
 "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 16:48:51 -0000

On Sun, Apr 9, 2017 at 9:24 AM Ryan Stone <rysto32@gmail.com> wrote:

> On Sun, Apr 9, 2017 at 6:13 AM, Yubin Ruan <ablacktshirt@gmail.com> wrote:
>
> >
> > #######1, spinlock used in an interrupt handler
> > If a thread A holding a spinlock T get interrupted and the interrupt
> > handler responsible for this interrupt try to acquire T, then we have
> > deadlock, because A would never have a chance to run before the
> > interrupt handler return, and the interrupt handler, unfortunately,
> > will continue to spin ... so in this situation, one has to disable
> > interrupt before spinning.
> >
> > As far as I know, in Linux, they provide two kinds of spinlocks:
> >
> >   spin_lock(..);   /* spinlock that does not disable interrupts */
> >   spin_lock_irqsave(...); /* spinlock that disable local interrupt *
>
>
> In the FreeBSD locking style, a spinlock is only used in the case where one
> needs to
> synchronize with an interrupt handler.  This is why spinlocks always
> disable local
> interrupts in FreeBSD.


Isn't it true that interrupt handlers instead of running on the current
thread stack now have their own thread?


>
> FreeBSD's lock for the first case is the MTX_DEF mutex, which is
> adaptively-spinning
> blocking mutex implementation.  In short, the MTX_DEF mutex will spin
> waiting for the
> lock if the owner is running, but will block if the owner is deschedules.
> This prevents
> expensive trips through the scheduler for the common case where the mutex
> is only held
> for short periods, without wasting CPU cycles spinning in cases where the
> owner thread is
> descheduled and therefore will not be completing soon.
>
> #######2, priority inversion problem
> > If thread B with a higher priority get in and try to acquire the lock
> > that thread A currently holds, then thread B would spin, while at the
> > same time thread A has no chance to run because it has lower priority,
> > thus not being able to release the lock.
> > (I haven't investigate enough into the source code, so I don't know
> > how FreeBSD and Linux handle this priority inversion problem. Maybe
> > they use priority inheritance or random boosting?)
> >
>
> FreeBSD's spin locks prevent priority inversion by preventing the holder
> thread from
> being descheduled.
>
> MTX_DEF locks implement priority inheritance.
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
>
-- 
Thanks,
Vasanth

From owner-freebsd-hackers@freebsd.org  Sun Apr  9 16:50:37 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id AE10CD365D0
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 16:50:37 +0000 (UTC)
 (envelope-from wlosh@bsdimp.com)
Received: from mail-io0-x22c.google.com (mail-io0-x22c.google.com
 [IPv6:2607:f8b0:4001:c06::22c])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 7A811789
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 16:50:37 +0000 (UTC)
 (envelope-from wlosh@bsdimp.com)
Received: by mail-io0-x22c.google.com with SMTP id a103so7153059ioj.1
 for <freebsd-hackers@freebsd.org>; Sun, 09 Apr 2017 09:50:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=bsdimp-com.20150623.gappssmtp.com; s=20150623;
 h=mime-version:sender:in-reply-to:references:from:date:message-id
 :subject:to:cc:content-transfer-encoding;
 bh=uq4N62VcObOpr6hm1+yt24jW5Lh4EG5i+bJ9+JSrzn8=;
 b=Yy8+Q7CiyFdr9jOo5b0lcI6gWUdlae3qFXQgXDEpw7wne6mSQPVMBpmATDg579XYDu
 5S6axvlm+TyNFVZhLr/N0pmR0UYF3DQHgYy33nW9Wh6fQ5Htwhw8oBm3FClQPMAs74R3
 BQeK2ADJtQr2PHMkndIFAg3xO1zP/H5l7fGcWjRdDRbsSl6HVeQIOI3f8JygKMCRTwGV
 3M5PjZ8itgU0ufL7UvkEVFWaXuM4yxl71UZR5AzGElQlNbhfZcqglKTapItY31miP7+L
 DbFugGr1IfUcVj8JBBNeRqR586HSx90LQvbHWbktfoY0IUmya4u2x1uekENOhY9BF2Nh
 Wnaw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:sender:in-reply-to:references:from
 :date:message-id:subject:to:cc:content-transfer-encoding;
 bh=uq4N62VcObOpr6hm1+yt24jW5Lh4EG5i+bJ9+JSrzn8=;
 b=OTYeGrhtkvV+TlQnLR3Ra0bPcXLgCgBU7vBCvOds7jsCErDahY/ELpVrvziueaKQ4v
 t9jE2n1PsjtAWpGZxkCjTR+10LAn1ZwLMuVATkZVwgUfO1Vnyj7rkAq9fyiVjLDSVPYx
 EKR5qGnxODX6/p6x6eVIPdVV8F5cI7dfn5NZwWQ8PVK0dKs1gfDfCWqi8zYIeNtPV6a7
 w19oZhq9Yse9FDEDevkj1inoSGZLaL9h6kBX3cG77k5RkteCoqqYAXV1gIqND6WXrvWY
 VRDGb/Qded5Sx+M6o9pCIorw7P1DvFETZY3Pum7LinwqWfqRxe71z7XItauN5YAVXRCl
 6xfg==
X-Gm-Message-State: AN3rC/53UZlGsKPu34znOT5JiWojidwCmIiDaOmqzxx1fQPpXERBlMnkIJU1Jsk0YuX7lk8ZfgehIWhk1JgNkg==
X-Received: by 10.107.198.137 with SMTP id w131mr2892855iof.19.1491756636689; 
 Sun, 09 Apr 2017 09:50:36 -0700 (PDT)
MIME-Version: 1.0
Sender: wlosh@bsdimp.com
Received: by 10.79.146.24 with HTTP; Sun, 9 Apr 2017 09:50:36 -0700 (PDT)
X-Originating-IP: [2607:fb10:7021:1::b517]
In-Reply-To: <8a60d967-eb7f-b529-df03-c0bfccbe9747@metricspace.net>
References: <6f6b47ed-84e0-e4c0-9df5-350620cff45b@metricspace.net>
 <20170408111144.GC14604@brick>
 <181f7b78-64c3-53a6-a143-721ef0cb5186@metricspace.net>
 <20170408115222.GA64207@brick>
 <7611f7a3-3e50-65f2-4347-e37018ae1abc@metricspace.net>
 <20170409155240.GA18363@brick>
 <8a60d967-eb7f-b529-df03-c0bfccbe9747@metricspace.net>
From: Warner Losh <imp@bsdimp.com>
Date: Sun, 9 Apr 2017 10:50:36 -0600
X-Google-Sender-Auth: zkX6U72fUJc7oD084joHMA5XRig
Message-ID: <CANCZdfpibssSCbqcRqDEZeoqLEyLwJMo-dU4ZKhMnH7ceYps_A@mail.gmail.com>
Subject: Re: Proposal for a design for signed kernel/modules/etc
To: Eric McCorkle <eric@metricspace.net>
Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>,
 freebsd-security@freebsd.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 16:50:37 -0000

On Sun, Apr 9, 2017 at 10:01 AM, Eric McCorkle <eric@metricspace.net> wrote=
:
> On 04/09/2017 11:52, Edward Tomasz Napiera=C5=82a wrote:
>> On 0409T1040, Eric McCorkle wrote:
>>> On 04/08/2017 07:52, Edward Tomasz Napiera=C5=82a wrote:
>>>> On 0408T0803, Eric McCorkle wrote:
>>>>> On 04/08/2017 07:11, Edward Tomasz Napiera=C5=82a wrote:
>>>>>> On 0327T1354, Eric McCorkle wrote:
>>>>>>> Hello everyone,
>>>>>>>
>>>>>>> The following is a design proposal for signed kernel and kernel mod=
ule
>>>>>>> loading, both at boot- and runtime (with the possibility open for s=
igned
>>>>>>> executables and libraries if someone wanted to go that route).  I'm
>>>>>>> interested in feedback on the idea before I start actually writing =
code
>>>>>>> for it.
>>>>>>
>>>>>> I see two potential problems with this.
>>>>>>
>>>>>> First, our current loader(8) depends heavily on Forth code.  By maki=
ng
>>>>>> it load modified 4th files, you can do absolutely anything you want;
>>>>>> AFAIK they have unrestricted access to hardware.  So you should pref=
erably
>>>>>> be able to sign them as well.  You _might_ (not sure on this one) al=
so
>>>>>> want to be able to restrict access to some of the loader configurati=
on
>>>>>> variables.
>>>>>
>>>>> Loader is handled by the UEFI secure boot framework, though the conce=
rns
>>>>> about the 4th code are still valid.  In a secure system, you'd want t=
o
>>>>> do something about that, but the concerns are different enough (and i=
t's
>>>>> isolated enough) that it could be done separately.
>>>>
>>>> Unless the way to address those ends up being a signature mechanism
>>>> that doesn't depend on the format of the files being signed.
>>>
>>> I explored the idea of wrapped or detached signatures in the previous
>>> discussion.  Envelopes or detached signatures could make sense for the
>>> 4th files.  It's a small, obscure set of code that probably isn't
>>> changed very often.
>>>
>>> Envelopes or detached signatures for kernel modules and especially
>>> signed executables and libraries both have extensive, far-reaching
>>> consequences for system administration, packaging, tooling, the ports
>>> collection, and so on, whereas signing the executable with an additiona=
l
>>> section has no such consequences.
>>>
>>> Config files (and the 4th files really are more like config files) have
>>> a different set of constraints, and detached signatures are probably th=
e
>>> way to go there.  So loader should probably support detached PKCS#7
>>> signature checks.
>>
>> The third way that might be worth considering would be to just append
>> the signature.  This would work for both 4th (if you prepend it with
>> whatever is the 4th comment character) and ELF, without the need for
>> changing or extending either format.
>
> No, that won't work at all.  That's going to break the tooling for ELF
> files as well as applications that use them, and it won't work for any
> configuration file aside from loader.4th  It wouldn't even work for
> boot.conf, for example.
>
> More generally, that's basing an entire standard off a dead language
> that's used in only one place, and in a way the precludes compatibility
> with any file format that uses a different comment character.  It also
> mandates some kind of ASCII encoding scheme to avoid newlines.

You don't need to avoid new lines with 4th. It doesn't even need to be
an ASCII encoding scheme, unless you are doing something crazy like
trying to push the signature through the 4th parser, which is nuts.
Forth can read binary files just fine. But I think arguing over the
4th stuff is a distraction, dee below.

> If I was going to adopt a solution that broke existing tooling, I'd at
> least go with a proper envelope scheme.

That would be preferable.

But why the either-or dichotomy? Seems like you're looking at the
problem wrong if you are arguing about 4th code. You should be
thinking more in terms of, at most, a couple of 4th words that can
implement this stuff (so the loader could show that the kernel is
signed and valid vs is not signed vs is signed, but the signature is
bogus). 99% of the functionality should be in C, and should be
sharable between the loader, the kernel and whatever else may wish to
verify signatures before loading. It would also allow the same
functionality to be pushed into the on-again-off-again LUA boot
project (which seems to have momentum this time).

Warner

From owner-freebsd-hackers@freebsd.org  Sun Apr  9 17:17:37 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5BB91D3601E;
 Sun,  9 Apr 2017 17:17:37 +0000 (UTC)
 (envelope-from eric@metricspace.net)
Received: from mail.metricspace.net (mail.metricspace.net
 [IPv6:2001:470:1f11:617::107])
 by mx1.freebsd.org (Postfix) with ESMTP id 2C2459CD;
 Sun,  9 Apr 2017 17:17:37 +0000 (UTC)
 (envelope-from eric@metricspace.net)
Received: from [172.16.0.205] (unknown [172.16.0.205])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client did not present a certificate) (Authenticated sender: eric)
 by mail.metricspace.net (Postfix) with ESMTPSA id A712318CF;
 Sun,  9 Apr 2017 17:17:29 +0000 (UTC)
Subject: Re: Proposal for a design for signed kernel/modules/etc
To: Warner Losh <imp@bsdimp.com>
References: <6f6b47ed-84e0-e4c0-9df5-350620cff45b@metricspace.net>
 <20170408111144.GC14604@brick>
 <181f7b78-64c3-53a6-a143-721ef0cb5186@metricspace.net>
 <20170408115222.GA64207@brick>
 <7611f7a3-3e50-65f2-4347-e37018ae1abc@metricspace.net>
 <20170409155240.GA18363@brick>
 <8a60d967-eb7f-b529-df03-c0bfccbe9747@metricspace.net>
 <CANCZdfpibssSCbqcRqDEZeoqLEyLwJMo-dU4ZKhMnH7ceYps_A@mail.gmail.com>
Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>,
 freebsd-security@freebsd.org
From: Eric McCorkle <eric@metricspace.net>
Message-ID: <082223a0-1768-f5f0-9f4a-2e9fd45716c7@metricspace.net>
Date: Sun, 9 Apr 2017 13:17:26 -0400
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <CANCZdfpibssSCbqcRqDEZeoqLEyLwJMo-dU4ZKhMnH7ceYps_A@mail.gmail.com>
Content-Type: multipart/signed; micalg=pgp-sha256;
 protocol="application/pgp-signature";
 boundary="MKJ6xGh0nJg9jVG0lQanCjkQ11rg9qNAe"
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 17:17:37 -0000

This is an OpenPGP/MIME signed message (RFC 4880 and 3156)
--MKJ6xGh0nJg9jVG0lQanCjkQ11rg9qNAe
Content-Type: multipart/mixed; boundary="CL7dlJ1GfWJTi96v7o8i4fBVIpSo8E8P3";
 protected-headers="v1"
From: Eric McCorkle <eric@metricspace.net>
To: Warner Losh <imp@bsdimp.com>
Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>,
 freebsd-security@freebsd.org
Message-ID: <082223a0-1768-f5f0-9f4a-2e9fd45716c7@metricspace.net>
Subject: Re: Proposal for a design for signed kernel/modules/etc
References: <6f6b47ed-84e0-e4c0-9df5-350620cff45b@metricspace.net>
 <20170408111144.GC14604@brick>
 <181f7b78-64c3-53a6-a143-721ef0cb5186@metricspace.net>
 <20170408115222.GA64207@brick>
 <7611f7a3-3e50-65f2-4347-e37018ae1abc@metricspace.net>
 <20170409155240.GA18363@brick>
 <8a60d967-eb7f-b529-df03-c0bfccbe9747@metricspace.net>
 <CANCZdfpibssSCbqcRqDEZeoqLEyLwJMo-dU4ZKhMnH7ceYps_A@mail.gmail.com>
In-Reply-To: <CANCZdfpibssSCbqcRqDEZeoqLEyLwJMo-dU4ZKhMnH7ceYps_A@mail.gmail.com>

--CL7dlJ1GfWJTi96v7o8i4fBVIpSo8E8P3
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On 04/09/2017 12:50, Warner Losh wrote:
> On Sun, Apr 9, 2017 at 10:01 AM, Eric McCorkle <eric@metricspace.net> w=
rote:
>> On 04/09/2017 11:52, Edward Tomasz Napiera=C5=82a wrote:
>>> On 0409T1040, Eric McCorkle wrote:
>>>> On 04/08/2017 07:52, Edward Tomasz Napiera=C5=82a wrote:
>>>>> On 0408T0803, Eric McCorkle wrote:
>>>>>> On 04/08/2017 07:11, Edward Tomasz Napiera=C5=82a wrote:
>>>>>>> On 0327T1354, Eric McCorkle wrote:
>>>>>>>> Hello everyone,
>>>>>>>>
>>>>>>>> The following is a design proposal for signed kernel and kernel =
module
>>>>>>>> loading, both at boot- and runtime (with the possibility open fo=
r signed
>>>>>>>> executables and libraries if someone wanted to go that route).  =
I'm
>>>>>>>> interested in feedback on the idea before I start actually writi=
ng code
>>>>>>>> for it.
>>>>>>>
>>>>>>> I see two potential problems with this.
>>>>>>>
>>>>>>> First, our current loader(8) depends heavily on Forth code.  By m=
aking
>>>>>>> it load modified 4th files, you can do absolutely anything you wa=
nt;
>>>>>>> AFAIK they have unrestricted access to hardware.  So you should p=
referably
>>>>>>> be able to sign them as well.  You _might_ (not sure on this one)=
 also
>>>>>>> want to be able to restrict access to some of the loader configur=
ation
>>>>>>> variables.
>>>>>>
>>>>>> Loader is handled by the UEFI secure boot framework, though the co=
ncerns
>>>>>> about the 4th code are still valid.  In a secure system, you'd wan=
t to
>>>>>> do something about that, but the concerns are different enough (an=
d it's
>>>>>> isolated enough) that it could be done separately.
>>>>>
>>>>> Unless the way to address those ends up being a signature mechanism=

>>>>> that doesn't depend on the format of the files being signed.
>>>>
>>>> I explored the idea of wrapped or detached signatures in the previou=
s
>>>> discussion.  Envelopes or detached signatures could make sense for t=
he
>>>> 4th files.  It's a small, obscure set of code that probably isn't
>>>> changed very often.
>>>>
>>>> Envelopes or detached signatures for kernel modules and especially
>>>> signed executables and libraries both have extensive, far-reaching
>>>> consequences for system administration, packaging, tooling, the port=
s
>>>> collection, and so on, whereas signing the executable with an additi=
onal
>>>> section has no such consequences.
>>>>
>>>> Config files (and the 4th files really are more like config files) h=
ave
>>>> a different set of constraints, and detached signatures are probably=
 the
>>>> way to go there.  So loader should probably support detached PKCS#7
>>>> signature checks.
>>>
>>> The third way that might be worth considering would be to just append=

>>> the signature.  This would work for both 4th (if you prepend it with
>>> whatever is the 4th comment character) and ELF, without the need for
>>> changing or extending either format.
>>
>> No, that won't work at all.  That's going to break the tooling for ELF=

>> files as well as applications that use them, and it won't work for any=

>> configuration file aside from loader.4th  It wouldn't even work for
>> boot.conf, for example.
>>
>> More generally, that's basing an entire standard off a dead language
>> that's used in only one place, and in a way the precludes compatibilit=
y
>> with any file format that uses a different comment character.  It also=

>> mandates some kind of ASCII encoding scheme to avoid newlines.
>=20
> You don't need to avoid new lines with 4th. It doesn't even need to be
> an ASCII encoding scheme, unless you are doing something crazy like
> trying to push the signature through the 4th parser, which is nuts.
> Forth can read binary files just fine. But I think arguing over the
> 4th stuff is a distraction, dee below.
>=20
>> If I was going to adopt a solution that broke existing tooling, I'd at=

>> least go with a proper envelope scheme.
>=20
> That would be preferable.
>=20
> But why the either-or dichotomy? Seems like you're looking at the
> problem wrong if you are arguing about 4th code. You should be
> thinking more in terms of, at most, a couple of 4th words that can
> implement this stuff (so the loader could show that the kernel is
> signed and valid vs is not signed vs is signed, but the signature is
> bogus). 99% of the functionality should be in C, and should be
> sharable between the loader, the kernel and whatever else may wish to
> verify signatures before loading. It would also allow the same
> functionality to be pushed into the on-again-off-again LUA boot
> project (which seems to have momentum this time).
>=20

I'm not following what you're saying.  I don't think anyone was
suggesting doing signature *verification* in 4th (at least I hope not!).
 The issue is about the format of the signatures.

Basically, the crux of my proposal is about using an ELF section to
store signatures, which has immediate use for kernel module loading as
well as in the boot loader for the same purpose.

Now, the boot programs, loader, and perhaps the kernel too all load
various additional config files (boot.conf, loader.4th, loader.conf,
etc).  These do also need to be signed, so there needs to be a solution
for this as well.

There's significant advantages to the ELF .sign section, and all the
alternatives have serious disadvantages.  For these reasons, I'm pretty
set on the .sign section.  With the config files (which includes the 4th
code), you don't have a file format that transparently supports
additional metadata (like ELF does).  So you have a choice between
storing detached signatures in an external file (the way GRUB does) or
using an envelope format.  Of the two, the envelope is preferable, I
think, though it should probably have a different name (ex:
loader.4th.pk7, loader.conf.pk7) and be understood to contain an
envelope, not a raw config file.


The implementation of all this would be in C, of course.  The
verification stuff would be compiled in to loader and kernel.  The
elf-signing would be done by a command-line utility (which I've
half-written at this point).  Ideally, the signing of config files would
be doable with the standard openssl command-line.


--CL7dlJ1GfWJTi96v7o8i4fBVIpSo8E8P3--

--MKJ6xGh0nJg9jVG0lQanCjkQ11rg9qNAe
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----

iHUEARYIAB0WIQRELMWN3SgpoYkrmidWwohAqoAEjQUCWOpspgAKCRBWwohAqoAE
jQ38AQCBS/XagV7XTbcddwhcVSvvwPw1iQKYnMYAUUumSSJ9ZQD/ahJsW5QVbf7R
d8z+nk1a4SUI98zbv4crR0O+pXjHSgE=
=BuYb
-----END PGP SIGNATURE-----

--MKJ6xGh0nJg9jVG0lQanCjkQ11rg9qNAe--

From owner-freebsd-hackers@freebsd.org  Sun Apr  9 17:24:38 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 64FBAD36426
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 17:24:38 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: from asp.reflexion.net (outbound-mail-210-7.reflexion.net
 [208.70.210.7])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 29E6BC8
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 17:24:38 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: (qmail 31129 invoked from network); 9 Apr 2017 17:24:31 -0000
Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1)
 by 0 (rfx-qmail) with SMTP; 9 Apr 2017 17:24:31 -0000
Received: by mail-cs-01.app.dca.reflexion.local
 (Reflexion email security v8.40.0) with SMTP;
 Sun, 09 Apr 2017 13:24:31 -0400 (EDT)
Received: (qmail 20454 invoked from network); 9 Apr 2017 17:24:31 -0000
Received: from unknown (HELO iron2.pdx.net) (69.64.224.71)
 by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 9 Apr 2017 17:24:31 -0000
Received: from [192.168.1.106] (c-76-115-7-162.hsd1.or.comcast.net
 [76.115.7.162])
 by iron2.pdx.net (Postfix) with ESMTPSA id 5CF31EC8630;
 Sun,  9 Apr 2017 10:24:30 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: The arm64 fork-then-swap-out-then-swap-in failures: a program
 source for exploring them
From: Mark Millard <markmi@dsl-only.net>
In-Reply-To: <20170409122715.GF1788@kib.kiev.ua>
Date: Sun, 9 Apr 2017 10:24:29 -0700
Cc: andrew@freebsd.org, freebsd-hackers@freebsd.org,
 freebsd-arm <freebsd-arm@freebsd.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <9D152170-5F19-47A2-A06A-66F83CA88A09@dsl-only.net>
References: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net>
 <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
 <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
 <20170409122715.GF1788@kib.kiev.ua>
To: Konstantin Belousov <kostikbel@gmail.com>
X-Mailer: Apple Mail (2.3273)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 17:24:38 -0000

On 2017-Apr-9, at 5:27 AM, Konstantin Belousov <kostikbel@gmail.com> =
wrote:

> On Sat, Apr 08, 2017 at 06:02:00PM -0700, Mark Millard wrote:
>> [I've identified the code path involved is the arm64 small =
allocations
>> turning into zeros for later fork-then-swapout-then-back-in,
>> specifically the ongoing RES(ident memory) size decrease that
>> "top -PCwaopid" shows before the fork/swap sequence. Hopefully
>> I've also exposed enough related information for someone that
>> knows what they are doing to get started with a specific
>> investigation, looking for a fix. I'd like for a pine64+
>> 2GB to have buildworld complete despite the forking and
>> swapping involved (yep: for a time zero RES(ident memory) for
>> some processes involved in the build).]
>=20
> I was not able to follow the walls of text, but do not think that
> I pmap_ts_reference() is the real culprit there.
>=20
> Is my impression right that the issue occurs on fork, and looks as
> a memory corruption, where some page suddently becomes zero-filled ?
> And swapping seems to be involved ?  It is somewhat interesting to see
> if the problem is reproducable on non-arm64 machines, e.g. armv7 or =
amd64.

Yes, yes, non-arm64 that I've tried works.

But I think that the following extra detail my be of use: what top
shows for RES over time is also odd on arm64 (only) and the amount
of pages that are zeroed is proportional to the decrease in RES.

In the test sequence:

A) Allocate lots of 14 KiByte allocations and initialize the content of =
each
to non-zero. The example ends up with RES of about 265M.

B) sleep some amount of time, I've been using well over 30 seconds here.

C) fork

D) sleep again (parent and child), also forcing swapping during the =
sleep
   (I used stress, manually run.)

E) Test the memory pattern in the parent and child process, passing over
   all the bytes, failed and good.

Both the parent and the child in (E) see the first pages allocated as =
zero,
with the number of pages being zero increasing as the sleep time in (B)
increases (as long as the sleep is over 30 sec or so). The parent and =
child
match for which pages are zero vs. not.

It fails with (B) being a no-op as well. But the proportionality with
the time for the sleep is interesting.

During (B) "top -PCwaopid" shows RES decreasing, starting after 30 sec
or so. The fork in (C) produces a child that does not have the same RES
as the parent but instead a tiny RES (80K as I remember). During (E)
the child's RES increases to full size.

My powerpc64, armv7, and amd64 tests of such do not fail, nor does RES
decrease during (B). The child process gets the same RES as the parent
as well, unlike for arm64.

In the failing context (arm64) RES in the parent decreases during (D)
before the swap-out as well.

> If answers to my two questions are yes, there is probably some bug =
with
> arm64 pmap handling of the dirty bit emulation.  ARMv8.0 does not =
provide
> hardware dirty bit, and pmap interprets an accessed writeable page as
> unconditionally dirty.  More, accessed bit is also not maintained by
> hardware, instead if should be set by pmap.  And arm64 pmap sets the
> AF bit unconditionally when creating valid pte.

fork-then-swap-out/in is required to see the problem. Neither fork
by itself nor swapping (zero RES as shown in top) by itself have
shown the problem so far.

> Hmm, could you try the following patch, I did not even compiled it.

I'll try it later today.

> diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
> index 3d5756ba891..55aa402eb1c 100644
> --- a/sys/arm64/arm64/pmap.c
> +++ b/sys/arm64/arm64/pmap.c
> @@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sva, =
vm_offset_t eva, vm_prot_t prot)
> 		    sva +=3D L3_SIZE) {
> 			l3 =3D pmap_load(l3p);
> 			if (pmap_l3_valid(l3)) {
> +				if ((l3 & ATTR_SW_MANAGED) &&
> +				    pmap_page_dirty(l3)) {
> +					vm_page_dirty(PHYS_TO_VM_PAGE(l3 =
&
> +					    ~ATTR_MASK));
> +				}
> 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
> 				PTE_SYNC(l3p);
> 				/* XXX: Use pmap_invalidate_range */


=3D=3D=3D
Mark Millard
markmi at dsl-only.net


From owner-freebsd-hackers@freebsd.org  Sun Apr  9 18:02:08 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id C43DCD3607A
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 18:02:08 +0000 (UTC)
 (envelope-from rysto32@gmail.com)
Received: from mail-io0-x230.google.com (mail-io0-x230.google.com
 [IPv6:2607:f8b0:4001:c06::230])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 8ACAF895
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 18:02:08 +0000 (UTC)
 (envelope-from rysto32@gmail.com)
Received: by mail-io0-x230.google.com with SMTP id r16so19129880ioi.2
 for <freebsd-hackers@freebsd.org>; Sun, 09 Apr 2017 11:02:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc; bh=uuoVb0oAtQXZ/NyuchdxgsKc/vAdEk+3WWY3C6y6/HA=;
 b=TFzidzG5LLGHLFZBiHNcCZOSGvHzlkQPZ1GVbvaUnFH34Eddap/PZn9IedLvAhz32j
 0/StQdzqEq8vTTtdeFFytHoFdqraWZ8Tb/bgyLAeYavygNlz9Ts2tbecgf7OnW4y79nY
 LwsbfSADRxJjaRhB8Qy8i7r5gaOburE0gR+CIxXLJggUQAMjL0NnqSq4ekQySJalSw7J
 qBrsch3TvqAgKx0H8GaT72mRQlPWnNetHVDRbxCFHChqpJUkqRlOD/lqja6h1Gras/+H
 skkttrOv9mnVdZgxfmHm0TK/voQ48q/HbQwCwJygZ/G5zokii4MPiom/jb9ztY9FRS9Q
 PjJQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc;
 bh=uuoVb0oAtQXZ/NyuchdxgsKc/vAdEk+3WWY3C6y6/HA=;
 b=AZE9dH5+dtwJW8xLuxlh+9fSDh9DM32keOkF70MWLBwJuZBCR0FJMOkdarMpwSXTNz
 SVJhz4oUNWxTFGByxNnbeK8eVR3ZRmkE9zeXxPeXcFAXO7rHqFe1tC65scnS+rjdBd+V
 1UlYCt8YH/GAtPFguepBI9IJkFcIySycvclWnAW28JDNgMC95vDQAeO1YEKrpwNQHUB2
 TURhB74bwk1g75H3MSRs/0fPnlDw640MonFd13o7U/GPVH2cbpOs3telA9QrozRk5v7/
 2SeKU7MQ4Gtdv+cjYVLxuLsoRg/5BbpFFWG07WwjmsZnA4S1RqqBDdZPJtdUjhgFwCkp
 axXw==
X-Gm-Message-State: AFeK/H12LNHWavsepZmXn3VRggpOU38TrpP75uVwqgMl+3AjJV1iyT2RArpq5TIbNs9ARJX06Ro9mjGmSVzSQQ==
X-Received: by 10.107.164.36 with SMTP id n36mr46312095ioe.103.1491760927967; 
 Sun, 09 Apr 2017 11:02:07 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.107.19.33 with HTTP; Sun, 9 Apr 2017 11:02:07 -0700 (PDT)
In-Reply-To: <CAAuizBiJFkqaEcaHkjP7ZVTgALzVagOopaf9gt3JjbQA3UE02A@mail.gmail.com>
References: <e99b6366-7d30-a889-b7db-4a3b3133ff5e@gmail.com>
 <CABh_MKkbVVi+gTkaBVDvVfRggS6pbHKJE_VbYBZpAaTCZ81b7Q@mail.gmail.com>
 <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
 <CAFMmRNwWnaq-4vEDCByqdUzWfoiZeN0nM_M5rt8ST0P8xnUTsA@mail.gmail.com>
 <CAAuizBiJFkqaEcaHkjP7ZVTgALzVagOopaf9gt3JjbQA3UE02A@mail.gmail.com>
From: Ryan Stone <rysto32@gmail.com>
Date: Sun, 9 Apr 2017 14:02:07 -0400
Message-ID: <CAFMmRNzOypqsBam2BfaFm+pX7hSYoEvB2oFtec8OtH6D=s9yTw@mail.gmail.com>
Subject: Re: Understanding the FreeBSD locking mechanism
To: vasanth sabavat <vasanth.raonaik@gmail.com>
Cc: Yubin Ruan <ablacktshirt@gmail.com>, Ed Schouten <ed@nuxi.nl>, 
 "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 18:02:08 -0000

On Sun, Apr 9, 2017 at 12:48 PM, vasanth sabavat <vasanth.raonaik@gmail.com>
wrote:

> Isn't it true that interrupt handlers instead of running on the current
> thread stack now have their own thread?
>

It depend on what you mean by "interrupt handler" in this context,as that's
ambiguous in FreeBSD.  Most driver interrupt handling is done through an
ithread, which does have its own thread context, and MTX_DEF mutexes are
the appropriate locking primitive to use with them.

However, it is possible to handle an interrupt through what FreeBSD calls
an "interrupt filter", which runs on the kernel stack of whatever thread
happened to be running on the CPU, and therefore you must use a spinlock to
synchronize with an interrupt.  FreeBSD prefers the use of ithreads and
MTX_DEF mutexes over filters and spinlocks.

Sorry for the use of confusing terminology.  I considering referring
interrupt filters in my last message, but I figured the term would be
unfamiliar to someone not intimately familiar with FreeBSD internals so I
decided to avoid it.

From owner-freebsd-hackers@freebsd.org  Sun Apr  9 18:25:03 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id F3C19D36B09
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 18:25:02 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: from asp.reflexion.net (outbound-mail-210-7.reflexion.net
 [208.70.210.7])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id B9D54AD0
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 18:25:02 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: (qmail 25550 invoked from network); 9 Apr 2017 18:25:01 -0000
Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1)
 by 0 (rfx-qmail) with SMTP; 9 Apr 2017 18:25:01 -0000
Received: by rtc-sm-01.app.dca.reflexion.local
 (Reflexion email security v8.40.0) with SMTP;
 Sun, 09 Apr 2017 14:25:01 -0400 (EDT)
Received: (qmail 6531 invoked from network); 9 Apr 2017 18:25:00 -0000
Received: from unknown (HELO iron2.pdx.net) (69.64.224.71)
 by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 9 Apr 2017 18:25:00 -0000
Received: from [192.168.1.106] (c-76-115-7-162.hsd1.or.comcast.net
 [76.115.7.162])
 by iron2.pdx.net (Postfix) with ESMTPSA id 2747CEC8630;
 Sun,  9 Apr 2017 11:25:00 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: The arm64 fork-then-swap-out-then-swap-in failures: a program
 source for exploring them
From: Mark Millard <markmi@dsl-only.net>
In-Reply-To: <9D152170-5F19-47A2-A06A-66F83CA88A09@dsl-only.net>
Date: Sun, 9 Apr 2017 11:24:59 -0700
Cc: andrew@freebsd.org, freebsd-hackers@freebsd.org,
 freebsd-arm <freebsd-arm@freebsd.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <9DCAF95B-39A5-4346-88FC-6AFDEE8CF9BB@dsl-only.net>
References: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net>
 <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
 <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
 <20170409122715.GF1788@kib.kiev.ua>
 <9D152170-5F19-47A2-A06A-66F83CA88A09@dsl-only.net>
To: Konstantin Belousov <kostikbel@gmail.com>
X-Mailer: Apple Mail (2.3273)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 18:25:03 -0000


On 2017-Apr-9, at 10:24 AM, Mark Millard <markmi@dsl-only.net> wrote:

> On 2017-Apr-9, at 5:27 AM, Konstantin Belousov <kostikbel@gmail.com> =
wrote:
>=20
>> On Sat, Apr 08, 2017 at 06:02:00PM -0700, Mark Millard wrote:
>>> [I've identified the code path involved is the arm64 small =
allocations
>>> turning into zeros for later fork-then-swapout-then-back-in,
>>> specifically the ongoing RES(ident memory) size decrease that
>>> "top -PCwaopid" shows before the fork/swap sequence. Hopefully
>>> I've also exposed enough related information for someone that
>>> knows what they are doing to get started with a specific
>>> investigation, looking for a fix. I'd like for a pine64+
>>> 2GB to have buildworld complete despite the forking and
>>> swapping involved (yep: for a time zero RES(ident memory) for
>>> some processes involved in the build).]
>>=20
>> I was not able to follow the walls of text, but do not think that
>> I pmap_ts_reference() is the real culprit there.
>>=20
>> Is my impression right that the issue occurs on fork, and looks as
>> a memory corruption, where some page suddently becomes zero-filled ?
>> And swapping seems to be involved ?  It is somewhat interesting to =
see
>> if the problem is reproducable on non-arm64 machines, e.g. armv7 or =
amd64.
>=20
> Yes, yes, non-arm64 that I've tried works.
>=20
> But I think that the following extra detail my be of use: what top
> shows for RES over time is also odd on arm64 (only) and the amount
> of pages that are zeroed is proportional to the decrease in RES.
>=20
> In the test sequence:
>=20
> A) Allocate lots of 14 KiByte allocations and initialize the content =
of each
> to non-zero. The example ends up with RES of about 265M.

I did forget to list one important property: why I picked 14 KiBytes.

A) Any allocation sizes <=3D 14 KiBytes that I've tried
   gets the zero's problem in my arm64 contexts (bpim3 and rip3).

B) Any allocation size >=3D 14 KiBYtes + 1 Byte that I've
   tried works in those contexts.

For the arm64 contexts that I use this happens to match with
the jemalloc SMALL_MAXCLASS size boundary. When I looked it
appeared that 14 Ki was the smallest SMALL_MAXCLASS value
in jemalloc so it would always fit the category.

> B) sleep some amount of time, I've been using well over 30 seconds =
here.
>=20
> C) fork
>=20
> D) sleep again (parent and child), also forcing swapping during the =
sleep
>   (I used stress, manually run.)
>=20
> E) Test the memory pattern in the parent and child process, passing =
over
>   all the bytes, failed and good.
>=20
> Both the parent and the child in (E) see the first pages allocated as =
zero,
> with the number of pages being zero increasing as the sleep time in =
(B)
> increases (as long as the sleep is over 30 sec or so). The parent and =
child
> match for which pages are zero vs. not.
>=20
> It fails with (B) being a no-op as well. But the proportionality with
> the time for the sleep is interesting.
>=20
> During (B) "top -PCwaopid" shows RES decreasing, starting after 30 sec
> or so. The fork in (C) produces a child that does not have the same =
RES
> as the parent but instead a tiny RES (80K as I remember). During (E)
> the child's RES increases to full size.
>=20
> My powerpc64, armv7, and amd64 tests of such do not fail, nor does RES
> decrease during (B). The child process gets the same RES as the parent
> as well, unlike for arm64.
>=20
> In the failing context (arm64) RES in the parent decreases during (D)
> before the swap-out as well.
>=20
>> If answers to my two questions are yes, there is probably some bug =
with
>> arm64 pmap handling of the dirty bit emulation.  ARMv8.0 does not =
provide
>> hardware dirty bit, and pmap interprets an accessed writeable page as
>> unconditionally dirty.  More, accessed bit is also not maintained by
>> hardware, instead if should be set by pmap.  And arm64 pmap sets the
>> AF bit unconditionally when creating valid pte.
>=20
> fork-then-swap-out/in is required to see the problem. Neither fork
> by itself nor swapping (zero RES as shown in top) by itself have
> shown the problem so far.
>=20
>> Hmm, could you try the following patch, I did not even compiled it.
>=20
> I'll try it later today.
>=20
>> diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
>> index 3d5756ba891..55aa402eb1c 100644
>> --- a/sys/arm64/arm64/pmap.c
>> +++ b/sys/arm64/arm64/pmap.c
>> @@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sva, =
vm_offset_t eva, vm_prot_t prot)
>> 		    sva +=3D L3_SIZE) {
>> 			l3 =3D pmap_load(l3p);
>> 			if (pmap_l3_valid(l3)) {
>> +				if ((l3 & ATTR_SW_MANAGED) &&
>> +				    pmap_page_dirty(l3)) {
>> +					vm_page_dirty(PHYS_TO_VM_PAGE(l3 =
&
>> +					    ~ATTR_MASK));
>> +				}
>> 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
>> 				PTE_SYNC(l3p);
>> 				/* XXX: Use pmap_invalidate_range */

=3D=3D=3D
Mark Millard
markmi at dsl-only.net


From owner-freebsd-hackers@freebsd.org  Sun Apr  9 20:13:06 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 223DBD360A9
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 20:13:06 +0000 (UTC)
 (envelope-from j.deboynepollard-newsgroups@ntlworld.com)
Received: from smtpq3.tb.ukmail.iss.as9143.net
 (smtpq3.tb.ukmail.iss.as9143.net [212.54.57.98])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id DAB7A107
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 20:13:04 +0000 (UTC)
 (envelope-from j.deboynepollard-newsgroups@ntlworld.com)
Received: from [212.54.57.81] (helo=smtp2.tb.ukmail.iss.as9143.net)
 by smtpq3.tb.ukmail.iss.as9143.net with esmtp (Exim 4.86_2)
 (envelope-from <j.deboynepollard-newsgroups@ntlworld.com>)
 id 1cxIsd-0002cD-Bb
 for freebsd-hackers@freebsd.org; Sun, 09 Apr 2017 21:52:11 +0200
Received: from oxbe4.tb.ukmail.iss.as9143.net ([172.25.160.135])
 by smtp2.tb.ukmail.iss.as9143.net with bizsmtp
 id 6Ks71v0052vaL8C01Ks7TV; Sun, 09 Apr 2017 21:52:07 +0200
X-SourceIP: 172.25.160.135
X-Authenticated-User: j.deboynepollard-newsgroups@ntlworld.com
Date: Sun, 9 Apr 2017 20:52:07 +0100 (BST)
From: Jonathan de Boyne Pollard <j.deboynepollard-newsgroups@ntlworld.com>
Reply-To: Jonathan de Boyne Pollard <j.deboynepollard-newsgroups@ntlworld.com>
To: Debian users <debian-user@lists.debian.org>, 
 FreeBSD Hackers <freebsd-hackers@freebsd.org>, 
 Supervision <supervision@list.skarnet.org>
Message-ID: <731531599.156033.1491767527334.JavaMail.open-xchange@oxbe4.tb.ukmail.iss.as9143.net>
In-Reply-To: <da1dc089-7850-82d8-6d87-5bb999e9e89a@NTLWorld.com>
References: <54430B41.3010301@NTLWorld.com>
 <76c00c13-4cc9-ed9c-f48f-81a3f050b80b@NTLWorld.com>
 <0d6afc48-3465-3509-ff46-494da45022bc@NTLWorld.com>
 <da1dc089-7850-82d8-6d87-5bb999e9e89a@NTLWorld.com>
Subject: nosh version 1.33
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Priority: 3
Importance: Medium
X-Mailer: Open-Xchange Mailer v7.6.2-Rev60
X-Originating-IP: 86.10.211.13
X-Originating-Client: com.openexchange.ox.gui.dhtml
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 20:13:06 -0000

The nosh package is now up to version 1.33 .

* http://jdebp.eu./Softwares/nosh/
*
https://www.freebsd.org/news/status/report-2015-07-2015-09.html#The-nosh-Project
* http://jdebp.info./Softwares/nosh/

This has been held back because of work being done by someone else.  I don't
want to steal xyr thunder, so I'll leave the announcement of that work to xem.
 Suffice it to say that it will interest a new group of people.

There are several major improvements in 1.33 .

Packaging
---------

In the version 1.29 announcement I said that the Debian packaging system was
going to be brought into line with the system used for FreeBSD/TrueOS and
OpenBSD.  This is now done.  Debian and the BSDs all now use a similar system
for generating each package manager's package maintenance instructions from an
abstract package description.

==============================================================
=========== IMPORTANT UPGRADE NOTE FOR Debian: ===============
==============================================================

An important consequence of the aforementioned is that the semantics of the
nosh-bundles package have changed. In earlier versions, the various nosh-run-*
packages were how one set services running, except for a small rump set of
services that were set up by the nosh-bundles package.

This is now no longer the case. The nosh-bundles package now presets and starts
no services at all. *All* running of services must be achieved with the
nosh-run-* packages or some other sets of scripts and presets.

To this end, there are now two new packages, nosh-run-debian-desktop-base and
nosh-run-debian-server-base. These parallel the
nosh-run-{freebsd,trueos}-{desktop,server}-base packages already available since
1.29 for FreeBSD/TrueOS. You must install, for a working fully-nosh-managed
system, exactly one of the nosh-run-debian-{desktop,server}-base packages.

If you are running nosh service management under systemd, you can of course run
as many or as few services under the nosh service manager as you care to switch
over from systemd. But if you are running a fully-nosh-managed system these
packages will arrange to run the various fundamentals that one pretty much
cannot do without, such as mounting/unmounting volumes, running
udev/eudev/vdev/mdev, binfmt loading, and initializing the PRNG.

Log service account names
-------------------------

The naming scheme used for the user accounts for dedicated log service users has
changed.  Installing the new nosh-bundles package should automatically rename
all existing log service accounts to use the new scheme.

The new naming scheme is slightly more compact, and copes better with services
that have things like underscores and plus characters (e.g. powerd++) in their
names.

As an ancillary to this, system-control now has an "escape" subcommand which can
be (and indeed is) used in scripts to perform the escaping transformations.

More packages
-------------

There are now four more -shims packages, for commands whose names conflict with
commands from other packages: nosh-kbd-shims, nosh-bsd-shims, nosh-core-shims,
and nosh-execline-shims.

nosh-kbd-shims, for example, contains a chvt shim that is an alias for the (also
new) console-multiplexor-control command; with it, and suitable privileges to
access the virtual terminal's input queue, one can switch between multiplexed
user-space virtual terminals in much the same way as the old chvt command does
with kernel virtual terminals.

The Z Shell command-line completion for the various commands in the toolset
(system-control, svcadm, shutdown, svstat, and so forth), which has been
available to the people building from source for a while, is now also available
as a binary package.

Configuration import
--------------------

ldconfig on TrueOS is now properly handled.  In particular, the external
configuration import subsystem now correctly pulls in and converts all of the
ldconfig directories.  (TrueOS has a lot more things that require ldconfig
support than stock FreeBSD does.)

The configuration import subsystem also now handles instances of Percona server,
alongside MySQL and MariaDB.  Moreover, these are now handled by the same set of
service bundles, which always produce service bundles named mysql@*.  MySQL
version 5.7 or later is now assumed.

The configuration import subsystem now automatically generates OpenVPN service
bundles based upon the current OpenVPN configuration.

=======================
==== CAVE: OpenVPN ====
=======================

The upgrade process attempts to remove the old hardwired openvpn@server and
openvpn@client service bundles.  However, you might encounter remnants of these
service bundles lying around in /var/sv that you will find that you need to
clean up by hand.

GOPHER
------

To accompany the new gopherd server in djbwares 5, there is a gopher6d service
bundle that runs it, serving up the same static files area as http6d, https6d,
and ftp4d do.

The FreeBSD, OpenBSD, and Debian package repositories can now be browsed with
GOPHER.  This is gopherd in action.  On the server side, generating the
index.gopher files is a fairly humdrum exercise in the use of redo (to
regenerate the indexes only when the directory contents change) and printf (to
construct the GOPHER format menus).

UCSPI-UNIX
----------

Two new UCSPI tools have been added to enable UCSPI-UNIX servers to listen on
and accept connections on AF_UNIX sequential packet sockets.  udevd is one such
server, and it is now handed its listening socket at startup rather than
expected to open its own.

From owner-freebsd-hackers@freebsd.org  Sun Apr  9 20:25:20 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9409DD3634B
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 20:25:20 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: from asp.reflexion.net (outbound-mail-210-8.reflexion.net
 [208.70.210.8])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 4B5E383A
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 20:25:19 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: (qmail 10386 invoked from network); 9 Apr 2017 20:26:17 -0000
Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1)
 by 0 (rfx-qmail) with SMTP; 9 Apr 2017 20:26:17 -0000
Received: by rtc-sm-01.app.dca.reflexion.local
 (Reflexion email security v8.40.0) with SMTP;
 Sun, 09 Apr 2017 16:25:18 -0400 (EDT)
Received: (qmail 21557 invoked from network); 9 Apr 2017 20:25:17 -0000
Received: from unknown (HELO iron2.pdx.net) (69.64.224.71)
 by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 9 Apr 2017 20:25:17 -0000
Received: from [192.168.1.106] (c-76-115-7-162.hsd1.or.comcast.net
 [76.115.7.162])
 by iron2.pdx.net (Postfix) with ESMTPSA id 28AD4EC8630;
 Sun,  9 Apr 2017 13:25:17 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: The arm64 fork-then-swap-out-then-swap-in failures: a program
 source for exploring them
From: Mark Millard <markmi@dsl-only.net>
In-Reply-To: <9DCAF95B-39A5-4346-88FC-6AFDEE8CF9BB@dsl-only.net>
Date: Sun, 9 Apr 2017 13:25:16 -0700
Cc: andrew@freebsd.org, freebsd-hackers@freebsd.org,
 freebsd-arm <freebsd-arm@freebsd.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <8FFE95AA-DB40-4D1E-A103-4BA9FCC6EDEE@dsl-only.net>
References: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net>
 <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
 <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
 <20170409122715.GF1788@kib.kiev.ua>
 <9D152170-5F19-47A2-A06A-66F83CA88A09@dsl-only.net>
 <9DCAF95B-39A5-4346-88FC-6AFDEE8CF9BB@dsl-only.net>
To: Konstantin Belousov <kostikbel@gmail.com>
X-Mailer: Apple Mail (2.3273)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 20:25:20 -0000

[I've not tried building the kernel with
your patch yet.]

Top post of new, independent information.

Jordan Gordeev made a testing suggestion that got me to look
at kdumps of runs with jemalloc allocations sizes that fail
(14*1024) vs. work (14*1024+1).

Example comparison:

 2258 swaptesting6 0.000169 CALL  =
mmap(0,0x200000,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,0xf=
fffffff,0)
 2258 swaptesting6 0.000047 RET   mmap 1080033280/0x40600000
vs.
 2325 swaptesting7 0.000091 CALL  =
mmap(0,0x200000,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,0xf=
fffffff,0)
 2325 swaptesting7 0.000024 RET   mmap 1080033280/0x40600000

No difference. And so it goes.

What varies is the number of mmap's: the larger jemalloc allocation size
gets more mmap's for the same number of jemalloc allocations. (All the
mmap's from my program's explicit allocations are together, =
back-to-back,
with no other traced activity between.)

But varying the number of jemalloc allocations in the program varies the =
number
of mmap calls, yet the size of the individual jemalloc allocations still =
makes
the difference between failure (zeroed pages after fork-then-swap) and =
success.

This problem is a complicated one to classify/isolate.

After the allocations there is not much activity visible in
kdump output. I traced with "-t +" and so avoided page fault
tracing but got most everything else.

I may have to ktrace the page faults for the two jemalloc
allocation sizes and see if anything stands out.

On 2017-Apr-9, at 11:24 AM, Mark Millard <markmi at dsl-only.net> wrote:

> On 2017-Apr-9, at 10:24 AM, Mark Millard <markmi at dsl-only.net> =
wrote:
>=20
>> On 2017-Apr-9, at 5:27 AM, Konstantin Belousov <kostikbel@gmail.com> =
wrote:
>>=20
>>> On Sat, Apr 08, 2017 at 06:02:00PM -0700, Mark Millard wrote:
>>>> [I've identified the code path involved is the arm64 small =
allocations
>>>> turning into zeros for later fork-then-swapout-then-back-in,
>>>> specifically the ongoing RES(ident memory) size decrease that
>>>> "top -PCwaopid" shows before the fork/swap sequence. Hopefully
>>>> I've also exposed enough related information for someone that
>>>> knows what they are doing to get started with a specific
>>>> investigation, looking for a fix. I'd like for a pine64+
>>>> 2GB to have buildworld complete despite the forking and
>>>> swapping involved (yep: for a time zero RES(ident memory) for
>>>> some processes involved in the build).]
>>>=20
>>> I was not able to follow the walls of text, but do not think that
>>> I pmap_ts_reference() is the real culprit there.
>>>=20
>>> Is my impression right that the issue occurs on fork, and looks as
>>> a memory corruption, where some page suddently becomes zero-filled ?
>>> And swapping seems to be involved ?  It is somewhat interesting to =
see
>>> if the problem is reproducable on non-arm64 machines, e.g. armv7 or =
amd64.
>>=20
>> Yes, yes, non-arm64 that I've tried works.
>>=20
>> But I think that the following extra detail my be of use: what top
>> shows for RES over time is also odd on arm64 (only) and the amount
>> of pages that are zeroed is proportional to the decrease in RES.
>>=20
>> In the test sequence:
>>=20
>> A) Allocate lots of 14 KiByte allocations and initialize the content =
of each
>> to non-zero. The example ends up with RES of about 265M.
>=20
> I did forget to list one important property: why I picked 14 KiBytes.
>=20
> A) Any allocation sizes <=3D 14 KiBytes that I've tried
>   gets the zero's problem in my arm64 contexts (bpim3 and rip3).
>=20
> B) Any allocation size >=3D 14 KiBYtes + 1 Byte that I've
>   tried works in those contexts.
>=20
> For the arm64 contexts that I use this happens to match with
> the jemalloc SMALL_MAXCLASS size boundary. When I looked it
> appeared that 14 Ki was the smallest SMALL_MAXCLASS value
> in jemalloc so it would always fit the category.
>=20
>> B) sleep some amount of time, I've been using well over 30 seconds =
here.
>>=20
>> C) fork
>>=20
>> D) sleep again (parent and child), also forcing swapping during the =
sleep
>>  (I used stress, manually run.)
>>=20
>> E) Test the memory pattern in the parent and child process, passing =
over
>>  all the bytes, failed and good.
>>=20
>> Both the parent and the child in (E) see the first pages allocated as =
zero,
>> with the number of pages being zero increasing as the sleep time in =
(B)
>> increases (as long as the sleep is over 30 sec or so). The parent and =
child
>> match for which pages are zero vs. not.
>>=20
>> It fails with (B) being a no-op as well. But the proportionality with
>> the time for the sleep is interesting.
>>=20
>> During (B) "top -PCwaopid" shows RES decreasing, starting after 30 =
sec
>> or so. The fork in (C) produces a child that does not have the same =
RES
>> as the parent but instead a tiny RES (80K as I remember). During (E)
>> the child's RES increases to full size.
>>=20
>> My powerpc64, armv7, and amd64 tests of such do not fail, nor does =
RES
>> decrease during (B). The child process gets the same RES as the =
parent
>> as well, unlike for arm64.
>>=20
>> In the failing context (arm64) RES in the parent decreases during (D)
>> before the swap-out as well.
>>=20
>>> If answers to my two questions are yes, there is probably some bug =
with
>>> arm64 pmap handling of the dirty bit emulation.  ARMv8.0 does not =
provide
>>> hardware dirty bit, and pmap interprets an accessed writeable page =
as
>>> unconditionally dirty.  More, accessed bit is also not maintained by
>>> hardware, instead if should be set by pmap.  And arm64 pmap sets the
>>> AF bit unconditionally when creating valid pte.
>>=20
>> fork-then-swap-out/in is required to see the problem. Neither fork
>> by itself nor swapping (zero RES as shown in top) by itself have
>> shown the problem so far.
>>=20
>>> Hmm, could you try the following patch, I did not even compiled it.
>>=20
>> I'll try it later today.
>>=20
>>> diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
>>> index 3d5756ba891..55aa402eb1c 100644
>>> --- a/sys/arm64/arm64/pmap.c
>>> +++ b/sys/arm64/arm64/pmap.c
>>> @@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sva, =
vm_offset_t eva, vm_prot_t prot)
>>> 		    sva +=3D L3_SIZE) {
>>> 			l3 =3D pmap_load(l3p);
>>> 			if (pmap_l3_valid(l3)) {
>>> +				if ((l3 & ATTR_SW_MANAGED) &&
>>> +				    pmap_page_dirty(l3)) {
>>> +					vm_page_dirty(PHYS_TO_VM_PAGE(l3 =
&
>>> +					    ~ATTR_MASK));
>>> +				}
>>> 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
>>> 				PTE_SYNC(l3p);
>>> 				/* XXX: Use pmap_invalidate_range */

=3D=3D=3D
Mark Millard
markmi at dsl-only.net


From owner-freebsd-hackers@freebsd.org  Sun Apr  9 22:10:01 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 24872D3683F
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sun,  9 Apr 2017 22:10:01 +0000 (UTC)
 (envelope-from alfred@freebsd.org)
Received: from elvis.mu.org (elvis.mu.org [IPv6:2001:470:1f05:b76::196])
 by mx1.freebsd.org (Postfix) with ESMTP id 16E07A4C
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 22:10:01 +0000 (UTC)
 (envelope-from alfred@freebsd.org)
Received: from Alfreds-MacBook-Pro-2.local (unknown
 [IPv6:2601:645:8003:a4d6:80a8:3cdd:4e29:76fd])
 by elvis.mu.org (Postfix) with ESMTPSA id B19E9346DDF5
 for <freebsd-hackers@freebsd.org>; Sun,  9 Apr 2017 15:10:00 -0700 (PDT)
Subject: Re: One Priority Per Run Queue
To: freebsd-hackers@freebsd.org
References: <1aafd6a2-828c-06f5-bdac-e4c953a403b5@FreeBSD.org>
 <CANCZdfogvSXzHT33JxzvZ+0h8BjRetM=-vkPjzhyNJgznPhAnQ@mail.gmail.com>
From: Alfred Perlstein <alfred@freebsd.org>
Organization: FreeBSD
Message-ID: <836da108-7e25-fc94-2c84-bc2f85bb6398@freebsd.org>
Date: Sun, 9 Apr 2017 15:10:33 -0700
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0)
 Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <CANCZdfogvSXzHT33JxzvZ+0h8BjRetM=-vkPjzhyNJgznPhAnQ@mail.gmail.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Apr 2017 22:10:01 -0000



On 3/29/17 2:18 PM, Warner Losh wrote:
> On Wed, Mar 29, 2017 at 2:00 PM, Eric van Gyzen <vangyzen@freebsd.org> wrote:
>> The FreeBSD schedulers assign four priorities to each run queue, making
>> those priorities effectively equal.  This breaks POSIX real-time priorities.
>>
>> Applications that use real-time scheduling use sched_get_priority_min()
>> and sched_get_priority_max() [0] to determine the available range of
>> priorities, and then use simple arithmetic to assign relatively higher
>> or lower priorities.  If an application configures two threads with
>> priorities MAX and MAX-1 (for example), POSIX says the thread at
>> priority MAX must be chosen if it is runnable.  Since our implementation
>> puts these two priorities in the same run queue, it may choose either
>> thread, so it does not conform.
>>
>> The above functions currently return 0 and 31, respectively.  One
>> solution would change max() to return 7 and change other code to
>> translate the 8 POSIX values into the 32 FreeBSD values.  However, this
>> would also not conform, because "conforming implementations shall
>> provide a priority range of at least 32 priorities for this policy." [1]
>>
>> I propose that we assign one priority per run queue:
>>
>>          https://reviews.freebsd.org/D10188
>>
>> This would conform to POSIX.  On a certain commercial block storage
>> product, this change made no difference in performance.  Benchmarks of
>> buildworld on two different machines actually showed a tiny improvement
>> in performance. [2]
>>
>> Please test the above change, especially if you have an interesting
>> workload that might be sensitive to scheduler behavior.  If you already
>> know this change would cause problems, please point me toward the details.
>>
>> Assigning 4 priorities per run queue also caused a recent portability
>> issue in ZFS, although that was fixed by r314058.
> How does this scheme prevent starvation of low priority processes? Or
> rather, how will this change after this change.
>
It would seem that for userland this should allow for starvation as 
that's the point.  However once inside the kernel and any locks are 
taken you must do at minimum priority lending or bump priority higher 
otherwise you can cause deadlock.  I thought we already do this...?

-Alfred



From owner-freebsd-hackers@freebsd.org  Mon Apr 10 00:10:13 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id A32E6D36D83
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 00:10:13 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: from asp.reflexion.net (outbound-mail-210-4.reflexion.net
 [208.70.210.4])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 491DAAA5
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 00:10:12 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: (qmail 25736 invoked from network); 10 Apr 2017 00:10:11 -0000
Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1)
 by 0 (rfx-qmail) with SMTP; 10 Apr 2017 00:10:11 -0000
Received: by mail-cs-01.app.dca.reflexion.local
 (Reflexion email security v8.40.0) with SMTP;
 Sun, 09 Apr 2017 20:10:11 -0400 (EDT)
Received: (qmail 24322 invoked from network); 10 Apr 2017 00:10:10 -0000
Received: from unknown (HELO iron2.pdx.net) (69.64.224.71)
 by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 10 Apr 2017 00:10:10 -0000
Received: from [192.168.1.106] (c-76-115-7-162.hsd1.or.comcast.net
 [76.115.7.162])
 by iron2.pdx.net (Postfix) with ESMTPSA id 46C38EC7901;
 Sun,  9 Apr 2017 17:10:10 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: The arm64 fork-then-swap-out-then-swap-in failures: a program
 source for exploring them
From: Mark Millard <markmi@dsl-only.net>
In-Reply-To: <8FFE95AA-DB40-4D1E-A103-4BA9FCC6EDEE@dsl-only.net>
Date: Sun, 9 Apr 2017 17:10:09 -0700
Cc: andrew@freebsd.org, freebsd-hackers@freebsd.org,
 freebsd-arm <freebsd-arm@freebsd.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <89D6D677-3BE2-45E2-A902-CC6A0305F3F9@dsl-only.net>
References: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net>
 <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
 <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
 <20170409122715.GF1788@kib.kiev.ua>
 <9D152170-5F19-47A2-A06A-66F83CA88A09@dsl-only.net>
 <9DCAF95B-39A5-4346-88FC-6AFDEE8CF9BB@dsl-only.net>
 <8FFE95AA-DB40-4D1E-A103-4BA9FCC6EDEE@dsl-only.net>
To: Konstantin Belousov <kostikbel@gmail.com>
X-Mailer: Apple Mail (2.3273)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 00:10:13 -0000

On 2017-Apr-9, at 10:24 AM, Mark Millard <markmi at dsl-only.net> wrote:

> On 2017-Apr-9, at 5:27 AM, Konstantin Belousov <kostikbel@gmail.com> =
wrote:

>=20
>> Hmm, could you try the following patch, I did not even compiled it.
>=20
> I'll try it later today.
>=20
>> diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
>> index 3d5756ba891..55aa402eb1c 100644
>> --- a/sys/arm64/arm64/pmap.c
>> +++ b/sys/arm64/arm64/pmap.c
>> @@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sva, =
vm_offset_t eva, vm_prot_t prot)
>> 		    sva +=3D L3_SIZE) {
>> 			l3 =3D pmap_load(l3p);
>> 			if (pmap_l3_valid(l3)) {
>> +				if ((l3 & ATTR_SW_MANAGED) &&
>> +				    pmap_page_dirty(l3)) {
>> +					vm_page_dirty(PHYS_TO_VM_PAGE(l3 =
&
>> +					    ~ATTR_MASK));
>> +				}
>> 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
>> 				PTE_SYNC(l3p);
>> 				/* XXX: Use pmap_invalidate_range */


Preliminary testing indicates that this fixes the
some-pages-become-zero problem for fork-then-swapout/in.

Thanks!

I'll see if a buildworld can go through without being stopped
by the type of issue. But that will take a while. (It is how
I originally ran into the problem(s) that others had been
reporting on the lists.)


Side notes:

The decreasing-RES(ident memory) behavior was unchanged.

The "child gets only 80K RES initially" behavior was also
unchanged.

(These are as shown by "top -PCwaopid" . These are just
differences with what I see for other TARGET_ARCH's.)

=3D=3D=3D
Mark Millard
markmi at dsl-only.net



From owner-freebsd-hackers@freebsd.org  Mon Apr 10 01:28:38 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 244F6D3571E
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 01:28:38 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-oi0-x242.google.com (mail-oi0-x242.google.com
 [IPv6:2607:f8b0:4003:c06::242])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id CC868350
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 01:28:37 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-oi0-x242.google.com with SMTP id w197so9934847oiw.1
 for <freebsd-hackers@freebsd.org>; Sun, 09 Apr 2017 18:28:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=m7+gMY3lTT1hn2K6qNpp4NVm7bDJCwBzlsnyQ7RhTrE=;
 b=u3LNmAxNLhVRNJoBfwUIyPRTLFOqfLiFvBIYYzpGoJ7NM99PnPlprQ4nGbTD/TjV8q
 W6m81iYn3bAsVD/wvnMexgHx1tMruvi0M60ir4lNfMDpXbg7Go3jeIvX2yMc/WmRjTYx
 8SScBBCcM6EjApMObiho0a+Q6MW5t+I2/QxoMuCvwlcY7qJkiiuidxChlR++yXj5cez7
 7iP8tivsF4Ha9X3DMWIGmUEJojype/I2q7TiGqOmkdUaIiSrI76NgaQy4njcoI1PeAyt
 8dQ4vvEJ++4KvHr+3EipmhlDXF9+K4OiwFgNNix/tma8GadA18BpaNrmTvcla1tIYe8D
 RbzQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=m7+gMY3lTT1hn2K6qNpp4NVm7bDJCwBzlsnyQ7RhTrE=;
 b=tmsJlzFYpfhsF9Oix7p6IoGT4Nu+7pecdsyIbe2kbfVNrQpt9AvX8liYrJ6ubfdtH5
 RxD6B8kaf/agZiDqba9unarU/qJDGoLd5Eoauq/VqU6Ot+0bLLuXK+PJCwvUyO0e0PN6
 6fD5kEqEOY88Zgk0WQnP8j38gZhTWg7YC/SI/vYUWvR/5rsJnFKpMelqo5e0ffo/kF2M
 18aZRa8Zz7PmauDZR4bfwIcVl+fJlm4nMLPojUInKtBMvjnDQ8eMs/hIO5LUkgyn5rIQ
 jyb2edBkbp4barJOTIPS07HMhSINRV48uIm4MOUo6YcU05q3SfGXP9P9ZFYlSW94WIYA
 mEWA==
X-Gm-Message-State: AFeK/H0DOo7y1rm2KTLmrueWFt1FkdXtgwsuRZ9JSx2Iinymc38Iyb1U8lNfOXTEt7xm0A==
X-Received: by 10.202.228.17 with SMTP id b17mr26697188oih.212.1491787716997; 
 Sun, 09 Apr 2017 18:28:36 -0700 (PDT)
Received: from [192.168.0.100] ([110.64.91.54])
 by smtp.gmail.com with ESMTPSA id v49sm5638242otb.13.2017.04.09.18.28.34
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Sun, 09 Apr 2017 18:28:35 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Ryan Stone <rysto32@gmail.com>
References: <e99b6366-7d30-a889-b7db-4a3b3133ff5e@gmail.com>
 <CABh_MKkbVVi+gTkaBVDvVfRggS6pbHKJE_VbYBZpAaTCZ81b7Q@mail.gmail.com>
 <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
 <CAFMmRNwWnaq-4vEDCByqdUzWfoiZeN0nM_M5rt8ST0P8xnUTsA@mail.gmail.com>
Cc: Ed Schouten <ed@nuxi.nl>,
 "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <3f93930c-7f10-4d0b-35f2-2b07d64081f0@gmail.com>
Date: Mon, 10 Apr 2017 09:28:25 +0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.7.0
MIME-Version: 1.0
In-Reply-To: <CAFMmRNwWnaq-4vEDCByqdUzWfoiZeN0nM_M5rt8ST0P8xnUTsA@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 01:28:38 -0000

On 2017/4/10 0:24, Ryan Stone wrote:
>
>
> On Sun, Apr 9, 2017 at 6:13 AM, Yubin Ruan <ablacktshirt@gmail.com
> <mailto:ablacktshirt@gmail.com>> wrote:
>
>
>     #######1, spinlock used in an interrupt handler
>     If a thread A holding a spinlock T get interrupted and the interrupt
>     handler responsible for this interrupt try to acquire T, then we have
>     deadlock, because A would never have a chance to run before the
>     interrupt handler return, and the interrupt handler, unfortunately,
>     will continue to spin ... so in this situation, one has to disable
>     interrupt before spinning.
>
>     As far as I know, in Linux, they provide two kinds of spinlocks:
>
>       spin_lock(..);   /* spinlock that does not disable interrupts */
>       spin_lock_irqsave(...); /* spinlock that disable local interrupt *
>
>
> In the FreeBSD locking style, a spinlock is only used in the case where
> one needs to synchronize with an interrupt handler.  This is why spinlocks
> always disable local interrupts in FreeBSD.
>
> FreeBSD's lock for the first case is the MTX_DEF mutex, which is
> adaptively-spinning blocking mutex implementation.  In short, the MTX_DEF
> mutex will spin waiting for the lock if the owner is running, but will
> block if the owner is deschedules.  This prevents expensive trips through
> the scheduler for the common case where the mutex is only held for short
> periods, without wasting CPU cycles spinning in cases where the owner thread
> is descheduled and therefore will not be completing soon.

Great explanation! I read the man page at:

 > 
https://www.freebsd.org/cgi/man.cgi?query=mutex&sektion=9&apropos=0&manpath=FreeBSD+11.0-RELEASE+and+Ports

and now clear about MTX_DEF and MTX_SPIN mutexs. But, still a few more
question, if you don't mind:

Is it true that a thread holding a MTX_DEF mutex can be descheduled?
(shouldn't it disable interrupt like a MTX_SPIN mutex?) It is said on
the main page that MTX_DEF mutex is used by default in FreeBSD, so its
usecase must be very common. If a thread holding a MTX_DEF mutex can be 
descheduled, which means that it did not disable interrupt, then we may
have lots of deadlock here, right?

>
>     #######2, priority inversion problem
>     If thread B with a higher priority get in and try to acquire the lock
>     that thread A currently holds, then thread B would spin, while at the
>     same time thread A has no chance to run because it has lower priority,
>     thus not being able to release the lock.
>     (I haven't investigate enough into the source code, so I don't know
>     how FreeBSD and Linux handle this priority inversion problem. Maybe
>     they use priority inheritance or random boosting?)
>
>
> FreeBSD's spin locks prevent priority inversion by preventing the holder
> thread from being descheduled.
>
> MTX_DEF locks implement priority inheritance.

Nice hints. Thanks!

regards,
Yubin Ruan


From owner-freebsd-hackers@freebsd.org  Mon Apr 10 01:51:47 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3C7B3D35DA0
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 01:51:47 +0000 (UTC)
 (envelope-from wlosh@bsdimp.com)
Received: from mail-io0-x242.google.com (mail-io0-x242.google.com
 [IPv6:2607:f8b0:4001:c06::242])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 06683FED
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 01:51:46 +0000 (UTC)
 (envelope-from wlosh@bsdimp.com)
Received: by mail-io0-x242.google.com with SMTP id 68so13145607ioh.3
 for <freebsd-hackers@freebsd.org>; Sun, 09 Apr 2017 18:51:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=bsdimp-com.20150623.gappssmtp.com; s=20150623;
 h=mime-version:sender:in-reply-to:references:from:date:message-id
 :subject:to:cc;
 bh=MFIO27u4jXiPmjPOhFGd/MFX/n4NDaWQVAaq0tNGpmw=;
 b=eX/lkFirGcuoOEUaDvNFUCN3nP38/78LypHmxs66SbLaoIkhr1pZTlE7ATDFKsG7E6
 xujQH4uFK3cVxwkPx7SswQDFxde86dkDNwEgLeoYcyiPSpcH8quJr2I6tzU/kluyfNPj
 lp8UOXS+8ULo6HkgQWfZggXh97eDuWTzvHkDlSoXqs8NAJQ3y+4aVf9yIflNfvcAG813
 ig+zcU1t93J2P4YMnE+jGLuHPvXxk970k8SbPTJXbtdTHEWZOSGr3dYfklT8w7aYnm9W
 F48XXhk9+k2FfjtxCWOVw2jy5fVmDC5ANiZsXcF3qJAAALQmtlwP3mPQceZwGGGh6x5r
 xQtQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:sender:in-reply-to:references:from
 :date:message-id:subject:to:cc;
 bh=MFIO27u4jXiPmjPOhFGd/MFX/n4NDaWQVAaq0tNGpmw=;
 b=XpjKy2TdhZ4uhv4+bQ6n9M0nUgLRlBkerE75tQhyWZdTutzYM4WkYIb6XeUJobQKqW
 fXZIZfnlrXm010Ms3aAu7qOGNZKW8j8poB9CAAvhgtuLa/+w9nS7u2JuvktBHyaUNqmV
 o72x2mzPYmPaVYMzievUnsGxbib8v6X2bU6tQLDMMJ0/wcqFFw9DA0Sl64NPLFV1rjv4
 RFhwK+taekhfq7Czxb8c1GvMpBv53ANOpmry/Y5ymLLSTyibh3ThJ3HxqbPRgrAmRo2T
 RT92i+79h/xPvGcIpccoJyZ/R8BVZbrUVCfG+80UvMJV0o9DYe/XzJhVDCny9f3minDz
 wt9w==
X-Gm-Message-State: AN3rC/7sJAYQACOCsE+rer4KUETvOQOWDHTB7tHjSbHXGYBJFXeMyVAWCqhq+o8mitMWpWTksvOPdUChRrN6aA==
X-Received: by 10.107.134.76 with SMTP id i73mr5787258iod.0.1491789105993;
 Sun, 09 Apr 2017 18:51:45 -0700 (PDT)
MIME-Version: 1.0
Sender: wlosh@bsdimp.com
Received: by 10.79.146.24 with HTTP; Sun, 9 Apr 2017 18:51:45 -0700 (PDT)
X-Originating-IP: [2607:fb10:7021:1::b517]
In-Reply-To: <3f93930c-7f10-4d0b-35f2-2b07d64081f0@gmail.com>
References: <e99b6366-7d30-a889-b7db-4a3b3133ff5e@gmail.com>
 <CABh_MKkbVVi+gTkaBVDvVfRggS6pbHKJE_VbYBZpAaTCZ81b7Q@mail.gmail.com>
 <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
 <CAFMmRNwWnaq-4vEDCByqdUzWfoiZeN0nM_M5rt8ST0P8xnUTsA@mail.gmail.com>
 <3f93930c-7f10-4d0b-35f2-2b07d64081f0@gmail.com>
From: Warner Losh <imp@bsdimp.com>
Date: Sun, 9 Apr 2017 19:51:45 -0600
X-Google-Sender-Auth: UCHBvUdt3vKI0KRKbrTdZaRoxUE
Message-ID: <CANCZdfoyjcSU+NHEVJF=bd8xz-Q-H1EupMPX+Jk45r3DKZ9F9Q@mail.gmail.com>
Subject: Re: Understanding the FreeBSD locking mechanism
To: Yubin Ruan <ablacktshirt@gmail.com>
Cc: Ryan Stone <rysto32@gmail.com>, 
 "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>,
 Ed Schouten <ed@nuxi.nl>
Content-Type: text/plain; charset=UTF-8
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 01:51:47 -0000

On Sun, Apr 9, 2017 at 7:28 PM, Yubin Ruan <ablacktshirt@gmail.com> wrote:
> On 2017/4/10 0:24, Ryan Stone wrote:
>>
>>
>>
>> On Sun, Apr 9, 2017 at 6:13 AM, Yubin Ruan <ablacktshirt@gmail.com
>> <mailto:ablacktshirt@gmail.com>> wrote:
>>
>>
>>     #######1, spinlock used in an interrupt handler
>>     If a thread A holding a spinlock T get interrupted and the interrupt
>>     handler responsible for this interrupt try to acquire T, then we have
>>     deadlock, because A would never have a chance to run before the
>>     interrupt handler return, and the interrupt handler, unfortunately,
>>     will continue to spin ... so in this situation, one has to disable
>>     interrupt before spinning.
>>
>>     As far as I know, in Linux, they provide two kinds of spinlocks:
>>
>>       spin_lock(..);   /* spinlock that does not disable interrupts */
>>       spin_lock_irqsave(...); /* spinlock that disable local interrupt *
>>
>>
>> In the FreeBSD locking style, a spinlock is only used in the case where
>> one needs to synchronize with an interrupt handler.  This is why spinlocks
>> always disable local interrupts in FreeBSD.
>>
>> FreeBSD's lock for the first case is the MTX_DEF mutex, which is
>> adaptively-spinning blocking mutex implementation.  In short, the MTX_DEF
>> mutex will spin waiting for the lock if the owner is running, but will
>> block if the owner is deschedules.  This prevents expensive trips through
>> the scheduler for the common case where the mutex is only held for short
>> periods, without wasting CPU cycles spinning in cases where the owner
>> thread
>> is descheduled and therefore will not be completing soon.
>
>
> Great explanation! I read the man page at:
>
>>
>> https://www.freebsd.org/cgi/man.cgi?query=mutex&sektion=9&apropos=0&manpath=FreeBSD+11.0-RELEASE+and+Ports
>
> and now clear about MTX_DEF and MTX_SPIN mutexs. But, still a few more
> question, if you don't mind:
>
> Is it true that a thread holding a MTX_DEF mutex can be descheduled?
> (shouldn't it disable interrupt like a MTX_SPIN mutex?) It is said on
> the main page that MTX_DEF mutex is used by default in FreeBSD, so its
> usecase must be very common. If a thread holding a MTX_DEF mutex can be
> descheduled, which means that it did not disable interrupt, then we may
> have lots of deadlock here, right?

Yes, they can be descheduled. But that's not a problem. No other
thread can acquire the MTX_DEF lock. If another thread tries, it will
sleep and wait for the thread that holds the MTX_DEF lock to release
it. Eventually, the thread will get time to run again, and then
release the lock. Threads that just hold a MTX_DEF lock may also
migrate from CPU to CPU too.

Warner

>>     #######2, priority inversion problem
>>     If thread B with a higher priority get in and try to acquire the lock
>>     that thread A currently holds, then thread B would spin, while at the
>>     same time thread A has no chance to run because it has lower priority,
>>     thus not being able to release the lock.
>>     (I haven't investigate enough into the source code, so I don't know
>>     how FreeBSD and Linux handle this priority inversion problem. Maybe
>>     they use priority inheritance or random boosting?)
>>
>>
>> FreeBSD's spin locks prevent priority inversion by preventing the holder
>> thread from being descheduled.
>>
>> MTX_DEF locks implement priority inheritance.
>
>
> Nice hints. Thanks!
>
> regards,
> Yubin Ruan
>
>
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"

From owner-freebsd-hackers@freebsd.org  Mon Apr 10 02:01:53 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 50B2CD32010;
 Mon, 10 Apr 2017 02:01:53 +0000 (UTC)
 (envelope-from alan.l.cox@gmail.com)
Received: from mail-io0-x22e.google.com (mail-io0-x22e.google.com
 [IPv6:2607:f8b0:4001:c06::22e])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 157446E7;
 Mon, 10 Apr 2017 02:01:53 +0000 (UTC)
 (envelope-from alan.l.cox@gmail.com)
Received: by mail-io0-x22e.google.com with SMTP id l7so79076932ioe.3;
 Sun, 09 Apr 2017 19:01:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:reply-to:in-reply-to:references:from:date:message-id
 :subject:to:cc;
 bh=z6n7AHYe/CeqWlfBZJM8VcXFSXQNpkIfC+ppK2DMQPo=;
 b=nXg6GCeXhiclxm7DtE3Y6Vbp6HgKF2WMa6ylfbPHzpi3uz+7fsP0VGz7iZwW/+2M4c
 B6OlsIym84KwZbSOtXVKYdcWfF+zquLbTzrStecnhDeVh4UG0Y2tUt3YFsMu7aGF61fH
 ygCS/E1EUxLgWsQFXmRBfYhCR5x8AOjvTGBTWFSDcaJEYNmG1gqTOT12gx/C6bMTrXKF
 ZUMtrLRMMEPxZ+ZGcrWRibX8LfsaFGP+uGReTHzRhTJJifRgG+HR8jB/oSTLChOqsdfW
 OwsVLx2Gy0e5DeES1N+43LzSIHTsO5lBZviBAAXxCLOQKo38EPc5Ej1JayOfHOavRSNk
 yO5w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:reply-to:in-reply-to:references
 :from:date:message-id:subject:to:cc;
 bh=z6n7AHYe/CeqWlfBZJM8VcXFSXQNpkIfC+ppK2DMQPo=;
 b=GMt/FUcJb3P0Vv7RGSMLLXK47LZkHqgculVM3w1h6XuhsIUZIElE+iFHv0yY6P3UiS
 yA8/sxc+iOcSoCYP1TTqqMz79UIBvVpeINhjQWmuERDdQTFhiGLPOiljWT2WWTptubOp
 r3qaXxK/Hi6lf9z/Twv84CFt/ZfRMYYszi1rKiFbPBA5niG2+ASvA85lFynrOUzn8GjY
 tOn6c2Mn7GQ51uv0tbsLek9FCY6n3OUekQkS0NbEaE6b3q5t3OMiegSg+zDaPU4U7dZB
 uQMIQf7bHuh1sWWLZ7NmPRnjtL99MS2jY0cP8ebOo421Uh69uW9vuBfX/7ioKz+zqAqx
 vdig==
X-Gm-Message-State: AN3rC/5Xiet9lwe3Yexe2t4qbzEqeKjWGf4w4ZjDplCRVYnZ0TfWk40K
 /62Qx34QWrLvvbQEfBp1PuAgZHeQrg==
X-Received: by 10.36.36.131 with SMTP id f125mr10145622ita.45.1491789712474;
 Sun, 09 Apr 2017 19:01:52 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.79.15.130 with HTTP; Sun, 9 Apr 2017 19:01:51 -0700 (PDT)
Reply-To: alc@freebsd.org
In-Reply-To: <89D6D677-3BE2-45E2-A902-CC6A0305F3F9@dsl-only.net>
References: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net>
 <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
 <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
 <20170409122715.GF1788@kib.kiev.ua>
 <9D152170-5F19-47A2-A06A-66F83CA88A09@dsl-only.net>
 <9DCAF95B-39A5-4346-88FC-6AFDEE8CF9BB@dsl-only.net>
 <8FFE95AA-DB40-4D1E-A103-4BA9FCC6EDEE@dsl-only.net>
 <89D6D677-3BE2-45E2-A902-CC6A0305F3F9@dsl-only.net>
From: Alan Cox <alan.l.cox@gmail.com>
Date: Sun, 9 Apr 2017 21:01:51 -0500
Message-ID: <CAJUyCcO1j6KxHRP_Azzb73JZb-Cqg=83zAYemjUQEnTp1t=dFA@mail.gmail.com>
Subject: Re: The arm64 fork-then-swap-out-then-swap-in failures: a program
 source for exploring them
To: Mark Millard <markmi@dsl-only.net>
Cc: Konstantin Belousov <kostikbel@gmail.com>, andrew@freebsd.org, 
 freebsd-hackers <freebsd-hackers@freebsd.org>,
 freebsd-arm <freebsd-arm@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 02:01:53 -0000

On Sun, Apr 9, 2017 at 7:10 PM, Mark Millard <markmi@dsl-only.net> wrote:

> On 2017-Apr-9, at 10:24 AM, Mark Millard <markmi at dsl-only.net> wrote:
>
> > On 2017-Apr-9, at 5:27 AM, Konstantin Belousov <kostikbel@gmail.com>
> wrote:
>
> >
> >> Hmm, could you try the following patch, I did not even compiled it.
> >
> > I'll try it later today.
> >
> >> diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
> >> index 3d5756ba891..55aa402eb1c 100644
> >> --- a/sys/arm64/arm64/pmap.c
> >> +++ b/sys/arm64/arm64/pmap.c
> >> @@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sva,
> vm_offset_t eva, vm_prot_t prot)
> >>                  sva += L3_SIZE) {
> >>                      l3 = pmap_load(l3p);
> >>                      if (pmap_l3_valid(l3)) {
> >> +                            if ((l3 & ATTR_SW_MANAGED) &&
> >> +                                pmap_page_dirty(l3)) {
> >> +                                    vm_page_dirty(PHYS_TO_VM_PAGE(l3 &
> >> +                                        ~ATTR_MASK));
> >> +                            }
> >>                              pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
> >>                              PTE_SYNC(l3p);
> >>                              /* XXX: Use pmap_invalidate_range */
>
>
> Preliminary testing indicates that this fixes the
> some-pages-become-zero problem for fork-then-swapout/in.
>
> Thanks!
>
> I'll see if a buildworld can go through without being stopped
> by the type of issue. But that will take a while. (It is how
> I originally ran into the problem(s) that others had been
> reporting on the lists.)
>
>
> Side notes:
>
> The decreasing-RES(ident memory) behavior was unchanged.
>
> The "child gets only 80K RES initially" behavior was also
> unchanged.
>
>
That is because the arm64 pmap doesn't implement pmap_copy().


> (These are as shown by "top -PCwaopid" . These are just
> differences with what I see for other TARGET_ARCH's.)
>
> ===
> Mark Millard
> markmi at dsl-only.net
>
>
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
>

From owner-freebsd-hackers@freebsd.org  Mon Apr 10 02:20:25 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2FEB9D324CA
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 02:20:25 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (mail.torek.net [96.90.199.121])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "elf.torek.net", Issuer "elf.torek.net" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 04753E5C
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 02:20:24 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (localhost [127.0.0.1])
 by elf.torek.net (8.15.2/8.15.2) with ESMTPS id v3A2GQ9G032228
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Sun, 9 Apr 2017 19:16:26 -0700 (PDT)
 (envelope-from torek@elf.torek.net)
Received: (from torek@localhost)
 by elf.torek.net (8.15.2/8.15.2/Submit) id v3A2GQ2s032227;
 Sun, 9 Apr 2017 19:16:26 -0700 (PDT) (envelope-from torek)
Date: Sun, 9 Apr 2017 19:16:26 -0700 (PDT)
From: Chris Torek <torek@elf.torek.net>
Message-Id: <201704100216.v3A2GQ2s032227@elf.torek.net>
To: rysto32@gmail.com, vasanth.raonaik@gmail.com
Subject: Re: Understanding the FreeBSD locking mechanism
Cc: ablacktshirt@gmail.com, ed@nuxi.nl, freebsd-hackers@freebsd.org
In-Reply-To: <CAFMmRNzOypqsBam2BfaFm+pX7hSYoEvB2oFtec8OtH6D=s9yTw@mail.gmail.com>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2
 (elf.torek.net [127.0.0.1]); Sun, 09 Apr 2017 19:16:26 -0700 (PDT)
X-Mailman-Approved-At: Mon, 10 Apr 2017 02:50:47 +0000
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 02:20:25 -0000

Ryan Stone is of course correct here.  I have not kept up with the
latest terminology shifts, but I can describe a bit of the history
of all of this.  (I was somewhat involved with writing the
original MTX_DEF and MTX_SPIN mutex code way back when).

(None of the rest of this should be new to experienced kernel
hackers.)

In the old non-SMP days, BSD, like traditional V6 Unix, divided
the kernel into "top half" and "bottom half" sections.  The top
half was anything driven from something other than an interrupt,
such as initial bootstrap or any user-sourced system call.  Each
of these had just one (per-process) kernel stack, in the "u.
area", which was UPAGES * NBPG (number of bytes per page) bytes
long, but also had to contain "struct user".

(In other words, the stack space available was actually smaller
than that.  The "user" struct was *above* the kernel stack, so
that ksp would not grow down into the structure; there was also
signal trampoline code wedged in there, at least on the VAX and
some of the early other ports.  I desperately wanted to move the
trampoline code to libc for the sparc port.  It was *in theory*
easy to do this :-) ... practice was another matter.)

When an interrupt arrived, as long as it was not interrupting
another interrupt, the system would get on a separate "interrupt
stack" -- some hardware supports this directly, with a separate
interrupt stack register -- which meant we did not have to provide
enough interrupt-handling space in the per-process kernel stack,
nor take interrupts on some possibly dodgy user stack.
(Interrupts can occur at any time, so the system may be running
user code, not kernel code.)

It also meant that a simple:

    s = splfoo();

call in the top half would block any interrupts at priority foo or
lower, so that "top half" code could know that "bottom half" code
for foo would not run at this point.

With prioritized interrupts, *taking* an interrupt at level foo
automatically raised the CPU's priority to foo, so any "bottom
half" code for foo would know that no important "top half" code
was running at the time -- if that had been the case, the top half
would have done an splfoo() to block it -- and of course no other
"bottom half" code for foo could run now.

Meanwhile, no bottom-half code was *ever* allowed to block.  When
you took an interrupt, you were committed to finishing all of the
work for that interrupt before issuing a "return from interrupt"
instruction (which would, if the interrupt was not interrrupting
another intterupt, switch back to the appropriate

A good way to describe this strategy:

    s = splfoo();  /* block all bottom half code at this priority */
    ...
    splx(s);       /* resume ability of blocked bottom half code to run */

is that spl (set priority level) provides mutual exclusion to
*code paths*.  The top half blocks out the bottom half with an
spl, and the bottom half blocks out the top half by simply *being*
bottom-half code, handling interrupts.

     -----

With SMP, this whole strategy is a non-starter.  We don't have
just one CPU running code; we cannot block *code* paths at all.
Instead, we switch to mutually exclusive access to *data*.

We then make several observations:

 * Most data structures are mostly uncontested.  (If not, we need to
   redesign them!)  "Get lock" should be fast in this usual case.

 * If we provide what used to be "bottom half" drivers with *their
   own* stacks / interrupt threads ("ithreads"), they can block if
   they need to: when the data structure *is* actually contested.

This means we mainly need just one kind of mutex.  For read-mostly
data structures, we would like to have shared-read locks and so
on, but we can build them on this base mutex.  (As it turns out,
this view is a little simplistic; we want to build them on a base
compare-and-swap, typically, rather than a base full-blown-mutex.
It would also be nice to have CAS-based multi-producer single-
consumer and single-producer multi-consumer queues.  These are
particuarly useful on systems with hundreds or thousands of
cores.)

Of course, we also have to start dealing with issues like priority
inversion and lock order / possible deadlock, any time we lock
data instead of code paths.  But that's mostly a separate issue.

     -----

This is all fine for most code, including most device drivers, but
for manipulating the hardware itself, the lowest level interrupt
dispatching code, and also the system scheduler, still must block
interrupts from time to time.  We also have some special oddball
cases, such as profiling interrupts, where we want to know: "What
was running when the interrupt fired?"  For these cases we *don't*
want to switch to a separate interrupt thread:

 * In the hardware interrupt dispatcher, we may not *know which
   thread to switch to* yet.  We must find the right ithread,
   then schedule it to run.  (Then we have to manipulate the
   hardware based on whether the interrupt is edge or level
   triggered, and so on, but that's an in-the-weeds detail also
   mostly unrelated to this scheduling.)

   For the profiling "what was running" case, we'd like to sample
   what was running, which we *can't* do from a separate thread:
   we need access to the current stack.  (Strictly speaking, we
   merely need said access ... but we also need that thread to
   remain paused while we sample it.)

   And, for some low-cost paths such as gathering entropy, we may
   not want or need to *pay* the up-front cost of a separate ithread.

 * In the scheduler, we're either in the process of choosing
   threads and changing stacks, or setting up data structures to
   tell the chooser which threads to choose.  We need to block all
   scheduling events, including all interrupts, for some of these
   super-critical sections.

These use the MTX_SPIN type lock, which is similar to MTX_DEF, but:

 * does block interrupts, and 

 * never *invokes* the scheduler: never tries to put the current
   thread out of the running until the locked data are available.

Since then, we have added another special case:

 * In a "critical section", we wish to make sure that the current
   thread does not migrate from one CPU to another.  This does
   not, strictly speaking, require blocking interrupts entirely,
   but because the scheduler does its thing by blocking interrupts,
   we block interrupts for short durations here as well (actually
   when *leaving* the critical section, where we check to see if
   the scheduler would *like* us to migrate).

   This is not really a mutex at all, but it does interact with
   them, so it's worth mentioning.  Essentially, if you are in a
   critical section, you may not switch threads, so if you need
   a mutex, you must use a spin mutex.

   (This *is* well-documented in "man 9 critical_enter".)

One might argue that being in a critical section should turn an
MTX_DEF mutex into an MTX_SPIN mutex, but it's not that easy to
do, and if you're taking a possibly slow MTX_DEF lock *while* in a
critical section, "you're doing it wrong" (as with the heavily
contested datum problem, we should rewrite the code so that the
critical section happens *while* holding the MTX_DEF object, not
the other way around).

Anyway, that's how we got here, and why things are the way they
are.

Chris

From owner-freebsd-hackers@freebsd.org  Mon Apr 10 04:02:23 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6D596D365CF
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 04:02:23 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-oi0-x242.google.com (mail-oi0-x242.google.com
 [IPv6:2607:f8b0:4003:c06::242])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 32F837A4
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 04:02:23 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-oi0-x242.google.com with SMTP id w197so10298429oiw.1
 for <freebsd-hackers@freebsd.org>; Sun, 09 Apr 2017 21:02:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=wqmoaea+s9YKC28jD3SwkdZ2MtrYqVk7eDe0InvWfv8=;
 b=sj4jc96EFNt5t/LDwRW71YGB1nuYyO+hhQoccrj+4/Nqfbz2N5i6Xwhj1GfUy7SDY+
 YCP5tambaaCVc/DOlluipV/WNliezaWiKhEnBx/zj10g1XDQGgw3P8N0KLpK0+5MVX53
 M2FZJe+K70vxn6uHeCfzlWauf7nHCACqRJUj/+ck1zEWFmLDbo0iIP2teqE9TWU/PpMz
 umy2lh91avEJw23QqSm6mOFaAOLdFByUMXL0hyscqJIR+UI8D/rMg6VFT3EQAVGmZbbJ
 NL9KkcGTjWwscfw3icMUnn099SvUSuZwreiKiol+v0huBlMU3auSud+uzGCKKQIZB8Mn
 Xlwg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=wqmoaea+s9YKC28jD3SwkdZ2MtrYqVk7eDe0InvWfv8=;
 b=pPxREbvl3dL+6zrgeQXw6iXsJdPIGuFsc3uY3lYcGzC1u1M5CDOHmM19/V/lXgaTCU
 +Z/WFJBAWIF0iODGYgtJQelzTNrILYDLf2B9kJyA2w71kjOz8xn9bHVLpzb7wrMhDNSH
 k6H+4SefCGLfO342xYyGNjEog8E8kmGJjQmXiYHeFb+D0RxWROOO38jdeFuL3H247ETY
 f4pCy1RECpCqihuu3pxKBaD4fmxvmGC6ugJVUBfkA+pA35QdSGHceCtiWr3/Z7XN/B25
 W4Y6Ny9xtGYjrIi/YuTspLUj8TKllS2zradeCyKOADLwRV80j3gx/hU+IuXSdSvVUhBR
 RQIg==
X-Gm-Message-State: AN3rC/6RbfN812Kpr/CF2Ym5BRM9KVPxEM0xx7B6nUFkbtAYfWTwqoGufLCELsgF0ukiVA==
X-Received: by 10.157.82.9 with SMTP id e9mr7888758oth.50.1491796942394;
 Sun, 09 Apr 2017 21:02:22 -0700 (PDT)
Received: from [192.168.0.100] ([110.64.91.54])
 by smtp.gmail.com with ESMTPSA id p3sm4067200ota.54.2017.04.09.21.02.19
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Sun, 09 Apr 2017 21:02:21 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Warner Losh <imp@bsdimp.com>
References: <e99b6366-7d30-a889-b7db-4a3b3133ff5e@gmail.com>
 <CABh_MKkbVVi+gTkaBVDvVfRggS6pbHKJE_VbYBZpAaTCZ81b7Q@mail.gmail.com>
 <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
 <CAFMmRNwWnaq-4vEDCByqdUzWfoiZeN0nM_M5rt8ST0P8xnUTsA@mail.gmail.com>
 <3f93930c-7f10-4d0b-35f2-2b07d64081f0@gmail.com>
 <CANCZdfoyjcSU+NHEVJF=bd8xz-Q-H1EupMPX+Jk45r3DKZ9F9Q@mail.gmail.com>
Cc: Ryan Stone <rysto32@gmail.com>,
 "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>,
 Ed Schouten <ed@nuxi.nl>
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <b69597cd-fab6-7ef8-7dfe-d097283c4064@gmail.com>
Date: Mon, 10 Apr 2017 12:01:51 +0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.7.0
MIME-Version: 1.0
In-Reply-To: <CANCZdfoyjcSU+NHEVJF=bd8xz-Q-H1EupMPX+Jk45r3DKZ9F9Q@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 04:02:23 -0000

On 2017/4/10 9:51, Warner Losh wrote:
> On Sun, Apr 9, 2017 at 7:28 PM, Yubin Ruan <ablacktshirt@gmail.com> wrote:
>> On 2017/4/10 0:24, Ryan Stone wrote:
>>>
>>>
>>>
>>> On Sun, Apr 9, 2017 at 6:13 AM, Yubin Ruan <ablacktshirt@gmail.com
>>> <mailto:ablacktshirt@gmail.com>> wrote:
>>>
>>>
>>>     #######1, spinlock used in an interrupt handler
>>>     If a thread A holding a spinlock T get interrupted and the interrupt
>>>     handler responsible for this interrupt try to acquire T, then we have
>>>     deadlock, because A would never have a chance to run before the
>>>     interrupt handler return, and the interrupt handler, unfortunately,
>>>     will continue to spin ... so in this situation, one has to disable
>>>     interrupt before spinning.
>>>
>>>     As far as I know, in Linux, they provide two kinds of spinlocks:
>>>
>>>       spin_lock(..);   /* spinlock that does not disable interrupts */
>>>       spin_lock_irqsave(...); /* spinlock that disable local interrupt *
>>>
>>>
>>> In the FreeBSD locking style, a spinlock is only used in the case where
>>> one needs to synchronize with an interrupt handler.  This is why spinlocks
>>> always disable local interrupts in FreeBSD.
>>>
>>> FreeBSD's lock for the first case is the MTX_DEF mutex, which is
>>> adaptively-spinning blocking mutex implementation.  In short, the MTX_DEF
>>> mutex will spin waiting for the lock if the owner is running, but will
>>> block if the owner is deschedules.  This prevents expensive trips through
>>> the scheduler for the common case where the mutex is only held for short
>>> periods, without wasting CPU cycles spinning in cases where the owner
>>> thread
>>> is descheduled and therefore will not be completing soon.
>>
>>
>> Great explanation! I read the man page at:
>>
>>>
>>> https://www.freebsd.org/cgi/man.cgi?query=mutex&sektion=9&apropos=0&manpath=FreeBSD+11.0-RELEASE+and+Ports
>>
>> and now clear about MTX_DEF and MTX_SPIN mutexs. But, still a few more
>> question, if you don't mind:
>>
>> Is it true that a thread holding a MTX_DEF mutex can be descheduled?
>> (shouldn't it disable interrupt like a MTX_SPIN mutex?) It is said on
>> the main page that MTX_DEF mutex is used by default in FreeBSD, so its
>> usecase must be very common. If a thread holding a MTX_DEF mutex can be
>> descheduled, which means that it did not disable interrupt, then we may
>> have lots of deadlock here, right?
>
> Yes, they can be descheduled. But that's not a problem. No other
> thread can acquire the MTX_DEF lock. If another thread tries, it will
> sleep and wait for the thread that holds the MTX_DEF lock to release
> it. Eventually, the thread will get time to run again, and then
> release the lock. Threads that just hold a MTX_DEF lock may also
> migrate from CPU to CPU too.
>
> Warner
>

Does that imply that MTX_DEF should not be used in something like
interrupt handler? Putting an interrupt handler into sleep doesn't
make so much sense.

Yubin

>>>     #######2, priority inversion problem
>>>     If thread B with a higher priority get in and try to acquire the lock
>>>     that thread A currently holds, then thread B would spin, while at the
>>>     same time thread A has no chance to run because it has lower priority,
>>>     thus not being able to release the lock.
>>>     (I haven't investigate enough into the source code, so I don't know
>>>     how FreeBSD and Linux handle this priority inversion problem. Maybe
>>>     they use priority inheritance or random boosting?)
>>>
>>>
>>> FreeBSD's spin locks prevent priority inversion by preventing the holder
>>> thread from being descheduled.
>>>
>>> MTX_DEF locks implement priority inheritance.
>>
>>
>> Nice hints. Thanks!
>>
>> regards,
>> Yubin Ruan
>>
>>
>> _______________________________________________
>> freebsd-hackers@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


From owner-freebsd-hackers@freebsd.org  Mon Apr 10 04:26:30 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 206B8D36AE0
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 04:26:30 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (mail.torek.net [96.90.199.121])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "elf.torek.net", Issuer "elf.torek.net" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id F2186FBC
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 04:26:29 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (localhost [127.0.0.1])
 by elf.torek.net (8.15.2/8.15.2) with ESMTPS id v3A4QRua042762
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Sun, 9 Apr 2017 21:26:27 -0700 (PDT)
 (envelope-from torek@elf.torek.net)
Received: (from torek@localhost)
 by elf.torek.net (8.15.2/8.15.2/Submit) id v3A4QR9Q042761;
 Sun, 9 Apr 2017 21:26:27 -0700 (PDT) (envelope-from torek)
Date: Sun, 9 Apr 2017 21:26:27 -0700 (PDT)
From: Chris Torek <torek@elf.torek.net>
Message-Id: <201704100426.v3A4QR9Q042761@elf.torek.net>
To: ablacktshirt@gmail.com, imp@bsdimp.com
Subject: Re: Understanding the FreeBSD locking mechanism
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, rysto32@gmail.com
In-Reply-To: <b69597cd-fab6-7ef8-7dfe-d097283c4064@gmail.com>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2
 (elf.torek.net [127.0.0.1]); Sun, 09 Apr 2017 21:26:27 -0700 (PDT)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 04:26:30 -0000

>>> Is it true that a thread holding a MTX_DEF mutex can be descheduled?

>> Yes, they can be descheduled. But that's not a problem. No other
>> thread can acquire the MTX_DEF lock. ...

>Does that imply that MTX_DEF should not be used in something like
>interrupt handler? Putting an interrupt handler into sleep doesn't
>make so much sense.

Go back to the old top-half / bottom-half model, and consider that
now that there are interrupt *threads*, your ithread is also in the
"top half".  It's therefore OK to suspend.  ("Sleep" is not quite
correct here: a mutex wait is not a "sleep" state but instead is
just a waiting, not-scheduled-to-run state.  The precise difference
is irrelevant at this level though.)

It's not *great* to suspend here, but all your alternatives are
*also* bad:

 * You may grab incoming data and stuff it into a ring buffer, and
   schedule some other thread to handle it later.  But if the ring
   buffer is full you have a problem, and all you have done is push
   the actual processing off to another thread, adding more overhead.

 * You may put the device itself on hold so that no more data can
   come in (if it's that kind of device).

On the other hand, if you are handling an interrupt but not in an
interrupt thread, you are running in the "bottom half".  It is
therefore *not OK* to suspend.  You must now use one of those
alternatives.

Note that if you suspend on an MTX_DEF mutex, and your priority is
*higher* than the priority of whatever thread actually holds that
mutex now, that other thread gets a priority boost to your level
(priority propagation, to prevent priority inversion).  So letting
your ithread suspend, assuming you have an ithread, is probably your
best bet.

Chris

From owner-freebsd-hackers@freebsd.org  Mon Apr 10 07:11:44 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 92935D373BB
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 07:11:44 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 371DDBC3
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 07:11:44 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from tom.home (kib@localhost [127.0.0.1])
 by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v3A7BcUx095781
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Mon, 10 Apr 2017 10:11:38 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v3A7BcUx095781
Received: (from kostik@localhost)
 by tom.home (8.15.2/8.15.2/Submit) id v3A7Bb1h095780;
 Mon, 10 Apr 2017 10:11:37 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Mon, 10 Apr 2017 10:11:37 +0300
From: Konstantin Belousov <kostikbel@gmail.com>
To: Chris Torek <torek@mail.torek.net>
Cc: rysto32@gmail.com, vasanth.raonaik@gmail.com, freebsd-hackers@freebsd.org, 
 ed@nuxi.nl, ablacktshirt@gmail.com
Subject: Re: Understanding the FreeBSD locking mechanism
Message-ID: <20170410071137.GH1788@kib.kiev.ua>
References: <CAFMmRNzOypqsBam2BfaFm+pX7hSYoEvB2oFtec8OtH6D=s9yTw@mail.gmail.com>
 <201704100216.v3A2GQ2s032227@elf.torek.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <201704100216.v3A2GQ2s032227@elf.torek.net>
User-Agent: Mutt/1.8.0 (2017-02-23)
X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no
 autolearn_force=no version=3.4.1
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 07:11:44 -0000

On Sun, Apr 09, 2017 at 07:16:26PM -0700, Chris Torek wrote:
> In the old non-SMP days, BSD, like traditional V6 Unix, divided
> the kernel into "top half" and "bottom half" sections.  The top
> half was anything driven from something other than an interrupt,
> such as initial bootstrap or any user-sourced system call.  Each
> of these had just one (per-process) kernel stack, in the "u.
> area", which was UPAGES * NBPG (number of bytes per page) bytes
> long, but also had to contain "struct user".
> 
> (In other words, the stack space available was actually smaller
> than that.  The "user" struct was *above* the kernel stack, so
> that ksp would not grow down into the structure; there was also
> signal trampoline code wedged in there, at least on the VAX and
> some of the early other ports.  I desperately wanted to move the
> trampoline code to libc for the sparc port.  It was *in theory*
> easy to do this :-) ... practice was another matter.)
Signal trampolines never were put on the kernel stack, simply because
uarea/kstack is not accessible from the user space. They lived on top
the user mode stack of the main thread. Currently on x86/powerpc/arm,
signal trampolines are mapped from the 'shared page', which was done to
allow marking the user stack as non-executable.

Kstack still contains the remnants of the uarea, renamed to (per-thread)
pcb. There is no much sense in the split of struct thread vs. struct
pcb, but it is historically survived up to this moment, and clearing
things up requires too much MD work.
My opinion is that pcb on kstack indeed only eats the space and better be
put into td_md.

Yet another thing which is shared with kstack, is the usermode FPU save
area for x86 and arm64. At least on x86, the save area is dynamically
sized at boot to support extentions like AVX/AVX256/AVX512 etc, and
chomping part of the kstack saves one more contiguous KVA allocation and
allows to reuse kstack cache. Again historically, pre-AVX kernels put
XMM save area into pcb->kstack.

> 
> When an interrupt arrived, as long as it was not interrupting
> another interrupt, the system would get on a separate "interrupt
> stack" -- some hardware supports this directly, with a separate
> interrupt stack register -- which meant we did not have to provide
> enough interrupt-handling space in the per-process kernel stack,
> nor take interrupts on some possibly dodgy user stack.
> (Interrupts can occur at any time, so the system may be running
> user code, not kernel code.)
No, this is not a case, at least on x86. There, 'normal' interrupts
and exceptions reuse the current thread kstack, thus participating in
the common stack overflow business. On i386, only NMI and double fault
exceptions are routed through task gates in IDT, and are provided with
the separate stack [double fault almost always indicates that stack
overflow]. On amd64, TSS switching is impossible, but IDT descriptors
may be marked with non-zero IST, which basically reference some static
stack besides kstack. Only NMI uses IST.

> Since then, we have added another special case:
> 
>  * In a "critical section", we wish to make sure that the current
>    thread does not migrate from one CPU to another.  This does
>    not, strictly speaking, require blocking interrupts entirely,
>    but because the scheduler does its thing by blocking interrupts,
>    we block interrupts for short durations here as well (actually
>    when *leaving* the critical section, where we check to see if
>    the scheduler would *like* us to migrate).
This is not true, both in explanation of intent, and in the implementation
details.

Critical section prevents de-scheduling of the current thread, disabling
any context switches on the current CPU.  It works by incrementing current
thread td_critnest counter.  Note that the interrupts are still enabled
when critical section is ensured, so the flow of control can still be
'preempted' to the interrupt, but after return from the interrupt, current
thread continues to execute.  If any higher-priority thread needs to be
scheduled due to interrupt, the scheduler and context switch are done after
the td_critnest returns to zero.

> 
>    This is not really a mutex at all, but it does interact with
>    them, so it's worth mentioning.  Essentially, if you are in a
>    critical section, you may not switch threads, so if you need
>    a mutex, you must use a spin mutex.
You probably mixed critical_enter() and spinlock_enter() there.
The later indeed disables interrupt and intended to be used as part
of the spinlock (spin mutexes) implementation.


> 
>    (This *is* well-documented in "man 9 critical_enter".)
The explanation in critical_enter(9) is somewhat misleading.
The spinlock_enter() call consequences include most side-effects of
critical_enter(), because interrupts are disabled for later and thus
context-switching cannot occur at all.  Spinlocks do not enter the
critical section technically, i.e. the td_critnest is not incremented.

From owner-freebsd-hackers@freebsd.org  Mon Apr 10 08:11:27 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 09B56D36128
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 08:11:27 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (mail.torek.net [96.90.199.121])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "elf.torek.net", Issuer "elf.torek.net" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id CF1E0B19
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 08:11:26 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (localhost [127.0.0.1])
 by elf.torek.net (8.15.2/8.15.2) with ESMTPS id v3A8BP3c049596
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Mon, 10 Apr 2017 01:11:25 -0700 (PDT)
 (envelope-from torek@elf.torek.net)
Received: (from torek@localhost)
 by elf.torek.net (8.15.2/8.15.2/Submit) id v3A8BP8B049595;
 Mon, 10 Apr 2017 01:11:25 -0700 (PDT) (envelope-from torek)
Date: Mon, 10 Apr 2017 01:11:25 -0700 (PDT)
From: Chris Torek <torek@elf.torek.net>
Message-Id: <201704100811.v3A8BP8B049595@elf.torek.net>
To: kostikbel@gmail.com
Subject: Re: Understanding the FreeBSD locking mechanism
Cc: ablacktshirt@gmail.com, ed@nuxi.nl, freebsd-hackers@freebsd.org,
 rysto32@gmail.com, vasanth.raonaik@gmail.com
In-Reply-To: <20170410071137.GH1788@kib.kiev.ua>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2
 (elf.torek.net [127.0.0.1]); Mon, 10 Apr 2017 01:11:25 -0700 (PDT)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 08:11:27 -0000

>Signal trampolines never were put on the kernel stack ...

Oops, right, not sure why I was thinking that.  However I would still
prefer to have libc supply the trampoline address (the underlying
signal system calls can do this, since until you are catching a
signal in the first place, there is no need for a known-in-advance
trampoline address).

>Kstack still contains the remnants of the uarea, renamed to (per-thread)
>pcb. There is no much sense in the split of struct thread vs. struct
>pcb, but it is historically survived up to this moment, and clearing
>things up requires too much MD work.
>My opinion is that pcb on kstack indeed only eats the space and better be
>put into td_md.

That would be good.

>> When an interrupt arrived, as long as it was not interrupting
>> another interrupt, the system would get on a separate "interrupt
>> stack" ...

No, this is not a case, at least on x86.

On VAX, and (emulated without hardware support) in my SPARC port,
it was. :-)

>There, 'normal' interrupts and exceptions reuse the current thread
>kstack ...

I never liked this very much, but if it's faster on x86, it's not
unreasonable.  And without hardware support (or if the TSS switch
is too slow) it's OK.

>>  * In a "critical section", we wish to make sure that the current
>>    thread does not migrate from one CPU to another.
>>    ...  we block interrupts for short durations here as well (actually
>>    when *leaving* the critical section, where we check to see if
>>    the scheduler would *like* us to migrate).

>This is not true, both in explanation of intent, and in the implementation
>details.

Ah, and I see you added a compiler_membar and some comments here
recently.  I did indeed misread the micro-optimization.

>You probably mixed critical_enter() and spinlock_enter() there.
>The later indeed disables interrupt and intended to be used as part
>of the spinlock (spin mutexes) implementation.

What I meant was that it's a dreadful error to do, e.g.:

	critical_enter();
	mtx_lock(mtx);
	...
	mtx_unlock(mtx);
	critical_exit();

but the other order (lock first, then enter/exit) is OK.  This
is similar to the prohibition against obtaining a default mutex
while holding a spin mutex.

Chris

From owner-freebsd-hackers@freebsd.org  Mon Apr 10 08:48:03 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7F984D36E00
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 08:48:03 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id E63B723E
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 08:48:02 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from tom.home (kib@localhost [127.0.0.1])
 by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v3A8lvou016982
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Mon, 10 Apr 2017 11:47:57 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v3A8lvou016982
Received: (from kostik@localhost)
 by tom.home (8.15.2/8.15.2/Submit) id v3A8lvrt016981;
 Mon, 10 Apr 2017 11:47:57 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Mon, 10 Apr 2017 11:47:56 +0300
From: Konstantin Belousov <kostikbel@gmail.com>
To: Chris Torek <torek@mail.torek.net>
Cc: ablacktshirt@gmail.com, ed@nuxi.nl, freebsd-hackers@freebsd.org,
 rysto32@gmail.com, vasanth.raonaik@gmail.com
Subject: Re: Understanding the FreeBSD locking mechanism
Message-ID: <20170410084756.GJ1788@kib.kiev.ua>
References: <20170410071137.GH1788@kib.kiev.ua>
 <201704100811.v3A8BP8B049595@elf.torek.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <201704100811.v3A8BP8B049595@elf.torek.net>
User-Agent: Mutt/1.8.0 (2017-02-23)
X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no
 autolearn_force=no version=3.4.1
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 08:48:03 -0000

On Mon, Apr 10, 2017 at 01:11:25AM -0700, Chris Torek wrote:
> >Signal trampolines never were put on the kernel stack ...
> 
> Oops, right, not sure why I was thinking that.  However I would still
> prefer to have libc supply the trampoline address (the underlying
> signal system calls can do this, since until you are catching a
> signal in the first place, there is no need for a known-in-advance
> trampoline address).
I considered some variation of this scheme when I worked on the
non-executable stack support. AFAIR the reason why I decided not to do
this was that the kernel-injected signal trampoline is still needed
for backward ABI-compat. In other words, the shared page would be
still needed, and we would end up with both libc trampoline and kernel
trampoline, which felt somewhat excessive.

Selecting one scheme or another based e.g. on the binary osrel was too
fragile, e.g. new binary might have loaded old library, and the kernel
trampoline still must be present in this situation.

> What I meant was that it's a dreadful error to do, e.g.:
> 
> 	critical_enter();
> 	mtx_lock(mtx);
> 	...
> 	mtx_unlock(mtx);
> 	critical_exit();
> 
> but the other order (lock first, then enter/exit) is OK.  This
> is similar to the prohibition against obtaining a default mutex
> while holding a spin mutex.

Sure, this is a bug.  Debugging kernel would catch this, at least
mi_switch() asserts that td_critnest == 0 (technically it checks that
td_critnest == 1 but the thread lock is owned there).  So if such code tries
to lock contested mutex, the bug causes panic.

I am sorry my previous mail contained an error: the spinlock_enter() also
increments td_critnest.  Still, since interrupts are disabled, this is
mostly cosmetics.  The more important consequence is that critical_exit()
on spinlock unlock re-checks td_owepreempt and executes potential postponed
scheduling actions.

From owner-freebsd-hackers@freebsd.org  Mon Apr 10 08:57:42 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id A269CD371E1
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 08:57:42 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (mail.torek.net [96.90.199.121])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "elf.torek.net", Issuer "elf.torek.net" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 8CFE2B17
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 08:57:42 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (localhost [127.0.0.1])
 by elf.torek.net (8.15.2/8.15.2) with ESMTPS id v3A8vf3B049846
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Mon, 10 Apr 2017 01:57:41 -0700 (PDT)
 (envelope-from torek@elf.torek.net)
Received: (from torek@localhost)
 by elf.torek.net (8.15.2/8.15.2/Submit) id v3A8vffM049845;
 Mon, 10 Apr 2017 01:57:41 -0700 (PDT) (envelope-from torek)
Date: Mon, 10 Apr 2017 01:57:41 -0700 (PDT)
From: Chris Torek <torek@elf.torek.net>
Message-Id: <201704100857.v3A8vffM049845@elf.torek.net>
To: kostikbel@gmail.com
Subject: Re: Understanding the FreeBSD locking mechanism
Cc: ablacktshirt@gmail.com, ed@nuxi.nl, freebsd-hackers@freebsd.org,
 rysto32@gmail.com, vasanth.raonaik@gmail.com
In-Reply-To: <20170410084756.GJ1788@kib.kiev.ua>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2
 (elf.torek.net [127.0.0.1]); Mon, 10 Apr 2017 01:57:41 -0700 (PDT)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 08:57:42 -0000

>I considered some variation of this scheme when I worked on the
>non-executable stack support. AFAIR the reason why I decided not to do
>this was that the kernel-injected signal trampoline is still needed
>for backward ABI-compat. In other words, the shared page would be
>still needed, and we would end up with both libc trampoline and kernel
>trampoline, which felt somewhat excessive.

Those are pretty much the same reasons I never did it as well.

>Selecting one scheme or another based e.g. on the binary osrel was too
>fragile, e.g. new binary might have loaded old library, and the kernel
>trampoline still must be present in this situation.

The method by which to select the scheme, though, is straightforward:
old vs new signal system call numbers and/or flags.  ("Flags" presents
issues if users of existing mechanism are not good about clearing
unknown flag bits.)

Besides non-executable stack / shared-page, this would also be
particularly good for cases where a runtime library (not
necessarily libc itself, perhaps for other languages) wants a
different signal handling method in user space.  For instance,
instead of signals being delivered to some existing thread as
interrupts, they might spin off new threads entirely.

I think it's still worth pursuing, but it's one of those "forever in
the future, low priority" ideas.  I can't even seem to get back to
my medium-priority ideas these days...

Chris

From owner-freebsd-hackers@freebsd.org  Mon Apr 10 09:51:49 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0B7A5D34100
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 09:51:49 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: from asp.reflexion.net (outbound-mail-210-5.reflexion.net
 [208.70.210.5])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id ABADBC99
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 09:51:48 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: (qmail 25882 invoked from network); 10 Apr 2017 09:52:41 -0000
Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1)
 by 0 (rfx-qmail) with SMTP; 10 Apr 2017 09:52:41 -0000
Received: by rtc-sm-01.app.dca.reflexion.local
 (Reflexion email security v8.40.0) with SMTP;
 Mon, 10 Apr 2017 05:51:41 -0400 (EDT)
Received: (qmail 29193 invoked from network); 10 Apr 2017 09:51:41 -0000
Received: from unknown (HELO iron2.pdx.net) (69.64.224.71)
 by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 10 Apr 2017 09:51:41 -0000
Received: from [192.168.1.106] (c-76-115-7-162.hsd1.or.comcast.net
 [76.115.7.162])
 by iron2.pdx.net (Postfix) with ESMTPSA id 85F66EC8630;
 Mon, 10 Apr 2017 02:51:40 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: The arm64 fork-then-swap-out-then-swap-in failures: a program
 source for exploring them
From: Mark Millard <markmi@dsl-only.net>
In-Reply-To: <89D6D677-3BE2-45E2-A902-CC6A0305F3F9@dsl-only.net>
Date: Mon, 10 Apr 2017 02:51:39 -0700
Cc: andrew@freebsd.org, freebsd-hackers@freebsd.org,
 freebsd-arm <freebsd-arm@freebsd.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <585B43F7-D4C8-431A-BFFE-68B48C3214AE@dsl-only.net>
References: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net>
 <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
 <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
 <20170409122715.GF1788@kib.kiev.ua>
 <9D152170-5F19-47A2-A06A-66F83CA88A09@dsl-only.net>
 <9DCAF95B-39A5-4346-88FC-6AFDEE8CF9BB@dsl-only.net>
 <8FFE95AA-DB40-4D1E-A103-4BA9FCC6EDEE@dsl-only.net>
 <89D6D677-3BE2-45E2-A902-CC6A0305F3F9@dsl-only.net>
To: Konstantin Belousov <kostikbel@gmail.com>
X-Mailer: Apple Mail (2.3273)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 09:51:49 -0000

On 2017-Apr-9, at 5:10 PM, Mark Millard <markmi at dsl-only.net> wrote:

> On 2017-Apr-9, at 10:24 AM, Mark Millard <markmi at dsl-only.net> =
wrote:
>=20
>> On 2017-Apr-9, at 5:27 AM, Konstantin Belousov <kostikbel@gmail.com> =
wrote:
>=20
>>=20
>>> Hmm, could you try the following patch, I did not even compiled it.
>>=20
>> I'll try it later today.
>>=20
>>> diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
>>> index 3d5756ba891..55aa402eb1c 100644
>>> --- a/sys/arm64/arm64/pmap.c
>>> +++ b/sys/arm64/arm64/pmap.c
>>> @@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sva, =
vm_offset_t eva, vm_prot_t prot)
>>> 		    sva +=3D L3_SIZE) {
>>> 			l3 =3D pmap_load(l3p);
>>> 			if (pmap_l3_valid(l3)) {
>>> +				if ((l3 & ATTR_SW_MANAGED) &&
>>> +				    pmap_page_dirty(l3)) {
>>> +					vm_page_dirty(PHYS_TO_VM_PAGE(l3 =
&
>>> +					    ~ATTR_MASK));
>>> +				}
>>> 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
>>> 				PTE_SYNC(l3p);
>>> 				/* XXX: Use pmap_invalidate_range */
>=20
>=20
> Preliminary testing indicates that this fixes the
> some-pages-become-zero problem for fork-then-swapout/in.
>=20
> Thanks!
>=20
> I'll see if a buildworld can go through without being stopped
> by the type of issue. But that will take a while. (It is how
> I originally ran into the problem(s) that others had been
> reporting on the lists.)

buildworld buildkernel completed non-stop for the first time
on a BPI-M3 board.

Looks good for a check-in to svn to me (head and stable/11).

This combined with 2017-Feb-15's -r313772's fix to the fork
trampline code's updating of sp_el0 makes arm64 far more stable
for my purposes.

-r313772 was never MFC'd to stable/11. In my view it should be.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net


From owner-freebsd-hackers@freebsd.org  Mon Apr 10 14:44:36 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4D249D37745
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 14:44:36 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "vps1.elischer.org",
 Issuer "CA Cert Signing Authority" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 34579B6
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 14:44:35 +0000 (UTC)
 (envelope-from julian@freebsd.org)
Received: from Julian-MBP3.local (106-68-100-234.dyn.iinet.net.au
 [106.68.100.234]) (authenticated bits=0)
 by vps1.elischer.org (8.15.2/8.15.2) with ESMTPSA id v3AEiTSw050386
 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO);
 Mon, 10 Apr 2017 07:44:32 -0700 (PDT)
 (envelope-from julian@freebsd.org)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Yubin Ruan <ablacktshirt@gmail.com>, Ed Schouten <ed@nuxi.nl>
References: <e99b6366-7d30-a889-b7db-4a3b3133ff5e@gmail.com>
 <CABh_MKkbVVi+gTkaBVDvVfRggS6pbHKJE_VbYBZpAaTCZ81b7Q@mail.gmail.com>
 <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
Cc: freebsd-hackers@freebsd.org
From: Julian Elischer <julian@freebsd.org>
Message-ID: <56c36e41-e1cb-6e87-dc6e-922dd5abbccc@freebsd.org>
Date: Mon, 10 Apr 2017 22:44:23 +0800
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0)
 Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <c72c0ee3-328d-3efc-e8a0-4d6c0d5c8cee@gmail.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 14:44:36 -0000

On 9/4/17 6:13 pm, Yubin Ruan wrote:
> On 2017/4/6 17:31, Ed Schouten wrote:
>> Hi Yubin,
>>
>> 2017-04-06 11:16 GMT+02:00 Yubin Ruan <ablacktshirt@gmail.com>:
>>> Does this function provides the ordinary "spinlock" functionality? 
>>> There
>>> is no special "test-and-set" instruction, and neither any extra 
>>> locking
>>> to protect internal data structure manipulation. Isn't this 
>>> subjected to
>>> race condition?
>>
>> Locking a spinlock is done through macro mtx_lock_spin(), which
>> expands to __mtx_lock_spin() in sys/sys/mutex.h. That macro first
>> calls into the function you looked at, spinlock_enter(), to disable
>> interrupts. It then calls into the _mtx_obtain_lock_fetch() to do the
>> test-and-set operation you were looking for.
>
> Thanks for replying. I have read some of those codes.
just in case it somehow slipped your attention or has not yet been 
brought up there is the following overview:

https://www.freebsd.org/cgi/man.cgi?locking(9)
>
> Just a few more questions, if you don't mind:
>
> (1) why are spinlocks forced to disable interrupt in FreeBSD?
>
> From the book "The design and implementation of the FreeBSD Operating
> System", the authors say "spinning can result in deadlock if a 
> thread interrupted the thread that held a mutex and then tried to 
> acquire the mutex"...(section 4.3, Mutex Synchronization, paragraph 4)
>
> I don't get the point why a spinlock(or *spin mutex* in the FreeBSD
> world) has to disable interrupt. Being interrupted does not necessarily
> mean a deadlock. Assume that thread A holding a lock T gets interrupted
> by another thread B(context switch here) and thread B try to acquire
> the lock T. After finding out that lock T has already been acquired,
> thread B will just spin until it gets preempted, after which thread A
> gets waken up and run and release the lock T. So, you see there is not
> necessarily any deadlock even if thread A get interrupted.
>
> I can only remember two conditions where using spinlock without
> disabling interrupts will cause deadlock:
>
> #######1, spinlock used in an interrupt handler
> If a thread A holding a spinlock T get interrupted and the interrupt
> handler responsible for this interrupt try to acquire T, then we have
> deadlock, because A would never have a chance to run before the
> interrupt handler return, and the interrupt handler, unfortunately,
> will continue to spin ... so in this situation, one has to disable
> interrupt before spinning.
>
> As far as I know, in Linux, they provide two kinds of spinlocks:
>
>   spin_lock(..);   /* spinlock that does not disable interrupts */
>   spin_lock_irqsave(...); /* spinlock that disable local interrupt */
>
>
> #######2, priority inversion problem
> If thread B with a higher priority get in and try to acquire the lock
> that thread A currently holds, then thread B would spin, while at the
> same time thread A has no chance to run because it has lower priority,
> thus not being able to release the lock.
> (I haven't investigate enough into the source code, so I don't know
> how FreeBSD and Linux handle this priority inversion problem. Maybe
> they use priority inheritance or random boosting?)
>
> thanks,
> Yubin Ruan
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to 
> "freebsd-hackers-unsubscribe@freebsd.org"
>


From owner-freebsd-hackers@freebsd.org  Mon Apr 10 20:16:00 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3EB7BD3871E
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Mon, 10 Apr 2017 20:16:00 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: from asp.reflexion.net (outbound-mail-210-11.reflexion.net
 [208.70.210.11])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id E2A212E6
 for <freebsd-hackers@freebsd.org>; Mon, 10 Apr 2017 20:15:59 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: (qmail 23069 invoked from network); 10 Apr 2017 20:15:58 -0000
Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2)
 by 0 (rfx-qmail) with SMTP; 10 Apr 2017 20:15:58 -0000
Received: by mail-cs-02.app.dca.reflexion.local
 (Reflexion email security v8.40.0) with SMTP;
 Mon, 10 Apr 2017 16:15:58 -0400 (EDT)
Received: (qmail 31433 invoked from network); 10 Apr 2017 20:15:58 -0000
Received: from unknown (HELO iron2.pdx.net) (69.64.224.71)
 by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 10 Apr 2017 20:15:58 -0000
Received: from [192.168.1.106] (c-76-115-7-162.hsd1.or.comcast.net
 [76.115.7.162])
 by iron2.pdx.net (Postfix) with ESMTPSA id 96DBBEC7C08;
 Mon, 10 Apr 2017 13:15:57 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: The arm64 fork-then-swap-out-then-swap-in failures: a program
 source for exploring them
From: Mark Millard <markmi@dsl-only.net>
In-Reply-To: <585B43F7-D4C8-431A-BFFE-68B48C3214AE@dsl-only.net>
Date: Mon, 10 Apr 2017 13:15:57 -0700
Cc: andrew@freebsd.org, freebsd-hackers@freebsd.org,
 freebsd-arm <freebsd-arm@freebsd.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <876EA1E4-E5A9-411C-AFFD-989713037C19@dsl-only.net>
References: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net>
 <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
 <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
 <20170409122715.GF1788@kib.kiev.ua>
 <9D152170-5F19-47A2-A06A-66F83CA88A09@dsl-only.net>
 <9DCAF95B-39A5-4346-88FC-6AFDEE8CF9BB@dsl-only.net>
 <8FFE95AA-DB40-4D1E-A103-4BA9FCC6EDEE@dsl-only.net>
 <89D6D677-3BE2-45E2-A902-CC6A0305F3F9@dsl-only.net>
 <585B43F7-D4C8-431A-BFFE-68B48C3214AE@dsl-only.net>
To: Konstantin Belousov <kostikbel@gmail.com>
X-Mailer: Apple Mail (2.3273)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2017 20:16:00 -0000

On 2017-Apr-10, at 2:51 AM, Mark Millard <markmi at dsl-only.net> wrote:

> On 2017-Apr-9, at 5:10 PM, Mark Millard <markmi at dsl-only.net> =
wrote:
>=20
>> On 2017-Apr-9, at 10:24 AM, Mark Millard <markmi at dsl-only.net> =
wrote:
>>=20
>>> On 2017-Apr-9, at 5:27 AM, Konstantin Belousov <kostikbel@gmail.com> =
wrote:
>>=20
>>>=20
>>>> Hmm, could you try the following patch, I did not even compiled it.
>>>=20
>>> I'll try it later today.
>>>=20
>>>> diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
>>>> index 3d5756ba891..55aa402eb1c 100644
>>>> --- a/sys/arm64/arm64/pmap.c
>>>> +++ b/sys/arm64/arm64/pmap.c
>>>> @@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sva, =
vm_offset_t eva, vm_prot_t prot)
>>>> 		    sva +=3D L3_SIZE) {
>>>> 			l3 =3D pmap_load(l3p);
>>>> 			if (pmap_l3_valid(l3)) {
>>>> +				if ((l3 & ATTR_SW_MANAGED) &&
>>>> +				    pmap_page_dirty(l3)) {
>>>> +					vm_page_dirty(PHYS_TO_VM_PAGE(l3 =
&
>>>> +					    ~ATTR_MASK));
>>>> +				}
>>>> 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
>>>> 				PTE_SYNC(l3p);
>>>> 				/* XXX: Use pmap_invalidate_range */
>>=20
>>=20
>> Preliminary testing indicates that this fixes the
>> some-pages-become-zero problem for fork-then-swapout/in.
>>=20
>> Thanks!
>>=20
>> I'll see if a buildworld can go through without being stopped
>> by the type of issue. But that will take a while. (It is how
>> I originally ran into the problem(s) that others had been
>> reporting on the lists.)
>=20
> buildworld buildkernel completed non-stop for the first time
> on a BPI-M3 board.

I had been thinking of the BPI-M3 for other reasons
and typed that instead of the correct: Pine64+ 2GB.
(True elsewhere as well.) I do really mean arm64
here, not armv7.

> Looks good for a check-in to svn to me (head and stable/11).
>=20
> This combined with 2017-Feb-15's -r313772's fix to the fork
> trampline code's updating of sp_el0 makes arm64 far more stable
> for my purposes.
>=20
> -r313772 was never MFC'd to stable/11. In my view it should be.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net


From owner-freebsd-hackers@freebsd.org  Tue Apr 11 07:17:02 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B3118D397BD
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue, 11 Apr 2017 07:17:02 +0000 (UTC)
 (envelope-from crb@chrisbowman.com)
Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org
 [IPv6:2001:1900:2254:206a::50:5])
 by mx1.freebsd.org (Postfix) with ESMTP id 91F10FCC
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 07:17:02 +0000 (UTC)
 (envelope-from crb@chrisbowman.com)
Received: by mailman.ysv.freebsd.org (Postfix)
 id 8E5DED397BC; Tue, 11 Apr 2017 07:17:02 +0000 (UTC)
Delivered-To: hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8E02DD397BB
 for <hackers@mailman.ysv.freebsd.org>; Tue, 11 Apr 2017 07:17:02 +0000 (UTC)
 (envelope-from crb@chrisbowman.com)
Received: from mail-pg0-x229.google.com (mail-pg0-x229.google.com
 [IPv6:2607:f8b0:400e:c05::229])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 69CA7FCB
 for <hackers@freebsd.org>; Tue, 11 Apr 2017 07:17:02 +0000 (UTC)
 (envelope-from crb@chrisbowman.com)
Received: by mail-pg0-x229.google.com with SMTP id g2so116615572pge.3
 for <hackers@freebsd.org>; Tue, 11 Apr 2017 00:17:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=chrisbowman-com.20150623.gappssmtp.com; s=20150623;
 h=from:content-transfer-encoding:mime-version:subject:message-id:date
 :to; bh=fyoDCcmdvyHZkM4J65ZwAeZDiG/V1Uq5Qsi0MYqj7Os=;
 b=Rxl7x//Nxw1VOsxCIbXjHaUC/HOLQM3dXio7rQ/cbqX835iVvT+OWrMReDyXPHM6LN
 3X+tPb1xkPUClInh1vKL1I2BPkuECh4J3OMbTDw+WA9P4CYUctStJGDSsk5XUu5Wf300
 sFnid3NEdlvglKgOSVe3rcZZmpsfN4tSYQhd4L48uWKW973OcieXgQNtupCE0/Ws2ZO9
 ax9yFBF6zvFB9nMtxLxTOJ3V3h89Mj94RagWfYEfF1AbQXkxQuHHBQGbQEh3A5qwe07x
 TdRs5G49Xzq/2XrSJsrPlBpdu/sBjvM0FXYMX9D7fO8Gtx6uoOb+VbIto7tsAU3+HhiW
 mODw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:content-transfer-encoding:mime-version
 :subject:message-id:date:to;
 bh=fyoDCcmdvyHZkM4J65ZwAeZDiG/V1Uq5Qsi0MYqj7Os=;
 b=rfiy66sU9kH8CoA/ApN4ypm188xF/2Z8xk//PAv56zWHNGZUz6Ehev7veJM2/JpqQc
 Re+R/FjXYSCsZr0UWrcfRcJ+IXfl6gF7GnypJblXoQ8bghj1h8ZeVYyf0WoH52ZLFlVH
 W/UUn9nalV7HTFTkiS64iCQxu/jpXPAe7vtFUNmv1X/ma2vWBtfA2+L1ua3RKbLWRaIx
 B0MRNN0aWwTp3w1arla6NiACuZSWUNOhCQYBLQV6J0K7EnnlmxASTuZ1gYXU0Ed5QxxQ
 o3iqyFe6y60AYB+yfbeWSacsAOi5ToTgeT9gSmW2kMSC19N2vf5mLoWz1csjMqJTKdX4
 G57w==
X-Gm-Message-State: AFeK/H3aC5UxLv9bpMFaSEmiC/R2jKQwXzdtMzktQTSwUHmYn5sN80f3Up4l9xxiR9Bjww==
X-Received: by 10.84.140.235 with SMTP id 98mr73654035plt.161.1491895021410;
 Tue, 11 Apr 2017 00:17:01 -0700 (PDT)
Received: from ?IPv6:2601:647:4e00:bbb5:8918:714a:df41:33f0?
 ([2601:647:4e00:bbb5:8918:714a:df41:33f0])
 by smtp.gmail.com with ESMTPSA id n65sm28467853pga.8.2017.04.11.00.17.00
 for <hackers@freebsd.org>
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 11 Apr 2017 00:17:00 -0700 (PDT)
From: Christopher Bowman <crb@chrisbowman.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Dtrace oddity
Message-Id: <CD5E9B03-6147-4E4D-BED6-6C45022051E3@chrisbowman.com>
Date: Tue, 11 Apr 2017 00:16:59 -0700
To: hackers@freebsd.org
X-Mailer: Apple Mail (2.3273)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2017 07:17:02 -0000

Apologies if I=E2=80=99m sending to the wrong list.  I have a small test =
program shown at the bottom.  It tries to mmap a device for which I=E2=80=99=
ve written (a possibly incorrect) driver.  When I run the program I get =
the following output:  =20

crb@retread:63> ./test /dev/sp6050=20
argc =3D 2
argv[0] =3D ./test
argv[1] =3D /dev/sp6050
opening device /dev/sp6050
open returned non-zero value
mmap failed: EINVAL

The man page lists a bunch of reasons for EINVAL so I want to =
investigate this and I don=E2=80=99t quite know good strategies to debug =
the kernel (yet) so I thought I=E2=80=99d experiment with Dtrace a bit.  =
Here is the oddity: when I run Dtrace and then run my test program I get =
the following output from Dtrace:

crb@retread:60> dtrace -n 'syscall:freebsd:mmap:entry /execname =3D=3D =
"test"/ {}'
dtrace: description 'syscall:freebsd:mmap:entry ' matched 1 probe
CPU     ID                    FUNCTION:NAME
  0  63401                       mmap:entry=20
  0  63401                       mmap:entry=20
  0  63401                       mmap:entry=20
  0  63401                       mmap:entry=20
  0  63401                       mmap:entry=20
  0  63401                       mmap:entry=20
  0  63401                       mmap:entry=20
  0  63401                       mmap:entry=20
  0  63401                       mmap:entry=20
  0  63401                       mmap:entry=20
  0  63401                       mmap:entry=20
  0  63401                       mmap:entry=20

I think Dtrace is indicating that the mmap syscall was called 12 times =
by my test program yet I can see how the program below would have done =
that.

Here is my program:

/*
	Copyright (c) 2011 by Christopher R. Bowman. All rights =
reserved.
*/

#include <errno.h>
#include <stdio.h>
#include <fcntl.h>
#include <sys/mman.h>


int main (int argc, char ** argv) {
	int i;

	printf("argc =3D %d\n", argc);
	for (i=3D0; i < argc; i++)
		printf ("argv[%i] =3D %s\n", i, argv[i]);
	if (argc < 2) {
		printf("usage: test device\n");
		return 0;
	}
	printf("opening device %s\n", argv[1]);
	int device =3D open (argv[1], O_RDWR);
	if (device =3D=3D 0) {
		printf ("open of device %s failed\n", argv[1]);
		return 0;
	}
	printf("open returned non-zero value\n");
	void *pa =3D mmap (0, 4095, PROT_READ | PROT_WRITE, 0, device, =
0);
	if (pa =3D=3D MAP_FAILED) {
		printf ("mmap failed: ");
		switch (errno) {
			case EACCES: printf("EACCESS\n"); break;
			case EBADF:  printf("EBADF\n"); break;
			case EINVAL: printf("EINVAL\n"); break;
			case ENODEV: printf("ENODEV\n"); break;
			case ENOMEM: printf("ENOMEM\n"); break;
		}
		return 0;
	}
	printf("mmap returned non-zero value: %lx\n", (unsigned =
long)pa);
	unsigned int *p =3D (unsigned int *) pa;
	unsigned char *c =3D (unsigned char *) pa;
#define NUM_ITERATIONS 16
	for (i=3D0; i < NUM_ITERATIONS; i++){	//BARs are 2Kbytes
		//*p++ =3D (0xa5a5 + i);
		*p++ =3D (0x5aa5a5a5);
	}
	p =3D (unsigned int *) pa;
	for (i=3D0; i < NUM_ITERATIONS; i++){
		printf("i =3D %d, read_val =3D %x\n", i, *p++);
	}
}


Thanks in advance for comments on Dtrace or perhaps program corrections =
or ideas why the mmap failed or places to read on kernel debugging or =
pointers to a better list to which to send this.
Thanks
Christopher



From owner-freebsd-hackers@freebsd.org  Tue Apr 11 12:37:29 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 63AAFD39C68
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue, 11 Apr 2017 12:37:29 +0000 (UTC)
 (envelope-from f.v.anton@gmail.com)
Received: from mail-wm0-x230.google.com (mail-wm0-x230.google.com
 [IPv6:2a00:1450:400c:c09::230])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id F10FAB0D
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 12:37:28 +0000 (UTC)
 (envelope-from f.v.anton@gmail.com)
Received: by mail-wm0-x230.google.com with SMTP id u2so59946975wmu.0
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 05:37:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:from:date:message-id:subject:to
 :content-transfer-encoding;
 bh=WvxxcZgDNDwMwhMhgwSXfFh/eyysDmXlZLA+Jvla+WA=;
 b=dMG5l+LKqlY/9Ko17ZNKSCdJrd8Ok6RJhFNdML6xnBrfnaY0J61BRSqzozDnO2QYm+
 SurW/z1jg9rPoMlt4EKXzQAE7CukQXVreiYkATswU4fD1wJz7bwoj4UrH9WfyLVGJst4
 xP4VbN676hKqZgAzwzFqgl9/4bX1fx55a6Ds3s0xeLcYmJ3+pnibiRobp8tiMJwf96lc
 LEYvi8qGcmznOmXJKbNfWem4GXQC4RzriJaYeHwtFZdR1auoWlnUjhWVsO5RppOIJRcQ
 Ae4LepPggA3YLURLo1XubefbzakNqtSlXYBu9JyY8ZT9n+G3KkGp7hW5LHuzlad2K++3
 HyHw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:from:date:message-id:subject:to
 :content-transfer-encoding;
 bh=WvxxcZgDNDwMwhMhgwSXfFh/eyysDmXlZLA+Jvla+WA=;
 b=ZeBW43nXvZLvxI93OCD9+BsohTC9WU1oUKUdSaBgEi/xHk+xmW/iAUvgUHbMh7dh5O
 5OPnmLBonRrWA8ISD6O+x2VNjW1yeNgVGh7s99CT1Ts6GeGKZd5NmEqbEhy2IqPQkDJl
 Gy++UKS3HDcrBIDUNxrFIDXBkQSclrufWU+qJtUttCYYVZGxtkdaOALz1vxJR74EhKuV
 Lq10q7Y/nkKYmPHIkfNArU7uGdrVWP2WpkPNdHHM3yagDwvB8tTULJ25R2qRCOaZhbo1
 0xayrK+GV+oDiUdzV5Q1SQtqnecHe3ADnjUESQP4T5/XaX7sfi9/vG+uYCR6Fw9kpoZT
 Xn9w==
X-Gm-Message-State: AN3rC/5OCneQ2ZNdIi5Za2WBDaqLHsmkqoBk1kE3SGt6HFAvXVEdJq7TUHgez+/M+yV1rLGlcYx+9wpB05w5xA==
X-Received: by 10.28.72.67 with SMTP id v64mr14580261wma.98.1491914246611;
 Tue, 11 Apr 2017 05:37:26 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.223.178.10 with HTTP; Tue, 11 Apr 2017 05:37:26 -0700 (PDT)
From: Flavius Anton <f.v.anton@gmail.com>
Date: Tue, 11 Apr 2017 15:37:26 +0300
Message-ID: <CANXdjjYajtvWK+q3OK4j5uPFR4sVUrhrQD8zZSpoJ1hwZhVS5Q@mail.gmail.com>
Subject: On COW memory mapping in d_mmap_single
To: freebsd-hackers@freebsd.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2017 12:37:29 -0000

Hi everyone,

I'll start by giving some context, so you can better understand what
is the problem I'm trying to solve. I=E2=80=99ve been working for a while o=
n
bhyve trying to implement save/restore [1]. We've currently managed to
get it working for VMs using a ramdisk and no devices, so just vCPU
and memory states are saved and restored so far.

Last week I started looking into network devices, specifically
virtio-net devices. The problem was that when I issue a checkpoint
operation, the guest virtio driver stops working. After digging for a
while, I figured out the problem is marking VM memory as COW. If I
don't do this, the driver continues with no problem after
checkpointing.

Each VM has an associated vmspace and a /dev/vmm/VM_NAME device. When
the user space does a mmap on the /dev device, we would like to mark
VM memory as COW, thus the VM can continue touching pages while the
user space is writing the 'freezed', COW marked memory to a persistent
storage. We do this by iterating through all vm_entries from VM's
vmspace, we find which entry is mapping the object that has VM memory
and then we roughly just set MAP_ENTRY_COW and MAP_ENTRY_NEEDS_COPY on
that entry. You can see the code here [2].

I'm not sure if the above is sufficient for our purpose. In other
words, how would you do this? You have a vm_object that is referenced
via a vm_entry by process A (the user space). Somebody else, process B
let's say, does an mmap() on your device and you'd like to freeze that
object, such that process B can see a consistent snapshot of it, while
you want process A to be able to continue reading and writing from/to
it.

I've also read through Design Elements of the FreeBSD VM system [3],
but I am still afraid (I am sure) that I have some misunderstandings.

Thank you very much for bearing with me and going through this wall of text=
.

--
Flavius

[1] https://github.com/flaviusanton/freebsd/tree/bhyve-save-restore
[2] https://github.com/flaviusanton/freebsd/blob/bhyve-save-restore/sys/amd=
64/vmm/vmm_dev.c#L862
[3] https://www.freebsd.org/doc/en/articles/vm-design/index.html

From owner-freebsd-hackers@freebsd.org  Tue Apr 11 13:00:21 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8E65FD394EB
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue, 11 Apr 2017 13:00:21 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 1B749A83
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 13:00:20 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from tom.home (kib@localhost [127.0.0.1])
 by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v3BD0CKo090298
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Tue, 11 Apr 2017 16:00:12 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v3BD0CKo090298
Received: (from kostik@localhost)
 by tom.home (8.15.2/8.15.2/Submit) id v3BD0C6a090297;
 Tue, 11 Apr 2017 16:00:12 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Tue, 11 Apr 2017 16:00:12 +0300
From: Konstantin Belousov <kostikbel@gmail.com>
To: Flavius Anton <f.v.anton@gmail.com>
Cc: freebsd-hackers@freebsd.org
Subject: Re: On COW memory mapping in d_mmap_single
Message-ID: <20170411130012.GQ1788@kib.kiev.ua>
References: <CANXdjjYajtvWK+q3OK4j5uPFR4sVUrhrQD8zZSpoJ1hwZhVS5Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CANXdjjYajtvWK+q3OK4j5uPFR4sVUrhrQD8zZSpoJ1hwZhVS5Q@mail.gmail.com>
User-Agent: Mutt/1.8.0 (2017-02-23)
X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no
 autolearn_force=no version=3.4.1
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2017 13:00:21 -0000

On Tue, Apr 11, 2017 at 03:37:26PM +0300, Flavius Anton wrote:
> Hi everyone,
> 
> I'll start by giving some context, so you can better understand what
> is the problem I'm trying to solve. I???ve been working for a while on
> bhyve trying to implement save/restore [1]. We've currently managed to
> get it working for VMs using a ramdisk and no devices, so just vCPU
> and memory states are saved and restored so far.
> 
> Last week I started looking into network devices, specifically
> virtio-net devices. The problem was that when I issue a checkpoint
> operation, the guest virtio driver stops working. After digging for a
> while, I figured out the problem is marking VM memory as COW. If I
> don't do this, the driver continues with no problem after
> checkpointing.
> 
> Each VM has an associated vmspace and a /dev/vmm/VM_NAME device. When
> the user space does a mmap on the /dev device, we would like to mark
> VM memory as COW, thus the VM can continue touching pages while the
> user space is writing the 'freezed', COW marked memory to a persistent
> storage. We do this by iterating through all vm_entries from VM's
> vmspace, we find which entry is mapping the object that has VM memory
> and then we roughly just set MAP_ENTRY_COW and MAP_ENTRY_NEEDS_COPY on
> that entry. You can see the code here [2].
This is very strange operation, to put it mildly.  First, are other vCPUs
operate while you do your 'COW' ?  If yes, you are guaranteed to get
inconsistent snapshot.  If not, then you do not need 'COW'.

More, what kinds of VM objects are mapped into the vmspace ? FreeBSD VM
does not support shadowing of device objects (which means, inserting
shadow objects into the device object chain breaks VM invariants). One
of the main reasons why it not needed to be supported is because shadow
copy cannot see changes which are performed on the shadowed pages,
supposedly done by device. If vmm mmaps some devices into guest vmspace,
the devices would kind of 'freeze' from the guest PoV.

Next, how do you undo the damage done by your 'COW' ?

> I'm not sure if the above is sufficient for our purpose. In other
> words, how would you do this? You have a vm_object that is referenced
> via a vm_entry by process A (the user space). Somebody else, process B
> let's say, does an mmap() on your device and you'd like to freeze that
> object, such that process B can see a consistent snapshot of it, while
> you want process A to be able to continue reading and writing from/to
> it.
This is not supported. I have no idea why would a copy of a page which
reflects the device state even considered as a good idea. But you cannot
make the consistent copy without device cooperation anyway, since device
might modify its state while CPU reads.

> 
> I've also read through Design Elements of the FreeBSD VM system [3],
> but I am still afraid (I am sure) that I have some misunderstandings.
> 
> Thank you very much for bearing with me and going through this wall of text.
> 
> --
> Flavius
> 
> [1] https://github.com/flaviusanton/freebsd/tree/bhyve-save-restore
> [2] https://github.com/flaviusanton/freebsd/blob/bhyve-save-restore/sys/amd64/vmm/vmm_dev.c#L862
> [3] https://www.freebsd.org/doc/en/articles/vm-design/index.html
> _______________________________________________
> freebsd-hackers@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"

From owner-freebsd-hackers@freebsd.org  Tue Apr 11 13:42:46 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 207B6D39747
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue, 11 Apr 2017 13:42:46 +0000 (UTC)
 (envelope-from freebsd-listen@fabiankeil.de)
Received: from smtprelay01.ispgateway.de (smtprelay01.ispgateway.de
 [80.67.31.28])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id DE130C65
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 13:42:45 +0000 (UTC)
 (envelope-from freebsd-listen@fabiankeil.de)
Received: from [78.35.167.42] (helo=fabiankeil.de)
 by smtprelay01.ispgateway.de with esmtpsa (TLSv1.2:AES256-GCM-SHA384:256)
 (Exim 4.84) (envelope-from <freebsd-listen@fabiankeil.de>)
 id 1cxvfK-0001wT-NY; Tue, 11 Apr 2017 15:17:02 +0200
Date: Tue, 11 Apr 2017 15:14:26 +0200
From: Fabian Keil <freebsd-listen@fabiankeil.de>
To: Christopher Bowman <crb@chrisbowman.com>
Cc: freebsd-hackers@freebsd.org
Subject: Re: Dtrace oddity
Message-ID: <20170411151426.3b760182@fabiankeil.de>
In-Reply-To: <CD5E9B03-6147-4E4D-BED6-6C45022051E3@chrisbowman.com>
References: <CD5E9B03-6147-4E4D-BED6-6C45022051E3@chrisbowman.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 boundary="Sig_/Wjcw_RzPb68pzBvBRYkVM4L"; protocol="application/pgp-signature"
X-Df-Sender: Nzc1MDY3
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2017 13:42:46 -0000

--Sig_/Wjcw_RzPb68pzBvBRYkVM4L
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Christopher Bowman <crb@chrisbowman.com> wrote:

> The man page lists a bunch of reasons for EINVAL so I want to
> investigate this and I don=E2=80=99t quite know good strategies to debug =
the
> kernel (yet) so I thought I=E2=80=99d experiment with Dtrace a bit.  Here=
 is the
> oddity: when I run Dtrace and then run my test program I get the
> following output from Dtrace:
>=20
> crb@retread:60> dtrace -n 'syscall:freebsd:mmap:entry /execname =3D=3D "t=
est"/ {}'
> dtrace: description 'syscall:freebsd:mmap:entry ' matched 1
> probe CPU     ID                    FUNCTION:NAME
>   0  63401                       mmap:entry=20
>   0  63401                       mmap:entry=20
>   0  63401                       mmap:entry=20
>   0  63401                       mmap:entry=20
>   0  63401                       mmap:entry=20
>   0  63401                       mmap:entry=20
>   0  63401                       mmap:entry=20
>   0  63401                       mmap:entry=20
>   0  63401                       mmap:entry=20
>   0  63401                       mmap:entry=20
>   0  63401                       mmap:entry=20
>   0  63401                       mmap:entry=20
>=20
> I think Dtrace is indicating that the mmap syscall was called 12 times
> by my test program yet I can see how the program below would have done
> that.

A bunch of mmap syscalls occur before main is even entered.
Try running your program with truss to see what's going on.

> Here is my program:
[...]
> 	printf("opening device %s\n", argv[1]);
> 	int device =3D open (argv[1], O_RDWR);
> 	if (device =3D=3D 0) {

You should check for -1 here.

> 	void *pa =3D mmap (0, 4095, PROT_READ | PROT_WRITE, 0, device, 0);

No flags? From the mmap man page:

|     [EINVAL]           None of MAP_ANON, MAP_PRIVATE, MAP_SHARED, or
|                        MAP_STACK was specified.  At least one of these fl=
ags
|                        must be included.

Fabian

--Sig_/Wjcw_RzPb68pzBvBRYkVM4L
Content-Type: application/pgp-signature
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----

iF0EARECAB0WIQTKUNd6H/m3+ByGULIFiohV/3dUnQUCWOzWswAKCRAFiohV/3dU
naR9AKC88uaGiPliml1AEINPpCMkoYMAWQCfSPsCr/Gj/fo9J+0zFGmy+EYYvXU=
=JFvI
-----END PGP SIGNATURE-----

--Sig_/Wjcw_RzPb68pzBvBRYkVM4L--

From owner-freebsd-hackers@freebsd.org  Tue Apr 11 13:55:04 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 50646D39B13
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue, 11 Apr 2017 13:55:04 +0000 (UTC)
 (envelope-from f.v.anton@gmail.com)
Received: from mail-wm0-x235.google.com (mail-wm0-x235.google.com
 [IPv6:2a00:1450:400c:c09::235])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id CEA683E3
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 13:55:03 +0000 (UTC)
 (envelope-from f.v.anton@gmail.com)
Received: by mail-wm0-x235.google.com with SMTP id y18so13906783wmh.0
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 06:55:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to;
 bh=u7sLiW9JSsAD/x7usLCGROHZF40bWzgGIItUmh5O1HY=;
 b=o3paD3Nd7QOOJmGpfSMRy65xrn+TxyX3RPgDLb454ISBnJ0rPf7LxsRPCmkrZFX0t8
 j1lRfShzjuRxcsgf85hBixS434YMO/7SboyWBmVLE9b5TEmB5eJ+uo1fr9I5Ee5nznR+
 1au54a4AJx/pkBe1glxv9mxBt/diCgseLfnLqOxNv4TkAiqcmOP70Csv7oQqdusPqKHj
 ZMHGyhKigxbQeBo/RApnZDLIdJMYgxR/MY6g25mk0vCfn2tMVYromaYtK590GatX6OxW
 o49yhVD/+6/R+6LQ0PHK3oFBj//wFmhC+KZPYh3nNhmxQXQMnqGxi06HAZ3km6wTEnKQ
 hs7Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to;
 bh=u7sLiW9JSsAD/x7usLCGROHZF40bWzgGIItUmh5O1HY=;
 b=gr9OH4517osGZ85W/sGn0J7GXSh3fzopR/fHAzG56XnlLthtPS1Btlv1bPmIaZ6PQC
 3jBaSuev32s8X/Lb5nifouMh1MPa6a5lnlU1SpH4BbRAeqToPn6OkA7TngbMBvlk81aF
 pEsBG11wkZ6D/Xd2RIl4Fk2c241r8SECHC4cW2hc8lfdiTfZ/jPw0CgQTFGMNnZBln9J
 kW3t0GKeWEs6gCi+uTQdo+oXnfksyefEpV/b5YBGJ0+ZfIKYTO/k1xETyxBxVoi0zNdg
 HpX0WLXEkqDjoVgSuIieUEc4YbcVuEHAKcji5bRpJkW0IY+u6OpHnQkHerZNjqgOPAMy
 7oDw==
X-Gm-Message-State: AN3rC/4prRnTdApqPJn/CuCUbLK35xyK2FvM9mGodqCzPvK13s+0bnJzwKvwuCe7FLfBQwNUB6lxerSDdgi2ag==
X-Received: by 10.28.7.144 with SMTP id 138mr15014079wmh.125.1491918900958;
 Tue, 11 Apr 2017 06:55:00 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.223.178.10 with HTTP; Tue, 11 Apr 2017 06:55:00 -0700 (PDT)
In-Reply-To: <CANXdjjYajtvWK+q3OK4j5uPFR4sVUrhrQD8zZSpoJ1hwZhVS5Q@mail.gmail.com>
References: <CANXdjjYajtvWK+q3OK4j5uPFR4sVUrhrQD8zZSpoJ1hwZhVS5Q@mail.gmail.com>
From: Flavius Anton <f.v.anton@gmail.com>
Date: Tue, 11 Apr 2017 16:55:00 +0300
Message-ID: <CANXdjjZrjxhbqhZ13sAuZP7cqpvYU8CJusQ2NEpGuRCVMgr0=g@mail.gmail.com>
Subject: Re: On COW memory mapping in d_mmap_single
To: freebsd-hackers@freebsd.org
Content-Type: text/plain; charset=UTF-8
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2017 13:55:04 -0000

>On Tue, Apr 11, 2017 at 04:00:21PM +0300, Konstantin Belousov wrote:
>>On Tue, Apr 11, 2017 at 03:37:26PM +0300, Flavius Anton wrote:
>> Hi everyone,
>>
>> I'll start by giving some context, so you can better understand what
>> is the problem I'm trying to solve. I???ve been working for a while on
>> bhyve trying to implement save/restore [1]. We've currently managed to
>> get it working for VMs using a ramdisk and no devices, so just vCPU
>> and memory states are saved and restored so far.
>>
>> Last week I started looking into network devices, specifically
>> virtio-net devices. The problem was that when I issue a checkpoint
>> operation, the guest virtio driver stops working. After digging for a
>> while, I figured out the problem is marking VM memory as COW. If I
>> don't do this, the driver continues with no problem after
>> checkpointing.
>>
>> Each VM has an associated vmspace and a /dev/vmm/VM_NAME device. When
>> the user space does a mmap on the /dev device, we would like to mark
>> VM memory as COW, thus the VM can continue touching pages while the
>> user space is writing the 'freezed', COW marked memory to a persistent
>> storage. We do this by iterating through all vm_entries from VM's
>> vmspace, we find which entry is mapping the object that has VM memory
>> and then we roughly just set MAP_ENTRY_COW and MAP_ENTRY_NEEDS_COPY on
>> that entry. You can see the code here [2].
>
>This is very strange operation, to put it mildly.  First, are other vCPUs
>operate while you do your 'COW' ?  If yes, you are guaranteed to get
>inconsistent snapshot.  If not, then you do not need 'COW'.

Yes, all vCPUs are locked before calling mmap(). I agree that we don't
need 'COW', as long as we keep all vCPUs locked while we copy the
entire VM memory. But this might take a while, imagine a VM with 32GB
or more of RAM. This will take maybe minutes to write to disk, so we
don't actually want the VM to be freezed for so long. That's the
reason we'd like to map the memory COW and then unlock vCPUs.

>More, what kinds of VM objects are mapped into the vmspace ? FreeBSD VM
>does not support shadowing of device objects (which means, inserting
>shadow objects into the device object chain breaks VM invariants). One
>of the main reasons why it not needed to be supported is because shadow
>copy cannot see changes which are performed on the shadowed pages,
>supposedly done by device. If vmm mmaps some devices into guest vmspace,
>the devices would kind of 'freeze' from the guest PoV.

It's a OBJT_DEFAULT. It's not a device object, it's the memory object
given to guest to use as physical memory.

>Next, how do you undo the damage done by your 'COW' ?

This is one thing that we've thought about, but we don't have a
solution for now. I agree it is very important, though. I figured that
it might be possible to 'unmark' the memory object as COW with some
additional tricks.

>> I'm not sure if the above is sufficient for our purpose. In other
>> words, how would you do this? You have a vm_object that is referenced
>> via a vm_entry by process A (the user space). Somebody else, process B
>> let's say, does an mmap() on your device and you'd like to freeze that
>> object, such that process B can see a consistent snapshot of it, while
>> you want process A to be able to continue reading and writing from/to
>> it.
>This is not supported. I have no idea why would a copy of a page which
>reflects the device state even considered as a good idea. But you cannot
>make the consistent copy without device cooperation anyway, since device
>might modify its state while CPU reads.

I'm sorry if I haven't been too clear. The object that I'm trying to
map as COW is not a device object. It's just the object that contains
VM memory. That object shouldn't change if all VM vCPUs are locked and
I make sure they are when calling mmap().

Thanks for your input on this.

--
Flavius

>> I've also read through Design Elements of the FreeBSD VM system [3],
>> but I am still afraid (I am sure) that I have some misunderstandings.
>>
>> Thank you very much for bearing with me and going through this wall of text.
>>
>> [1] https://github.com/flaviusanton/freebsd/tree/bhyve-save-restore
>> [2] https://github.com/flaviusanton/freebsd/blob/bhyve-save-restore/sys/amd64/vmm/vmm_dev.c#L862
>> [3] https://www.freebsd.org/doc/en/articles/vm-design/index.html

From owner-freebsd-hackers@freebsd.org  Tue Apr 11 14:30:13 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1ED72D3986F
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue, 11 Apr 2017 14:30:13 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id BAB4E13B
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 14:30:12 +0000 (UTC)
 (envelope-from kostikbel@gmail.com)
Received: from tom.home (kib@localhost [127.0.0.1])
 by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id v3BEU3V4010382
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Tue, 11 Apr 2017 17:30:03 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua v3BEU3V4010382
Received: (from kostik@localhost)
 by tom.home (8.15.2/8.15.2/Submit) id v3BEU31S010380;
 Tue, 11 Apr 2017 17:30:03 +0300 (EEST)
 (envelope-from kostikbel@gmail.com)
X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com
 using -f
Date: Tue, 11 Apr 2017 17:30:03 +0300
From: Konstantin Belousov <kostikbel@gmail.com>
To: Flavius Anton <f.v.anton@gmail.com>
Cc: freebsd-hackers@freebsd.org
Subject: Re: On COW memory mapping in d_mmap_single
Message-ID: <20170411143003.GT1788@kib.kiev.ua>
References: <CANXdjjYajtvWK+q3OK4j5uPFR4sVUrhrQD8zZSpoJ1hwZhVS5Q@mail.gmail.com>
 <CANXdjjZrjxhbqhZ13sAuZP7cqpvYU8CJusQ2NEpGuRCVMgr0=g@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CANXdjjZrjxhbqhZ13sAuZP7cqpvYU8CJusQ2NEpGuRCVMgr0=g@mail.gmail.com>
User-Agent: Mutt/1.8.0 (2017-02-23)
X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00,
 DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no
 autolearn_force=no version=3.4.1
X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2017 14:30:13 -0000

On Tue, Apr 11, 2017 at 04:55:00PM +0300, Flavius Anton wrote:
> >On Tue, Apr 11, 2017 at 04:00:21PM +0300, Konstantin Belousov wrote:
> >>On Tue, Apr 11, 2017 at 03:37:26PM +0300, Flavius Anton wrote:
> >> Hi everyone,
> >>
> >> I'll start by giving some context, so you can better understand what
> >> is the problem I'm trying to solve. I???ve been working for a while on
> >> bhyve trying to implement save/restore [1]. We've currently managed to
> >> get it working for VMs using a ramdisk and no devices, so just vCPU
> >> and memory states are saved and restored so far.
> >>
> >> Last week I started looking into network devices, specifically
> >> virtio-net devices. The problem was that when I issue a checkpoint
> >> operation, the guest virtio driver stops working. After digging for a
> >> while, I figured out the problem is marking VM memory as COW. If I
> >> don't do this, the driver continues with no problem after
> >> checkpointing.
> >>
> >> Each VM has an associated vmspace and a /dev/vmm/VM_NAME device. When
> >> the user space does a mmap on the /dev device, we would like to mark
> >> VM memory as COW, thus the VM can continue touching pages while the
> >> user space is writing the 'freezed', COW marked memory to a persistent
> >> storage. We do this by iterating through all vm_entries from VM's
> >> vmspace, we find which entry is mapping the object that has VM memory
> >> and then we roughly just set MAP_ENTRY_COW and MAP_ENTRY_NEEDS_COPY on
> >> that entry. You can see the code here [2].
> >
> >This is very strange operation, to put it mildly.  First, are other vCPUs
> >operate while you do your 'COW' ?  If yes, you are guaranteed to get
> >inconsistent snapshot.  If not, then you do not need 'COW'.
> 
> Yes, all vCPUs are locked before calling mmap(). I agree that we don't
> need 'COW', as long as we keep all vCPUs locked while we copy the
> entire VM memory. But this might take a while, imagine a VM with 32GB
> or more of RAM. This will take maybe minutes to write to disk, so we
> don't actually want the VM to be freezed for so long. That's the
> reason we'd like to map the memory COW and then unlock vCPUs.
> 
> >More, what kinds of VM objects are mapped into the vmspace ? FreeBSD VM
> >does not support shadowing of device objects (which means, inserting
> >shadow objects into the device object chain breaks VM invariants). One
> >of the main reasons why it not needed to be supported is because shadow
> >copy cannot see changes which are performed on the shadowed pages,
> >supposedly done by device. If vmm mmaps some devices into guest vmspace,
> >the devices would kind of 'freeze' from the guest PoV.
> 
> It's a OBJT_DEFAULT. It's not a device object, it's the memory object
> given to guest to use as physical memory.
Perhaps add asserts that you only shadow default/swap/vnode objects.
Then you will see if the issue is what I noted above, or not.

> 
> >Next, how do you undo the damage done by your 'COW' ?
> 
> This is one thing that we've thought about, but we don't have a
> solution for now. I agree it is very important, though. I figured that
> it might be possible to 'unmark' the memory object as COW with some
> additional tricks.
You might consider using vm_object_collapse().

From owner-freebsd-hackers@freebsd.org  Tue Apr 11 17:15:44 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 20CD4D3AD63
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue, 11 Apr 2017 17:15:44 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-pf0-x242.google.com (mail-pf0-x242.google.com
 [IPv6:2607:f8b0:400e:c00::242])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id E4578F01
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 17:15:43 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-pf0-x242.google.com with SMTP id o126so553844pfb.1
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 10:15:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=ovrJSqlRWmu3DimgzZ0CfywoGCZcEcUX70IYYI41pg0=;
 b=nKoeaKSTcgqgsIODVW/jfh0qH+8dPKqF6+fqPkGDr5LiSC2GPelJW55Az7aQsMfpHs
 8YKkUtwA+n8USAeq9hCvYwg1VFZWECvW8Hrh932XKay2mrYRnNEGXZgDqQsdTvn68XOF
 39Pd80+6YWl+idLWvF9O4Y/uuf5lFt9J9POKGZg0wTdwDY/UXqwSuXzJrSiNzhgkvtFi
 xiTxdbhVxyfago2oMgJNOp3738zmSzTbVWAhSiezWPwKU10kVqPTs/vZh8zzxP5nx0la
 omb2r/vbHICWqWXf1KPnUJy8dySMRVrpuytw1uo9GCL2foitYMmUEpSjm6owyFXsVnPU
 WCsA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=ovrJSqlRWmu3DimgzZ0CfywoGCZcEcUX70IYYI41pg0=;
 b=oQwTWrQ6Jz7o//joCdz4IOjcemni1ZYtlMgoHEPCqY6ysxs1sYo0c6nrAP7RKmxMqJ
 SXjkjorDme55rdUtC3vuiP9sJ77o9ktP3a2n10VqzTVtPAXt1owJ/p/kLElewxZMT3uC
 xgw4w5ehbibj1D86wFBNZg60sKiYZbIYXNgZ1mFy/JMSbM29uk2b0UBZd2a7swfZHA0C
 uw3uMdMlqClaQ6lkKQ3VVQRJBOMcufGj0kQzSfWFRv09MEiizzD/vYMssRI1P9DEqaQl
 Q8JGhditGBfDMt3GMvhoixmp1J3NBCXh/apRTmRjIthqC9Yh9BfXt5CFZGXiwj4qIQmW
 N7Jw==
X-Gm-Message-State: AFeK/H32wSbxEjrcvIwOAg/Zy92SUe2bGVCY9dfnXiWs4+QXarFQD9rd1E+XZM/a1/340Q==
X-Received: by 10.84.218.68 with SMTP id f4mr76690785plm.146.1491930943443;
 Tue, 11 Apr 2017 10:15:43 -0700 (PDT)
Received: from [192.168.0.100] ([110.64.91.54])
 by smtp.gmail.com with ESMTPSA id 133sm25559138pfy.106.2017.04.11.10.15.39
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 11 Apr 2017 10:15:42 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Chris Torek <torek@elf.torek.net>, imp@bsdimp.com
References: <201704100426.v3A4QR9Q042761@elf.torek.net>
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, rysto32@gmail.com
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <4768e26a-cdec-6f40-1463-ece9847ca34d@gmail.com>
Date: Wed, 12 Apr 2017 01:15:34 +0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <201704100426.v3A4QR9Q042761@elf.torek.net>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2017 17:15:44 -0000


Thanks for your reply. I have read your mails and your discussion with
Konstantin Belousov

On 2017/4/10 12:26, Chris Torek wrote:
>>>> Is it true that a thread holding a MTX_DEF mutex can be descheduled?
>
>>> Yes, they can be descheduled. But that's not a problem. No other
>>> thread can acquire the MTX_DEF lock. ...
>
>> Does that imply that MTX_DEF should not be used in something like
>> interrupt handler? Putting an interrupt handler into sleep doesn't
>> make so much sense.
>
> Go back to the old top-half / bottom-half model, and consider that
> now that there are interrupt *threads*, your ithread is also in the
> "top half".  It's therefore OK to suspend.  ("Sleep" is not quite
> correct here: a mutex wait is not a "sleep" state but instead is
> just a waiting, not-scheduled-to-run state.  The precise difference
> is irrelevant at this level though.)

I don't truely understand the "top-half/bottom-half" model you proposed,
but I think I get the idea of how things work now. Basically, we can
assume that if a thread is in the "bottom-half", then it should never
suspend(or, in the other words, be preempted). This is the case of the
"interrupt filter" in FreeBSD. On the other hand, if a thread is in the
"top-half", then it is safe to suspend/block. This is the case of the 
"ithread".

The difference between the "ithread" and "interrupt filter" things is
that ithread has its own thread context, while interrupt handling 
through interrupt filter shares the same kernel stack.

So, for ithread, we should use the MTX_DEF, which don't disable
interrupt, and for "interrupt filter", we should use the MTX_SPIN, which
disable interrupt.

What really confuses me is that I don't really see how owning an
"independent" thread context(i.e ithread) makes a thread run in the 
"top-half" and how sharing the same kernel stack makes a thread run in
the "bottom-half".

I did read your long explanation in the previous mail. For the non-SMP
case, the "top-half/bottom-half" model goes well and I understand how 
the *code* path/*data* path things go. But I cannot still fully
understand the model for the SMP case. Maybe you can draw something like

     -----                     -----
     |   |<-- top-half         |   | <-- top-half
     |   |                     |   |
     |   |                     |   |
     |   |                     |   |
     |   |<-- bottom-half      |   | <-- bottom-half
     -----                     -----
      CPU1                     CPU2

to make things less abstract.

Thanks,
Yubin Ruan

> It's not *great* to suspend here, but all your alternatives are
> *also* bad:
>
>  * You may grab incoming data and stuff it into a ring buffer, and
>    schedule some other thread to handle it later.  But if the ring
>    buffer is full you have a problem, and all you have done is push
>    the actual processing off to another thread, adding more overhead.
>
>  * You may put the device itself on hold so that no more data can
>    come in (if it's that kind of device).
>
> On the other hand, if you are handling an interrupt but not in an
> interrupt thread, you are running in the "bottom half".  It is
> therefore *not OK* to suspend.  You must now use one of those
> alternatives.
>
> Note that if you suspend on an MTX_DEF mutex, and your priority is
> *higher* than the priority of whatever thread actually holds that
> mutex now, that other thread gets a priority boost to your level
> (priority propagation, to prevent priority inversion).  So letting
> your ithread suspend, assuming you have an ithread, is probably your
> best bet.
>
> Chris
>


From owner-freebsd-hackers@freebsd.org  Tue Apr 11 17:17:11 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id C9B67D3AEE9
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue, 11 Apr 2017 17:17:11 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-pg0-x241.google.com (mail-pg0-x241.google.com
 [IPv6:2607:f8b0:400e:c05::241])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 98D0D1320
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 17:17:11 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-pg0-x241.google.com with SMTP id o123so579717pga.1
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 10:17:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=S0ugCMLYngrYkykv/9b4twhqQF1hRsfGxiB1t9VdqkU=;
 b=VqL5kkpFzIb3aEh8SXgrf3YwpwUGlk2XIhNzgIVsJf/Fiz+DCLvro/kMOBxMbpi5I1
 X6GKEmFZupM2ZBeFbsGD1/QX6tpqF16MuKAIFrEoC0+j3otgPCR92auUIYHdA0mjRLsC
 jnM+VaaEVcxXoBmIRCz7zRl3Crbp7qrzqUaTf8e44V+2zRaL2LWqntqBu34aTGcKhZSI
 xvWOgRqbZfZmGnKF1M+ZnTsx0F+V16NDLvQCBgY3HvMMYCtJpjlMEQraUMIoWHykNfLB
 5lfH/4KKRJB6sV7CnSFrpAy+D1kAf7afOvIrF5/RiAxsE2GLj6g2nnfMGP5nWLEFQpoJ
 2rew==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=S0ugCMLYngrYkykv/9b4twhqQF1hRsfGxiB1t9VdqkU=;
 b=oFtWPWoJ2KvN3AZC/YUNDLi4LsPk0wC22PezqDIQcXlMtBLXzFIaW4Xfu0taw+He78
 s8SW3KwA+mRtqr4wiM/3RGI9FWjItyatPQzMk9+RrfERaLWrjnbUbtVfkqkLMgYmnyJs
 LnL3cAYpDH+oCXOJWnN893J6TEMoU/3ONOU5JJRnbwb+eBX94WLSrbTBXlG8SYwVh/DX
 +e01GWICsNMk57fquIrdF/qwxKMNbuTZC0ryRJAEhj7N64Ji4MS+ha2O2NObEcrAUY9h
 OODlRdWkS4IYGi+jjaQwZgNdsi1lIwyJ6PcYFL3TmkgkDqyqJpDRci7HIIfKXfZ0UGnV
 pWYA==
X-Gm-Message-State: AFeK/H2X85yYknyQkyGG8MiypSu5QDQM5rXH5DVrjAPfY4ZnImPqqmFz7DHH0pXMMcB9Nw==
X-Received: by 10.99.212.69 with SMTP id i5mr62137301pgj.36.1491931031107;
 Tue, 11 Apr 2017 10:17:11 -0700 (PDT)
Received: from [192.168.0.100] ([110.64.91.54])
 by smtp.gmail.com with ESMTPSA id v17sm31868381pgc.20.2017.04.11.10.17.07
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 11 Apr 2017 10:17:09 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Chris Torek <torek@elf.torek.net>, imp@bsdimp.com
References: <201704100426.v3A4QR9Q042761@elf.torek.net>
 <4768e26a-cdec-6f40-1463-ece9847ca34d@gmail.com>
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, rysto32@gmail.com
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <04b3328f-7bfb-bb70-c665-b43038cdd768@gmail.com>
Date: Wed, 12 Apr 2017 01:17:02 +0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <4768e26a-cdec-6f40-1463-ece9847ca34d@gmail.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2017 17:17:11 -0000

On 2017/4/12 1:15, Yubin Ruan wrote:
>
> Thanks for your reply. I have read your mails and your discussion with
> Konstantin Belousov
>
> On 2017/4/10 12:26, Chris Torek wrote:
>>>>> Is it true that a thread holding a MTX_DEF mutex can be descheduled?
>>
>>>> Yes, they can be descheduled. But that's not a problem. No other
>>>> thread can acquire the MTX_DEF lock. ...
>>
>>> Does that imply that MTX_DEF should not be used in something like
>>> interrupt handler? Putting an interrupt handler into sleep doesn't
>>> make so much sense.
>>
>> Go back to the old top-half / bottom-half model, and consider that
>> now that there are interrupt *threads*, your ithread is also in the
>> "top half".  It's therefore OK to suspend.  ("Sleep" is not quite
>> correct here: a mutex wait is not a "sleep" state but instead is
>> just a waiting, not-scheduled-to-run state.  The precise difference
>> is irrelevant at this level though.)
>
> I don't truely understand the "top-half/bottom-half" model you proposed,
> but I think I get the idea of how things work now. Basically, we can
> assume that if a thread is in the "bottom-half", then it should never
> suspend(or, in the other words, be preempted). This is the case of the
> "interrupt filter" in FreeBSD. On the other hand, if a thread is in the
> "top-half", then it is safe to suspend/block. This is the case of the
> "ithread".
>
> The difference between the "ithread" and "interrupt filter" things is
> that ithread has its own thread context, while interrupt handling
> through interrupt filter shares the same kernel stack.
>
> So, for ithread, we should use the MTX_DEF, which don't disable
> interrupt, and for "interrupt filter", we should use the MTX_SPIN, which
> disable interrupt.
>
> What really confuses me is that I don't really see how owning an
> "independent" thread context(i.e ithread) makes a thread run in the
> "top-half" and how sharing the same kernel stack makes a thread run in
> the "bottom-half".
>
> I did read your long explanation in the previous mail. For the non-SMP
> case, the "top-half/bottom-half" model goes well and I understand how
> the *code* path/*data* path things go. But I cannot still fully
> understand the model for the SMP case. Maybe you can draw something like
>
>     -----                     -----
>     |   |<-- top-half         |   | <-- top-half
>     |   |                     |   |
>     |   |                     |   |
>     |   |                     |   |
>     |   |<-- bottom-half      |   | <-- bottom-half
>     -----                     -----
>      CPU1                     CPU2
>
> to make things less abstract.
>
> Thanks,
> Yubin Ruan
>
>> It's not *great* to suspend here, but all your alternatives are
>> *also* bad:
>>
>>  * You may grab incoming data and stuff it into a ring buffer, and
>>    schedule some other thread to handle it later.  But if the ring
>>    buffer is full you have a problem, and all you have done is push
>>    the actual processing off to another thread, adding more overhead.
>>
>>  * You may put the device itself on hold so that no more data can
>>    come in (if it's that kind of device).
>>
>> On the other hand, if you are handling an interrupt but not in an
>> interrupt thread, you are running in the "bottom half".  It is
>> therefore *not OK* to suspend.  You must now use one of those
>> alternatives.
>>
>> Note that if you suspend on an MTX_DEF mutex, and your priority is
>> *higher* than the priority of whatever thread actually holds that
>> mutex now, that other thread gets a priority boost to your level
>> (priority propagation, to prevent priority inversion).  So letting
>> your ithread suspend, assuming you have an ithread, is probably your
>> best bet.
>>
>> Chris
>>
>

Sorry for the ugly format. The mail client sucks.

Yubin Ruan

From owner-freebsd-hackers@freebsd.org  Tue Apr 11 20:21:28 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 32A5FD3ACB9;
 Tue, 11 Apr 2017 20:21:28 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
Received: from NAM02-BL2-obe.outbound.protection.outlook.com
 (mail-bl2nam02on0078.outbound.protection.outlook.com [104.47.38.78])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (Client CN "mail.protection.outlook.com",
 Issuer "Microsoft IT SSL SHA2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 9A4DF1AF;
 Tue, 11 Apr 2017 20:21:27 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ksu.edu; s=selector2; 
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=LERGuygVnhSHc2ek/QihWWNIveDS2IXN6/Tq6XE3U9U=;
 b=SbrIfIBSbFGgDGuRwT/XQvAmCBsUU/VulI3p2SLOuPM9eYojQipNxzoCjlKvYOEzAjOzUMxpmgQFQd6yskWRlvNmW84lk1j8dPRCY1omCGPBkIvzCeXLbk+a86wB0EsjfpP3l3Ri4GCZxONd0SkX6UwSX4lhyiU/vrNvHo8Su4I=
Received: from DM2PR0501CA0040.namprd05.prod.outlook.com (10.162.29.178) by
 BY1PR0501MB1109.namprd05.prod.outlook.com (10.160.103.143) with Microsoft
 SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1034.5; Tue, 11
 Apr 2017 20:21:25 +0000
Received: from SN1NAM02FT055.eop-nam02.prod.protection.outlook.com
 (2a01:111:f400:7e44::208) by DM2PR0501CA0040.outlook.office365.com
 (2a01:111:e400:5148::50) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1034.5 via
 Frontend Transport; Tue, 11 Apr 2017 20:21:25 +0000
Authentication-Results: spf=pass (sender IP is 129.130.18.151)
 smtp.mailfrom=ksu.edu; freebsd.org; dkim=none (message not signed)
 header.d=none;freebsd.org; dmarc=bestguesspass action=none
 header.from=ksu.edu;
Received-SPF: Pass (protection.outlook.com: domain of ksu.edu designates
 129.130.18.151 as permitted sender) receiver=protection.outlook.com;
 client-ip=129.130.18.151; helo=ome-vm-smtp1.campus.ksu.edu;
Received: from ome-vm-smtp1.campus.ksu.edu (129.130.18.151) by
 SN1NAM02FT055.mail.protection.outlook.com (10.152.72.174) with Microsoft SMTP
 Server id 15.1.1019.14 via Frontend Transport; Tue, 11 Apr 2017 20:21:24
 +0000
Received: from calypso.engg.ksu.edu (calypso.engg.ksu.edu [129.130.43.181])
 by ome-vm-smtp1.campus.ksu.edu (8.14.4/8.14.4/Debian-2ubuntu2.1) with ESMTP id
 v3BKLM0s004639; Tue, 11 Apr 2017 15:21:22 -0500
Received: by calypso.engg.ksu.edu (Postfix, from userid 110)
 id 68001248005; Tue, 11 Apr 2017 15:21:22 -0500 (CDT)
Received: from mail-wr0-f182.google.com (mail-wr0-f182.google.com
 [209.85.128.182])
 by calypso.engg.ksu.edu (Postfix) with ESMTPA id 15271248004;
 Tue, 11 Apr 2017 15:21:20 -0500 (CDT)
Received: by mail-wr0-f182.google.com with SMTP id o21so5073596wrb.2;
 Tue, 11 Apr 2017 13:21:20 -0700 (PDT)
X-Gm-Message-State: AFeK/H1vWhcGvsElaH3P+Nd2zlfXYxgh/F/HLMFKTydqZ0wcnkgYM6lXZISW0xZkKNLlaZZUdTNsNVpCXJBKQA==
X-Received: by 10.223.154.54 with SMTP id z51mr32463232wrb.76.1491942079266;
 Tue, 11 Apr 2017 13:21:19 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.28.39.134 with HTTP; Tue, 11 Apr 2017 13:20:58 -0700 (PDT)
From: Kyle Evans <kevans91@ksu.edu>
Date: Tue, 11 Apr 2017 15:20:58 -0500
X-Gmail-Original-Message-ID: <CACNAnaEmBjWudEJwvRTSqyciOp7-oRbCEQ_e6qtGsap0oHQ4yw@mail.gmail.com>
Message-ID: <CACNAnaEmBjWudEJwvRTSqyciOp7-oRbCEQ_e6qtGsap0oHQ4yw@mail.gmail.com>
Subject: Replacing libgnuregex
To: <freebsd-hackers@freebsd.org>, <freebsd-standards@freebsd.org>
X-EOPAttributedMessage: 0
X-Forefront-Antispam-Report: CIP:129.130.18.151; IPV:NLI; CTRY:US; EFV:NLI;
 SFV:NSPM;
 SFS:(10009020)(6009001)(39400400002)(39410400002)(39850400002)(39450400003)(39860400002)(39840400002)(2980300002)(438002)(189002)(199003)(450100002)(61266001)(189998001)(221733001)(38730400002)(93516999)(63696999)(54356999)(8936002)(8676002)(3480700004)(512874002)(46386002)(88552002)(90966002)(45336002)(8576002)(42186005)(305945005)(50986999)(59536001)(106466001)(61726006)(2906002)(7116003)(9896002)(9686003)(356003)(5660300001)(498394004)(55446002)(75432002)(86362001)(84326002)(55456009);
 DIR:OUT; SFP:1101; SCL:1; SRVR:BY1PR0501MB1109; H:ome-vm-smtp1.campus.ksu.edu;
 FPR:; SPF:Pass; MLV:sfv; MX:1; A:1; LANG:en; 
X-Microsoft-Exchange-Diagnostics: 1; SN1NAM02FT055;
 1:hq2osXGRM10eSK3Aok7WXISnSxjlzma8d0VxpkQsvBPSF+548kZXwwD9iQkl6ikgcWX7LS/lS1qd2+lEfzLrp00V6QExUDErzUoKZpGUMrQ+MwSGG9lbJ/Ybp0gWiJkqVQgXKT3w6MN2B0sxFDj7tsV3RzDF3/jxFCtXSI62E9vGhPMmD54wjd+VqiSfz8V83ThHYFMghmX/poJ1oVzmZfzmfngQcaTN44DzHl/KeCIh7+wdBT7fcip2YVGPYcp2teH5yDTMPZCn6qCMbWjYz333OWY2/3eakoDVrvWHUBvbVyQ5IBSe1xuvCbr3YKtMuvXP7i7mauba43lLTcZkWcSjNxwzVdMxfhAxosXymIGscVNR7pwc8Zape5WdbDH6fN/9j0//VP1P5wVUG5QAY7kCzjgkkumB+C8J5rv/JsSBKa0ojcWIU6gUm5qgT5GIbFWdGU0iwpsjRkPSaUbsPqynYyQ3vAhAitWWxPXj3w7E9m2/IDGrmUmi7y2zCZneABoTP3/1vpTuCRK8rECkDA2dV4oeOYDxKDn8vVNXvi4=
X-MS-Office365-Filtering-Correlation-Id: 42b62dcd-2e20-470b-512f-08d48118535b
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(22001)(8251501002)(2017030254075)(201703131423075)(201703031133081);
 SRVR:BY1PR0501MB1109; 
X-Microsoft-Exchange-Diagnostics: 1; BY1PR0501MB1109;
 3:rrakiPXfYSx3fiLhj1bPAyROiL3JPKHviEx0tsdOZtxBfX747dl/VbXpBz4UVGb0DX02Hmt+zZsMXE0dKQKhk9VMz52/4y05P4lDnfRSQtJXnWSLKkYX5txLPZOqFShTnoz0LfiAMrX+qIiAnq9lBhY6KWtIm+Neeuu+b+tKO42YOjRKi75qZJgHfSF905j5IjsdHAW9kbt0vRydV+WFLWXTUB75JHFcvQefVSI8JFTPsp8ImahPGtVzYOvd/ti12hmuzYmQd5w+bkndSbXWOJcV3AIO1z/UT4mGxNJf9z+JTplm/q9eS6LStbTEqdQT1l3NJ6njomZaQARtPbJIkMS8iHj8AxgcL/qaPPf8hCtuiKEp0tswJlA1hOLQdFaW1V7rO4RkIoJQ2BZUKjlN9txn3fPcW+C6o0Q/uVl2qOiyK0MJEgZHoujFPIj9Oku7HzDrsEl2EQSJ9+DAfVzFlKxZsCiL3Y/9oIaiJCPAykJgi6CsKQvxrSIQzFWiZmy8
X-Microsoft-Exchange-Diagnostics: 1; BY1PR0501MB1109;
 25:dVavJMkz6eGxyAKzcMMkpHeHQi10NpT7Xn8L9W0pxbSP+N+kArIQ+iCA9ZJKrGwNS3mEyhQyeuKlaRTs3lf8CFOWxxhBKeJ5f9hpZcqvnvOVLHkrWrBEjXBNbcFWItbjYkAo1o8NbiWSw6G2p/JZL0g8rJafudms+A+X/2IE2JuAboWJixsxz4E/m0Mjkluj/ffKeeDHFDICbxsKYidByY11v6docWpP4WuWUfbriRLHZ2bqJg6p5CXhB5lwX8o8DtBqFfVPk6r3n8RSZS0EKcKZYN2Bt/sKqlV9ae3BMNe2FCVenqFycObxE95d+qHu3aZfPg9afIN/w3RbQR26gMaQ+1KKtU56HwEjRQ87VZlnAW6dkEbQomG5svRnhbfepFv7lpVwKLIOCmQuRB+xHyysNNnVZGQOYTaCdktfPeZqyR+quy3rWcvHr0v7ZzeRPRgp/wRZp60EAQ/pkNAfyg==;
 31:ZMi+ao6IpOtF5la7Ii6fyzQjgh3HqJp8ek7vQmy9ImD+M8f/vwwovVKd9cMXB10BmMCHbWjMgoCoFIttkyP2Sg6y+PtWeTe7YIIkxRrzuJtlmDYI9PijZ6e+rtqXpXuL/kTyPp8JqpCvpSxs8p+orj/4rXubr1Vkl4siXD9IqElHCIa+YCtlB+XFNgH0O/Tqn9kctNDC/7milQAqOPgRbFpdvX9Z0GvOfQcZe+VrQnA8/C/lj56Ji54SWA4Mp0u/vxIX1Td1wy89i00nQ8IcTg==
X-Microsoft-Exchange-Diagnostics: 1; BY1PR0501MB1109;
 20:zpxIKwzosPh7TqJ7dZ7nKRCJHm4hLJ1TsQL7d1TTa0fr6WnSfZR1OI8WPH2QGHLSA426XXNm8qYShdotGgs41/c9LzcTMfCis3G5WSSORzvwzhQj3YFPpzHcWTXgawe+8bjv/RO6f4ZxNJpH6t/nja+ys6UrFCF8nTT8ahCTZHq2KctTbbBU1LOn1x8t+8w2k7e2MWYa0PI3aIgvjwfb0CZtVOh44mfNM6M5GEonB1UFVbqJsFhGeeXIKNYUVY6FvNjJdwFZ9sbLyiJDKTd4YXh1OSVQ9Unq1KgUpcFzZ63ZhrkACdqlHODmdeh8ziUoB3AvfIwD83iuL3kbX/L5F/xdxIp7MQMTZLWUrQp8TdyjxfU7prEXztxzqc+Q4ukrzElrJuI63sCyj4ZXAqFXd7DyXEy/opd/CZj1JY1ymshwdJLss0wtXGvXsq00JEJ9I/qBXPCxADuOwUeoagXp6BQ2P2O+hDcgOe2uwyqSTEXgbDsBhUZVpZREqhtTBodW
X-Microsoft-Antispam-PRVS: <BY1PR0501MB11097C563D4D636F0CACED1EC1000@BY1PR0501MB1109.namprd05.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:;
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(6040450)(601004)(2401047)(8121501046)(13015025)(5005006)(13017025)(13018025)(13024025)(13023025)(10201501046)(93006095)(93004095)(3002001)(6041248)(20161123564025)(20161123560025)(20161123562025)(20161123555025)(201703131423075)(201702281529075)(201702281528075)(201703061421075)(6072148);
 SRVR:BY1PR0501MB1109; BCL:0; PCL:0; RULEID:; SRVR:BY1PR0501MB1109; 
X-Microsoft-Exchange-Diagnostics: 1; BY1PR0501MB1109;
 4:svs9ctkkC/s/fBvWc3j0M0p64XOo/ta2l7ONTF1+1o7i/k3hfkHKxeIl3n2lX2EODHiMVnWjwmrZX5+PlejTmtjLuJlo6/GReS37kfYLr/O0HBqXCmuWqvuE90nKtmZdDxwzwYVU9r6IBXkR0wrh0wJ3OLOaLfvj2VQnnQTBU3hNuv4Qj0SpHG+IGN+r+je9iTpSaqeOP4GbbX0RGGbcugmQS34sIC86Mg1kMzx5l5q9QF2WMFgJ5m4mh7pBDHe9jeKD8mebDNIqt5YK5DfSCd+YKemHa3z8AKPJuCGI5f24NFBCQxI4IzC/M027ewX9g5gFQaoS3HdzXOqdLq3htXdK6IKe5IXc6lTh0p3LpCzxr5DlENa0OVagqwUoa8lrEBAFzunVZiz5oKTMV3M6UDeawljm8ykriAU/fUGyJpk0gogZ8Xk/G4ZyoFEQBEvX81TWMTr7winJYLsZLspUj5QNg0cajAPl7tbzJe7neg/AVZIDVTK0I5MrtGPpIRHTAcKvxzcKB7qHpl0WUarrEtOIgUiPw+qanIZPSlbBByf8kJxdOCAi27Cok9qR1FAZx4egmA8i926S/NWsW0mdDx3Nw+SdxKUXlR6Jh4u52UXFoBcSi8VH7Pwt7N0lnWlgcPMp/OGQi2aQ9Zj92Ih9AeQEBlwVkl+URhLoihCjXz7/CHul84EcXdb8fYgGwinEj4hpmufEfwZtOtgGsOiT1hlpNISmtcR45Sh4OWRi90ihDCmBvzw5G4s669NJlcTKST/7YK6KObh7kRyEO4hNID3Hz6SClZX9WmrgkX4+yKXa9QaBksjVyYs4LS4vBuhasr+f7fPUsUolAbT9L9tS7A==
X-Forefront-PRVS: 0274272F87
X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BY1PR0501MB1109;
 23:WMolZbBmBP7KUXB9EDNNxeGt+P4v1HM0/iFSaqN?=
 =?us-ascii?Q?W5C+omtO7crhMju7dWSWyWYj0LdFtC/2CAZ08u5FvwtHwcorr9RNfkigazL7?=
 =?us-ascii?Q?5NyTQIiOcUeWaLUCEfXOZOv6haHqONmy5LuxgWPND9lIGZlipFvA+oyuBZ8F?=
 =?us-ascii?Q?/rAPBUrVonXe3F74pszY5F6B60XmrcZ0Rf67GX3jQaPacja7o9tFakEQb+MU?=
 =?us-ascii?Q?9JSF3SbZvom5EaQ8suEw+X/uVyPbDN52ELFlUkbjvbEcB7NcEaJJYUQceycR?=
 =?us-ascii?Q?oh5NSgwVC2kSJcDT0VjbnwcpXuR4Wea/Bo9rSGL4aihH7XjJbto8zhrgUIjq?=
 =?us-ascii?Q?XK+EGeRNfpAMMNgcnx2nlKdYdAEr3NNYdtP+wbn9gycbsXLvEojL8d5Vywrj?=
 =?us-ascii?Q?kJ/ZalwDe57C/FO7TsmVqeoLkkEJ6dTBVUcRePCjGC809ybp3f9h63RiNNeA?=
 =?us-ascii?Q?oEzSwmrTlFPQvyPWyJHL3PQ0FcYECoo7aLDTQ3iyuH6yCUq3bnAEMaMVVjIo?=
 =?us-ascii?Q?IBqJ5bg9BUDHQh4OpJPbtcyBSgtQfTFr6cMc9WfRLVolrXfACrSGc9PXb5Ef?=
 =?us-ascii?Q?u/JMQOXYKnx1hWSDOXvOD4w2BqSO8o0bW4qyUsA9V7im8t6oe9mvkEZuTJpb?=
 =?us-ascii?Q?pzdfXHl9NJdy6F7pqYig0pot/j+7afYNfcVJudVj3549mXIYTVDFZkqjlFNk?=
 =?us-ascii?Q?WDe2jY1onfXWTUrv6mg4T/0dLG0nG47I5DW+AdaI3Pmxh4onsa8ndlLRnD34?=
 =?us-ascii?Q?vP58kIYsaEeO9VBlSliSlNMH38c/zxY+X5f5twxDWKPxwUtFI7fxyY8MSpQt?=
 =?us-ascii?Q?Vn6PPnNbcIas2xGektoqYta0828BmDWI4mQWvJfky5LSPMRI0CUFMvecvfxB?=
 =?us-ascii?Q?vq6iiEbH3bL/j+Q628kJPZ3exvcKxPJGhZyrNXY4ZHmXGJvUgOaRywOpJrrk?=
 =?us-ascii?Q?yz+LiJqEl44suw/g+Sjs/nUzaidK1hvBfJzIs/6ZE6yUV1pChsRB/BjaAPXx?=
 =?us-ascii?Q?Hy0HQGU7wdegFavXKd/K178kBleiyLxYacZB4Pt6ETIwirsFtDnnQrcWb1mp?=
 =?us-ascii?Q?WAopS48XOQCNqWiqyY1nNEU7gRyPLj3etKkd6WDsPhBiTuTF9K2jqpK7p9I+?=
 =?us-ascii?Q?wdPsrwGOCXLsX9JeTVzxw24N5+TE6DOPZ?=
X-Microsoft-Exchange-Diagnostics: 1; BY1PR0501MB1109;
 6:qY8LDc7zBbmnDEk2VDTsGc/HgcDMezpLUFzlw+qwNGt568gAU9nMQQKAVQaa5riDimqbzW1rBFC4jYasDCYCrfI65kpP45/13QX4NQW20pOHyvejn68RTtnlS5goESKYJi34DFEiNcYzKhJjHd7ORyWoQf74x2lbeo6fYVYKih1Et/h4KRKXJxvdAFpkeDJV46D8aLkMzFG0DkO6LQoJAdbJNyLdriQ9Gpj/IuydAXtIU1pT2BUvTN6lQnlRoAyI1AuKBioB+v7H2tDwPfnNybBUZr2NHHeROcu6NdgRKQTW+OSkfRmHqljK9cuhotcznXFwKZp6qgzhmyHG8M97AkxIR8ol2dWEaVhAPcsFrSZCfpKckBrprIx8OVYmSp8582VyFmv5eYIIkrWoaAApDcEtxskak3RA26s4alv9LFo35ryvBGYmwSVLhMcKRcNGS0j8x19Id2pCb/Ga/X5Krw==;
 5:OZPbTr4fMeRJ9wq1fw5pPZzfxCzZGJv6ZjYuxneJtSW3bw5R7EKdF3ylfVNNQMqp9kH7afGrIEGw0JWomuPqhFoecLPlBYldpXEv2UIG8C9bXIsA1QaqkaZidNJCodkeGEGaGrWgalHbzcr5dfPP/A==;
 24:/3Se69i1nM8uR0B6UZUErGgaua8PkdXdl9//vjf8EvQZxM0YM1Qbux89Z7KyMWJkyEURZktJWPMJf31PZhI7pEhv6dIsyPi7q/rf3TGh/CI=
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-Microsoft-Exchange-Diagnostics: 1; BY1PR0501MB1109;
 7:JCAHkLhqbwha/8jc3tswCD+CVIpHwdG1aa2yYTUpFRPE0AKfAjU0LSDVb/ibU4nOJOzf18dfzQfzuS7/HOgyCeudb00Pay5NeF1Lx8HBx6tlSbS64YjvvhuEHeRn+8GQIBjRaJdKvZplRkBsXDbRBXB/8cvZtUGE8BorFvaT4jfPSOgzuz1YkdsxAF81F5iJMxGP2ByYw0/3HFrIUVbhZElWuBAXhbq9zSLs4W0tFrUKcMl32SHIeBrLlW/C+TxNHuy5z8FBJOkmD+G0h1YIt2e4jHdKUApBSH+XLmWJ94sJ3EHAoXKAcg0RvBTRwCarWraopU/PefryLwX7Lt1agQ==;
 20:lb2IcOAyn3s60wYWFwC8D59ma4AjWmXnRJj0pXYTtd26Waqe8qjYb0D7u95Jf1WnzlF7MmxMgGyk06gVIgX51HmlrJpQ9y83CjKVeujQu1fDcvlRgQgMVXqYc6t00PPb8pHKAbxDoGgCqHJor2sFmiawUsj+7iMTe1wHdD6ZXrY=
X-OriginatorOrg: ksu.edu
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Apr 2017 20:21:24.1899 (UTC)
X-MS-Exchange-CrossTenant-Id: d9a2fa71-d67d-4cb6-b541-06ccaa8013fb
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=d9a2fa71-d67d-4cb6-b541-06ccaa8013fb; Ip=[129.130.18.151];
 Helo=[ome-vm-smtp1.campus.ksu.edu]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY1PR0501MB1109
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2017 20:21:28 -0000

Hello!

To start, I'm cross-posting to freebsd-hackers@ and freebsd-standards@,
since it seems to pertain to both as a question of how strictly we follow
the standards, as well as potential approach. The following e-mail will
somewhat outline my questions, then my personal opinion.

== Almost objective, obviously biased stuff ==

The first question we must answer- is it strictly necessary necessary that
we maintain a separate library for gnuregex, or would it be
feasible/desirable to extend libc/regex to include GNU extensions?

There's obvious benefits to both, but the former (a drop-in replacement for
libgnuregex) seems like it's going to be more difficult to find. We only
have two base-consumers of libgnuregex (at the moment), but one must
consider the potential other consumers since this doesn't seem to be a
private library.

On the other hand, I think I could fairly easily implement most of these
into libc/regex. Here's a summary of what this option entails adding to
libc/regex, from what I've found:

* Empty subexpressions(*)
* Add missing quantifiers to BREs: \?, \+
* Add branching to BREs: \|
* Add backreferences (\1 through \9) to EREs
* Add \w, \W, \s, and \S corresponding to [[:alnum:]], [^[:alnum:]],
[[:space:]], and [^[:space:]] respectively
* Add word boundaries and anchors:
** \b: word boundary
** \B: not word boundary
** \<: Strt of word
** \>: End of word
** \`: Start of subject string
** \': End of subject string

(*) I didn't actually find anything explicitly stating this as a GNU
extension, but it's certainly not conformant to POSIX specifications to
use, it gets used a tiny bit in some ports, and we implement a workaround
in bsdgrep(1) for the simplest case of empty expressions ("") to match
everything and produce zero length matches.

The main benefit of this is not having to maintain a completely separate
regex parser and the potential for inconsistencies that come along with it.
The downside is that that would seem to promote expressions that are not
strictly POSIX conformant. Is this a problem? Is this a problem worth
worrying about?

== Opinion ==

My personal opinion is that we should go the latter route and implement
these features into libc/regex as a default behavior. Perhaps with a flag
or something so that an application *could* opt out of GNU extensions
("strict POSIX" type of flag) if it so chooses or finds them undesirable,
but that may not be deemed necessary.

Ultimately, the GNU extensions are just that- extensions. There's no direct
harm that I can think of in accepting them in our libc, and they do indeed
provide some sensible features with little cost added to our current
implementation. I'd personally like to have one parser that does it all so
that when a regex-parsing bug does come in, there's no initial triage *at
all* of whether it's a gnuregex bug or a libc/regex bug.

Thoughts? What all have I missed?

Thanks,

Kyle Evans

From owner-freebsd-hackers@freebsd.org  Tue Apr 11 22:10:38 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0553FD3ADB0
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue, 11 Apr 2017 22:10:38 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (mail.torek.net [96.90.199.121])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "elf.torek.net", Issuer "elf.torek.net" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id D59A3957
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 22:10:37 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (localhost [127.0.0.1])
 by elf.torek.net (8.15.2/8.15.2) with ESMTPS id v3BMAVhu093703
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Tue, 11 Apr 2017 15:10:31 -0700 (PDT)
 (envelope-from torek@elf.torek.net)
Received: (from torek@localhost)
 by elf.torek.net (8.15.2/8.15.2/Submit) id v3BMAVSe093702;
 Tue, 11 Apr 2017 15:10:31 -0700 (PDT) (envelope-from torek)
Date: Tue, 11 Apr 2017 15:10:31 -0700 (PDT)
From: Chris Torek <torek@elf.torek.net>
Message-Id: <201704112210.v3BMAVSe093702@elf.torek.net>
To: f.v.anton@gmail.com, freebsd-hackers@freebsd.org
Subject: Re: On COW memory mapping in d_mmap_single
In-Reply-To: <CANXdjjZrjxhbqhZ13sAuZP7cqpvYU8CJusQ2NEpGuRCVMgr0=g@mail.gmail.com>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2
 (elf.torek.net [127.0.0.1]); Tue, 11 Apr 2017 15:10:31 -0700 (PDT)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2017 22:10:38 -0000

>Yes, all vCPUs are locked before calling mmap(). I agree that we don't
>need 'COW', as long as we keep all vCPUs locked while we copy the
>entire VM memory. But this might take a while, imagine a VM with 32GB
>or more of RAM. This will take maybe minutes to write to disk, so we
>don't actually want the VM to be freezed for so long. That's the
>reason we'd like to map the memory COW and then unlock vCPUs.

You'll need to save the device state while holding the CPUs locked,
too, so that the virtio queues can be in sync when you restore.


>It's a OBJT_DEFAULT. It's not a device object, it's the memory object
>given to guest to use as physical memory.

Your copy code path is basically a simplified vm_map_copy_entry()
as called from vmspace_fork() for the MAP_INHERIT case.  But if
these are OBJT_DEFAULT, shouldn't you be calling vm_object_collapse()?
See https://github.com/flaviusanton/freebsd/blob/bhyve-save-restore/sys/vm/vm_map.c#L3170
(Maybe src_object->handle is never NULL?  There are several things
in the VM object code that I do not understand fully here, so this
might be the case.)

>>Next, how do you undo the damage done by your 'COW' ?

>This is one thing that we've thought about, but we don't have a
>solution for now. I agree it is very important, though. I figured that
>it might be possible to 'unmark' the memory object as COW with some
>additional tricks.

I think you may be better off doing actual vm_map_copy_entry()
calls.

I am assuming, here, that snapshot-saving is implemented by
sending a request to the running bhyve, which spins off a thread
or process that does the snapshot-save.  If you spin it off as
a real process, i.e., do a fork(), you will get the existing
VM system to do all the work for you.  The overall strategy
then looks something like this:

    handle_external_suspend_or_snapshot_request() {
        set global suspending flag /* if needed */
        stop all vcpus
        signal virtio and emulated devices to quiesce, if needed
        if (snapshot) {
            open snapshot file
            pid = fork()
            if (pid == 0) { /* child */
                COW is now in effect on memory: save more-volatile
                    vcpu and dev state
                pthread_cond_signal parent that it's safe to resume
                save RAM state
                close snapshot file
                _exit(0)
            }
	    if (pid < 0) ... handle error ...
            /* parent */
	    close snapshot file
            wait for child to signal OK to resume
        } else {
            wait for external resume signal
        }
        clear suspending flag
        resume devices and vcpus
    }

To resume a snapshot from a file, we load its state and then run
the last two steps (clear suspending flag and resume devices and
vcpus).

This way all the COW action happens through fork(), so there is no
new kernel side code required

(Frankly, I think the hard part here is saving device and virtual
APIC state.  If you have the vlapic state saving working, you have
made pretty good progress.)

Chris

From owner-freebsd-hackers@freebsd.org  Tue Apr 11 23:11:06 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id C627CD3A0D1
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Tue, 11 Apr 2017 23:11:06 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (mail.torek.net [96.90.199.121])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "elf.torek.net", Issuer "elf.torek.net" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id B27191B24
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 23:11:06 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (localhost [127.0.0.1])
 by elf.torek.net (8.15.2/8.15.2) with ESMTPS id v3BNB45w094086
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Tue, 11 Apr 2017 16:11:04 -0700 (PDT)
 (envelope-from torek@elf.torek.net)
Received: (from torek@localhost)
 by elf.torek.net (8.15.2/8.15.2/Submit) id v3BNB4fc094085;
 Tue, 11 Apr 2017 16:11:04 -0700 (PDT) (envelope-from torek)
Date: Tue, 11 Apr 2017 16:11:04 -0700 (PDT)
From: Chris Torek <torek@elf.torek.net>
Message-Id: <201704112311.v3BNB4fc094085@elf.torek.net>
To: ablacktshirt@gmail.com, imp@bsdimp.com
Subject: Re: Understanding the FreeBSD locking mechanism
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, rysto32@gmail.com
In-Reply-To: <4768e26a-cdec-6f40-1463-ece9847ca34d@gmail.com>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2
 (elf.torek.net [127.0.0.1]); Tue, 11 Apr 2017 16:11:04 -0700 (PDT)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Apr 2017 23:11:06 -0000

>The difference between the "ithread" and "interrupt filter" things
>is that ithread has its own thread context, while interrupt handling 
>through interrupt filter shares the same kernel stack.

Right -- though rather than "the same" I would just say "shares
a stack", i.e., we're not concerned with *whose* stack and/or
thread we're borrowing, just that we have one borrowed.

>So, for ithread, we should use the MTX_DEF, which don't disable
>interrupt, and for "interrupt filter", we should use the MTX_SPIN, which
>disable interrupt.

Right.

>What really confuses me is that I don't really see how owning an
>"independent" thread context(i.e ithread) makes a thread run in the 
>"top-half" and how sharing the same kernel stack makes a thread run in
>the "bottom-half".

It's not that it *makes* it run that way, it's that it *allows* it
to run that way -- and then the scheduler *does* run it that way.

>I did read your long explanation in the previous mail. For the non-SMP
>case, the "top-half/bottom-half" model goes well and I understand how 
>the *code* path/*data* path things go. But I cannot still fully
>understand the model for the SMP case.

It's fundamentally fairly tricky, but we start with that same first
notion:

 * If you have your own state (i.e., stack), you can be suspended
   (stopped in the scheduler, giving the CPU to other threads):
   *your* (private) state is preserved on *your* (private) stack.

 * If you have borrowed someone else's state, anything that suspends
   you, suspends them too.  Since this may deadlock, you are not
   allowed to do it at all.

Once we block interrupts locally (as for MTX_SPIN, or
automatically inside a filter style or "bottom half" interrupt),
we are in a special state: we may not take *any* MTX_DEF locks at
all (the kernel should panic if we do).

This in turn means that data structures are protected *either* by
a spin mutex *or* by a default (non-spin) mutex, never both.  So
if you need to touch a spin-mutex data structure from thread-y
("top half") code, you obtain the spin mutex, and now no interrupts
will occur *on this CPU*, and as a key side effect, you won't move
*off* this CPU either.  If an interrupt occurs on another CPU and
it goes to take the spin lock that protects that CPU, it loops
at that point, not switching tasks, waiting for the MTX_SPIN mutex
to be released:

       CPU 1                          CPU 2
    ----------------------------|-----------------------------
    func() {                    | ... code not involving mtx
        mtx_lock_spin(&mtx);    | ...
        do some work            |    mtx_lock_spin(&mtx); /* loops */
             .                  |        [stuck]
             .                  |        [stuck]
             .                  |        [stuck]
       mtx_unlock_spin(&mtx);   |        [unstuck]
             ...                |        do some work

If an interrupt occurs on CPU 2, and that interrupt-handling code
wants to touch the data protected by the spin lock, that code
obtains the spin lock as usual.  Meanwhile the interrupt *cannot*
occur on CPU 1, as holding the spin lock has blocked interrupts.
So the code path on CPU 2 blocks -- looping in mtx_lock_spin(),
not giving CPU 2 over to the scheduler -- for as long as CPU 1
holds the spin lock.  The corresponding code path is already
blocked on CPU 1, the same way it was back in the non-SMP, single-
CPU days.

This means it is unwise to hold spin locks for long periods.  In
fact, if CPU 2 waits too long in that [stuck] section, it will
panic, on the assumption that CPU 1 has done something terrible
and the system is now hung.

This is also waht gives rise to the constrant that you must take
MTX_SPIN locks "inside" any outer MTX_DEF locks.

Chris

From owner-freebsd-hackers@freebsd.org  Wed Apr 12 00:10:58 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id BB73CD3A509
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Wed, 12 Apr 2017 00:10:58 +0000 (UTC)
 (envelope-from cse.cem@gmail.com)
Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3])
 by mx1.freebsd.org (Postfix) with ESMTP id 9BB3CBD0
 for <freebsd-hackers@freebsd.org>; Wed, 12 Apr 2017 00:10:58 +0000 (UTC)
 (envelope-from cse.cem@gmail.com)
Received: by mailman.ysv.freebsd.org (Postfix)
 id 98005D3A508; Wed, 12 Apr 2017 00:10:58 +0000 (UTC)
Delivered-To: hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 979D4D3A507
 for <hackers@mailman.ysv.freebsd.org>; Wed, 12 Apr 2017 00:10:58 +0000 (UTC)
 (envelope-from cse.cem@gmail.com)
Received: from mail-ua0-f176.google.com (mail-ua0-f176.google.com
 [209.85.217.176])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 5B4E2BCF
 for <hackers@freebsd.org>; Wed, 12 Apr 2017 00:10:57 +0000 (UTC)
 (envelope-from cse.cem@gmail.com)
Received: by mail-ua0-f176.google.com with SMTP id q26so7748926uaa.0
 for <hackers@freebsd.org>; Tue, 11 Apr 2017 17:10:57 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:reply-to:in-reply-to:references
 :from:date:message-id:subject:to:cc;
 bh=13A4uRpsv4+eJqty5OBONh8XJhB+haky4HDmP7qTYn4=;
 b=d5PkPFbrnmO6naA0+o6+D3RMyViH1UCdwk7sZM3YxRDBGVzbrlpac/OYByhMUqBjQz
 RoJcgGC6yx4rNKhi194NwpdNoFYgbnnRGX5FjUD9/vW0JL32lvVbkFV1ygrlETC7QKgE
 JPPR0W5i91RHEeumPVnZ4NK6Qgts1Owe5h7nc0aVpFLGrp0oyxc2zzeuF9gA34ZY858p
 FH9fdV9n0gtz+ZJY7Ct5lMCXbYDqY2NuvYcGmC7buS9JsY2L2My7u6wpOsxe17KI+BIs
 I6J2Pi9bdRE7FhlEIH63Z9CwiklL9vlQfx+e3p1p0WquhZ4o0pSljeyjcehIV8H0AvbX
 Gygw==
X-Gm-Message-State: AN3rC/4120Bn8fwi/JDZCCCPYRXjLyU/BKt8FVtGi6FKFpqihPoL47PJ61OEU9ueFt3t9Q==
X-Received: by 10.176.80.65 with SMTP id z1mr122700uaz.99.1491954434998;
 Tue, 11 Apr 2017 16:47:14 -0700 (PDT)
Received: from mail-ua0-f169.google.com (mail-ua0-f169.google.com.
 [209.85.217.169])
 by smtp.gmail.com with ESMTPSA id 21sm4824243vkg.38.2017.04.11.16.47.14
 for <hackers@freebsd.org>
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 11 Apr 2017 16:47:14 -0700 (PDT)
Received: by mail-ua0-f169.google.com with SMTP id u103so7508393uau.1
 for <hackers@freebsd.org>; Tue, 11 Apr 2017 16:47:14 -0700 (PDT)
X-Received: by 10.159.32.163 with SMTP id 32mr139999uaa.160.1491954434017;
 Tue, 11 Apr 2017 16:47:14 -0700 (PDT)
MIME-Version: 1.0
Reply-To: cem@freebsd.org
Received: by 10.103.13.3 with HTTP; Tue, 11 Apr 2017 16:47:13 -0700 (PDT)
In-Reply-To: <CD5E9B03-6147-4E4D-BED6-6C45022051E3@chrisbowman.com>
References: <CD5E9B03-6147-4E4D-BED6-6C45022051E3@chrisbowman.com>
From: Conrad Meyer <cem@freebsd.org>
Date: Tue, 11 Apr 2017 16:47:13 -0700
X-Gmail-Original-Message-ID: <CAG6CVpV7WwqBXZs+78Q3xak6UjjBG1X+WiAPJOnTx1V17DpWEw@mail.gmail.com>
Message-ID: <CAG6CVpV7WwqBXZs+78Q3xak6UjjBG1X+WiAPJOnTx1V17DpWEw@mail.gmail.com>
Subject: Re: Dtrace oddity
To: Christopher Bowman <crb@chrisbowman.com>
Cc: "freebsd-hackers@freebsd.org" <hackers@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Apr 2017 00:10:58 -0000

On Tue, Apr 11, 2017 at 12:16 AM, Christopher Bowman
<crb@chrisbowman.com> wrote:
> Here is the oddity: when I run Dtrace and then run my test program I get the following output from Dtrace:
>
> crb@retread:60> dtrace -n 'syscall:freebsd:mmap:entry /execname == "test"/ {}'
> dtrace: description 'syscall:freebsd:mmap:entry ' matched 1 probe
> CPU     ID                    FUNCTION:NAME
>   0  63401                       mmap:entry
>   0  63401                       mmap:entry
>   0  63401                       mmap:entry
>   0  63401                       mmap:entry
>   0  63401                       mmap:entry
>   0  63401                       mmap:entry
>   0  63401                       mmap:entry
>   0  63401                       mmap:entry
>   0  63401                       mmap:entry
>   0  63401                       mmap:entry
>   0  63401                       mmap:entry
>   0  63401                       mmap:entry
>
> I think Dtrace is indicating that the mmap syscall was called 12 times by my test program yet I can see how the program below would have done that.

A configuration file for dynamic linking is mapped; libc needs to be
mapped (several different regions); jemalloc sets up some memory for
allocations with anonymous mmap.  So this is not unreasonable as part
of crt0 / program startup.

Best,
Conrad

From owner-freebsd-hackers@freebsd.org  Wed Apr 12 02:32:29 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3E495D3A59C
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Wed, 12 Apr 2017 02:32:29 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-pf0-x242.google.com (mail-pf0-x242.google.com
 [IPv6:2607:f8b0:400e:c00::242])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 0E24BBC9
 for <freebsd-hackers@freebsd.org>; Wed, 12 Apr 2017 02:32:29 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-pf0-x242.google.com with SMTP id c198so2482863pfc.0
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 19:32:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=JpJKDgntHWxojUVtXwn4X9vTiZKDsMvcHbvnhPGm9Lk=;
 b=TARhOEexxZhw3TGDnXFx/XwLmyLp1+sSKwmIkN9E3a7BJdepmNCD/xS3QALEf0gXcX
 4KSVbpHZ1tzCkqsneJpe4OmxTlIdIDWqc7EqftY860WYQ3k0E+2O4on0IciGTDl3Knm8
 TwU9VOQcntr47eHq8GvFyqHUkLtQN1GgujLVdZA1DiZAme5mcbM09I5o3WqlP15xItLB
 ytMTNde4w782ALEashNHOomhgBqBqf/FnZeKoCZLsO3Ok9IvNxSLlcbs6lu39ip2kVaY
 HchglRzjDo7UxQko5BUdpHC7Eun1Q6QbklRmdbqiCLYjXCmr/3pTlMLarw3K7+ILtQ4V
 JtFQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=JpJKDgntHWxojUVtXwn4X9vTiZKDsMvcHbvnhPGm9Lk=;
 b=ZNvVAnBGn9GnlUVQyQGFBB/1hmNiUjG3jsiXvRBGOXIawKrLwITKQfv6b+N9ZjuO/N
 L1VFMNighpBW1A5aiA2Hlw5TB306o0KHSR7FyRVSRDV60eZfOb9Ki5IqS2RPv6UyBiX+
 QVcJRoULkhJUArQu9UhKAElmcazImwNBJv+ZI10IlvkrwkrK+wiXyEjwVL2njiCCWvH3
 a9B8zpF8LcCLopFwmeqlynBrM1cc4thO3RSbdkM/sU4nvdL0PN6if+gWhXxEmjoegKlw
 1q+ijWYi+oGWMs00qKggRkOaa8VMNYf0uI5WUdLIrwP/zv0i7n/QyXtf0KSAV3svt078
 Tr+A==
X-Gm-Message-State: AFeK/H189taxi12rLIYis3AzSfSELyd6HjAWtWeQFDGLalQjpQ3QqYFZxjxPSrs3EE8z2A==
X-Received: by 10.99.94.66 with SMTP id s63mr62735071pgb.34.1491964348499;
 Tue, 11 Apr 2017 19:32:28 -0700 (PDT)
Received: from [192.168.2.211] ([116.56.129.146])
 by smtp.gmail.com with ESMTPSA id r17sm32995928pgg.19.2017.04.11.19.32.25
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 11 Apr 2017 19:32:27 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Chris Torek <torek@elf.torek.net>, imp@bsdimp.com
References: <201704112311.v3BNB4fc094085@elf.torek.net>
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, rysto32@gmail.com
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <99e3673e-d490-faef-359d-c6ec8a36ee0c@gmail.com>
Date: Wed, 12 Apr 2017 10:32:18 +0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <201704112311.v3BNB4fc094085@elf.torek.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Apr 2017 02:32:29 -0000

On 2017年04月12日 07:11, Chris Torek wrote:
>> The difference between the "ithread" and "interrupt filter" things
>> is that ithread has its own thread context, while interrupt handling
>> through interrupt filter shares the same kernel stack.
>
> Right -- though rather than "the same" I would just say "shares
> a stack", i.e., we're not concerned with *whose* stack and/or
> thread we're borrowing, just that we have one borrowed.
>
>> So, for ithread, we should use the MTX_DEF, which don't disable
>> interrupt, and for "interrupt filter", we should use the MTX_SPIN, which
>> disable interrupt.
>
> Right.
>
>> What really confuses me is that I don't really see how owning an
>> "independent" thread context(i.e ithread) makes a thread run in the
>> "top-half" and how sharing the same kernel stack makes a thread run in
>> the "bottom-half".
>
> It's not that it *makes* it run that way, it's that it *allows* it
> to run that way -- and then the scheduler *does* run it that way.
>
>> I did read your long explanation in the previous mail. For the non-SMP
>> case, the "top-half/bottom-half" model goes well and I understand how
>> the *code* path/*data* path things go. But I cannot still fully
>> understand the model for the SMP case.
>
> It's fundamentally fairly tricky, but we start with that same first
> notion:
>
>  * If you have your own state (i.e., stack), you can be suspended
>    (stopped in the scheduler, giving the CPU to other threads):
>    *your* (private) state is preserved on *your* (private) stack.
>
>  * If you have borrowed someone else's state, anything that suspends
>    you, suspends them too.  Since this may deadlock, you are not
>    allowed to do it at all.

clear. How can I distinguish these two conditions? I mean, whether I
am using my own state/stack or borrowing others' state.

> Once we block interrupts locally (as for MTX_SPIN, or
> automatically inside a filter style or "bottom half" interrupt),
> we are in a special state: we may not take *any* MTX_DEF locks at
> all (the kernel should panic if we do).
>
> This in turn means that data structures are protected *either* by
> a spin mutex *or* by a default (non-spin) mutex, never both.  So
> if you need to touch a spin-mutex data structure from thread-y
> ("top half") code, you obtain the spin mutex, and now no interrupts
> will occur *on this CPU*, and as a key side effect, you won't move
> *off* this CPU either.  If an interrupt occurs on another CPU and
> it goes to take the spin lock that protects that CPU, it loops
> at that point, not switching tasks, waiting for the MTX_SPIN mutex
> to be released:
>
>        CPU 1                          CPU 2
>     ----------------------------|-----------------------------
>     func() {                    | ... code not involving mtx
>         mtx_lock_spin(&mtx);    | ...
>         do some work            |    mtx_lock_spin(&mtx); /* loops */
>              .                  |        [stuck]
>              .                  |        [stuck]
>              .                  |        [stuck]
>        mtx_unlock_spin(&mtx);   |        [unstuck]
>              ...                |        do some work
>
> If an interrupt occurs on CPU 2, and that interrupt-handling code
> wants to touch the data protected by the spin lock, that code
> obtains the spin lock as usual.  Meanwhile the interrupt *cannot*
> occur on CPU 1, as holding the spin lock has blocked interrupts.
> So the code path on CPU 2 blocks -- looping in mtx_lock_spin(),
> not giving CPU 2 over to the scheduler -- for as long as CPU 1
> holds the spin lock.  The corresponding code path is already
> blocked on CPU 1, the same way it was back in the non-SMP, single-
> CPU days.

Things become clearer now. Thanks for your reply.
If I understand correctly, which kind of lock should be used depends on
which thread model(i.e "thread filter" or "ithread") we use. If I want
to use a lock, I must know in advance which kind of thread model I am
in, otherwise the interrupt handling code might cause you deadlock or
kernel panic. The problem is, how can I tell which thread model I am
in? I am not so clear about the thread model things and scheduling
code of FreeBSD...

> This means it is unwise to hold spin locks for long periods.  In
> fact, if CPU 2 waits too long in that [stuck] section, it will
> panic, on the assumption that CPU 1 has done something terrible
> and the system is now hung.
>
> This is also waht gives rise to the constrant that you must take
> MTX_SPIN locks "inside" any outer MTX_DEF locks.

What do you mean by "must take MTX_SPIN locks 'inside' any outer
MTX_DEF locks?

Regards,
Yubin Ruan



From owner-freebsd-hackers@freebsd.org  Wed Apr 12 03:57:19 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 16F0DD37B11
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Wed, 12 Apr 2017 03:57:19 +0000 (UTC)
 (envelope-from crb@chrisbowman.com)
Received: from mail-pg0-x231.google.com (mail-pg0-x231.google.com
 [IPv6:2607:f8b0:400e:c05::231])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id E20D9A3E
 for <freebsd-hackers@freebsd.org>; Wed, 12 Apr 2017 03:57:18 +0000 (UTC)
 (envelope-from crb@chrisbowman.com)
Received: by mail-pg0-x231.google.com with SMTP id 81so8220328pgh.2
 for <freebsd-hackers@freebsd.org>; Tue, 11 Apr 2017 20:57:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=chrisbowman-com.20150623.gappssmtp.com; s=20150623;
 h=from:message-id:mime-version:subject:date:in-reply-to:cc:to
 :references; bh=qq6KklBUCgjaidDLjYU4IdnW+9X6goX8WPcZ5iJ03lg=;
 b=VEiOU9624Ii7TLINxtJ7dTrnY/ZX9IkM1EzTPbidmVRO0xv0yLmQbpjKWEhnAfqpW3
 60dv1yAVnFHlueq+8PGhVPHCLxVBAeMW2IJOy5OUJpH37MHLuE7o6xyrb84msYQTkXCy
 aS0jPYVtZybGmDzi8tAe7Z8hroJtGF9CF+4JW8wIv9AES88x3Ge8Xu0UrYPbwBfsg47y
 q0+QVzeTX5pUAwmlGgLdrytSe2VpTsZG6gVRYMrmFLBvyoMHJN3jWr1vdSG45fXAPlRk
 hC09LK3KtZjpyV33jT/696PmR9se/2GqhpZWBVUFAJIdQuHTkxBCsgwTA06Dx6dJ3h/2
 0+XQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:message-id:mime-version:subject:date
 :in-reply-to:cc:to:references;
 bh=qq6KklBUCgjaidDLjYU4IdnW+9X6goX8WPcZ5iJ03lg=;
 b=hdcG9FZ+kjXr7x6Ir6mzwBwh7xAn1GF4OPNqvtwxn774YzB+RBVkB4igGQjHVuUcp4
 cZVY+JA+rv7iMp6lN1xx3FUliUnLbCQntMpLJLwdQY5/dM3fkid/a2FiugrlHK4nipL3
 pPqbkz930YNariILzPAHcCYCMGMnYI+JO4QMb5elxodNYPJB8UfXTuLVT2P1/66O6Nba
 hbI7tpjxkpfn8m5zcQYZg9NjcyjqWz2r+vDyo6DavGIXN6LDYzqcrhYkAdpUpdcJmkvC
 byz10BPyfH5KUXnUBkC31iHFqskqh4ics882DeugDIsyRZZ0gYqdzRr10fnvpkykZZGK
 g9aw==
X-Gm-Message-State: AFeK/H2dxjjkMDg7GSTNRty2SJ+G99irTbqrioaAiXu991JK70V/BFHLTI9R8NSy7c0+Zg==
X-Received: by 10.99.97.12 with SMTP id v12mr65838575pgb.124.1491969438431;
 Tue, 11 Apr 2017 20:57:18 -0700 (PDT)
Received: from ?IPv6:2601:647:4e00:bbb5:a1e9:e0d1:714c:d747?
 ([2601:647:4e00:bbb5:a1e9:e0d1:714c:d747])
 by smtp.gmail.com with ESMTPSA id v86sm33094945pfa.86.2017.04.11.20.57.17
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 11 Apr 2017 20:57:17 -0700 (PDT)
From: "Christopher R. Bowman" <crb@chrisbowman.com>
X-Google-Original-From: "Christopher R. Bowman" <crb@ChrisBowman.com>
Message-Id: <15DF9D2C-40A4-4341-AE7E-E8A776ED3F09@ChrisBowman.com>
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: Dtrace oddity
Date: Tue, 11 Apr 2017 20:57:16 -0700
In-Reply-To: <20170411151426.3b760182@fabiankeil.de>
Cc: freebsd-hackers@freebsd.org
To: Fabian Keil <freebsd-listen@fabiankeil.de>
References: <CD5E9B03-6147-4E4D-BED6-6C45022051E3@chrisbowman.com>
 <20170411151426.3b760182@fabiankeil.de>
X-Mailer: Apple Mail (2.3273)
Content-Type: text/plain;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Apr 2017 03:57:19 -0000

Fabian,
	That was hugely helpful.  I should have known about the extra =
mmap sys calls, but sometimes your mind only sees what it expects to =
see.  Checking for negative values on open is also the right thing to do =
(I had mis-read the man page to imply that zero indicated a failure to =
open).  But the real help was putting one of the flags for mmap.  I =
don=E2=80=99t think FreeBSD used to check for that as I have a vague =
recollection that this code used to work on a pervious version.
Thanks SO SO much for the help!
Christopher

--------
Christopher R. Bowman
email: crb@ChrisBowman.com
World Wide GSM cell: +1 (408) 476-2299

> On Apr 11, 2017, at 6:14 AM, Fabian Keil =
<freebsd-listen@fabiankeil.de> wrote:
>=20
> Christopher Bowman <crb@chrisbowman.com> wrote:
>=20
>> The man page lists a bunch of reasons for EINVAL so I want to
>> investigate this and I don=E2=80=99t quite know good strategies to =
debug the
>> kernel (yet) so I thought I=E2=80=99d experiment with Dtrace a bit.  =
Here is the
>> oddity: when I run Dtrace and then run my test program I get the
>> following output from Dtrace:
>>=20
>> crb@retread:60> dtrace -n 'syscall:freebsd:mmap:entry /execname =3D=3D =
"test"/ {}'
>> dtrace: description 'syscall:freebsd:mmap:entry ' matched 1
>> probe CPU     ID                    FUNCTION:NAME
>>  0  63401                       mmap:entry=20
>>  0  63401                       mmap:entry=20
>>  0  63401                       mmap:entry=20
>>  0  63401                       mmap:entry=20
>>  0  63401                       mmap:entry=20
>>  0  63401                       mmap:entry=20
>>  0  63401                       mmap:entry=20
>>  0  63401                       mmap:entry=20
>>  0  63401                       mmap:entry=20
>>  0  63401                       mmap:entry=20
>>  0  63401                       mmap:entry=20
>>  0  63401                       mmap:entry=20
>>=20
>> I think Dtrace is indicating that the mmap syscall was called 12 =
times
>> by my test program yet I can see how the program below would have =
done
>> that.
>=20
> A bunch of mmap syscalls occur before main is even entered.
> Try running your program with truss to see what's going on.
>=20
>> Here is my program:
> [...]
>> 	printf("opening device %s\n", argv[1]);
>> 	int device =3D open (argv[1], O_RDWR);
>> 	if (device =3D=3D 0) {
>=20
> You should check for -1 here.
>=20
>> 	void *pa =3D mmap (0, 4095, PROT_READ | PROT_WRITE, 0, device, =
0);
>=20
> No flags? =46rom the mmap man page:
>=20
> |     [EINVAL]           None of MAP_ANON, MAP_PRIVATE, MAP_SHARED, or
> |                        MAP_STACK was specified.  At least one of =
these flags
> |                        must be included.
>=20
> Fabian


From owner-freebsd-hackers@freebsd.org  Wed Apr 12 07:55:37 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 73D27D3A1C8
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Wed, 12 Apr 2017 07:55:37 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (mail.torek.net [96.90.199.121])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "elf.torek.net", Issuer "elf.torek.net" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 51020F45
 for <freebsd-hackers@freebsd.org>; Wed, 12 Apr 2017 07:55:36 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (localhost [127.0.0.1])
 by elf.torek.net (8.15.2/8.15.2) with ESMTPS id v3C7tYdL016700
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Wed, 12 Apr 2017 00:55:34 -0700 (PDT)
 (envelope-from torek@elf.torek.net)
Received: (from torek@localhost)
 by elf.torek.net (8.15.2/8.15.2/Submit) id v3C7tYUH016699;
 Wed, 12 Apr 2017 00:55:34 -0700 (PDT) (envelope-from torek)
Date: Wed, 12 Apr 2017 00:55:34 -0700 (PDT)
From: Chris Torek <torek@elf.torek.net>
Message-Id: <201704120755.v3C7tYUH016699@elf.torek.net>
To: ablacktshirt@gmail.com, imp@bsdimp.com
Subject: Re: Understanding the FreeBSD locking mechanism
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, rysto32@gmail.com
In-Reply-To: <99e3673e-d490-faef-359d-c6ec8a36ee0c@gmail.com>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2
 (elf.torek.net [127.0.0.1]); Wed, 12 Apr 2017 00:55:34 -0700 (PDT)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Apr 2017 07:55:37 -0000

>clear. How can I distinguish these two conditions? I mean, whether I
>am using my own state/stack or borrowing others' state.

You choose it when you establish your interrupt handler.  If you
say you are a filter interrupt, then you *are* one, and the rest
of your code must be written as one.  Unless you know what you
are doing, don't do this, and then you *aren't* one and the rest
of your code can be written using the much more relaxed model.

>What do you mean by "must take MTX_SPIN locks 'inside' any outer
>MTX_DEF locks?

This means that any code path that is going to hold a spin-type
lock must obtain it while already holding any applicable non-spin
locks.  For instance, if we look at <sys/proc.h> we find these:

	#define	PROC_STATLOCK(p)	mtx_lock_spin(&(p)->p_statmtx)
	#define	PROC_ITIMLOCK(p)	mtx_lock_spin(&(p)->p_itimmtx)
	#define	PROC_PROFLOCK(p)	mtx_lock_spin(&(p)->p_profmtx)

Let's find a bit of code that uses one, such as in kern_time.c:

https://github.com/freebsd/freebsd/blob/master/sys/kern/kern_time.c#L338

(kern_clock_gettime()).  This code reads:

	case CLOCK_PROF:
		PROC_LOCK(p);
		PROC_STATLOCK(p);
		calcru(p, &user, &sys);
		PROC_STATUNLOCK(p);
		PROC_UNLOCK(p);
		timevaladd(&user, &sys);
		TIMEVAL_TO_TIMESPEC(&user, ats);
		break;

Note that the call to PROC_LOCK comes first, then the call to
PROC_STATLOCK.  This is because PROC_LOCK

https://github.com/freebsd/freebsd/blob/master/sys/sys/proc.h#L825

is defined as:

	#define	PROC_LOCK(p)	mtx_lock(&(p)->p_mtx)

If you obtain the locks in the other order -- i.e., if you grab
the PROC_STATLOCK first, then try to lock PROC_LOCK -- you are
trying to take a spin-type mutex while holding a default mutex,
and this is not allowed (can cause deadlock).  But taking the
PROC_LOCK first (which may block), then taking the PROC_STATLOCK
(a spin lock) "inside" the outer PROC_LOCK default mutex, is OK.

(This is one of my mild objections to macros like PROC_LOCK and
PROC_STATLOCK: they hide whether the mutex in question is a spin
lock.)

Incidentally, any time you take *any* lock while holding any
other lock (e.g., lock A, then lock B while holding A), you have
created a "lock order" in which A predeces B.  If some other
code path locks B first, then while holding B, attempts to lock
A, you get a deadlock if both code paths are running at the same
time.  The WITNESS code dynamically discovers these various orders
and warns you at run time if you have a "lock order reversal"
(a case where one code path does A-then-B while another does
B-then-A).

(This is, in a sense, the same problem as discovering whether
there is a loop in a directed graph, or whether this directed
graph is acyclic.  If you can force the graph to take the shape of
a tree, rather than the more general graph, there will never be
any loops in it, and you will never have lock order reversals.
And of course if you have only *one* lock for some data, there is
nothing to be reversed.  Not all lock order reversals are
guaranteed to lead to deadlock, but sorting out which ones are
really OK, and which are not, is ... challenging.)

Chris

From owner-freebsd-hackers@freebsd.org  Wed Apr 12 11:11:34 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id A8899D3BE55
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Wed, 12 Apr 2017 11:11:34 +0000 (UTC)
 (envelope-from f.v.anton@gmail.com)
Received: from mail-wm0-x230.google.com (mail-wm0-x230.google.com
 [IPv6:2a00:1450:400c:c09::230])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 61F3DF53;
 Wed, 12 Apr 2017 11:11:34 +0000 (UTC)
 (envelope-from f.v.anton@gmail.com)
Received: by mail-wm0-x230.google.com with SMTP id w204so17963605wmd.1;
 Wed, 12 Apr 2017 04:11:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc; bh=01JWzR2h3/k+MbRsSu7klNwFdmVaCbHD6sYRuRRfEPw=;
 b=KK597+ncNU3vNRkDpAZotL6Nlg1MvPfNeweqQG3OCChnWc+bZRWcQZV+xCvvEcbNzE
 t2ku03xGEPfstF5DXPu+pVZ1w4fv9gtbbXm9y9datLaHVQPmRK5dsSH0vqF9FI/APayo
 bm4zTTopn4gEm3Pbrl3kYLhy3s0Cc9UDW7rVCj4leU+e382ctSjL+bL8u1SArU6CdwJs
 USsHb4wYeZmxEo+u4KRI+nnygEyA2uJNBRNEWLOD68oTaEu77u6XwQlO794/88AUuX7h
 n6D9UXa05p3vc6yNlvMLanhLW+bl+wIG/pHlj80d9pY1YdE8d2eJzpcnb//T76/pMjEQ
 FnoQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc;
 bh=01JWzR2h3/k+MbRsSu7klNwFdmVaCbHD6sYRuRRfEPw=;
 b=nnbZdCG4sd1Vt2AK/kQeC91dw1TYkl9F+ivbrkUpUwKhBKww6m6dAndXPyJs/RjSFx
 LhXfEexY4CfQx/fyaS5/ueNLsBvrvD33x6t1Bq4r3q/1I3+8Iov9UmNemmXyuAIZqqRm
 RmrLG53opiQVy0rz8BspOW+5hvNwgEge1YM7I894gxxwtG4YW5Z2c+3r9VbC8+vhcNBz
 gLu6SIY/RyFwurVBPYTFY6KYqz1k+ykyZ5UEedqmim/Ezq0CCzYCvlynoXb7womqrKNO
 QFtmZai3Nq+iYptEyYxZ7mo2EsxF7nE8j6wtK1tjKeAgv1LC1txqpIS7aS3AtVsUS8h2
 agew==
X-Gm-Message-State: AN3rC/7nIYM0lYVjHBp4+TMIcNcTMJFzBfU+G1U0dXKLaw83JfpHd3jl
 hlOc6ubllGIfmkBmL58aLrPHLKCB/OAf
X-Received: by 10.28.6.203 with SMTP id 194mr20107190wmg.125.1491995491399;
 Wed, 12 Apr 2017 04:11:31 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.223.178.10 with HTTP; Wed, 12 Apr 2017 04:11:30 -0700 (PDT)
In-Reply-To: <201704112210.v3BMAVSe093702@elf.torek.net>
References: <CANXdjjZrjxhbqhZ13sAuZP7cqpvYU8CJusQ2NEpGuRCVMgr0=g@mail.gmail.com>
 <201704112210.v3BMAVSe093702@elf.torek.net>
From: Flavius Anton <f.v.anton@gmail.com>
Date: Wed, 12 Apr 2017 14:11:30 +0300
Message-ID: <CANXdjjbuRvh77zxmOyEkYKAeMsj7CEKYUCD1a4o72nPGz17-xA@mail.gmail.com>
Subject: Re: On COW memory mapping in d_mmap_single
To: Chris Torek <torek@elf.torek.net>, freebsd-hackers@freebsd.org
Cc: Peter Grehan <grehan@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Apr 2017 11:11:34 -0000

Hi Chris,

Thanks a lot for your answer. I've added Peter to CC, as he knows
about this ongoing project and some of the design decisions, like the
COW mapping, were already taken to some extent when I joined. Please
see my in-lined answers below.

On Wed, Apr 12, 2017 at 1:10 AM, Chris Torek <torek@elf.torek.net> wrote:
>>Yes, all vCPUs are locked before calling mmap(). I agree that we don't
>>need 'COW', as long as we keep all vCPUs locked while we copy the
>>entire VM memory. But this might take a while, imagine a VM with 32GB
>>or more of RAM. This will take maybe minutes to write to disk, so we
>>don't actually want the VM to be freezed for so long. That's the
>>reason we'd like to map the memory COW and then unlock vCPUs.
>
> You'll need to save the device state while holding the CPUs locked,
> too, so that the virtio queues can be in sync when you restore.

Yes, saving vCPU state, vlapic, ioapic etc is done with all vCPUs
locked. Memory, on the other hand, may be too large and take too much
time to copy. I am working right now on saving virtio queues and
device state.

>>It's a OBJT_DEFAULT. It's not a device object, it's the memory object
>>given to guest to use as physical memory.
>
> Your copy code path is basically a simplified vm_map_copy_entry()
> as called from vmspace_fork() for the MAP_INHERIT case.  But if
> these are OBJT_DEFAULT, shouldn't you be calling vm_object_collapse()?
> See https://github.com/flaviusanton/freebsd/blob/bhyve-save-restore/sys/vm/vm_map.c#L3170
> (Maybe src_object->handle is never NULL?  There are several things
> in the VM object code that I do not understand fully here, so this
> might be the case.)

I saw those functions: vm_map_copy_entry() and vm_object_collapse(),
but I didn't have enough understanding of the whole system to be able
to tell if they might do some other things that we don't want them to.
I'll read them again after this e-mail.

>>>Next, how do you undo the damage done by your 'COW' ?
>
>>This is one thing that we've thought about, but we don't have a
>>solution for now. I agree it is very important, though. I figured that
>>it might be possible to 'unmark' the memory object as COW with some
>>additional tricks.
>
> I think you may be better off doing actual vm_map_copy_entry()
> calls.
>
> I am assuming, here, that snapshot-saving is implemented by
> sending a request to the running bhyve, which spins off a thread
> or process that does the snapshot-save.  If you spin it off as
> a real process, i.e., do a fork(), you will get the existing
> VM system to do all the work for you.  The overall strategy
> then looks something like this:
>
>     handle_external_suspend_or_snapshot_request() {
>         set global suspending flag /* if needed */
>         stop all vcpus
>         signal virtio and emulated devices to quiesce, if needed
>         if (snapshot) {
>             open snapshot file
>             pid = fork()
>             if (pid == 0) { /* child */
>                 COW is now in effect on memory: save more-volatile
>                     vcpu and dev state
>                 pthread_cond_signal parent that it's safe to resume
>                 save RAM state
>                 close snapshot file
>                 _exit(0)
>             }
>             if (pid < 0) ... handle error ...
>             /* parent */
>             close snapshot file
>             wait for child to signal OK to resume
>         } else {
>             wait for external resume signal
>         }
>         clear suspending flag
>         resume devices and vcpus
>     }
>
> To resume a snapshot from a file, we load its state and then run
> the last two steps (clear suspending flag and resume devices and
> vcpus).
>
> This way all the COW action happens through fork(), so there is no
> new kernel side code required

This looks perfect to me, this was one of my first questions when I
joined. However, I am not sure if it's ok to fork the entire bhyve
memory space, I remember that I've seen some discussion about this,
that's why I CCed Peter. Right now we have a checkpoint thread that
listens for the checkpoint signal (via a UNIX socket), then it
proceeds to locking the CPUs, saving some state, requests COW mapping
(via ioctl), unlocks vCPUs and copy COW memory to a checkpoint file. I
haven't done anything about unmapping the COW entry yet.

> (Frankly, I think the hard part here is saving device and virtual
> APIC state.  If you have the vlapic state saving working, you have
> made pretty good progress.)

Thanks. I am almost sure it is not complete yet, but I have vlapic
state saved. Actually, I am able to restore VMs using a ramdisk and no
devices except the console. I'd like to open a pull request for review
as soon as possible, but in the meantime I started looking on virtio
devices and save/restore virtio-net too.

--
Flavius

From owner-freebsd-hackers@freebsd.org  Wed Apr 12 11:56:55 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id BD77AD39CE6
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Wed, 12 Apr 2017 11:56:55 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-pg0-x241.google.com (mail-pg0-x241.google.com
 [IPv6:2607:f8b0:400e:c05::241])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 8D80DB18
 for <freebsd-hackers@freebsd.org>; Wed, 12 Apr 2017 11:56:55 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-pg0-x241.google.com with SMTP id 79so4917595pgf.0
 for <freebsd-hackers@freebsd.org>; Wed, 12 Apr 2017 04:56:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=nxeN4e5qP1PDoFMH8fkHZssfH3T8E3KVVH241a9rSBg=;
 b=LNLPD/WmNWQtmgoKvmxgt86srQQ0XLlc9iY7hCfuj5CLSE6g6/J7EZBB9UgA1s9/M+
 LBY8PYw4l1npYhGw1kkAEB2RLT3MVSJpVVzwY1wTEPoB7UXJgmCPDdeYL9u803ztaai/
 r9GmAKh45Za6NL19X5qRJ33hf6TmCMFDpbq6azIYxoEcN+hZ18w8hfm7wLrNqpFVmrsN
 YTmZExkNP6HaLGwghpkyHtBS89342QdtV0GducSBJSi8/s12tr1FkPMz3mZvuXizTt8V
 AVOLfo6waeXkbfAmM4JzeA2gYxmmpwnGe59S0sFdF9KJtfhK+TbmB+60WjQT90NFtQfM
 COEA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=nxeN4e5qP1PDoFMH8fkHZssfH3T8E3KVVH241a9rSBg=;
 b=CGEsVTesrtebwm6gNBdRbTIylCgrX6jx6ytPmuOYFXgExwIhIKhVfglyI8nZojoHTa
 YrfAH8Xz1dmgNUQdcp2LyvQC0aeSUh21rpksusrEBKyrzS71WNdtGog+GUq0A7vLMXEF
 BPaq8KY2ps7B0y8NjbXTKlZWN3F1KRD2eJUOjPoYRMnbyr6OICinzVT/v0KxuVf5X7Pl
 BlGsyWm3/tQC4r7zz2+VSWNVirg6xJO6bLIRRQfacSe2cGB6pbzr/JT4PV8vbO2UzuJQ
 z6R+KnV3O/IuMzYFbzua9AdzFzGpzqrn2gLkp0CJFGCqU3utHe2fQM/8wDUHcoJk0lbj
 u8/w==
X-Gm-Message-State: AFeK/H3QTuy1NCKMfizJSWdEuH64lNB3H5hKxSsFUa1tIOP1ikdsFrYwXBgBIwTCKR1wAA==
X-Received: by 10.84.212.8 with SMTP id d8mr82135131pli.152.1491998215161;
 Wed, 12 Apr 2017 04:56:55 -0700 (PDT)
Received: from [192.168.2.211] ([116.56.129.146])
 by smtp.gmail.com with ESMTPSA id b10sm5238515pfc.27.2017.04.12.04.56.52
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 12 Apr 2017 04:56:54 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Chris Torek <torek@elf.torek.net>, imp@bsdimp.com
References: <201704120755.v3C7tYUH016699@elf.torek.net>
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, rysto32@gmail.com
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <aa5f22f4-0bec-f2a4-554b-f0055398eb7d@gmail.com>
Date: Wed, 12 Apr 2017 19:56:50 +0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <201704120755.v3C7tYUH016699@elf.torek.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Apr 2017 11:56:55 -0000

On 2017年04月12日 15:55, Chris Torek wrote:
>> clear. How can I distinguish these two conditions? I mean, whether I
>> am using my own state/stack or borrowing others' state.
>
> You choose it when you establish your interrupt handler.  If you
> say you are a filter interrupt, then you *are* one, and the rest
> of your code must be written as one.  Unless you know what you
> are doing, don't do this, and then you *aren't* one and the rest
> of your code can be written using the much more relaxed model.
>
>> What do you mean by "must take MTX_SPIN locks 'inside' any outer
>> MTX_DEF locks?
>
> This means that any code path that is going to hold a spin-type
> lock must obtain it while already holding any applicable non-spin
> locks.  For instance, if we look at <sys/proc.h> we find these:
>
> 	#define	PROC_STATLOCK(p)	mtx_lock_spin(&(p)->p_statmtx)
> 	#define	PROC_ITIMLOCK(p)	mtx_lock_spin(&(p)->p_itimmtx)
> 	#define	PROC_PROFLOCK(p)	mtx_lock_spin(&(p)->p_profmtx)
>
> Let's find a bit of code that uses one, such as in kern_time.c:
>
> https://github.com/freebsd/freebsd/blob/master/sys/kern/kern_time.c#L338
>
> (kern_clock_gettime()).  This code reads:
>
> 	case CLOCK_PROF:
> 		PROC_LOCK(p);
> 		PROC_STATLOCK(p);
> 		calcru(p, &user, &sys);
> 		PROC_STATUNLOCK(p);
> 		PROC_UNLOCK(p);
> 		timevaladd(&user, &sys);
> 		TIMEVAL_TO_TIMESPEC(&user, ats);
> 		break;
>
> Note that the call to PROC_LOCK comes first, then the call to
> PROC_STATLOCK.  This is because PROC_LOCK
>
> https://github.com/freebsd/freebsd/blob/master/sys/sys/proc.h#L825
>
> is defined as:
>
> 	#define	PROC_LOCK(p)	mtx_lock(&(p)->p_mtx)
>
> If you obtain the locks in the other order -- i.e., if you grab
> the PROC_STATLOCK first, then try to lock PROC_LOCK -- you are
> trying to take a spin-type mutex while holding a default mutex,

Is this a typo? I guess you mean something like "you are trying
to take a blocking mutex while holding spin-type mutex".

> and this is not allowed (can cause deadlock).  But taking the
> PROC_LOCK first (which may block), then taking the PROC_STATLOCK
> (a spin lock) "inside" the outer PROC_LOCK default mutex, is OK.

I think I get your point: if you take a spin-type mutex, you
already disable interrupt, which in effect means that no other
code can preempt you. Under this circumstance, if you continue to
take a blocking mutex, you may get blocked. Since you already
disable interrupt and nobody can interrupt/preempt you, you are blocked
on that CPU, not being able to do anything, which is pretty much a
"deadlock" (actually this is not a deadlock, but, it is similar)

Regards,
Yubin Ruan

> (This is one of my mild objections to macros like PROC_LOCK and
> PROC_STATLOCK: they hide whether the mutex in question is a spin
> lock.)
>
> Incidentally, any time you take *any* lock while holding any
> other lock (e.g., lock A, then lock B while holding A), you have
> created a "lock order" in which A predeces B.  If some other
> code path locks B first, then while holding B, attempts to lock
> A, you get a deadlock if both code paths are running at the same
> time.  The WITNESS code dynamically discovers these various orders
> and warns you at run time if you have a "lock order reversal"
> (a case where one code path does A-then-B while another does
> B-then-A).
>
> (This is, in a sense, the same problem as discovering whether
> there is a loop in a directed graph, or whether this directed
> graph is acyclic.  If you can force the graph to take the shape of
> a tree, rather than the more general graph, there will never be
> any loops in it, and you will never have lock order reversals.
> And of course if you have only *one* lock for some data, there is
> nothing to be reversed.  Not all lock order reversals are
> guaranteed to lead to deadlock, but sorting out which ones are
> really OK, and which are not, is ... challenging.)
>
> Chris
>


From owner-freebsd-hackers@freebsd.org  Wed Apr 12 18:53:45 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9ED29D3BB1A
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Wed, 12 Apr 2017 18:53:45 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (mail.torek.net [96.90.199.121])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "elf.torek.net", Issuer "elf.torek.net" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 8A0E1B29
 for <freebsd-hackers@freebsd.org>; Wed, 12 Apr 2017 18:53:44 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (localhost [127.0.0.1])
 by elf.torek.net (8.15.2/8.15.2) with ESMTPS id v3CIrgrQ055169
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Wed, 12 Apr 2017 11:53:43 -0700 (PDT)
 (envelope-from torek@elf.torek.net)
Received: (from torek@localhost)
 by elf.torek.net (8.15.2/8.15.2/Submit) id v3CIrg5d055158;
 Wed, 12 Apr 2017 11:53:42 -0700 (PDT) (envelope-from torek)
Date: Wed, 12 Apr 2017 11:53:42 -0700 (PDT)
From: Chris Torek <torek@elf.torek.net>
Message-Id: <201704121853.v3CIrg5d055158@elf.torek.net>
To: ablacktshirt@gmail.com, imp@bsdimp.com
Subject: Re: Understanding the FreeBSD locking mechanism
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, rysto32@gmail.com
In-Reply-To: <aa5f22f4-0bec-f2a4-554b-f0055398eb7d@gmail.com>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2
 (elf.torek.net [127.0.0.1]); Wed, 12 Apr 2017 11:53:43 -0700 (PDT)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Apr 2017 18:53:45 -0000

>> If you obtain the locks in the other order -- i.e., if you grab
>> the PROC_STATLOCK first, then try to lock PROC_LOCK -- you are
>> trying to take a spin-type mutex while holding a default mutex,

>Is this a typo? I guess you mean something like "you are trying
>to take a blocking mutex while holding spin-type mutex".

Yes, or rather brain-o (swapping words) -- these most often happen
if I am interrupted while composing a message :-)

>I think I get your point: if you take a spin-type mutex, you
>already disable interrupt, which in effect means that no other
>code can preempt you. Under this circumstance, if you continue to
>take a blocking mutex, you may get blocked. Since you already
>disable interrupt and nobody can interrupt/preempt you, you are blocked
>on that CPU, not being able to do anything, which is pretty much a
>"deadlock" (actually this is not a deadlock, but, it is similar)

Right.  It *may* deadlock, and it is definitely not good -- and
the INVARIANTS kernel will check and panic.

Chris

From owner-freebsd-hackers@freebsd.org  Thu Apr 13 01:26:24 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id D3795D38DA8
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Thu, 13 Apr 2017 01:26:24 +0000 (UTC)
 (envelope-from otacilio.neto@bsd.com.br)
Received: from mail-qk0-x22a.google.com (mail-qk0-x22a.google.com
 [IPv6:2607:f8b0:400d:c09::22a])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 8CF29DCB
 for <freebsd-hackers@freebsd.org>; Thu, 13 Apr 2017 01:26:23 +0000 (UTC)
 (envelope-from otacilio.neto@bsd.com.br)
Received: by mail-qk0-x22a.google.com with SMTP id f133so37619793qke.2
 for <freebsd-hackers@freebsd.org>; Wed, 12 Apr 2017 18:26:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsd.com.br; s=capeta;
 h=subject:to:references:from:message-id:date:user-agent:mime-version
 :in-reply-to:content-transfer-encoding;
 bh=N383ORst11g2PiAMLlRNgwH1Bxa1LHUEDkL69gyEK+g=;
 b=NGJrwtu2HcImIkgWg/AupeZmZPg22XGB2Hnf4fVc8hiTSmyVfy3Nbx479DvChINQEv
 etgNJbhVbrkQugs2EWIb9t/kk4B185f3DIiUxO2O3tMw6cUXia7XtWAmdftaXCvZSuKO
 L88+9L3Cuix3mQJrhdghn2d1YQ0bYtMVAyx9E=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=N383ORst11g2PiAMLlRNgwH1Bxa1LHUEDkL69gyEK+g=;
 b=nnfJninENWqWQGgX2JR1UEVjK42PCPYns6vcj2lUxpOH7ZRGoRHH2J+71N0eCaTNpb
 OC8PAlVgh/RwIAX6FAi8qznaC5Ak0LOwY//9BmbK/eSS+YWu6fAFaU/k5A11AZxbPlrE
 ikzlfbB6dI8Bs5g4opFNN1EhaC3lGLluW5Ljbj2ZNqFq04ytznhZsRa2irTvHON7boQ+
 68nA5OcOEASQg0xFGAlaf502V3//9qhgQ46NOIVJLgNoBilB0+25wFKH5Drw8LbcIIiL
 uJHDQq2j+RyY2qftGBbtFX2pXUB6J0LMaMkhnyzsAIW75GZYXVcUH/sM8Nlno9ra77/7
 26CQ==
X-Gm-Message-State: AN3rC/7zGUzpA8zheX4hm6O6At+QyxC/hEA4osAHp8plF876ZNc3cl+O
 HWrVcvS0N4iiuxTL
X-Received: by 10.55.102.193 with SMTP id a184mr411486qkc.309.1492046782712;
 Wed, 12 Apr 2017 18:26:22 -0700 (PDT)
Received: from ?IPv6:2804:54:19ef:cc00:c47d:8860:c52b:6c79?
 ([2804:54:19ef:cc00:c47d:8860:c52b:6c79])
 by smtp.googlemail.com with ESMTPSA id q80sm14735457qkq.16.2017.04.12.18.26.21
 for <freebsd-hackers@freebsd.org>
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 12 Apr 2017 18:26:22 -0700 (PDT)
Subject: Re: The arm64 fork-then-swap-out-then-swap-in failures: a program
 source for exploring them
To: freebsd-hackers@freebsd.org
References: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net>
 <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
 <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
 <20170409122715.GF1788@kib.kiev.ua>
 <9D152170-5F19-47A2-A06A-66F83CA88A09@dsl-only.net>
 <9DCAF95B-39A5-4346-88FC-6AFDEE8CF9BB@dsl-only.net>
 <8FFE95AA-DB40-4D1E-A103-4BA9FCC6EDEE@dsl-only.net>
 <89D6D677-3BE2-45E2-A902-CC6A0305F3F9@dsl-only.net>
 <585B43F7-D4C8-431A-BFFE-68B48C3214AE@dsl-only.net>
 <876EA1E4-E5A9-411C-AFFD-989713037C19@dsl-only.net>
From: =?UTF-8?B?T3RhY8OtbGlv?= <otacilio.neto@bsd.com.br>
Message-ID: <7adada71-e089-e105-eec8-6136d4b8c083@bsd.com.br>
Date: Wed, 12 Apr 2017 22:25:43 -0300
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <876EA1E4-E5A9-411C-AFFD-989713037C19@dsl-only.net>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Apr 2017 01:26:24 -0000

Em 10/04/2017 17:15, Mark Millard escreveu:
> On 2017-Apr-10, at 2:51 AM, Mark Millard <markmi at dsl-only.net> wrote:
>
>> On 2017-Apr-9, at 5:10 PM, Mark Millard <markmi at dsl-only.net> wrote:
>>
>>> On 2017-Apr-9, at 10:24 AM, Mark Millard <markmi at dsl-only.net> wrote:
>>>
>>>> On 2017-Apr-9, at 5:27 AM, Konstantin Belousov <kostikbel@gmail.com> wrote:
>>>>> Hmm, could you try the following patch, I did not even compiled it.
>>>> I'll try it later today.
>>>>
>>>>> diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
>>>>> index 3d5756ba891..55aa402eb1c 100644
>>>>> --- a/sys/arm64/arm64/pmap.c
>>>>> +++ b/sys/arm64/arm64/pmap.c
>>>>> @@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, vm_prot_t prot)
>>>>> 		    sva += L3_SIZE) {
>>>>> 			l3 = pmap_load(l3p);
>>>>> 			if (pmap_l3_valid(l3)) {
>>>>> +				if ((l3 & ATTR_SW_MANAGED) &&
>>>>> +				    pmap_page_dirty(l3)) {
>>>>> +					vm_page_dirty(PHYS_TO_VM_PAGE(l3 &
>>>>> +					    ~ATTR_MASK));
>>>>> +				}
>>>>> 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
>>>>> 				PTE_SYNC(l3p);
>>>>> 				/* XXX: Use pmap_invalidate_range */
>>>
>>> Preliminary testing indicates that this fixes the
>>> some-pages-become-zero problem for fork-then-swapout/in.
>>>
>>> Thanks!
>>>
>>> I'll see if a buildworld can go through without being stopped
>>> by the type of issue. But that will take a while. (It is how
>>> I originally ran into the problem(s) that others had been
>>> reporting on the lists.)
>> buildworld buildkernel completed non-stop for the first time
>> on a BPI-M3 board.
> I had been thinking of the BPI-M3 for other reasons
> and typed that instead of the correct: Pine64+ 2GB.
> (True elsewhere as well.) I do really mean arm64
> here, not armv7.
>
>> Looks good for a check-in to svn to me (head and stable/11).
>>
>> This combined with 2017-Feb-15's -r313772's fix to the fork
>> trampline code's updating of sp_el0 makes arm64 far more stable
>> for my purposes.
>>
>> -r313772 was never MFC'd to stable/11. In my view it should be.
> ===
> Mark Millard
> markmi at dsl-only.net
>
Dears

Will this patch be committed to HEAD?

[]'s
-Otacilio

From owner-freebsd-hackers@freebsd.org  Thu Apr 13 04:59:37 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 402ECD3BBA7
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Thu, 13 Apr 2017 04:59:37 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: from asp.reflexion.net (outbound-mail-210-43.reflexion.net
 [208.70.210.43])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 0296680C
 for <freebsd-hackers@freebsd.org>; Thu, 13 Apr 2017 04:59:36 +0000 (UTC)
 (envelope-from markmi@dsl-only.net)
Received: (qmail 18806 invoked from network); 13 Apr 2017 04:59:34 -0000
Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1)
 by 0 (rfx-qmail) with SMTP; 13 Apr 2017 04:59:34 -0000
Received: by rtc-sm-01.app.dca.reflexion.local
 (Reflexion email security v8.40.0) with SMTP;
 Thu, 13 Apr 2017 00:59:34 -0400 (EDT)
Received: (qmail 15295 invoked from network); 13 Apr 2017 04:59:34 -0000
Received: from unknown (HELO iron2.pdx.net) (69.64.224.71)
 by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 13 Apr 2017 04:59:34 -0000
Received: from [192.168.1.106] (c-76-115-7-162.hsd1.or.comcast.net
 [76.115.7.162])
 by iron2.pdx.net (Postfix) with ESMTPSA id 65DFFEC8B66;
 Wed, 12 Apr 2017 21:59:33 -0700 (PDT)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: The arm64 fork-then-swap-out-then-swap-in failures: a program
 source for exploring them
From: Mark Millard <markmi@dsl-only.net>
In-Reply-To: <7adada71-e089-e105-eec8-6136d4b8c083@bsd.com.br>
Date: Wed, 12 Apr 2017 21:59:32 -0700
Cc: freebsd-hackers@freebsd.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <C648CF35-E122-49B9-A198-7722143EF2F5@dsl-only.net>
References: <4DEA2D76-9F27-426D-A8D2-F07B16575FB9@dsl-only.net>
 <163B37B0-55D6-498E-8F52-9A95C036CDFA@dsl-only.net>
 <08E7A5B0-8707-4479-9D7A-272C427FF643@dsl-only.net>
 <20170409122715.GF1788@kib.kiev.ua>
 <9D152170-5F19-47A2-A06A-66F83CA88A09@dsl-only.net>
 <9DCAF95B-39A5-4346-88FC-6AFDEE8CF9BB@dsl-only.net>
 <8FFE95AA-DB40-4D1E-A103-4BA9FCC6EDEE@dsl-only.net>
 <89D6D677-3BE2-45E2-A902-CC6A0305F3F9@dsl-only.net>
 <585B43F7-D4C8-431A-BFFE-68B48C3214AE@dsl-only.net>
 <876EA1E4-E5A9-411C-AFFD-989713037C19@dsl-only.net>
 <7adada71-e089-e105-eec8-6136d4b8c083@bsd.com.br>
To: =?utf-8?B?T3RhY8OtbGlv?= <otacilio.neto@bsd.com.br>
X-Mailer: Apple Mail (2.3273)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Apr 2017 04:59:37 -0000

On 2017-Apr-12, at 6:25 PM, Otac=C3=ADlio <otacilio.neto at bsd.com.br> =
wrote:

> Em 10/04/2017 17:15, Mark Millard escreveu:
>> On 2017-Apr-10, at 2:51 AM, Mark Millard <markmi at dsl-only.net> =
wrote:
>>=20
>>> On 2017-Apr-9, at 5:10 PM, Mark Millard <markmi at dsl-only.net> =
wrote:
>>>=20
>>>> On 2017-Apr-9, at 10:24 AM, Mark Millard <markmi at dsl-only.net> =
wrote:
>>>>=20
>>>>> On 2017-Apr-9, at 5:27 AM, Konstantin Belousov =
<kostikbel@gmail.com> wrote:
>>>>>> Hmm, could you try the following patch, I did not even compiled =
it.
>>>>> I'll try it later today.
>>>>>=20
>>>>>> diff --git a/sys/arm64/arm64/pmap.c b/sys/arm64/arm64/pmap.c
>>>>>> index 3d5756ba891..55aa402eb1c 100644
>>>>>> --- a/sys/arm64/arm64/pmap.c
>>>>>> +++ b/sys/arm64/arm64/pmap.c
>>>>>> @@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sva, =
vm_offset_t eva, vm_prot_t prot)
>>>>>> 		    sva +=3D L3_SIZE) {
>>>>>> 			l3 =3D pmap_load(l3p);
>>>>>> 			if (pmap_l3_valid(l3)) {
>>>>>> +				if ((l3 & ATTR_SW_MANAGED) &&
>>>>>> +				    pmap_page_dirty(l3)) {
>>>>>> +					=
vm_page_dirty(PHYS_TO_VM_PAGE(l3 &
>>>>>> +					    ~ATTR_MASK));
>>>>>> +				}
>>>>>> 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
>>>>>> 				PTE_SYNC(l3p);
>>>>>> 				/* XXX: Use pmap_invalidate_range */
>>>>=20
>>>> Preliminary testing indicates that this fixes the
>>>> some-pages-become-zero problem for fork-then-swapout/in.
>>>>=20
>>>> Thanks!
>>>>=20
>>>> I'll see if a buildworld can go through without being stopped
>>>> by the type of issue. But that will take a while. (It is how
>>>> I originally ran into the problem(s) that others had been
>>>> reporting on the lists.)
>>> buildworld buildkernel completed non-stop for the first time
>>> on a BPI-M3 board.
>> I had been thinking of the BPI-M3 for other reasons
>> and typed that instead of the correct: Pine64+ 2GB.
>> (True elsewhere as well.) I do really mean arm64
>> here, not armv7.
>>=20
>>> Looks good for a check-in to svn to me (head and stable/11).
>>>=20
>>> This combined with 2017-Feb-15's -r313772's fix to the fork
>>> trampline code's updating of sp_el0 makes arm64 far more stable
>>> for my purposes.
>>>=20
>>> -r313772 was never MFC'd to stable/11. In my view it should be.
>> =3D=3D=3D
>> Mark Millard
>> markmi at dsl-only.net
>>=20
> Dears
>=20
> Will this patch be committed to HEAD?

It was:

Author: kib
Date: Mon Apr 10 15:32:26 2017
New Revision: 316679
URL:=20
https://svnweb.freebsd.org/changeset/base/316679


Log:
  Do not lose dirty bits for removing PROT_WRITE on arm64.
 =20
  Arm64 pmap interprets accessed writable ptes as modified, since
  ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable bit
  is removed, page must be marked as dirty for MI VM.
 =20
  This change is most important for COW, where fork caused losing
  content of the dirty pages which were not yet scanned by pagedaemon.
 =20
  Reviewed by:	alc, andrew
  Reported and tested by:	Mark Millard <markmi at dsl-only.net>
  PR:	217138, 217239
  Sponsored by:	The FreeBSD Foundation
  MFC after:	2 weeks

Modified:
  head/sys/arm64/arm64/pmap.c

Modified: head/sys/arm64/arm64/pmap.c
=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
--- head/sys/arm64/arm64/pmap.c	Mon Apr 10 12:35:58 2017	=
(r316678)
+++ head/sys/arm64/arm64/pmap.c	Mon Apr 10 15:32:26 2017	=
(r316679)
@@ -2481,6 +2481,11 @@ pmap_protect(pmap_t pmap, vm_offset_t sv
 		    sva +=3D L3_SIZE) {
 			l3 =3D pmap_load(l3p);
 			if (pmap_l3_valid(l3)) {
+				if ((l3 & ATTR_SW_MANAGED) &&
+				    pmap_page_dirty(l3)) {
+					vm_page_dirty(PHYS_TO_VM_PAGE(l3 =
&
+					    ~ATTR_MASK));
+				}
 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
 				PTE_SYNC(l3p);
 				/* XXX: Use pmap_invalidate_range */


There was a patch ( -r313772 ) committed to head back in Feb. for
interrupts sometimes trashing a special register during fork. It
takes both of these patches to get fork working reliably.

[stable/11 should eventually get both of these patches
so that fork becomes reliable there for aarch64 (armv8.0).]

=3D=3D=3D
Mark Millard
markmi at dsl-only.net


From owner-freebsd-hackers@freebsd.org  Thu Apr 13 09:28:17 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1B93ED3BAA3
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Thu, 13 Apr 2017 09:28:17 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-pg0-x242.google.com (mail-pg0-x242.google.com
 [IPv6:2607:f8b0:400e:c05::242])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id DFB02C5F
 for <freebsd-hackers@freebsd.org>; Thu, 13 Apr 2017 09:28:16 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-pg0-x242.google.com with SMTP id g2so10283164pge.2
 for <freebsd-hackers@freebsd.org>; Thu, 13 Apr 2017 02:28:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=a2j1IuhOD6HMFFaoSk4s7VuV09u2nX42VOKZHelFk04=;
 b=QOHWhM+dLG6bmRWRId5jeVy5IGIVufHvKfJf7+5e/zLci7cd7S0XL9mS2IcOLl8sms
 nzHrjXoyepdwZ5QL4k4q1ia9PDd7VOc/BmBUmIxDdhOVL32jWP+nM+I4ZQWWg430VojY
 TWg47ObeVpqpnIw4U/t9N654BRrOXSkCNmRDBBTbxLp6MaT5O5Kqbq/o8+8r2l0VUHQ5
 mtt8uMXqW+SauLqro/ctLvlfnl2WKlLiTMqs8FcrSPcVjRAwI4g1bKSjBKzKeqU7Ulx8
 ohN0TgKWsENRqN6O5LSrsyTfUQO22KfOzugPa8RD9tZsZp+rjKpozFMGs9ydNrUNF1xv
 00LQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=a2j1IuhOD6HMFFaoSk4s7VuV09u2nX42VOKZHelFk04=;
 b=fgogqMPPKh+I9oSeQ2CcH8QP1yoLZIXdczIYv4+0aUPLVjDY0HaUAYjktqItqCOQJJ
 MkD7USaYf7NW10wPsxbhKoCSBtBe9vsIEmFslP1QVyxU1yTZDmaTt9v8UKWwnxBKpjgy
 g1DOeD8JHCwHav7FP0LKVcqAOUvsIInzJPQ4of87p/upB0tPEoaxo/ZHJ+QZ43U9Qw6o
 DktxTvTgxOdobxga3t3eUTxabt/TAkL7hqq7m6lO5b+PfQLPKBz6P4NYKGjpPxP3wxxF
 2FVKc/gApvjNQpxuyFz6iN0s1sbOWBt4HW86wlVYdIpGcq/k/DHWl12f9anM+qi3SGZL
 mSsw==
X-Gm-Message-State: AN3rC/5WRQd6ot8lARfXkM2Ek1hW8IJLQm/YlrugUk8ynChTYzoEmVdl
 iQbq4eR1eMJuzSUjBpw=
X-Received: by 10.98.14.28 with SMTP id w28mr2343869pfi.59.1492075696305;
 Thu, 13 Apr 2017 02:28:16 -0700 (PDT)
Received: from [192.168.2.211] ([116.56.129.146])
 by smtp.gmail.com with ESMTPSA id l127sm41436091pga.7.2017.04.13.02.28.07
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 13 Apr 2017 02:28:15 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Chris Torek <torek@elf.torek.net>, imp@bsdimp.com
References: <201704121853.v3CIrg5d055158@elf.torek.net>
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, rysto32@gmail.com,
 kostikbel@gmail.com
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <06a30d21-acff-efb2-ff58-9aa66793e929@gmail.com>
Date: Thu, 13 Apr 2017 17:28:04 +0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <201704121853.v3CIrg5d055158@elf.torek.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Apr 2017 09:28:17 -0000

On 2017年04月13日 02:53, Chris Torek wrote:
>>> If you obtain the locks in the other order -- i.e., if you grab
>>> the PROC_STATLOCK first, then try to lock PROC_LOCK -- you are
>>> trying to take a spin-type mutex while holding a default mutex,
>
>> Is this a typo? I guess you mean something like "you are trying
>> to take a blocking mutex while holding spin-type mutex".
>
> Yes, or rather brain-o (swapping words) -- these most often happen
> if I am interrupted while composing a message :-)
>
>> I think I get your point: if you take a spin-type mutex, you
>> already disable interrupt, which in effect means that no other
>> code can preempt you. Under this circumstance, if you continue to
>> take a blocking mutex, you may get blocked. Since you already
>> disable interrupt and nobody can interrupt/preempt you, you are blocked
>> on that CPU, not being able to do anything, which is pretty much a
>> "deadlock" (actually this is not a deadlock, but, it is similar)
>
> Right.  It *may* deadlock, and it is definitely not good -- and
> the INVARIANTS kernel will check and panic.

I discover that in the current implementation in FreeBSD, spinlock
does not disable interrupt entirely:


   607         for (;;) {
   608                 if (m->mtx_lock == MTX_UNOWNED && 
_mtx_obtain_lock(m, tid))
   609                         break;
   610                 /* Give interrupts a chance while we spin. */
   611                 spinlock_exit();
   612                 while (m->mtx_lock != MTX_UNOWNED) {
   613                         if (i++ < 10000000) {
   614                                 cpu_spinwait();
   615                                 continue;
   616                         }
   617                         if (i < 60000000 || kdb_active || 
panicstr != NULL)
   618                                 DELAY(1);
   619                         else
   620                                 _mtx_lock_spin_failed(m);
   621                         cpu_spinwait();
   622                 }
   623                 spinlock_enter();
   624         }

This is `_mtx_lock_spin_cookie(...)` in kern/kern_mutex.c, which
implements the core logic of spinning. However, as you can see, while
spinning, it would enable interrupt "occasionally" and disable it
again... What is the rationale for that?

Regards,
Yubin Ruan

From owner-freebsd-hackers@freebsd.org  Thu Apr 13 12:18:20 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 70AF4D380E3
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Thu, 13 Apr 2017 12:18:20 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (mail.torek.net [96.90.199.121])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "elf.torek.net", Issuer "elf.torek.net" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id DA38CF1C
 for <freebsd-hackers@freebsd.org>; Thu, 13 Apr 2017 12:18:19 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (localhost [127.0.0.1])
 by elf.torek.net (8.15.2/8.15.2) with ESMTPS id v3DCIBg4093208
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Thu, 13 Apr 2017 05:18:11 -0700 (PDT)
 (envelope-from torek@elf.torek.net)
Received: (from torek@localhost)
 by elf.torek.net (8.15.2/8.15.2/Submit) id v3DCIBJg093207;
 Thu, 13 Apr 2017 05:18:11 -0700 (PDT) (envelope-from torek)
Date: Thu, 13 Apr 2017 05:18:11 -0700 (PDT)
From: Chris Torek <torek@elf.torek.net>
Message-Id: <201704131218.v3DCIBJg093207@elf.torek.net>
To: ablacktshirt@gmail.com, imp@bsdimp.com
Subject: Re: Understanding the FreeBSD locking mechanism
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, kostikbel@gmail.com,
 rysto32@gmail.com
In-Reply-To: <06a30d21-acff-efb2-ff58-9aa66793e929@gmail.com>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2
 (elf.torek.net [127.0.0.1]); Thu, 13 Apr 2017 05:18:11 -0700 (PDT)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Apr 2017 12:18:20 -0000

>I discover that in the current implementation in FreeBSD, spinlock
>does not disable interrupt entirely:
[extra-snipped here]
>   610                 /* Give interrupts a chance while we spin. */
>   611                 spinlock_exit();
>   612                 while (m->mtx_lock != MTX_UNOWNED) {
[more snip]

>This is `_mtx_lock_spin_cookie(...)` in kern/kern_mutex.c, which
>implements the core logic of spinning. However, as you can see, while
>spinning, it would enable interrupt "occasionally" and disable it
>again... What is the rationale for that?

This code snippet is slightly misleading.  The full code path runs
from mtx_lock_spin() through __mtx_lock_spin(), which first
invokes spinlock_enter() and then, in the *contested* case (only),
calls _mtx_lock_spin_cookie().

spinlock_enter() is:

	td = curthread;
	if (td->td_md.md_spinlock_count == 0) {
		flags = intr_disable();
		td->td_md.md_spinlock_count = 1;
		td->td_md.md_saved_flags = flags;
	} else
		td->td_md.md_spinlock_count++;
	critical_enter();

so it actualy disables interrupts *only* on the transition from
td->td_md.md_spinlock_count = 0 to td->td_md.md_spinlock_count = 1,
i.e., the first time we take a spin lock in this thread, whether
this is a borrowed thread or not.  It's possible that interrupts
are actually disabled at this point.  If so, td->td_md.md_saved_flags
has interrupts disabled as well.  This is all just an optimization
to use a thread-local variable so as to avoid touching hardware.
The details vary widely, but typically, touching the actual hardware
controls requires flushing the CPU's instruction pipeline.

If the compare-and-swap fails, we enter _mtx_lock_spin_cookie()
and loop waiting to see if we can obtain the spin lock in time.
In that case, we don't actually *hold* this particular spin lock
itself yet, so we can call spinlock_exit() to undo the effect
of the outermost spinlock_enter() (in __mtx_lock_spin).  That
decrements the counter.  *If* it goes to zero, that also calls
intr_restore(td->td_md.md_saved_flags).

Hence, if we have failed to obtain our first spin lock, we restore
the interrupt setting to whatever we saved.  If interrupts were
already locked out (as in a filter type interrupt handler) this is
a potentially-somewhat-expensive no-op.  If interrupts were
enabled previously, this is a somewhat expensive re-enable of
interrupts -- but that's OK, and maybe good, because we have no
spin locks of our own yet.  That means we can take hardware
interrupts now, and let them borrow our current thread if they are
that kind of interrupt, or schedule another thread to run if
appropriate.  That might even preempt us, since we do not yet hold
any spin locks.  (But it won't preempt us if we have done a
critical_enter() before this point.)

(In fact, the spinlock exit/enter calls that you see inside
_mtx_lock_spin_cookie() wrap a loop that does not use compare-and-
swap operations at all, but rather ordinary memory reads.  These
are cheaper than CAS operations on a lot of CPUs, but they may
produce wrong answers when two CPUs are racing to write the same
location; only a CAS produces a guaranteed answer, which might
still be "you lost the race".  The inner loop you are looking at
occurs after losing a CAS race.  Once we think we might *win* a
future CAS race, _mtx_lock_spin_cookie() calls spinlock_enter()
again and tries the actual CAS operation, _mtx_obtain_lock_fetch(),
with interrupts disabled.  Note also the calls to cpu_spinwait()
-- the Linux equivalent macro is cpu_relax() -- which translates
to a "pause" instruction on amd64.)

Chris

From owner-freebsd-hackers@freebsd.org  Thu Apr 13 13:46:28 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B8E57D3BC97
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Thu, 13 Apr 2017 13:46:28 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: from mail-pf0-x242.google.com (mail-pf0-x242.google.com
 [IPv6:2607:f8b0:400e:c00::242])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 898F9151
 for <freebsd-hackers@freebsd.org>; Thu, 13 Apr 2017 13:46:28 +0000 (UTC)
 (envelope-from ablacktshirt@gmail.com)
Received: by mail-pf0-x242.google.com with SMTP id o126so10966280pfb.1
 for <freebsd-hackers@freebsd.org>; Thu, 13 Apr 2017 06:46:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:references:cc:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-transfer-encoding;
 bh=zAOW7bioTJCjAtfqpCe+yL3nIsRs0wAHd4JyPxzGVCc=;
 b=SmEWB0yDWfm8bSmLNHEwOk2vhhL1uTNREjQ3cidPWLOuCy9+x9dq6mok9m6/aJKr+Z
 x/AKmDQatPScz/E8abVAu+GSSp20+ntTAC5PAv/QJvivSq7ZuBIXW9Lz/NIDLipkMvXZ
 okuf7/SsVMR3DWGwUVyRCWbNk6a7gx1LJZy8lLS1qSSgpdVpRDOGSCj3SBnSJonlBUWP
 2o8jY0Pg2G6Fzd6bm8s33np+beU03qzHoJkdbFPXBeR0D7Mx2AoFkayOoa/xndhbvpDW
 mcA94T92PhAaf/pUQ2dNm3Oq049KAxO3qbOJ+HJ79Tc06/iCvhUoN7iQzgxo1k5ufy18
 MAkQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:references:cc:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-transfer-encoding;
 bh=zAOW7bioTJCjAtfqpCe+yL3nIsRs0wAHd4JyPxzGVCc=;
 b=apiRI+zU/ZS6uDXKFDTWN9+e/l93gnkBNJ2wodRBzOdOjiTZtoI4X4aPuSBWecnO6G
 P6hhZDcPMhB/O+NZJf8knnJlyqMiqDk2boFomTQWYWOxqmOZ+kDCxfm/mxROteHo1Oqx
 45xfBFdFJSxNhnzvjL1vmyXn4hyrJzx4V5J1SuS2FFKRpbuSkOYMDd8Wx7rRRcEx+qT4
 Y16RAF4CsdWLNJTwN4SB0az3ZnNUNcta2zaKyh71iP8Cg7UVCFW4c6lC/bJKJQJFCCmp
 lvff5quHt6cJujz+RQhfLB5Vkj+6OojsfS7OhrehS1+6VegvzrO7ajeG6u09TR6NvTYG
 4G1w==
X-Gm-Message-State: AN3rC/4y0cDgk2SfY0EUd5Uv9aVj9FAxRjuvfNfh81mYvZDt03dbewiB
 m1VEpVf5cqMZRA==
X-Received: by 10.99.65.4 with SMTP id o4mr3478761pga.90.1492091187944;
 Thu, 13 Apr 2017 06:46:27 -0700 (PDT)
Received: from [192.168.2.211] ([116.56.129.146])
 by smtp.gmail.com with ESMTPSA id 74sm42776530pfn.102.2017.04.13.06.46.24
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 13 Apr 2017 06:46:26 -0700 (PDT)
Subject: Re: Understanding the FreeBSD locking mechanism
To: Chris Torek <torek@elf.torek.net>, imp@bsdimp.com
References: <201704131218.v3DCIBJg093207@elf.torek.net>
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, kostikbel@gmail.com,
 rysto32@gmail.com
From: Yubin Ruan <ablacktshirt@gmail.com>
Message-ID: <a51d29c2-4cab-0dfc-6fdc-81d7b2188d61@gmail.com>
Date: Thu, 13 Apr 2017 21:46:26 +0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <201704131218.v3DCIBJg093207@elf.torek.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Apr 2017 13:46:28 -0000

On 2017年04月13日 20:18, Chris Torek wrote:
>> I discover that in the current implementation in FreeBSD, spinlock
>> does not disable interrupt entirely:
> [extra-snipped here]
>>   610                 /* Give interrupts a chance while we spin. */
>>   611                 spinlock_exit();
>>   612                 while (m->mtx_lock != MTX_UNOWNED) {
> [more snip]
>
>> This is `_mtx_lock_spin_cookie(...)` in kern/kern_mutex.c, which
>> implements the core logic of spinning. However, as you can see, while
>> spinning, it would enable interrupt "occasionally" and disable it
>> again... What is the rationale for that?
>
> This code snippet is slightly misleading.  The full code path runs
> from mtx_lock_spin() through __mtx_lock_spin(), which first
> invokes spinlock_enter() and then, in the *contested* case (only),
> calls _mtx_lock_spin_cookie().
>
> spinlock_enter() is:
>
> 	td = curthread;
> 	if (td->td_md.md_spinlock_count == 0) {
> 		flags = intr_disable();
> 		td->td_md.md_spinlock_count = 1;
> 		td->td_md.md_saved_flags = flags;
> 	} else
> 		td->td_md.md_spinlock_count++;
> 	critical_enter();
>
> so it actualy disables interrupts *only* on the transition from
> td->td_md.md_spinlock_count = 0 to td->td_md.md_spinlock_count = 1,
> i.e., the first time we take a spin lock in this thread, whether
> this is a borrowed thread or not.  It's possible that interrupts
> are actually disabled at this point.  If so, td->td_md.md_saved_flags
> has interrupts disabled as well.  This is all just an optimization
> to use a thread-local variable so as to avoid touching hardware.
> The details vary widely, but typically, touching the actual hardware
> controls requires flushing the CPU's instruction pipeline.
>
> If the compare-and-swap fails, we enter _mtx_lock_spin_cookie()
> and loop waiting to see if we can obtain the spin lock in time.
> In that case, we don't actually *hold* this particular spin lock
> itself yet, so we can call spinlock_exit() to undo the effect
> of the outermost spinlock_enter() (in __mtx_lock_spin).  That
> decrements the counter.  *If* it goes to zero, that also calls
> intr_restore(td->td_md.md_saved_flags).
>
> Hence, if we have failed to obtain our first spin lock, we restore
> the interrupt setting to whatever we saved.  If interrupts were
> already locked out (as in a filter type interrupt handler) this is
> a potentially-somewhat-expensive no-op.  If interrupts were
> enabled previously, this is a somewhat expensive re-enable of
> interrupts -- but that's OK, and maybe good, because we have no
> spin locks of our own yet.  That means we can take hardware
> interrupts now, and let them borrow our current thread if they are
> that kind of interrupt, or schedule another thread to run if
> appropriate.  That might even preempt us, since we do not yet hold
> any spin locks.  (But it won't preempt us if we have done a
> critical_enter() before this point.)

Good explanation. I just missed that "local" interrupt point.

> (In fact, the spinlock exit/enter calls that you see inside
> _mtx_lock_spin_cookie() wrap a loop that does not use compare-and-
> swap operations at all, but rather ordinary memory reads.  These
> are cheaper than CAS operations on a lot of CPUs, but they may
> produce wrong answers when two CPUs are racing to write the same

why would that produce wrong result? I think what the inner loop wants
to do is to perform some no-op for a while before it tries again to
acquire the spinlock. So there is no race here.

> location; only a CAS produces a guaranteed answer, which might
> still be "you lost the race".  The inner loop you are looking at
> occurs after losing a CAS race.  Once we think we might *win* a
> future CAS race, _mtx_lock_spin_cookie() calls spinlock_enter()
> again and tries the actual CAS operation, _mtx_obtain_lock_fetch(),
> with interrupts disabled.  Note also the calls to cpu_spinwait()
> -- the Linux equivalent macro is cpu_relax() -- which translates
> to a "pause" instruction on amd64.)

Regards,
Yubin Ruan

From owner-freebsd-hackers@freebsd.org  Thu Apr 13 22:46:48 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 58531D3CF29
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Thu, 13 Apr 2017 22:46:48 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (mail.torek.net [96.90.199.121])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "elf.torek.net", Issuer "elf.torek.net" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 42DA09DC
 for <freebsd-hackers@freebsd.org>; Thu, 13 Apr 2017 22:46:47 +0000 (UTC)
 (envelope-from torek@elf.torek.net)
Received: from elf.torek.net (localhost [127.0.0.1])
 by elf.torek.net (8.15.2/8.15.2) with ESMTPS id v3DMkjGK027792
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Thu, 13 Apr 2017 15:46:45 -0700 (PDT)
 (envelope-from torek@elf.torek.net)
Received: (from torek@localhost)
 by elf.torek.net (8.15.2/8.15.2/Submit) id v3DMkj27027791;
 Thu, 13 Apr 2017 15:46:45 -0700 (PDT) (envelope-from torek)
Date: Thu, 13 Apr 2017 15:46:45 -0700 (PDT)
From: Chris Torek <torek@elf.torek.net>
Message-Id: <201704132246.v3DMkj27027791@elf.torek.net>
To: ablacktshirt@gmail.com, imp@bsdimp.com
Subject: Re: Understanding the FreeBSD locking mechanism
Cc: ed@nuxi.nl, freebsd-hackers@freebsd.org, kostikbel@gmail.com,
 rysto32@gmail.com
In-Reply-To: <a51d29c2-4cab-0dfc-6fdc-81d7b2188d61@gmail.com>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2
 (elf.torek.net [127.0.0.1]); Thu, 13 Apr 2017 15:46:45 -0700 (PDT)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Apr 2017 22:46:48 -0000

(This is getting a bit far afield; let me know if we should
take this off-list.)

>why would [regular read, vs CAS] produce wrong result?

There are both hardware architecture (and sometimes individual
CPU architecture) and compiler reasons for this.

First, compilers may try to optimize load and store operations,
especially on register-rich architectures.  What's coded as:

    ... some code section A ...
    x = p->foo;
    y = p->bar;
    ... some code section B ...

might actually move the loads of p->foo and/or p->bar into either
the A or B sections.  The same goes for stores.  The compiler
makes the (somewhat reasonable for most programming) assumption
that only the instructions the compiler itself emits, actually
access the data -- not some instructions running on some other
CPU.

For any lock, this assumption is automatically wrong.

We can defeat part of this with the "volatile" keyword, but we
need to insert compiler level memory barriers to make sure that
the operations proceed in a temporally-defined manner, i.e.,
so that time appears to be linear.

Second, the CPU itself may also have both temporal and non-
temporal loads and stores (with arbitrarily complicated rules
about using them).  In this case there may be special instructions
("sfence", "mfence", etc; "membar" on SPARC) for forcing order.
For more about non-temporal operations, see, e.g.:

 http://stackoverflow.com/q/37070/1256452
 http://infocenter.arm.com/help/topic/com.arm.doc.den0024a/CJACGJJF.html

There are some lock algorithms that work without most of this, but
they tend to be a bit hard to set up.  Even then we usually depend
on an atomic compare-and-swap: see
http://wiki.c2.com/?LockFreeSynchronization for instance.

>I think what the inner loop wants to do is to perform some no-op
>for a while before it tries again to acquire the spinlock.

Yes - but the point is that it tries to "gently" read the actual
mutex lock value, and inspect the result to see whether to try the
more-savage (at the hardware level) CAS.

Some of this gets deep into the weeds of hardware implementations.
I had this in my earlier reply (but ripped it out as too much
detail).  On Intel dual-socket systems, for instance, there is a
special bus that connects the two sockets called the QPI, and then
there are caches around each core within any one given socket.
These caches come in multiple levels (L1 and higher, details vary
and one should always expect the *next* CPU to do something
different) with some caches physically local to one core and
others shared between multiple cores in one socket.

These caches tend to coordinate using protocols called MESI or
MOESI.  The letters stand for cache line states: Modified, Owned,
Exclusive, Shared, or Invalid.  A Modified cache line has data not
yet written to the next level out (whether that's a higher level,
larger cache, or main memory).  An Excusive line is in this cache
only and can therefore be written-to.  (I'm ignoring "owned", it
is kind of a tweak between M and E.)  A Shared line is in this
cache *and some other cache* and therefore can *not* be written
to, but *can* be read from; and finally, an Invalid line has no
valid data at all.

As a rule, the closer a cache line is to the CPU, the faster its
access is.  (This rule is pretty reliable since otherwise there is
no point in having that cache.)  *Writing* to a cache line
requires exclusive access, though, so we must know if the line is
shared.  If it *is* shared, we must signal higher level caches
that we intend to write, and wait for them to give up their cached
copies of data.  In other words we fire a bullet at them: "I want
exclusive access, kill off your shared copy."  Then we must wait
for a reply, or a time delay (whichever is architecturally
appropriate), so that we know that this succeeded, or get a
failure indication ("you may not have that exlusively, not just
yet anyway").  This reply or delay takes up to the *worst case*,
slowest, access time may be.  For dual-socket Intel that means
doing a QPI bus transaction to the other socket.

(This is true for any write operation, not just compare-and-swap.
For this reason, we often like our mutexes to fill out a complete
cache line, so that any data *protected by* the mutex is not shot
out of the cache every time we poke at the mutex itself.)

Note that when we *read* the object, however, we're doing a read,
not a write.  This does not need exclusive access to the cache line:
shared access suffices.  If we do not have the data in cache, we
send out a request: "I would like to read this."  Any CPU that has
the item cached, e.g., whoever actually locked the lock, must drop
it back to whatever level accomplishes sharing -- if it's dirty,
writing it out -- and take his core-side cache line status back
from M or E to S.  Any other CPU also spinning, waiting for the
lock, must go to this shared state.  Now all CPUs interested in
the lock -- the holder, and all waiters -- have it shared.  They
can all *read* the line and see whether it's still locked.  There
is no traffic over inter-cache or inter-socket busses at this
point.  These are the "gentle" spins, that only *read* the lock.

Eventually, whoever owns the lock, unlocks it.  This requires a
write (or CAS) operation, which yanks the cache line away from all
the spinners (that part is unavoidable, and slow, and causes all
this bus traffic we were avoiding) and releases the lock.  The
spinners then take the shared cache lines back to shared state and
see that they *may* be able to get the lock.  At this point they
attempt the expensive operation, and *that* produces a reliable
answer -- which may be "someone else beat us to the lock so we
go back to the gentle spin code".

Note that some architectures do not have an actual compare-and-
swap instruction.  For instance, PowerPC and MIPS use a different
technique: there is a load instruction that takes the cache line
to exclusive state *and* sets an internal CPU register to remember
this.  If the cache line drops back out of exclusive state, a
subsequent "store conditional" instruction fails the condition,
does *not* store, and lets you branch to a loop that repeats the
load if needed.  If it is still exclusive, the write succeeds (and
the cache line goes to M state, if the cache is write-back).  This
lets you build a compare-and-swap from the low level cache-line
synchronization operations that are what the hardware uses.

There is more on this at:

http://stackoverflow.com/q/151783/1256452

and specifically for x86 (64 bit Intel and ARM) at:

http://stackoverflow.com/q/151783/1256452

(These are not the only ways to implement synchronization.
Some more exotic architectures have special regions of physical
memory that can act transactionally, or that auto-increment
upon load so that each CPU can "take a ticket" and use a
version of Lamport's Bakery algorithm: see
https://en.wikipedia.org/wiki/Lamport%27s_bakery_algorithm
for details.  However, the BSD mtx_lock() is designed around
compare-and-swap plus optimizations for MESI cache implementations.)

Chris

From owner-freebsd-hackers@freebsd.org  Fri Apr 14 18:56:13 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0B964D3E24D
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Fri, 14 Apr 2017 18:56:13 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
Received: from NAM01-BN3-obe.outbound.protection.outlook.com
 (mail-bn3nam01on0042.outbound.protection.outlook.com [104.47.33.42])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (Client CN "mail.protection.outlook.com",
 Issuer "Microsoft IT SSL SHA2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 8FD1B9C6;
 Fri, 14 Apr 2017 18:56:11 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ksu.edu; s=selector2; 
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=3YvMaCZrxv8xPxTq608KPk97saL0YW9nNR5OHRmqoRQ=;
 b=oDXfapV1WTsHXvApjmpZr5h2phVcH+ZR6c2NleGAIEQhdfs42p/MOz6h4gSOUK30oe77LxQ2ffNXSIQbG3lvMtXbuZWNoLSAg7PYzmmKhIvvl8ife8HaIHFNHekoV9d8HbXJW71DtP/vEgPVwB7Kp3stTUanJkFJccJZNOVJ9DU=
Received: from DM5PR05CA0023.namprd05.prod.outlook.com (10.173.226.33) by
 BN6PR05MB3570.namprd05.prod.outlook.com (10.174.234.159) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id
 15.1.1047.6; Fri, 14 Apr 2017 18:56:09 +0000
Received: from CY1NAM02FT050.eop-nam02.prod.protection.outlook.com
 (2a01:111:f400:7e45::209) by DM5PR05CA0023.outlook.office365.com
 (2603:10b6:3:d4::33) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1047.6 via
 Frontend Transport; Fri, 14 Apr 2017 18:56:09 +0000
Authentication-Results: spf=pass (sender IP is 129.130.18.151)
 smtp.mailfrom=ksu.edu; freebsd.org; dkim=none (message not signed)
 header.d=none;freebsd.org; dmarc=bestguesspass action=none
 header.from=ksu.edu;
Received-SPF: Pass (protection.outlook.com: domain of ksu.edu designates
 129.130.18.151 as permitted sender) receiver=protection.outlook.com;
 client-ip=129.130.18.151; helo=ome-vm-smtp2.campus.ksu.edu;
Received: from ome-vm-smtp2.campus.ksu.edu (129.130.18.151) by
 CY1NAM02FT050.mail.protection.outlook.com (10.152.75.65) with Microsoft SMTP
 Server id 15.1.1019.14 via Frontend Transport; Fri, 14 Apr 2017 18:56:08
 +0000
Received: from calypso.engg.ksu.edu (calypso.engg.ksu.edu [129.130.43.181])
 by ome-vm-smtp2.campus.ksu.edu (8.14.4/8.14.4/Debian-2ubuntu2.1) with ESMTP id
 v3EIu8k8006637; Fri, 14 Apr 2017 13:56:08 -0500
Received: by calypso.engg.ksu.edu (Postfix, from userid 110)
 id 1F9F3248318; Fri, 14 Apr 2017 13:56:08 -0500 (CDT)
Received: from mail-wm0-f51.google.com (mail-wm0-f51.google.com [74.125.82.51])
 by calypso.engg.ksu.edu (Postfix) with ESMTPA id C28402482FB;
 Fri, 14 Apr 2017 13:56:05 -0500 (CDT)
Received: by mail-wm0-f51.google.com with SMTP id t189so69721399wmt.1;
 Fri, 14 Apr 2017 11:56:05 -0700 (PDT)
X-Gm-Message-State: AN3rC/6FioYP8KBcIDQZ2s5zBGQnFyEHzz4PuPcdqpmPCbdfFULfCY+7
 2+HR6PWbO28NSOdb1rcEk30YDj2OzQ==
X-Received: by 10.28.98.66 with SMTP id w63mr36546wmb.33.1492196164629; Fri,
 14 Apr 2017 11:56:04 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.28.167.206 with HTTP; Fri, 14 Apr 2017 11:55:44 -0700 (PDT)
In-Reply-To: <CACNAnaEmBjWudEJwvRTSqyciOp7-oRbCEQ_e6qtGsap0oHQ4yw@mail.gmail.com>
References: <CACNAnaEmBjWudEJwvRTSqyciOp7-oRbCEQ_e6qtGsap0oHQ4yw@mail.gmail.com>
From: Kyle Evans <kevans91@ksu.edu>
Date: Fri, 14 Apr 2017 13:55:44 -0500
X-Gmail-Original-Message-ID: <CACNAnaGOLVKR7Y4uzhuS7EB5-UMb3tS9yKL4Srn8knThk0o1kg@mail.gmail.com>
Message-ID: <CACNAnaGOLVKR7Y4uzhuS7EB5-UMb3tS9yKL4Srn8knThk0o1kg@mail.gmail.com>
Subject: Re: Replacing libgnuregex
To: <freebsd-hackers@freebsd.org>
CC: Pedro Giffuni <pfg@freebsd.org>, Ed Maste <emaste@freebsd.org>
X-EOPAttributedMessage: 0
X-Forefront-Antispam-Report: CIP:129.130.18.151; IPV:NLI; CTRY:US; EFV:NLI;
 SFV:NSPM;
 SFS:(10009020)(39400400002)(39410400002)(39840400002)(39860400002)(39450400003)(2980300002)(438002)(24454002)(377454003)(199003)(189002)(69234005)(8676002)(38730400002)(110136004)(106466001)(8576002)(63696999)(76176999)(54356999)(42186005)(7906003)(50986999)(93516999)(498394004)(8936002)(450100002)(3480700004)(4326008)(53546009)(305945005)(356003)(55446002)(9896002)(606005)(75432002)(7116003)(5660300001)(90966002)(229853002)(2351001)(6916009)(512874002)(45336002)(46386002)(2950100002)(61726006)(236005)(2906002)(88552002)(6306002)(189998001)(221733001)(86362001)(61266001)(9686003)(6246003)(966004)(84326002)(54906002)(55456009);
 DIR:OUT; SFP:1101; SCL:1; SRVR:BN6PR05MB3570; H:ome-vm-smtp2.campus.ksu.edu;
 FPR:; SPF:Pass; MLV:sfv; A:1; MX:1; LANG:en; 
X-Microsoft-Exchange-Diagnostics: 1; CY1NAM02FT050;
 1:cZRVwphsQks9bL/iXgw6MuMgwgbEkYKAw9IOnbkKuqctzfiVz/r52/AegAOoHlS2MnDFtUQrCnDetAmlEZNk8V7LGPobgFtRJoFYERJKvVtYhOmiI1MoDlSguqRExbFaP910XHtK3nWpkbLpz47hdikNjo1o1DTlynKFWeO8xwm7doZ3VeZz4uO8WSv3X/aDl84M5A51/cQkbBGwIzSFAqen+Tt3SZK/jd2wEbyLOcuEyV4p5sUoaZjCDf1tQQJcmAM38STCriPvffCYFolCIGxG6PCSDawT1SvPcR5qoBJ+9rPe7B8qTXHwTe50/hKVDcyhDMUsrEua//nFCnTzLhAOV57bwKSepaCrZdZVT5BYk+8UcHPFZCGwkkFTzHk3jjuxYZ0oh1HhW0ws00Q2Yab94CD2MJtMkXmmK8J91WO8hRxiiwc526XxCGw84oaTfB61i59JHGjAvZsCS6pEbWtB/AEi5J9ksIKVFDLET7weiNhRTh9s7qrEv+nha5shF1HQXDY+4hC15FmQNFJC+ezew4NGXheQWeWYqSc/ebs=
X-MS-Office365-Filtering-Correlation-Id: a88d3fa4-c546-480e-a03b-08d48367e9a9
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(22001)(8251501002)(2017030254075)(201703131423075)(201703031133081)(201702281549075);
 SRVR:BN6PR05MB3570; 
X-Microsoft-Exchange-Diagnostics: 1; BN6PR05MB3570;
 3:88qUcyNQMuH6ubxRyeuzP8MVxpKvW3tLEXQYdIIWy4zRng3O70EHpotNtEGeWxvUEsWX34CukIW8hyO0HUNL6bhu8d06FTNxGxoxrR3oqE5I5FFKxH+7WoNDXZsHj14vRkvi6YxwHimM3UBVZ+9I35SYtRircbnadZcDVUj3zNycQUnsKcA0SZVtgKmJjmBqNND1raw5DMd9HKSFhXk6j8dIzvhrSW6f+HeN6P5uaHmjQrMverk3DCeXk5gHFFZuW7CvbIKvOWpy07PadWajzrgonHPjND8ddu1U8BFERqJK4XcmBxSxwgA7SXMchUHL0YMSPr4UvPWY4yeggAkfh7z9HuIf+NZt9Bvkp+e4Q6QF/FkVJ/kG1f+pnHOMsCy1L9h5D0HkkJTsW9zj5jcv2yUA7tTUcgENaTmSJ+FJzTYohqaknDI9SO7nll1VZ0D3BexUzEoJnT6KvmEKM+oxudmd42LCi3yfti0meWMWL9aWhu1+OkMWTq29kvtNLg05pDS79lIJUJth+rFpV8kQpA==
X-Microsoft-Exchange-Diagnostics: 1; BN6PR05MB3570;
 25:Um8cMkVtPikaBachp+u5NqEcZljWHF6gYvRLtUugX+s3bTIQGqXKTibz25KtSSSqJOYJSZpHJJw2TVD1w8xt2QKXx2+fD5OjnRJ1ioppzLSC/3jGv/lmYvvYmJYOg7u0gEEr600Gm7AqHjXIh7aQZrR4sWoEfTwBaDxRWptNgVMTW/3feKXD2McfX4QGWahQRD8fQN4lk4MNZbhKGfWvFkv6HghoUPCPtxFqNncQcdRUwzUvGVGwa/txJ+JukTWj6RM+WG1sKYAUB9r/4TXK7Geyv/2JHe9QaxMmBYkNJ5gzPLt0Ue2EjsZvcCIkSFTcgYrhI6VxRKW6lXR+WlikAdOnaMuEih5/UqqTFG+QngN+wIJ+xuojR/CFSVrk5kAJM5DrTCyvPXv4aa6HwRVPpArLJ0CpvI0DqBgGaHhih3SEbcoSrAqNFtgo43ekUzjqYbA1vSL5dr4z0NIw4flynv4ipn5eT43qnPS05NXiKow=;
 31:u826m6AbEdb4xS1joNfnrqLxViKb+ULKcq3cClSc7nYzgRj3Gfu1tu+5INakMSNtWCSA4wSIVMmOFOt0ArfenDfzgtYJGZPmtxpEAu5MkcavpvZP+rXx3EjrZI20iM1VowRI9giXiRcMeDq5bUk2dCa+SjIL9ti4O65oCFjH5PRtINwikfsU93vmxKMItKooCMOQdnrp/v2z0CA0qfHTJtwHA2W3vq+LDE4BCN/uxCNW1r5UVAuobziKdhvgrY+NBZganYOl/hC4v57jhnoxBoVeD20JVsV9aAwMn/QAuns=
X-Microsoft-Exchange-Diagnostics: 1; BN6PR05MB3570;
 20:W9oAu16mAzuvixWNEj6z2G6b1dJa1ONFaevQTnNouIntSfCMsQY7mwFiCEjlOHPmG5smbYXvyj4sILen748qtMe8rwTVzs+fa5HpZ0xqGxYp/yaiTdvCstai2F+eyt9zLa1+EhNsdmtTuHY92qwkJKFz5zqvkOgL0Smhe818r3EUP7ibZov0trEKP78RhOEYDfCRjagGkPDkNQqU/CkCHFre3Zm2bxcI/x4iCakVuPwH7a4h2iGGl2TKhiZnftrSVjLqRDYZ3stz3yJWGWpUaijFHUGKnzx0mnoz8n0VMgTxgbpNjizLggAqSO7cHJAdtNwH/7c0m/yDWFtYpKvaYTlLO24A+Aw8Fi5iVZEQQxbadvSfl+HG2vVUJLT67a9WvRnTHL9XZWS/7QkTTiqNbfRxhPKHU9Arh35a/7zO6mervZTdFwBEMn1+8JJnOMzxlezr+Zlyt522RiZOrmOaWaKb7NBend6Pe/SDL6QV8ytRKm3C0o2LXhe3eZjHrgYd
X-Microsoft-Antispam-PRVS: <BN6PR05MB35701ECD1D4752F9ADCE3DD5C1050@BN6PR05MB3570.namprd05.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:(112903893386949);
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(6040450)(601004)(2401047)(13024025)(13023025)(13018025)(8121501046)(13017025)(13015025)(5005006)(93006095)(93004095)(3002001)(10201501046)(6041248)(201703131423075)(201702281529075)(201702281528075)(201703061421075)(20161123562025)(20161123555025)(20161123564025)(20161123560025)(6072148);
 SRVR:BN6PR05MB3570; BCL:0; PCL:0; RULEID:; SRVR:BN6PR05MB3570; 
X-Microsoft-Exchange-Diagnostics: 1; BN6PR05MB3570;
 4:ibEdhJnpoZ4eosClq+sRohctfPJCK1yqgAa9axC0dwOBc81TH5jwER8DK3zQKZ30j/P5klZvAMi2RKGTUZ9lu/JGT3IppgMwMrX7SgHLiQAj1jwRDvwfcQYbhK0KYAMcbsv4O3m7eXMLyApelWusDEemSHW7YR34e/JTin0W9FZJd0C+3TBbLBhxaRqI4pRY/+0x8v2jjAz7nHUyMJ9jQ2V9jxWU6TqysKvfQYaSVe7IF77DCxYVqJLJsJEV33TvfcybLw8nPyLz6gN1Q/w/LuyIwAXEAAgiAmVbDSCG3qCugqzreZHVKdVqCMLe54x8DMMfTzAXeArSUTNhwBSSfimO+fadazCi1HEwC4oyFSr5noYPwSH66p3liHSlh4ohunxUJ+X1ZDpJ3D604Knnc7MfaJ5sOjVnYvvlroqO4DoXNhqaSAwmTcWJasCvCoXuxaBGHu7/5bQmKOlMa/3ytAFjYe736ReCVL2Zw7CQyeHDQSfs5Mzp72kFKVkROq5p1elrYeQjeWZxtF1R12W92ZPhas3Mw65INrE7yoSlHrhYI3i989DUPTDLcuFOFeZdKdjxONN9/R1T9JMhm77yL6GJ3BtI0yexm8FcKUAxKa6deSkh4NfbBPzDewHN1Yese43PeMy6gGcjXBh1Ih8sec1gUFQW/OPsWQCto5RNs1+Lagb2p8KaqVuTeHnBjHT0FitU29EGv12js3SxGYKAxN9gW3Pra58JWvcPAL41aIQ4sTZpF7WuHZeoqDCBTi/hCBsVywyXyMBt3eVirt3k06DzUiMHk836M48tvfXdyhtgsCgyFIh0PVABDlm5QS9S4/hiGrG6jU5Wyw1ihDF7fKCJZIR3gzyDXL3CysA1fHjYRJ7Cv/N5c5OWbVXKY/hnjyzYs9ufxGxft6ck7TtxDA==
X-Forefront-PRVS: 02778BF158
X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BN6PR05MB3570;
 23:ocPHmsllCDfIkwoxSh2sCy4ynnGGObQhDNMUJFnGA?=
 =?us-ascii?Q?GLBrP6rVE4JP9Fqxd6vMVCC7ev2Fmqe1SsnOeeW3smFDS48GlKDHBjDetB3m?=
 =?us-ascii?Q?7IDn5ddLdityUaMFmHXhmOp2oML2TMUPQ0BeTPH5Q2eztmNCyjOA3H2i/GrK?=
 =?us-ascii?Q?cZZV51Jkq7BIvyVU15e67vD73w/WYKmaUd954sOr4bj6a4hZR6lVsr1wAwCO?=
 =?us-ascii?Q?9arMyLovodv+TTJ+KXEOPoglcGRE2rlCC7by4e2ioMbmGpCocOp2ojTY4mXe?=
 =?us-ascii?Q?KRI7dk/kwJM/2MwASsF2RcevaZxqRSOJR7tFEwhuaSCbdcKr1O1aztXO8ZxC?=
 =?us-ascii?Q?miEtEHZv1aAYxqqdNqHn0DqG8CwtEwf7AKMszkNvVtN0+DtfRY19Y72lzvw8?=
 =?us-ascii?Q?TTvVT9Y5b7GqbFtU1WIWFGGrCUC14vqYLb6acxSNO8gKm0zzubhC2IephcY1?=
 =?us-ascii?Q?Ay4tZpqB1pqxVXrxiUKl32N8TZh3aMM3+GbFmaiPoWaz5/zH8L7gpzWpKEOY?=
 =?us-ascii?Q?y2tNf/t6irpym1v+adJRQYuYX7vWnaAEtZX+/h4CTaWC59wsKq/qnERAGJGI?=
 =?us-ascii?Q?uxsVz7T9MiwvtuLR4vNb++i7+HHyjgOWKQF6woKGMKQ7jZd4qhPVCFy6bmiH?=
 =?us-ascii?Q?HFzZqXRynJsOJkwoA9I2Ixw376uvsetkPwHp4Q1vLbpiBfuoONRg5pJInO1S?=
 =?us-ascii?Q?uhNu3GffvjMDOmLeTvCXyUoUgsBqoHT3XPJ0vqLhcAuusqK0UXf4xD1Cyxcu?=
 =?us-ascii?Q?hFu9ltOv/54v/iXNCnY8OsnD4v+TEIVV2a1qTX7kiA2yff5Unxi6N9MbZXBw?=
 =?us-ascii?Q?wqcvDJ0T/VJeUELbewDzw1pCxGqxvcRw6Uhh2+aUh8ZfiCMqRfOZOiEs7Ycw?=
 =?us-ascii?Q?X83XPVp5GebE1EEeq/LkdacsJiRbzQtTgWiVPxE0Us2YNzuTkcJyyIRCoR5R?=
 =?us-ascii?Q?uXg3FavZ/rnsOMDefZrbHt3pX4LbVmQkx2hrYRoOCjsZChWl+7CeKV3zbjL1?=
 =?us-ascii?Q?7pMQgqhlELD4rDCvNZ77QnbI/RHmgWqZwnOlxCPGFGoG40PSt1DxmfTGyIok?=
 =?us-ascii?Q?/LdNCmPkbrkuUp9kReKvQfL1r4WTT44KXminruPBu5+zObJLR1lH2ZS+4drF?=
 =?us-ascii?Q?/qKFLa+QYL7EFSvZAaGyTRhCmphXlg0xIm1D6hYFuErv7Cu7ncoE/AcG6HtF?=
 =?us-ascii?Q?H+u2bCoseLGp1wujgNRNRjEUqzt8oR5fHRQjMqwryZW05anv4HKuXx/pUmnM?=
 =?us-ascii?Q?/wumeTkkc1/FxvmB+c4sGJEpAmHErFBvb9eQTgBpozQosLt3ds8kx6nWi6tK?=
 =?us-ascii?Q?wzzboeHWRpWVQB1rHheGq6p/dXg+jGbVRWejVMPZMhWHuceF6eJcMVHDLfxF?=
 =?us-ascii?Q?oJRbqJYFsgH6M45ui25dM/sikISY4+uH1Zdu7pTu7Pf7s4NsWSZX4eooPdY6?=
 =?us-ascii?Q?l/rVbELlJHbnkfofutVaRrjPDA+wxI=3D?=
X-Microsoft-Exchange-Diagnostics: 1; BN6PR05MB3570;
 6:1hqFSIIgY3VtnHxEQkTGmhik55Cgs1jAdeS5u3KX6k81VoNGVbFFMwrGBTvP8MYTLkoRFKzJiDNyFe3rf3v9hBkMe8d5Tg/GDKch3V3X4TU9CsBM/oIXqAxCLhJ1G2oNI0frQcf8Z7iiB/gZXwtBjMXoTHCyx27x86QZjQgSBL0JgzaW2Ak+Uci7i+0CKJ7tKBCCPKYoA/OGm0mj+/iSiwa9yrj3xGQRsM9KTMojH9bzEkysp4pAq0ttjwZg+5PJWAWznkH9xPN3JAw2SPBYn9u+Ux5s0yl475Z6fiP0od0Deq37qgb8PZJWCR72fyGarNXOu7ct+nU4mb4nwzEXQjHZRITpgsAxrVb5JkMjmfuE9cm3wuSMWMCd7wwypmzVUCIqK/pTVt/G2R/bDM5G3WKoFYK66tw2gFJZ+tGDIMpeITmp8cDY11V45h8OI+pa/aBuzkHiC/uIxGewCze1YA==;
 5:xfgnGkhrZIsynwdzCdoXY9P2hGWu/2dezAHlWSfnEh6w8Sk+BU49KGq0Iii5s/PEhNlTkN+DTW/lzfAaUFkIss7+2RuiZYcOgf6Wsi0Xob1mU5QxB7Slb+yWfGqw60y3J59GTik0Xl7X0ekXwPFKxQ==;
 24:kAga1hxYnflteNGzDQP3/2sKPaO0OQN3Bq7k+P/oJpIrg3XQKnkkCf5iyXa4whxP0u05lVkBG71e3m3UbpIQ+Peh+vmgm0QahGzxojaAtAI=
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-Microsoft-Exchange-Diagnostics: 1; BN6PR05MB3570;
 7:RN8s8QPqX6KY/sz2hCYd/tUjcEP9cu6WSra2CRrSBGMqRYAUL7TAF9DAxKQNc/BujZ5X3SuVdD9khcQrE3fzm237iMgnJySosmEo5heAdOy7p0Ys8V9Fs12jKsaSS1FCMGSjgzzV0r8OzE1DrSUqMMXgYlBlUs9HxJQgCl6z4gAlBWYS1sDjpCKF7NflODNC75E5NNTRtmpTbrx0AWvdXLNrmA8MlMTkNhmNhd2sMtcrD3Q/hxzT1EwltUXiWSnRMsj2YhaBAq9OJSKHOHNm5/6BuLVu35tFVGuRFXPEifhXSs/WP6vDE6XT/LReuNygp7UFgtIYpPIwXguvRHRCVA==;
 20:5/HjOUXBIYkJ/1/oL8snETnFk3LZ6nki51peconWfz0z6T5eM4GH45KN/ufMRII0MWRaaPemug+zY/LFdnBluelJipRFXuUYW0OCOmQXqD/UJGq6s+9dUI7XuaQ5XfWV59b8LFx76uGiMOhf8R+wWd5ukpRTIwgNLMOz3gz0310=
X-OriginatorOrg: ksu.edu
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2017 18:56:08.8660 (UTC)
X-MS-Exchange-CrossTenant-Id: d9a2fa71-d67d-4cb6-b541-06ccaa8013fb
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=d9a2fa71-d67d-4cb6-b541-06ccaa8013fb; Ip=[129.130.18.151];
 Helo=[ome-vm-smtp2.campus.ksu.edu]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR05MB3570
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Apr 2017 18:56:13 -0000

On Tue, Apr 11, 2017 at 3:20 PM, Kyle Evans <kevans91@ksu.edu> wrote:

>
> On the other hand, I think I could fairly easily implement most of these
> into libc/regex. Here's a summary of what this option entails adding to
> libc/regex, from what I've found:
>
> * Empty subexpressions(*)
> * Add missing quantifiers to BREs: \?, \+
> * Add branching to BREs: \|
> * Add backreferences (\1 through \9) to EREs
> * Add \w, \W, \s, and \S corresponding to [[:alnum:]], [^[:alnum:]],
> [[:space:]], and [^[:space:]] respectively
> * Add word boundaries and anchors:
> ** \b: word boundary
> ** \B: not word boundary
> ** \<: Strt of word
> ** \>: End of word
> ** \`: Start of subject string
> ** \': End of subject string
>
> (*) I didn't actually find anything explicitly stating this as a GNU
> extension, but it's certainly not conformant to POSIX specifications to
> use, it gets used a tiny bit in some ports, and we implement a workaround
> in bsdgrep(1) for the simplest case of empty expressions ("") to match
> everything and produce zero length matches.
>
> The main benefit of this is not having to maintain a completely separate
> regex parser and the potential for inconsistencies that come along with it.
> The downside is that that would seem to promote expressions that are not
> strictly POSIX conformant. Is this a problem? Is this a problem worth
> worrying about?
>
>
FYI- A patch showing what the implementation for all of the above into
libc/regex looks like [1]. Some cleanup is still in order and the test set
is not exhaustive, but this should implement all of the GNU extensions and
it's at least functional.

It will break some things (like one of the tests, for instance) that relied
on being able to escape an ordinary character (e.g. \b) and get an ordinary
character. This is specified as producing undefined behavior [2], though,
so I don't feel terrible about breaking it.

If this seems desirable, I can work on cleaning it up and splitting it into
more consumable bites for FreeBSD's libc.

Thanks,

Kyle Evans

[1] http://files.kyle-evans.net/freebsd/libc-gnuext.diff
[2]
http://pubs.opengroup.org/onlinepubs/009696899/basedefs/xbd_chap09.html#tag_09_03_03

From owner-freebsd-hackers@freebsd.org  Fri Apr 14 20:41:31 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id C367FD3AAC9
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Fri, 14 Apr 2017 20:41:31 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
Received: from NAM02-BL2-obe.outbound.protection.outlook.com
 (mail-bl2nam02on0060.outbound.protection.outlook.com [104.47.38.60])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (Client CN "mail.protection.outlook.com",
 Issuer "Microsoft IT SSL SHA2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 49BC837A;
 Fri, 14 Apr 2017 20:41:30 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ksu.edu; s=selector2; 
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=L4Zye8I6kY+zyo7jVNmXGvFaqfvTSvMoarOMrpWVSJ0=;
 b=eo+G77EXn3Ek8qN15bZCx0ZUSrz7iXxCvGJ8lGXmHHgwpG1TuK3YF2EtVT4DF+heHZsJNyrOYoynUwvEa5dgH96zqvu22+WR7IDQwo590hnqdtDIoBSKn3KQbmuN2ElSlRVLNxUy+j1IqcujFXA8WDgT8bvBvixDC4qmjg4JqjA=
Received: from BLUPR05CA0061.namprd05.prod.outlook.com (10.141.20.31) by
 BN3PR0501MB1107.namprd05.prod.outlook.com (10.160.113.141) with Microsoft
 SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1034.5; Fri, 14
 Apr 2017 20:41:28 +0000
Received: from CY1NAM02FT011.eop-nam02.prod.protection.outlook.com
 (2a01:111:f400:7e45::209) by BLUPR05CA0061.outlook.office365.com
 (2a01:111:e400:855::31) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1047.6 via
 Frontend Transport; Fri, 14 Apr 2017 20:41:28 +0000
Authentication-Results: spf=pass (sender IP is 129.130.18.151)
 smtp.mailfrom=ksu.edu; freebsd.org; dkim=none (message not signed)
 header.d=none;freebsd.org; dmarc=bestguesspass action=none
 header.from=ksu.edu;
Received-SPF: Pass (protection.outlook.com: domain of ksu.edu designates
 129.130.18.151 as permitted sender) receiver=protection.outlook.com;
 client-ip=129.130.18.151; helo=ome-vm-smtp2.campus.ksu.edu;
Received: from ome-vm-smtp2.campus.ksu.edu (129.130.18.151) by
 CY1NAM02FT011.mail.protection.outlook.com (10.152.75.156) with Microsoft SMTP
 Server id 15.1.1019.14 via Frontend Transport; Fri, 14 Apr 2017 20:41:26
 +0000
Received: from calypso.engg.ksu.edu (calypso.engg.ksu.edu [129.130.43.181])
 by ome-vm-smtp2.campus.ksu.edu (8.14.4/8.14.4/Debian-2ubuntu2.1) with ESMTP id
 v3EKfQlL030735; Fri, 14 Apr 2017 15:41:26 -0500
Received: by calypso.engg.ksu.edu (Postfix, from userid 110)
 id 68A11248319; Fri, 14 Apr 2017 15:41:26 -0500 (CDT)
Received: from mail-wm0-f53.google.com (mail-wm0-f53.google.com [74.125.82.53])
 by calypso.engg.ksu.edu (Postfix) with ESMTPA id 102DB248318;
 Fri, 14 Apr 2017 15:41:24 -0500 (CDT)
Received: by mail-wm0-f53.google.com with SMTP id t189so1116178wmt.1;
 Fri, 14 Apr 2017 13:41:23 -0700 (PDT)
X-Gm-Message-State: AN3rC/4d0vkFlwi5dcxA4jLMZFa401CIA5W/ahb9Uxq4KVN7gc3A+LM9
 eyNjU+Y7RJD8b8cojnQSUQQbUFXlwA==
X-Received: by 10.28.88.2 with SMTP id m2mr316897wmb.12.1492202482966; Fri, 14
 Apr 2017 13:41:22 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.28.39.134 with HTTP; Fri, 14 Apr 2017 13:41:02 -0700 (PDT)
In-Reply-To: <10004f0d-acb7-f81a-f3d5-b368e606a105@FreeBSD.org>
References: <CACNAnaEmBjWudEJwvRTSqyciOp7-oRbCEQ_e6qtGsap0oHQ4yw@mail.gmail.com>
 <CACNAnaGOLVKR7Y4uzhuS7EB5-UMb3tS9yKL4Srn8knThk0o1kg@mail.gmail.com>
 <10004f0d-acb7-f81a-f3d5-b368e606a105@FreeBSD.org>
From: Kyle Evans <kevans91@ksu.edu>
Date: Fri, 14 Apr 2017 15:41:02 -0500
X-Gmail-Original-Message-ID: <CACNAnaHTAS7a+vuTS+3yAT3p_4yUY_MOmd=mfBc922-0pjNW9g@mail.gmail.com>
Message-ID: <CACNAnaHTAS7a+vuTS+3yAT3p_4yUY_MOmd=mfBc922-0pjNW9g@mail.gmail.com>
Subject: Re: Replacing libgnuregex
To: Pedro Giffuni <pfg@freebsd.org>
CC: <freebsd-hackers@freebsd.org>, Ed Maste <emaste@freebsd.org>
X-EOPAttributedMessage: 0
X-Forefront-Antispam-Report: CIP:129.130.18.151; IPV:NLI; CTRY:US; EFV:NLI;
 SFV:NSPM;
 SFS:(10009020)(39860400002)(39400400002)(39840400002)(39450400003)(39410400002)(2980300002)(438002)(24454002)(199003)(189002)(377454003)(50986999)(61266001)(4326008)(93516999)(90966002)(7906003)(305945005)(54356999)(38730400002)(2906002)(110136004)(88552002)(55446002)(53546009)(63696999)(76176999)(189998001)(450100002)(2950100002)(6916009)(221733001)(229853002)(6246003)(106466001)(42186005)(512874002)(8936002)(61726006)(54906002)(9686003)(7116003)(84326002)(356003)(5660300001)(606005)(236005)(498394004)(86362001)(8676002)(9896002)(45336002)(98316002)(75432002)(8576002)(46386002)(6306002)(3480700004)(55456009);
 DIR:OUT; SFP:1101; SCL:1; SRVR:BN3PR0501MB1107; H:ome-vm-smtp2.campus.ksu.edu;
 FPR:; SPF:Pass; MLV:sfv; A:3; MX:1; LANG:en; 
X-Microsoft-Exchange-Diagnostics: 1; CY1NAM02FT011;
 1:bXFPhA2g0KkbMpybI2rcQoJ43x6gw5a3+HR5ubKD+yjoNAopnHcxPggB6rjta4OAaKJz41R9cGRy39FkU/REvNckjunLpbSOIF50zwZm6hRE5Ajq2BY8ImYbv9oEELbIqQWytw7hNYTg8jmQ++zakqIzXFH0AKzLD3vXGi+OMjfLO/BjRRY+w6fCMpGmqXC+WBu9a1+wlypzLe2kdm+vDwqsdQgRZ9/0BOARhp4dkGsL7rcWiF+FR6LZgo4mTIqDP26b4/F5juhp/NM9ByjkVIMD2LhRqU950ZneORxV3WGHp0fOsUR+7rlV+ou5ZFa/1w9mHl4fiXV5MHnAxfFb7TT+lK/vbK9aivqJw0EzAQqx3Y7EL6fU7CoZDp/HQDoIDpcxbJ1k3LrPydAyegYl5goLPVHU/dWuH2Ajwp/IZoIfGrpAIe99aMrUdtBiUeDgkDkWl1KemPoqTUrfXOyHqA4OwT++XXRZwCMhz5NAhqXtAJPIoo7FqZx5ZxHb0Tjs+4fnug3s4pF0F5QpqeTRS4UCeVglqxj6EsYjTsuNlek=
X-MS-Office365-Filtering-Correlation-Id: 281fd37b-0217-47a5-808e-08d483769f87
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(22001)(8251501002)(2017030254075)(201703131423075)(201703031133081)(201702281549075);
 SRVR:BN3PR0501MB1107; 
X-Microsoft-Exchange-Diagnostics: 1; BN3PR0501MB1107;
 3:k5Lq8D9ro1XxMqeKm46OFdJpqrsA5PrROFDy6sNUNpPz+iFNbY0fVLL4DNCOqggGg+9fT27hezVv78RS5vZLMX3RhPDrQ9b6pr5h6ypboW/n4AiXq0VRGW3sm462F938rqr/sxXkfWMVz00kjk6dHMf95Ap2Bv+UHKo9DFlM9a8n6VODvNOKa4mIMYUhC/prYiG4bMd1yZBmSgZkNwgt/VYQ6maW2rHt+YyDsYOlCzMcYPlTnPY1J1VPBZWdkJLSGIpOLNJid6zXI7Fy0nueRZT7WCl6Etdu5a7EMKzLi3+JzijsthAZMJpABV2jcdAOdtR8TuCqliRDgaY6B/nfRwNSoh7xHE4ZXeBbrPbC5ZWB8xmUP57CghCxeG84jXfQ/UM6eJrPZpqWBhJSeyrBF/1nthveybNDZ8NFeiHdGpTaewyHN3s04ZbShcxxK5ns+0GB7U9H0ahekeuPBqbyPdKyRT5KHFp9XvERyD7vLuLAs7ZwzlXfFkXT+uSiAyqL02iCw2cQhq3pI4wnewBq+g==
X-Microsoft-Exchange-Diagnostics: 1; BN3PR0501MB1107;
 25:RCjD6Bky+vwMU/n9AUcuQZeDkgogd5Jtl2BUhO1f1CySFLMJIgbywVPV+/C+F3LsYDhzeQw20F8xcya9cynckAR5HkJwnH//hl1jLlfWspPDqL9gNrsy/dBsvIGCjC/hgvU1H7H0DsfoXWH2+T6MIrHKT0aMyVQOOCXJW8hLP7MIl+v6ikLwYnszU14HxANbkRl2SFU/azA3jN6nPv/MM+CUEppE3xNaenYE0YG5SYBeM9AAlZcYwXjXANiL/d3TSGDKNUNQJVT6L/9dQyJEPVlltUJ5CkGCPFputgEu8Ue/2J8JrZpwyjd8tl0u0H9xn0+DGM1Q20B0YHww/vGiOv0rkciAYRMSGirD5N2YbYpyK5YosM4WVNaO744TqmfnsauroLvP8vd7fE8RetVgXNy9tJdeGpY26se99Irb+4glXS+fJGGTCPPVZvggsoZZksjRxEBYq/RH9chOIEC4Kg==;
 31:yjYQWlTORSeQveRHyCtAYcJP9kJz+Ie+l3zFzRmtdP+7Aql0Qm1LupSjbFZgPNIvIlPmhsQUSfRqjpuxLK/i4rIa9cab4Z2dn4GXUE4+bILwTAFidHVwzVWfrUYQ44dpqwRRPo8qxs+lRpQjeekANgAXtoSMqac8miXjbFL6eJvBLo/BrwlZOMGVv9+3NXD8OV4nkl7KfSLU+g7JkrESM/9QEYtLvi9Csr6yu/oUS1Ne9kWfnnKxTBqx5atSbD/Ke7h7LKbT25SnCUkExXohi6GVIkqmf3Jwvns3N+QoN7s=
X-Microsoft-Exchange-Diagnostics: 1; BN3PR0501MB1107;
 20:vRihyUKn9fccvU51/F227tGYRaOw+kOBbVanX4QYo7R8BT5VNgsMzgK28m5CTOAGyeHzMjZ8zQkRh4WsScQKzfhjJOUQgWNyIRPHXZfN1VhbGm9RXnZqEOruYSdoyoZyfr/VFtcrNzJ3CYDB8+wek7sGAHX12heeLdfd2dm8I0Jrs/PPGcfU4dz3JxVLqwnMAysN4/ee6pMoQkKiceKxW5MXr+nRPDLm3aQCyGq4UXChXsG9ug0uZPTg2nlW0tzGZdL40wx9fEXRf841YckjncOfpW5zjXscz040u+VJNfjhAhLqhJ9vvjP1G5bNn8qbVD+zKts9Etfpy/oqRat1j/mopXlPMowt2CYV+uce4JZ6KTjW9CEQiNlHqcQHPDelU0JI8dftMLLJ7w0uKdaecui3YFyI0garSGwj1CYGbgE7g9jOyVpcnDwO+HP7RiABs3ZcF+PApez6Em6Jp/Q/mtwkUG+RgYMUXhHkHcqQ0wc+FyvAuL2VTOvZsqrr+Xm0
X-Microsoft-Antispam-PRVS: <BN3PR0501MB1107F3BEFF53AD428D134C29C1050@BN3PR0501MB1107.namprd05.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:;
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(6040450)(601004)(2401047)(13023025)(13024025)(13017025)(13018025)(8121501046)(13015025)(5005006)(3002001)(93006095)(93004095)(10201501046)(6041248)(20161123555025)(201703131423075)(201702281529075)(201702281528075)(201703061421075)(20161123562025)(20161123564025)(20161123560025)(6072148);
 SRVR:BN3PR0501MB1107; BCL:0; PCL:0; RULEID:; SRVR:BN3PR0501MB1107; 
X-Microsoft-Exchange-Diagnostics: 1; BN3PR0501MB1107;
 4:UhTCm4JycDc2hfF1Ur8YH2dgSSjDnQs8KgNgFnftFGu/0hQLNQig7vCDB+P79m+8VD1ODeSGMbMGdNDy3We81ancuunSCcXUdI0APwibrEUAtcUZqiI1+W/P5wd4FsCMCh70fQQVR2dAwKlSHaVa5rqgSRISX7cZWLC78mWl0TVg4T/JtEeVauaZGRtwkt3z+REb4mWqdNSKLxSLuDzHUb17J6UxcBvSC4i1oLcIhl3xbe5YU7d714pKECRsN9u1Tp2ZZ5afzp9kjbuLbrK9GtIcr0jyBiCYWW1jRed5agOfLVU2Z8eT8ZetDWNLUoSAym/EFKjsnaezldzs8v2OM82h8Z+GqpSNIGOejeDKob8X93EBvjK3ACMmVJT3yNmlKV+CD8/1q3VTgMU368KBcrqsRH2oiIKOn6shW9nYoom57nysWQYlwvsrJQeiDFx0KVe4pxBSO1GGjwTZ5IVuhgdOacDXReiEb6P//l/3Ti/UuwbVxZ75awSnybNr0dVfpownaraQri6EZMtk2lJ/NTmTJZlmS0pDYb9Th+6wOlSA1tG2we0QuJ9rFjRKfIOgOg7yAY2Igok479zRpkXazaY66mDlNS1Q7tUlhbbHFAQ+yYYI4qKnWtY1W1DiCtIu56EGJ04q7qeiw+ql0L8jEd47xsOwOl7iwvsJ6MisDFEvIvrEBXt5HOaB755F5Czq/5pzmGRHtdoYx+jv9ac59nnUZ+YoMFcZKAXWjVgs3WAwnn5fozJtLeuPou8T7dUD6CwMqBQTwhZgMU7aTp/W9OqAXhA9lAb5X/folo3oLWwyF5e1tY93A2RI0OYf9gf0piCH3iEO1D+0YWj2QVajI2c5Nga1I6Ne2/2QkqBiMzc=
X-Forefront-PRVS: 02778BF158
X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BN3PR0501MB1107;
 23:JiUBbFdMMK8FCDvXSr4RSuVeNB8XXzLgMKpt0vC?=
 =?us-ascii?Q?DVAffuhu7Geux50GhqbdNkj5l8r2v+fKlPybw1oSDPgkttnEh9Nx4s9nDpRO?=
 =?us-ascii?Q?TRK0bZfhAsTOmqwUT4Z6fFwRiAXiX4aYVDG3zdSKbVm9Bd+OmBto+DX2PO/U?=
 =?us-ascii?Q?W251YE5tHvTluXqzCaRCuwgVSXYxRJ6iwxr4v2JOXvBCbC8djZZqGoRnC+sl?=
 =?us-ascii?Q?JiK5d3K/zduVfbtYytSjJADjUGasg2Or3sNwD/bcA4OloQXjuH8UE9w8VmWP?=
 =?us-ascii?Q?WyioEJVHWD2FIIvE5qK1UH1GvDvT20+FWhSZXtYF/I/JB3JsxrigbLXrui+h?=
 =?us-ascii?Q?TO3Z3grbCWtmZvvqZFPNglLSsx+2oMOMdS8UesILj7FLDpxe+uV1Lqmd4sDM?=
 =?us-ascii?Q?71tj8iu1QmGtnhxzozzuVpFrtpuo13b9QKkftJwRv1Ad144Hy9E3vM1IAzQs?=
 =?us-ascii?Q?rKcKq2hYWi9JseUnNhBdVi6pa+DNf1ycizq/VYPOW6wIVOCnWpL2rCydCTtV?=
 =?us-ascii?Q?Ko3HE7z5/8jdfaDpxrWUj4Um6bd+YDmQn/sGchXDLEZtYwcaOkTgkCvgDbwL?=
 =?us-ascii?Q?XfF38LVwoDzCSjxoKUGqbcmH1zDXFI4SA0jpPJSy7XO6E/mjZ0rbWZB4gJlB?=
 =?us-ascii?Q?uYGLJNKSli4W+FwZXW+zQd3PEXu8vefJy21Xx0d4q5WaAytr6k4dCuhWweIZ?=
 =?us-ascii?Q?TNzKguy7OsSerzXaDQ7e1MFdsUdIQiGXKH/Yypqxmpdm1nH7KMcwO2vbCSmR?=
 =?us-ascii?Q?YbPt/04KPy8t0LffkXmgoo1EKlzCeogv2gOdCBRjWB9jCR7m0yc3QhBmEJ7c?=
 =?us-ascii?Q?17CrxjtvocNUDt2IzcN5mmn/1kO1oEPsX5WGJ4UWxYPwjfQaWzdvF8Dypbwf?=
 =?us-ascii?Q?UjwwgRPib6wu31KKHaJu5rsGEUvmlYSTxohrD60OFTbd5e+FHGQBA4Ouc1OL?=
 =?us-ascii?Q?iPEpFKkdTnWH0UuP+QHjEwLlo1o2GBqEhtAkM1fguELnWakAdiDfpyv0GVD8?=
 =?us-ascii?Q?klws7GTucRjGIE0QfiD7xwiEApNw9OBwJzWMW6klf6IsDJChMPujEa1YxZ3k?=
 =?us-ascii?Q?cPK3Fmid6wq+yh8vrE9pAK8uVMM0EmHdg9mG4vHC5MOC3G28HIIK8I1IXxPK?=
 =?us-ascii?Q?0ja0/TZiTn+P7pBuwgEwcp6hqIA3jAQuayF6IUpkFw1zt+ffmABmbNdGSNq+?=
 =?us-ascii?Q?Xs0kk467mZgKIKdt2qxndnzZRS9ue32byqAofK6xGAWZ+40GeLDyNh3jOHTT?=
 =?us-ascii?Q?U48cF6MiKDsVjv2brzHkbdCNtgvTTdynDNkQnZOhdld7KuL13ZcZddlH/RI2?=
 =?us-ascii?Q?tqvMwH+pol3OroAdH3UPrD9SIRce+dcEY02pIG+BNdx84YUWhHZGHApeLqcX?=
 =?us-ascii?Q?CegdV78U5absOSZ1mbnVapYk9mLEZvXcn88/VWLwroUPBMfy8?=
X-Microsoft-Exchange-Diagnostics: 1; BN3PR0501MB1107;
 6:xoc2Y9q6FXQUkDYjY1kk5DPrYRJdNLDWou1GdM8NSzjCFNiCGLgmi4iczDTOLqUhmbmF7t+7+UD34pPsXdjioy819hgBKT1vn3A737S6eJxM+KwtxS9r50JvWhGVobkKj5GjzqKa75HzPIEXwAS6xWJiUB3axVz2gd+PwCQSY1LRPh4ppFy4Jvgr+XNXvPsIrSlPKgXFVHP8RaUrsdyb7WmgzrqlPc8PLDzjDaZYGWvkJsDzMG4m7GlO+Cy57ycidW1nipycfiooGG7D2Ln9QOZmvxiKiweSI5VcHKCLQe/RUiWpz9imSZ4mUtF4aO7gK7Swk2uAATFQJ7jo5fYZ1tzmj4UVbLMYQdmRtJYzuMmxNelCgj1bhm3KaPlnqLurpJESXDPM169tFpziDjg1iFDk9HWjIAljw4Zt13eOoKOVgrSWTFAE+dhildZd8BLfANJ/pRv519wPT00Bmc5b7A==;
 5:jrqIPwRrqiujeMGONCZInpkXh7eGjpzrauMDuXBeXUPslsONlUnV6NVfIYOxUOieQGWWHqfrPG9TXsSvRTKQI8Y8qU2ec+amjZAeeVYorK86y7B/8D9PqrWa1Qtj50qbJdn33n5gJaA5VY3Q2UVrp8WpX6gLe6P+jWoiqlx3k2s=;
 24:eKtTwljrdN4s0tKHV2mNrjzL6jr2lpfpmtzKHPzY26SazJhl+fNQC4BOd+jAfxfcOhC2o3GAJD8xlfL0Vq2nY/5/yHjIBDPZTjTS67WhOx4=
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-Microsoft-Exchange-Diagnostics: 1; BN3PR0501MB1107;
 7:khH6ojj6xFb7mYVUy/QMuPkOlC5DdhzDJQR2KXZKy+tbb1Tl84dRllCiMX8mXWF8/Kon0lR5KhOahAhNjcQWzcWFdxmXJROHVZw0G+qTibFs+xAvgvXA6QtubGUcS/KDddhmwczJYiNFMW/ilnV/xigbB4izePdL+E01RadfxoLXEOJriEQNjUT3r41DdDCbWCdQX0c4WhHFSeVzV8vFE6E6bPtbDAyjvm0CbubqQZBenqsfgHiuQX9qoJXlzZSvXn0XDwUp+xKWbWJCJeWDIkRhogMRqpSq+ByJs4XHAiXs0REIYqC+ZMX05UFO5//uyAlCNnZR0DpfwiSpIrli0g==;
 20:pzVrc8QNL3bEvJ90u/goTwnDSNRu3hvDi0/OdCHHDCdyWrA59NiLS1lNzOmBZq+dfHu12flKCOltA5VxU9LpWwqucx0tRLJodGp21508/rp47YPZTzCeazfC3KF0xwDUFQqkDUPZCSds0qP0IJLtRhre0p66upGKZSi5UHHs/2g=
X-OriginatorOrg: ksu.edu
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2017 20:41:26.9263 (UTC)
X-MS-Exchange-CrossTenant-Id: d9a2fa71-d67d-4cb6-b541-06ccaa8013fb
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=d9a2fa71-d67d-4cb6-b541-06ccaa8013fb; Ip=[129.130.18.151];
 Helo=[ome-vm-smtp2.campus.ksu.edu]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN3PR0501MB1107
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Apr 2017 20:41:31 -0000

On Fri, Apr 14, 2017 at 3:24 PM, Pedro Giffuni <pfg@freebsd.org> wrote:
>
> That doesn't seem good: anything that breaks tests is very likely to have
> other side-effects.
> Keep in mind that any regex change will likely have to go through a ports
> exp-run and
> ports will still have to work fine in three versions of FreeBSD.
>

Yeah, I anticipate other side-effects from this. Fortunately, there aren't
many ports relying on GNU extensions, and as a part of [1] I'm trying to
get them to start using textproc/gnugrep since this is more up-to-date and
well-tested.

As far as sed goes, the only potential breakage should come from \<, \>,
\b, \B, \w, \W, \s, and \S expecting to be ordinary. This is easy to fix in
a way that is actually POSIX compliant (unlike expecting them to be
ordinary), so no worries there.

It's worth noting that I have absolutely no intention of changing anything
to actually expect GNU extensions, but I tend to use them myself in my own
daily grep(1) usage- some of them are nice to have.


>
> It is difficult to know exactly how far we want to keep the GNU grep
> behavior. It is perfectly fine for BSD grep to keep a slightly incompatible
> behavior as long as we keep within standards.
>
> Just my $0.02,
>

>
Much appreciated. =)

Kyle Evans

[1]  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218385

From owner-freebsd-hackers@freebsd.org  Fri Apr 14 22:28:37 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 585BAD3D252
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Fri, 14 Apr 2017 22:28:37 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
Received: from NAM02-SN1-obe.outbound.protection.outlook.com
 (mail-sn1nam02on0069.outbound.protection.outlook.com [104.47.36.69])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (Client CN "mail.protection.outlook.com",
 Issuer "Microsoft IT SSL SHA2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id D8387E82;
 Fri, 14 Apr 2017 22:28:36 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ksu.edu; s=selector2; 
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=gGR2MTCgceHizSh9kfwfvaYGN36EWBe4svNjhFw0PpI=;
 b=HqVNA0LXZC/974eZ/EgHXgDEM6KVWqM0mbZKSiv0gnJ6bU5hbM7boYJIMBfY+jvhcaJNx/HI8aLso9n+uneBhidqkwlnH3d6dTho/J/6JJD9cFde5lNAlebX9VUeypO487v80WIxOikbNWbq6x7k9wIDgI/UR+M3M4VfePxEJq8=
Received: from BN6PR05CA0007.namprd05.prod.outlook.com (10.174.92.148) by
 DM2PR0501MB1049.namprd05.prod.outlook.com (10.160.25.20) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id
 15.1.1034.5; Fri, 14 Apr 2017 22:28:34 +0000
Received: from SN1NAM02FT006.eop-nam02.prod.protection.outlook.com
 (2a01:111:f400:7e44::203) by BN6PR05CA0007.outlook.office365.com
 (2603:10b6:405:39::20) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1047.6 via
 Frontend Transport; Fri, 14 Apr 2017 22:28:34 +0000
Authentication-Results: spf=pass (sender IP is 129.130.18.151)
 smtp.mailfrom=ksu.edu; freebsd.org; dkim=none (message not signed)
 header.d=none;freebsd.org; dmarc=bestguesspass action=none
 header.from=ksu.edu;
Received-SPF: Pass (protection.outlook.com: domain of ksu.edu designates
 129.130.18.151 as permitted sender) receiver=protection.outlook.com;
 client-ip=129.130.18.151; helo=ome-vm-smtp2.campus.ksu.edu;
Received: from ome-vm-smtp2.campus.ksu.edu (129.130.18.151) by
 SN1NAM02FT006.mail.protection.outlook.com (10.152.72.68) with Microsoft SMTP
 Server id 15.1.1019.14 via Frontend Transport; Fri, 14 Apr 2017 22:28:33
 +0000
Received: from calypso.engg.ksu.edu (calypso.engg.ksu.edu [129.130.43.181])
 by ome-vm-smtp2.campus.ksu.edu (8.14.4/8.14.4/Debian-2ubuntu2.1) with ESMTP id
 v3EMSXWk028436; Fri, 14 Apr 2017 17:28:33 -0500
Received: by calypso.engg.ksu.edu (Postfix, from userid 110)
 id 099B0248319; Fri, 14 Apr 2017 17:28:33 -0500 (CDT)
Received: from mail-wm0-f44.google.com (mail-wm0-f44.google.com [74.125.82.44])
 by calypso.engg.ksu.edu (Postfix) with ESMTPA id AB02B248318;
 Fri, 14 Apr 2017 17:28:30 -0500 (CDT)
Received: by mail-wm0-f44.google.com with SMTP id w64so2144884wma.0;
 Fri, 14 Apr 2017 15:28:30 -0700 (PDT)
X-Gm-Message-State: AN3rC/7EfT19nhx70pJnorlb9WBlP7AunJQXgQaOXxj3CBWdQnA/3bJ5
 uDzXe9yRZ2VEFP4/HhfA11y3Z/tEmA==
X-Received: by 10.28.98.66 with SMTP id w63mr488539wmb.33.1492208909847; Fri,
 14 Apr 2017 15:28:29 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.28.39.134 with HTTP; Fri, 14 Apr 2017 15:28:29 -0700 (PDT)
Received: by 10.28.39.134 with HTTP; Fri, 14 Apr 2017 15:28:29 -0700 (PDT)
In-Reply-To: <CACNAnaHTAS7a+vuTS+3yAT3p_4yUY_MOmd=mfBc922-0pjNW9g@mail.gmail.com>
References: <CACNAnaEmBjWudEJwvRTSqyciOp7-oRbCEQ_e6qtGsap0oHQ4yw@mail.gmail.com>
 <CACNAnaGOLVKR7Y4uzhuS7EB5-UMb3tS9yKL4Srn8knThk0o1kg@mail.gmail.com>
 <10004f0d-acb7-f81a-f3d5-b368e606a105@FreeBSD.org>
 <CACNAnaHTAS7a+vuTS+3yAT3p_4yUY_MOmd=mfBc922-0pjNW9g@mail.gmail.com>
From: Kyle Evans <kevans91@ksu.edu>
Date: Fri, 14 Apr 2017 17:28:29 -0500
X-Gmail-Original-Message-ID: <CACNAnaHhoUNvWkbB095eAn2HDF78+KmgBQbYnZ1tMHCF-aa5kQ@mail.gmail.com>
Message-ID: <CACNAnaHhoUNvWkbB095eAn2HDF78+KmgBQbYnZ1tMHCF-aa5kQ@mail.gmail.com>
Subject: Re: Replacing libgnuregex
To: Pedro Giffuni <pfg@freebsd.org>
CC: Ed Maste <emaste@freebsd.org>, <freebsd-hackers@freebsd.org>
X-EOPAttributedMessage: 0
X-Forefront-Antispam-Report: CIP:129.130.18.151; IPV:NLI; CTRY:US; EFV:NLI;
 SFV:NSPM;
 SFS:(10009020)(39400400002)(39410400002)(39860400002)(39450400003)(39850400002)(39840400002)(2980300002)(438002)(24454002)(377454003)(189002)(199003)(55446002)(9686003)(90966002)(88552002)(54906002)(356003)(236005)(221733001)(512874002)(229853002)(8576002)(110136004)(4326008)(450100002)(38730400002)(8676002)(3480700004)(6246003)(53546009)(8936002)(86362001)(2906002)(54356999)(50986999)(2950100002)(63696999)(76176999)(42186005)(498394004)(61266001)(6916009)(106466001)(93516999)(45336002)(7116003)(61726006)(46386002)(305945005)(93886004)(5660300001)(75432002)(189998001)(84326002)(55456009);
 DIR:OUT; SFP:1101; SCL:1; SRVR:DM2PR0501MB1049; H:ome-vm-smtp2.campus.ksu.edu;
 FPR:; SPF:Pass; MLV:sfv; MX:1; A:1; LANG:en; 
X-Microsoft-Exchange-Diagnostics: 1; SN1NAM02FT006;
 1:Wcz5L6yu15tcei3G9vTbY5s/CxXUYIc6LWToLbGq9zwJa0e9r12UWjcYGq8nAJuqTZMfp1ubydINW9Cdev4KrnoL9r2oZyskCbQYGQKI6BkOj7CW1LjCFCuIs3wyHHioAI5AWerAdJ0pHpCwVPNgl57UHEGAvt69fDCggjPeTtru91CEB2t2aKNxb0A+SG4BQtSoKZU0uzsTTbakpPu/jgWo7omZeH3Zf4T3u653RmQpwkqMVbpFdhm1eMZ3/7qCme8+a+AEdqcnpAWZMqP5JMiBydJzWZyZpVaVqrGBWM+7G9JiHdPJNe8uFvqNAUEap9eLBSNEMQUkuu9iiPm0539C8YRNjLmoedkAPku1wD3vdObTniiJlhVnnnohFB8d7ywabtsEzWXNRyn+wl3vHYDDCHUj5De77TrBcsW/2B8JPuypqU4sSXv64gT0Zl/Lugdycd+5KXIM4+3P/V65lY2WShRnkgjogEBLOccgyN5nniawVPynDYYSyObp3/6v9KPjrhvQyxZM1yONDBdSqQfPqYfK5e+wBtaMX1K6gbE=
X-MS-Office365-Filtering-Correlation-Id: c0737644-330e-49bb-c1f8-08d48385961b
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(22001)(8251501002)(2017030254075)(201703131423075)(201703031133081);
 SRVR:DM2PR0501MB1049; 
X-Microsoft-Exchange-Diagnostics: 1; DM2PR0501MB1049;
 3:s70eIdh3mCeqwvYJNy19i/s/77jq4xEUuWQGPIJNexsjKciMLy+FGDV1mcwgnZQUE4XtGnb3ihG0d93UZHpJibxMJJ1NWowPgvOTJPJuwaPjYhJGBzH0VFk80CzFhpNr8H8/zokuNCUqzeKnqw5t0x7q4+ch3l7Re7btW8jD1m+Ig74oMes3yHTeUQVimvz+2R1F23/WWoa9vjc5udjrmHt9HBszPC8u37Jn0oRlQpw0UqoksMdjOu/dRogCJ3E6YEAJ9jm6qzsuGERP+S7Ym9mkmoLqGwEM/V1X5PAKZ9hgtutVGbFsyzygZ3FB73HjB8yCn3ztlt51fzVKbLr78dUqc/QVuJ+qX7l8MtBu90JrKL6ZMZEcTIG7ym5sylls7NdzdsT4MtC+cQ8JPRrMWqhbi6IpE9pdBUMRhwGenvXdpxwtme2FRzmX6ruQtQPTbmSsyYnpB/4tIGSOUw4ta24MezuNmVh4Mo33KVTKzhKbkBzveOfjAkK3l4RPif0F
X-Microsoft-Exchange-Diagnostics: 1; DM2PR0501MB1049;
 25:MCMHs4LUXllaCgELsrTIQuE6RsphtnHHPbGodcQA9eOjLTE6eZmU09Mh0B4tgb4Rtu1V1oagE3CQyZA1myz1rha6gIMMOAO4QXDndDUaiHivt9AA7bCRYnfquuoaJ/gA4qVsQmSuXE2xr6oQjvbxERqqlQEZhsUfNj5gSr8yJWziMo97zsOV/1/ZKQdwU/hk6YxhIiXWDeObOh/CURqOisPTbFdfbnmTWlevHal7lH4lM/irACS1tE9sCiWHwuFZRXo1Noij7z+8gT5P7gv1qXrLdSEdGCZvblGMS0Fda2GweoF7dypxFHHGdvhFmx277d7K9x+Ilf+WnbhmdJVsabK7aMIDOxIEEPHEdXzGxGJYhV1S2y65lHqMbNWL5hRTJ4UhnJ2+hv/BWvykwS3I79RzM0d1FyzbZZzLGrSfvFLTI0ERN+i4Alv8wmHP1FzZJbtYaRxc+h4sngh0h7PY9w==;
 31:eC+4dqpEMDw3ezWeA1J1Rb63Pp1Be95joL+122GzZhFcIuFdIrQiK2wBqYf8w9+Vzr/icxNzMOdR6uOwBTI5M3s8qeTxdJmmWjdHGMyfKt1VFe6f/zJJy8QFofkivl6XppbdmaZvE21UKRubdfXiCPPsUOydoYVfwg1hrAFaD+9n8dO4jJYWDRwcZ8iZ0YYWz9RcBMpYJQG91wldQvL6IUWQMq8xIRcoSDknn4w/xO2ci7SvXt7YmPbD9ssoUKEbRovAnrC/snj/1NdiNDlqMg==
X-Microsoft-Exchange-Diagnostics: 1; DM2PR0501MB1049;
 20:hHDFdskJLCwDItI89FLRdAnoK6eQmF8dIlFPE1LtB/WybmB2aXCjOAt2i7MG58sY9mDSq8NMHc7tB6VS+B8Ue4LvD1O7goJbcJMcA5BtQrQ9xS0lSqJTjB+S6Bi2gaQrUJ2PVhcww52ggQ3w9jLFTlQSKObJNxUmKVwi6+J3tcssMfj1Xb4uuxapKl+KweiPKd+B/049Y4I6wLnfKMLZnu8SaXGQ7NXyAMAIOMKBopVkF8ROggd/WLOX3mE6vYbibByEso2WLqVAWmmZ2qIJySyrnyV7Qd0VoeCf8kzso8NwPTRuN1YA0IUEVrJ4zQYOGlej5PaZolMw9EPUJv3ng6P5vO9u1SzJMVaBVkHjOQoumX0+f/9ZBTUPI242nCvzhaBPb9KH9iy0zwggPRXJaX04C0PWbk2yIqJMlU9ZmiX2zVxcHwgG0ll1PfqvmTQfj9jj6B3z2MCdK3eVg9qiL+m4qX+hZjLswlAjjBcSnQgfgEYbeKbuxoval7tjrE4Q
X-Microsoft-Antispam-PRVS: <DM2PR0501MB1049911CD5DAD02966815FC5C1050@DM2PR0501MB1049.namprd05.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:(112903893386949);
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(6040450)(601004)(2401047)(13017025)(13018025)(13023025)(13024025)(8121501046)(13015025)(5005006)(93006095)(93004095)(10201501046)(3002001)(6041248)(20161123564025)(20161123560025)(20161123562025)(201703131423075)(201702281529075)(201702281528075)(201703061421075)(20161123555025)(6072148);
 SRVR:DM2PR0501MB1049; BCL:0; PCL:0; RULEID:; SRVR:DM2PR0501MB1049; 
X-Microsoft-Exchange-Diagnostics: 1; DM2PR0501MB1049;
 4:0j7QluHyQ9IYe6zAMl8Teo8Hn2DMITfJHikNS5PitRa0gm9cIFEd8escNliH2KEGDdrljWv2dmib4MEE39MQxHaFolU4mKK/4vVOx4p5OAaEOp79LFy8F4QycD78HeNXRNjMaje5Hca5/b3n3bqg+Jywu6JK25lXV9E/FMuv3+ssvvt089AuDDXXiIRG8RcxKQyOfI+bqqG8bo2mGtLuDLbQPODGWKJC3ZEpFRwliwiWwrnrNempbAS9VjRMGDOHzOwSP+a2ZidnrX7BOOsaQVOOo5KknRVPGbIcINmEdAEl+ldhDa8uu/qdkpoONdnTgOmUywj/ZbHNJYpkjuv74xvpEDs3uRY0YBRXgoUda6QftLsL1Dy8GzrMEOIZ8l+ND6Jypp1dLfbLj6fLC53qOoYf8GbLQNjJEwyXfUPrTqkhw4rQwL9ZiFIjQkZLmbiki7173P76v+oFqkr8MH6SC5lUVkeuJhFak5a7vUJORiamvnEOWs7xBTVwS2VSGDNufKdtzdTHTB+D/v//GwaA9mIKd4NJzuJ80wDar6104i+Mqp4HwINInsH5sQLHhzuezAUFlKlEV6teRkOZrK9h/MyIwFT25sSeGUTHEJ2nj8yXvOBzdxTWdUDa73Npgs1E6xA+n5Z/UQax91nLqmqC2cDdgzLlzsdJzEHAdkdb2kXl9yMpI1ClcFu/Gt0avyESoG8lkVobN+7k09/5YBVJqYtN4qiFdbU9edolWyxg/vQgSCGO+g9tCwXTZq8/cuhpLj6+JO27ZWb7qR+9BhKdO5UgV4FQgI0q76H4LYE/sA8olXw/nPgQk2l2a06jFObUC5Thi6KqFrgHUqTn7t+ewnMSC3woAQcI5r4xmJGoNvDNPEC/LPo83ZyvANHDmn3LjGZPvJMtM1M7bbceT2q3fQ==
X-Forefront-PRVS: 02778BF158
X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DM2PR0501MB1049;
 23:nTO1UYMgUqU51TkD37jFHgL+Pr7HF0cJfAOE53f?=
 =?us-ascii?Q?J2r0n5bQghQAua6yDiEGgw9v2uwmZqCj0aR7Rhz3eYxRQfBLfa0/B/IRe+iy?=
 =?us-ascii?Q?IsaDSHyRyHr5aAbvPE4yk3Vv2mE086mLue70c9tP5AZ/9n7hmqp7Bq4lQ+Bk?=
 =?us-ascii?Q?IMRpU4m8TL9Bw20a1qUeIFeC0GUJ73whh9tdz46Bh/e2IauO3JKFbaO8hlFe?=
 =?us-ascii?Q?0ADD48mAPDnhXE4skCcY++3Am5Z6TlCSiIaQfBTUl89XteTKLoMYhDw0OOOL?=
 =?us-ascii?Q?ImCyRspMI9eDkozE95GKGe39SrSZd+CD14vzHShdAokGhkffKzYELzcqL5/H?=
 =?us-ascii?Q?ViZUxoVnifLWZulXGPJfcIJYnGWmZp7t7v7PjQbCldQ73HzreEv5V4nXHFsM?=
 =?us-ascii?Q?S/ngO/hrRU9Wx2Pu6/VUs97Qsur3BfApKeMgg7Fk5s1VV2pbqpbsqJsyYJtZ?=
 =?us-ascii?Q?Ntw/SHifJr3Z6ToQvXEkgV6OOmjl+BpBdzSoxVezJcDBKdOajM8HdtmgXwfG?=
 =?us-ascii?Q?W8TFN/gU2Anz/Qpq8VFUI2AMJeB306ws9KsGaUNOK8VqWk7BH1I4Cqetkx0J?=
 =?us-ascii?Q?xs/LhkANpuuzDY2dNVjdaoiX5tmyTRRB/L7KkXYGPLanOjPzxhGssT2hI6QK?=
 =?us-ascii?Q?+1sa4WkwfslPde5Z79jcBuRVxQDAQkZ+6DmJj+4q3R3RVm10dlXhncTdVEWr?=
 =?us-ascii?Q?RIDlxnYUQ65JggijCbUsXnqet2VXCfW+kucIf/YHKX2K3Ti5LEuhspLK5CST?=
 =?us-ascii?Q?ivtKbCiavRIM+8qGoJgTDCwbwId8Tblx0qnWelBph5avLd16hsmJOkQ8GHd/?=
 =?us-ascii?Q?UT/KRiLSLBfQwALhcWIo4BWpY/aWOULwtAr7s9EMyHp4IfayKbmJ8DEmc+LQ?=
 =?us-ascii?Q?ZHRO9L837GnhFOZGfKTeZvwMMiUAbg+wLPQyIrhVYErTqCHG4Ccw9UMGCD1V?=
 =?us-ascii?Q?pa5lDMmj0bGomCTwVqHy1fARwD2oom4qpMDMq+fuvGOKatI1mKdD+IHJFvXG?=
 =?us-ascii?Q?0u9rjPQKXGItvBITfgbs1h77lfOv/WTQnrz6WLBKD4Zn9aSiCR900EnuQo5g?=
 =?us-ascii?Q?7SFuXjEZrxsMjow7x0O7xUQRHtLVxLBaQrZYAtWRVV5rZYxgrslbBkGUEJzn?=
 =?us-ascii?Q?0t1Xbs00ZCZAmUE1oqxyfV1skSSgsJ6g8cg7GejjFWj+I7qpcyFSlGPP2oC3?=
 =?us-ascii?Q?zJu0elMVJlpLTOzTxzUWzBiepBmKxPgpW/wrIxlOjQZgern10p09llYd7sLv?=
 =?us-ascii?Q?d9dgDu99rsdmHfrZqiSYiaET5pSNkliN8XnqrYbnM4bA26lsm7Fa83lhT5sT?=
 =?us-ascii?Q?KmUiw1lnN03y1gFRK8G4FmxGPyuI90751uDEqFcyrAaBwEHNzNF6AfnCv3qL?=
 =?us-ascii?Q?C0XzZIA=3D=3D?=
X-Microsoft-Exchange-Diagnostics: 1; DM2PR0501MB1049;
 6:1jZ2OQuHIRWE9ftya8jHFbSZh/ay42B4eCVK3Q7yjk+Nfod6gBm8Ftz7/Sw+iSdff2IANy4YOpA6gBfq4dGflPfUhTwxMPpwwlLkb8jcdCiNxMFbcbGXd3g8CRdovLsHip5j1UZ8L5RBoAqmwdFm8++BKrunYjc11WiXi1Qx7NCAtgvxYfoaWUAd6ebQ1zYl+FiIgu4JGQKJDitbhr5mx3U5MTozlKsp2hwANsCsXbleQNPz9i2nQ2P3hVcY+UnSuFC0JsVAjyhiVI2W0Fjg2Fw0NVpCqLat67eA+JcOg19p9axpFw/DjywW233e9G3HKf4R5BcpqmJ2EVW5ar83AfG8reYPaxFCaIuoJ1pVkTmwXZINnZLzRsBPwtCHUpXkCg1xvziRWZHS2ZBs+y7x/CGM+l7qdMi4T2qriw4Y21vn7eUcpLTHFHSnkAy2BoBdQDI7AYjQPkgty8VDpu3Eww==;
 5:o82h+/W84Nm5mSsKwQHGgoYSQ6etC1kqp9Xo6PZkA4jItw429WZJ2B7SI02VFyNrI/LV8xBdy4VO+16PZZ2EWE8wYwJEEWF1b4DBIpdrqAYzroq0CYeCTLTvv1MEDTfoiY81/AIhzipMaYpb8bNRHA==;
 24:6Ishjuj0zvNzzd7Xhm88iGZIzZzfV5XMRIS3rtzgqTKRpxWRh0P+e3lrTLss61x17grlnUP8ZoQwon+5hnrphCIr+MO1RxjKUlDOolYIDAc=
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-Microsoft-Exchange-Diagnostics: 1; DM2PR0501MB1049;
 7:JZ059vuBAfMaM6eVMilsJ46QB97iHxrc1wrBrjbyiaNS9IaKfsE2WpDOPMhL5w7J027WvUG11IHl7eEJ2Nr+WQAiwjCygtbN5q/OpfBuus6J2lUoeq1Ux31EjS80bYgJm/mdwQWkPCZBSKwAd1e/6DRDgbfqArnx2/wRvN9VLXQt8w3KQRsMSItYpAFVgfz0ZW1oNRum6YBGNBxbRNmGDk+oPDFqbn470hFn78guNQtB/2XpSKq/maYx0EUCjegbRkT/nhYl4inb/WBdj9WVPCsflDACdgEPl1UK6TdYgA2kvVUDCjHyxYQnh8wbHDGHzhfLIGmzuHxuuyXGkxKUBQ==;
 20:gKPZ4JjpJNCAlnMlpf/2oGXsxW+j6A1D6ejDIkPuGrivoaFikww+N8jVlj2fhwz1U1VcbItofcokE6i34v1+2WaSWWIiQnTGO6phbTuPO/FCWOV8WBjc0ZYRB2dJu47uGILob+cy5ZJBoG2OiDvo/y8dGuDpEoX9aYVW5+dw0xc=
X-OriginatorOrg: ksu.edu
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2017 22:28:33.5560 (UTC)
X-MS-Exchange-CrossTenant-Id: d9a2fa71-d67d-4cb6-b541-06ccaa8013fb
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=d9a2fa71-d67d-4cb6-b541-06ccaa8013fb; Ip=[129.130.18.151];
 Helo=[ome-vm-smtp2.campus.ksu.edu]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM2PR0501MB1049
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Apr 2017 22:28:37 -0000

(apologies, wrong email, resending for list)


On Apr 14, 2017 3:41 PM, "Kyle Evans" <kevans91@ksu.edu> wrote:

On Fri, Apr 14, 2017 at 3:24 PM, Pedro Giffuni <pfg@freebsd.org> wrote:
>
> That doesn't seem good: anything that breaks tests is very likely to have
> other side-effects.
> Keep in mind that any regex change will likely have to go through a ports
> exp-run and
> ports will still have to work fine in three versions of FreeBSD.
>

Yeah, I anticipate other side-effects from this. Fortunately, there aren't
many ports relying on GNU extensions, and as a part of [1] I'm trying to
get them to start using textproc/gnugrep since this is more up-to-date and
well-tested.

As far as sed goes, the only potential breakage should come from \<, \>,
\b, \B, \w, \W, \s, and \S expecting to be ordinary. This is easy to fix in
a way that is actually POSIX compliant (unlike expecting them to be
ordinary), so no worries there.

It's worth noting that I have absolutely no intention of changing anything
to actually expect GNU extensions, but I tend to use them myself in my own
daily grep(1) usage- some of them are nice to have.


>
On second thought, I should add a REG_POSIX flag so that we can make sure
to maintain POSIX compatibility instead of removing the tests with
expectations that cannot hold. I think it should be opt-in though for the
sake of, say, gdb, which expects GNU extensions.

I do still intend to fix the regressions that occur because of undefined
behavior, though.

From owner-freebsd-hackers@freebsd.org  Sat Apr 15 06:03:13 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 35BC8D3ECE3
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sat, 15 Apr 2017 06:03:13 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
Received: from NAM01-SN1-obe.outbound.protection.outlook.com
 (mail-sn1nam01on0069.outbound.protection.outlook.com [104.47.32.69])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (Client CN "mail.protection.outlook.com",
 Issuer "Microsoft IT SSL SHA2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id CC9A938A;
 Sat, 15 Apr 2017 06:03:12 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ksu.edu; s=selector2; 
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=VJFfAlO+msErk2NxE7A4+/32btE1s8PRk6PbvaBiLAQ=;
 b=ER5GG7hNJuYmtitJHdnkqtoQpUY4w8HrG//ter3WGuylaw2/L0tADqlN2YNJy7O2xbsb1JtEyIJIZZcMIYonLJzASk8fp0lMOlUTkrzQjztE7pSJwQq41BCXRI2h/QMkCwHwRF3VnxOuHZyeTBOFkLLA9r5VxfYTuvU3ggYYXMg=
Received: from SN1PR05CA0024.namprd05.prod.outlook.com (10.163.68.162) by
 SN1PR0501MB2047.namprd05.prod.outlook.com (10.163.227.20) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id
 15.1.1034.5; Sat, 15 Apr 2017 06:03:09 +0000
Received: from BL2NAM02FT044.eop-nam02.prod.protection.outlook.com
 (2a01:111:f400:7e46::202) by SN1PR05CA0024.outlook.office365.com
 (2a01:111:e400:5197::34) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1047.6 via
 Frontend Transport; Sat, 15 Apr 2017 06:03:09 +0000
Authentication-Results: spf=pass (sender IP is 129.130.18.151)
 smtp.mailfrom=ksu.edu; freebsd.org; dkim=none (message not signed)
 header.d=none;freebsd.org; dmarc=bestguesspass action=none
 header.from=ksu.edu;
Received-SPF: Pass (protection.outlook.com: domain of ksu.edu designates
 129.130.18.151 as permitted sender) receiver=protection.outlook.com;
 client-ip=129.130.18.151; helo=ome-vm-smtp2.campus.ksu.edu;
Received: from ome-vm-smtp2.campus.ksu.edu (129.130.18.151) by
 BL2NAM02FT044.mail.protection.outlook.com (10.152.77.35) with Microsoft SMTP
 Server id 15.1.1019.14 via Frontend Transport; Sat, 15 Apr 2017 06:03:08
 +0000
Received: from calypso.engg.ksu.edu (calypso.engg.ksu.edu [129.130.43.181])
 by ome-vm-smtp2.campus.ksu.edu (8.14.4/8.14.4/Debian-2ubuntu2.1) with ESMTP id
 v3F638fa015535; Sat, 15 Apr 2017 01:03:08 -0500
Received: by calypso.engg.ksu.edu (Postfix, from userid 110)
 id 57FA3248319; Sat, 15 Apr 2017 01:03:08 -0500 (CDT)
Received: from mail-wr0-f179.google.com (mail-wr0-f179.google.com
 [209.85.128.179])
 by calypso.engg.ksu.edu (Postfix) with ESMTPA id D7499248318;
 Sat, 15 Apr 2017 01:03:05 -0500 (CDT)
Received: by mail-wr0-f179.google.com with SMTP id o21so59175369wrb.2;
 Fri, 14 Apr 2017 23:03:05 -0700 (PDT)
X-Gm-Message-State: AN3rC/76pMIK9NG8BnohSPllYBvmW1WO1W44i9cAwoZYsR2xHZa3tU3O
 Rqifv3LaiskvMXRoQtYEsin1WDU/zw==
X-Received: by 10.223.154.54 with SMTP id z51mr9936411wrb.76.1492236182773;
 Fri, 14 Apr 2017 23:03:02 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.28.39.134 with HTTP; Fri, 14 Apr 2017 23:02:42 -0700 (PDT)
In-Reply-To: <CACNAnaGOLVKR7Y4uzhuS7EB5-UMb3tS9yKL4Srn8knThk0o1kg@mail.gmail.com>
References: <CACNAnaEmBjWudEJwvRTSqyciOp7-oRbCEQ_e6qtGsap0oHQ4yw@mail.gmail.com>
 <CACNAnaGOLVKR7Y4uzhuS7EB5-UMb3tS9yKL4Srn8knThk0o1kg@mail.gmail.com>
From: Kyle Evans <kevans91@ksu.edu>
Date: Sat, 15 Apr 2017 01:02:42 -0500
X-Gmail-Original-Message-ID: <CACNAnaHRi4RH4Staf6ZT5+1_ZqSBAR6shOd2=nYt3K9_A5kKZQ@mail.gmail.com>
Message-ID: <CACNAnaHRi4RH4Staf6ZT5+1_ZqSBAR6shOd2=nYt3K9_A5kKZQ@mail.gmail.com>
Subject: Re: Replacing libgnuregex
To: <freebsd-hackers@freebsd.org>
CC: Pedro Giffuni <pfg@freebsd.org>, Ed Maste <emaste@freebsd.org>
X-EOPAttributedMessage: 0
X-Forefront-Antispam-Report: CIP:129.130.18.151; IPV:NLI; CTRY:US; EFV:NLI;
 SFV:NSPM;
 SFS:(10009020)(39850400002)(39410400002)(39400400002)(39450400003)(39840400002)(39860400002)(2980300002)(438002)(24454002)(69234005)(377454003)(189002)(199003)(9896002)(61266001)(55446002)(512874002)(93516999)(59536001)(189998001)(6306002)(9686003)(966004)(84326002)(6246003)(6916009)(54906002)(236005)(2950100002)(229853002)(606005)(63696999)(53386004)(54356999)(86362001)(110136004)(2906002)(42186005)(90966002)(50986999)(98316002)(45336002)(88552002)(2351001)(46386002)(4326008)(76176999)(356003)(38730400002)(3480700004)(450100002)(7116003)(8576002)(221733001)(5660300001)(61726006)(8936002)(106466001)(305945005)(8676002)(7906003)(75432002)(55456009);
 DIR:OUT; SFP:1101; SCL:1; SRVR:SN1PR0501MB2047; H:ome-vm-smtp2.campus.ksu.edu;
 FPR:; SPF:Pass; MLV:sfv; MX:1; A:1; LANG:en; 
X-Microsoft-Exchange-Diagnostics: 1; BL2NAM02FT044;
 1:NH6RnNj+T4wkRW0TOTpG0r124MwsOh87KShcjrdSTufLAfUgAa+IwVurvj6bgjGTNZAPi6f2NXiXujzoIdrGX8AG9INQwcMvuSs9MLdoINTEkRmr9aAegScpqxbcS93vlWYNjQ3f3ELC2XKT15GHqMbniRiiInvpXUdvawJrk0x7P/HQccoR0nN3WeFrbGye7zME4Qaue1KuOjtYYaOMUV8jARAI2E1QZEq86/kAQ2UsL3hhsHx+/UpkAD4+hvv/Hbux5cNXd/9Bs2Mx50tIq+yNoTFUXnyjJYKifKaAvCGp5wVpOH3O6rbdbQQd8/fRebMOrCCav9IKvP4WhTIc+89VW35FSnoFD6v9G9YHVzfEeUWlRJv35a/VjuodVGK9Mnr7x0/lqqS7XrGPZYgm/RsfWLeCTT8HYHp6bwFAQ8A2dSZ/VyfnYcIsfVi093PMGlKDbVYRoI6r+nHpqTwp3LaoB9JfdtIJA0VbYm9w05i+VNhl3n2EJJzABKeExeNMtwAyBA2XyTaDxMG7vH/6/JO685HkpGTmt/xvXeXmfMQ=
X-MS-Office365-Filtering-Correlation-Id: 420d314f-af14-47ab-f3c9-08d483c51772
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(22001)(8251501002)(2017030254075)(201703131423075)(201703031133081);
 SRVR:SN1PR0501MB2047; 
X-Microsoft-Exchange-Diagnostics: 1; SN1PR0501MB2047;
 3:mqqXYen4PwoqUFaO1T5x2QaeaUmW08HGR2EtehvxZRVauRZNmVi/Rz/IX+f29/2KEohppoihD2o9qWfFDyAdCpcv7cO3nNr/mIeHTN/tg/OnIahXacWMw78R30hnszEjMd+rlN/CSnE5L2U0uvhUF2LM42+YMPo4B/oecCttT6YUwswfYWnS4zPYBRpQH3ZNi0B2Upl0qwhK6dV6pCj43j7imoOxqiqkiLSSB/fahVzvVv1NdE/IhdJT/YfovUNHZMBCLW1M0oGygLnsbUAf+3q+p7F8wvyBQTbnzBhx15otHk+Fs2R4dew9LkuHr1ThB+GZj/5YunQdJkqKI4i2IwXgBNo+nMlVeIzHL/GAjt2wM9n4COxkNW5Y36H68gLmLxycMp0Oe3yZigkgedOJjGBn6fxFCXI/ZbvDXfSHOBU0qVf3Pmf5Vcln5PP05Xgj7AIeuGClpxES/QBgGQ43zFfItd5hIomG4WJL2mXJo+sjueCSrsiCqblaYU6toW01
X-Microsoft-Exchange-Diagnostics: 1; SN1PR0501MB2047;
 25:yLAPJsfnM+FVvA9nMZedDqwZlYARdDp0JaICeV+T0Jh8KyqfKWukrp8TkPH2OEYaWAjc9l3viapk6g0c4BlVJMinMpyJHbctXntXANV3hxI8zLGWi7NZpSm+HNljkVOzcBq5IVri24/Rh7+NxxnNDYeYdAfERFIWHRWqVcifMpXjC0gJQtzynxWuoBwbAoFhISBiD6C3YE6jjLllPkqye8NL7XshdbXSFArxP0MTSd6DZHN08lx+KqbXaFVW1/gSnX8ZFawJ102eVqtzYrMVMU/x29qeXGogGk5MH5xddU0DPNDYTF02OdBrYVHknxSGW73yDWT1jkMCNq3OmxPrCmEpRVRj4zmN6RvJyK8pc6beRuQxH5iqdfM1wdgSjLeAlYUUVJdpSo4DTBMnFG6zIwnNRiJ4jGvpZUYPK2cX20svSPLJHOe4SfXXGhXWEUWl6NJWix2acYLkUzV5v2ArwQ==;
 31:HeCTC/jg5OsZQEfSRJOiWXImusR5tDOpPRvZPhQPPSd61Qw+x2iCyP/yGvwgoY4IaR5z0sDij7ZCEdRW2skbpPD5muNXi9xMpVnkkRLNCPQ9vPKIhGqIFltQJorKWHXjnmDTZ+Vx6PA4HGRjraVpeisZoM2bx6tfAIoBq6i760rEXCGNe0agbuR0nedtfSZXb/lHXmxBMInGveZGBPXdwPeSGsxMBz0WVMhdhg2B275AOpoKExGpXwWbPKUkj/04wZaXbL59u+uFXL977i4fNA==
X-Microsoft-Exchange-Diagnostics: 1; SN1PR0501MB2047;
 20:Vmh7jyWOCwZ2juiuSddPTSBEukAgfd9XiszhCHR1nhkCf+ai38rEcWw2EgQZ7DEHuUaxRENDuQBSeTy5+rqPVU2tRO70H67obpggqJy0k+aNleZZ8GLIfNGj6zUopIzs35pwua8tRf6pMbo6AwZv9SAvnJSDC6nNGRynhaigehn4x1zVybEjJn7Cq5HlJaYSDO2zVzfWSH7U3bNZiQh97SzB7nz47DeBkTrwT+dcr1E7Tyqt5UXXk+6G7XgNhn8L2ePFpmA5vx3jFQq0HlJP5eyT1QyF6HL2wbsgmASzhlQLP7tQ58QgOWR1Nrk6aGz6cFlVWYirQ5lI3pjDGIe3jxCHexwXoc8NKaMHEP/tI3jtqKcO/xeqX/QC4Rr4s52/Q35hQUhgLHzxnELj5ZJ82IyI/bfN7vlSbgCZNiJDvHeBNvS98eg4xEas38L5K3aWReb7PoIThie1kkwq3RkSLpLYr7HsgYbUi6KeBo2I4mjX5oWzW09pmlTr78GgI5/E
X-Microsoft-Antispam-PRVS: <SN1PR0501MB204793F0C74312B66F29BA4BC1040@SN1PR0501MB2047.namprd05.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:(112903893386949);
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(6040450)(601004)(2401047)(5005006)(8121501046)(13015025)(13018025)(13017025)(13024025)(13023025)(93006095)(93004095)(10201501046)(3002001)(6041248)(20161123555025)(20161123562025)(20161123564025)(201703131423075)(201702281529075)(201702281528075)(201703061421075)(20161123560025)(6072148);
 SRVR:SN1PR0501MB2047; BCL:0; PCL:0; RULEID:; SRVR:SN1PR0501MB2047; 
X-Microsoft-Exchange-Diagnostics: 1; SN1PR0501MB2047;
 4:DdInQOExtNcSzi5XEWJj4IA7xrLNRBbStHVn+bmeR/xnDgoKhyKR9vGJd9atkWx6wPEUxv/+hrnZquWZZdpCq2QqeC2Ys1w8i5jcvzIdSPntRcE1WqJY5NP2sdFo6LXidSXwLSlMLunWC1BbBCiiFmx6S8IOPR4nfbRXe9VKGxu32GK86IJIg9FysE4bRdE84tdeTbpZYzBCUhxNLAX8rTqamyPCV3OVOuOFasJB+QvRTqR/Wh3ZIPV2mkKNMnZ+KsYYMq8G0n1lW+gfHOMQAM6GZl0gZlyihb1U9nq0txB6HcFvfllMpJVDZIMD0QxhaEkRhENXWdh+nqgtQo2QyXL2SCw3OWrMg3jfTofGEIs6/BFSueZ0YXHinKO3woRGWljrTXDWBAGAp33rjGYGJIFFRu9SR0Xfa/XCBhcoMA6ZniHLqkkMG7J7g+SQq0/aK5p6dIu2GqP1ybk1eLLciCIFo0pvq37rcGz/RPuMhQj89ajRsm2wqyY8392d3vDvUbFz74mOb99Z4PEWcE4ZYPgGAWa1wbunRv8Da0BksL/T/0c/qfwK2+eOc0CeXuAWYKhRj4GcvR3GMm7NPfPhBX0hJnR4DL3ChuQIptDHnhhea7E/TRwXFqO56bjjafKeKxByPGlw09x4XvXvrYfzcAgbaXTgMIkmpe2Plc+GTMOhySEfV6X6I6Tz5cEOA3+RWfCRBO/0ba93wOLR+X3jva81saL4IYJRlJtz81qah5cqC1WS162s9CA4Fq5h0Bo8o4/XfXk9y3gpkSemPuZHOgFok1lS9Ey713NfbkqoCt2yRhZNYytBKTv2hCALNtPW2Ul7kUwJeKWEzqaS25KWz6TDwdhI+SaM5dm+gRxvBY/aowrtF3UCFgYQE+InEyjupVj07L4Ct5iCAkWtEYwMMw==
X-Forefront-PRVS: 02788FF38E
X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; SN1PR0501MB2047;
 23:UC9ovcyIJXvjr6I5nI7JsV3hGKCpoYbB0Rvs1K4?=
 =?us-ascii?Q?vpgn6NdKg5Vug2Gk5V3sUJWUTbmpT7A2EMOpAuyx3eq/zBbosftFPpO/p12l?=
 =?us-ascii?Q?M5+zoHmkLG2ScAYXgOlqOpbHiPx32ahcheaOoQhbhIXjGkBevcx1xLnlcDVK?=
 =?us-ascii?Q?HBpfdPSi9t4HhCNjAfIcaBObzJHUjVbCm91SzOXbgn/bfVbKjKJN0fFEyp5P?=
 =?us-ascii?Q?PK9GZM76Qp/KqFhDuDjcre17zRqbcoTl9bZmLpx/tt5qC6H0fgv+ezOhHLrt?=
 =?us-ascii?Q?FfaPC4HUH0M0iaW+4cL6bi/enWfxZzV6xhGhihAlne8eaW4o4DtPy0VDPoVz?=
 =?us-ascii?Q?9Q/J0DeSHAb46PoabRDrt/mUsFcWwTpYVnxWgOF+tqc/ieLwiqfy7UCfr40H?=
 =?us-ascii?Q?ac9Viw72rJ/gFwE+0glIGBDoXjaPwUQ/T21BbZ8NrpgVBhsC6hJDqePMqPpQ?=
 =?us-ascii?Q?hd43sCXddwRidp+sW2/JJAlUYCkraFxSevYg2GJvg4RLg+Ju/6j8LYq0LB43?=
 =?us-ascii?Q?/XSZs6gFkSe3/rT9Mm7QWM3wtwsXSQpozYGWYLDbtzYpVck2HBv2vnLTWQuH?=
 =?us-ascii?Q?tB+oBbHI8e+AGQq/fQ3VZaluZ7FbclAqj+tyPyAsN10HPOsM1P4lEIr7UF9C?=
 =?us-ascii?Q?+HqBmlxCYqDCVfGoxcnqsqWWtp6a9I6Hc91EJus+zyd9XEpxYmpgE/oHkIUt?=
 =?us-ascii?Q?Hk0K2mi0mFcS4PVHlLiwcPRMwi5H4u1HcDWtilTrvllzguRHPRMlY5/d+xxH?=
 =?us-ascii?Q?rZf5pROC6y03FnbEZgpYOmTxiT2vq+R5vMlP3rhLqaA5ZHnn+KF99cnKLOVb?=
 =?us-ascii?Q?Jm4uTirGjTIsjA5o8kA+EZkI/hIB4/pyjRe1rLohwAWeMINn2eAGX3/TMgK3?=
 =?us-ascii?Q?xgd9X21mOOBOb6au8IiVw26743r/D/4ZkmiwKQ21jGhiCyLvVLGg7jFtGYn0?=
 =?us-ascii?Q?KwqlF+nAPhmk0tbiZqupuq1XYn9ztNONS1T/tNkq5xlrUCZ2H6Q0ZDasstGo?=
 =?us-ascii?Q?1JNwVpxJZdNCsDs3k6e8Ozu7pNAyimg2OMnAVYwLiwgqv0Vv/XYh1IpxUGE9?=
 =?us-ascii?Q?8jozkPEE6Gm0dr/hQpbQ+ESIpena3h1T3WlQ9mwirLhcm/rjw3r7rpbV56/K?=
 =?us-ascii?Q?VS7mDR1AvYMvQD5TYPFJuWOE8wJHVagmqvIBY+Yx/hioqdjlaIChC83Kbido?=
 =?us-ascii?Q?pEkxiS+LUzhr/kTYTjZBgLg61wtLKKyVt80Y4ZDx991wOfRJZh9ccRTi0oUB?=
 =?us-ascii?Q?YsQbZ03VDS5z5WS1iRZNsXrvoCfNzSaR6cuQpadsMLkNvdsl9gLZhE0qfgFG?=
 =?us-ascii?Q?2wpNuWRP/Vhy37WLa6EcRIl2RCkKEJlha39x2BJi6wrgJiGehLLOVVx3RxKT?=
 =?us-ascii?Q?7I0oPpknBpEsq/AQfX+K4NGJXooRoMjDbvC2hWr47V7P/aVoFxP05ClnZ+7i?=
 =?us-ascii?Q?YJ0k7Vp6egjF+R0tmLCR/3FlkZ8EWjVB9spm7ag/tK1DpaqkorKH6y2mnVJE?=
 =?us-ascii?Q?E4sWtcK25ZmD2VQ=3D=3D?=
X-Microsoft-Exchange-Diagnostics: 1; SN1PR0501MB2047;
 6:lzOkE2WXF8DKMImREkfDD/q+NlPPDwPRW6CG0Nyk3O2KDbKcwVDnUHG3LHHdODGi7inzbw/HAffg7+ZtrPp/bzU8cFGzrXZO5h9hEOk0htx3h1FCBBwtGqwi+EFNPqFt+nkXGefz/GFMGAkr9/w2fgJJKZhAuBhZcqTW2/OtxwHM8t3NPEP6bhdL5wEkuFJpxe3G8Iag4pLJRGxzpuePWjjdsC/W2iAUOta7w4/BAIm1QkzHafp1xGKpn0D1mIVw0ulNF7mA05VImnfcxwCdEppcjRJohoJio/+hX+yr6zQ/17kbyjd7tSOVn3O3ftJSrwcszoDXQaR6BZcXKCk2i90IUbon3ISPkI3XiFW7YFkPZ0o6huSNujfFt98Z/wzMW8ruz0dnCfMEEHjixvIHWvrxemPVqwReDcqUvZ8awqVrFhDxjGx3XX0AhkanlIsICQMA+M10Gm4tLVop1dBkWw==;
 5:d15GdEgKnIYNwfRovDpDwJU4Ybc6gpUWdbZUPiVdYz4uM2Nxhg6kttSUAvYlptfFvEegMsgQj+NlHktyl01jl5h4moy9JefDoprsH0wt+ko0DuqXIHDYKgK1xhjTCuibT9+oVAQqs3Z3Coog/NdgqA==;
 24:jFsOI2BKBwQZjoBexuQzVS7A+8szfokucioydoK4cmDM2Z4SajPKfgO4HpiHjpe8MRaxCCPf0Q6P7QGzIR2Ui9P21kIA0sioGnxwYdDa1xg=
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-Microsoft-Exchange-Diagnostics: 1; SN1PR0501MB2047;
 7:VPdwnbhDwdQ69yHrTEv3nsp17lR3YQvollZmdobN5fAOoJFargCGFLCEYyKcY1Pml66RwoZjDp2dWOEZWoVuUkp/VvK9Loqn8UXHEiRJRaxFQw5i0amjLIkpfrni4A+fn8/MVSTVIMqtAXwW0Htk2RkE8I/3cOXyJmQRk6BUti8t46ynLenwJuWbl85dqsaa6dDTTUhk+qsFmW9oREpT++g+p5jGegvRTs+Aqz9nkGheG6F4WLH8RoupmfTAGNDjWXeZN75QzPO0wPsEhldPDvTLFJ2DQL2UJoLrxlKGiAwbFDW1KdDCbDZJzP38gAuga+ItTiZu9NCnzfSLB5JNOw==;
 20:gFPByWaBZNDnkURmzzCymQsz6U7YQRa5i1PgOV7GtEbsB4TIiTzbP5iR5pgVOwfHh+ZCHLxfI5blTcmNU0mL+Mgz0YFZ2d9zoBJIUvx3o6k2mtPCN6tRts9nhZPeT7eML7SAaA57hH3Hha0obesdIXK7ANEyOIDkWYCrk1/HF/0=
X-OriginatorOrg: ksu.edu
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Apr 2017 06:03:08.9412 (UTC)
X-MS-Exchange-CrossTenant-Id: d9a2fa71-d67d-4cb6-b541-06ccaa8013fb
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=d9a2fa71-d67d-4cb6-b541-06ccaa8013fb; Ip=[129.130.18.151];
 Helo=[ome-vm-smtp2.campus.ksu.edu]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1PR0501MB2047
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 15 Apr 2017 06:03:13 -0000

On Fri, Apr 14, 2017 at 1:55 PM, Kyle Evans <kevans91@ksu.edu> wrote:

> On Tue, Apr 11, 2017 at 3:20 PM, Kyle Evans <kevans91@ksu.edu> wrote:
>
>>
>> On the other hand, I think I could fairly easily implement most of these
>> into libc/regex. Here's a summary of what this option entails adding to
>> libc/regex, from what I've found:
>>
>> * Empty subexpressions(*)
>> * Add missing quantifiers to BREs: \?, \+
>> * Add branching to BREs: \|
>> * Add backreferences (\1 through \9) to EREs
>> * Add \w, \W, \s, and \S corresponding to [[:alnum:]], [^[:alnum:]],
>> [[:space:]], and [^[:space:]] respectively
>> * Add word boundaries and anchors:
>> ** \b: word boundary
>> ** \B: not word boundary
>> ** \<: Strt of word
>> ** \>: End of word
>> ** \`: Start of subject string
>> ** \': End of subject string
>>
>> (*) I didn't actually find anything explicitly stating this as a GNU
>> extension, but it's certainly not conformant to POSIX specifications to
>> use, it gets used a tiny bit in some ports, and we implement a workaround
>> in bsdgrep(1) for the simplest case of empty expressions ("") to match
>> everything and produce zero length matches.
>>
>> The main benefit of this is not having to maintain a completely separate
>> regex parser and the potential for inconsistencies that come along with it.
>> The downside is that that would seem to promote expressions that are not
>> strictly POSIX conformant. Is this a problem? Is this a problem worth
>> worrying about?
>>
>>
> FYI- A patch showing what the implementation for all of the above into
> libc/regex looks like [1]. Some cleanup is still in order and the test set
> is not exhaustive, but this should implement all of the GNU extensions and
> it's at least functional.
>
> It will break some things (like one of the tests, for instance) that
> relied on being able to escape an ordinary character (e.g. \b) and get an
> ordinary character. This is specified as producing undefined behavior [2],
> though, so I don't feel terrible about breaking it.
>
> If this seems desirable, I can work on cleaning it up and splitting it
> into more consumable bites for FreeBSD's libc.
>
> Thanks,
>
> Kyle Evans
>
> [1] http://files.kyle-evans.net/freebsd/libc-gnuext.diff
> [2] http://pubs.opengroup.org/onlinepubs/009696899/basedefs/
> xbd_chap09.html#tag_09_03_03
>

An amended version of this patch can be found here:
https://files.kyle-evans.net/freebsd/libc-gnuext-2.diff

This one introduces a REG_POSIX flag for regcomp(3) that removes the GNU
extension for a more POSIX conformant implementation along with an
amendment to regex.3 to document said flag.

Instead of removing the tests that don't fail like they should under GNU
extensions, I've restored them and added a 'P' flag to specify REG_POSIX
and marked the failing tests as such to clearly denote that they require a
more strict implementation.

Thanks,

Kyle Evans

From owner-freebsd-hackers@freebsd.org  Sat Apr 15 16:18:10 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 47CE4D3FF79
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sat, 15 Apr 2017 16:18:10 +0000 (UTC)
 (envelope-from bapt@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
 [IPv6:2610:1c1:1:6074::16:84])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "freefall.freebsd.org",
 Issuer "Let's Encrypt Authority X3" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 26FF8E2F;
 Sat, 15 Apr 2017 16:18:10 +0000 (UTC)
 (envelope-from bapt@FreeBSD.org)
Received: by freefall.freebsd.org (Postfix, from userid 1235)
 id 4EE3E732D; Sat, 15 Apr 2017 16:18:09 +0000 (UTC)
Date: Sat, 15 Apr 2017 18:18:08 +0200
From: Baptiste Daroussin <bapt@FreeBSD.org>
To: Kyle Evans <kevans91@ksu.edu>
Cc: freebsd-hackers@freebsd.org, Pedro Giffuni <pfg@freebsd.org>,
 Ed Maste <emaste@freebsd.org>
Subject: Re: Replacing libgnuregex
Message-ID: <20170415161808.rqcq44qcfyrrrrdg@ivaldir.net>
References: <CACNAnaEmBjWudEJwvRTSqyciOp7-oRbCEQ_e6qtGsap0oHQ4yw@mail.gmail.com>
 <CACNAnaGOLVKR7Y4uzhuS7EB5-UMb3tS9yKL4Srn8knThk0o1kg@mail.gmail.com>
 <CACNAnaHRi4RH4Staf6ZT5+1_ZqSBAR6shOd2=nYt3K9_A5kKZQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha256;
 protocol="application/pgp-signature"; boundary="zty5vwucofg7xgsw"
Content-Disposition: inline
In-Reply-To: <CACNAnaHRi4RH4Staf6ZT5+1_ZqSBAR6shOd2=nYt3K9_A5kKZQ@mail.gmail.com>
User-Agent: NeoMutt/20170306 (1.8.0)
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 15 Apr 2017 16:18:10 -0000


--zty5vwucofg7xgsw
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Apr 15, 2017 at 01:02:42AM -0500, Kyle Evans wrote:
> On Fri, Apr 14, 2017 at 1:55 PM, Kyle Evans <kevans91@ksu.edu> wrote:
>=20
> > On Tue, Apr 11, 2017 at 3:20 PM, Kyle Evans <kevans91@ksu.edu> wrote:
> >
> >>
> >> On the other hand, I think I could fairly easily implement most of the=
se
> >> into libc/regex. Here's a summary of what this option entails adding to
> >> libc/regex, from what I've found:
> >>
> >> * Empty subexpressions(*)
> >> * Add missing quantifiers to BREs: \?, \+
> >> * Add branching to BREs: \|
> >> * Add backreferences (\1 through \9) to EREs
> >> * Add \w, \W, \s, and \S corresponding to [[:alnum:]], [^[:alnum:]],
> >> [[:space:]], and [^[:space:]] respectively
> >> * Add word boundaries and anchors:
> >> ** \b: word boundary
> >> ** \B: not word boundary
> >> ** \<: Strt of word
> >> ** \>: End of word
> >> ** \`: Start of subject string
> >> ** \': End of subject string
> >>
> >> (*) I didn't actually find anything explicitly stating this as a GNU
> >> extension, but it's certainly not conformant to POSIX specifications to
> >> use, it gets used a tiny bit in some ports, and we implement a workaro=
und
> >> in bsdgrep(1) for the simplest case of empty expressions ("") to match
> >> everything and produce zero length matches.
> >>
> >> The main benefit of this is not having to maintain a completely separa=
te
> >> regex parser and the potential for inconsistencies that come along wit=
h it.
> >> The downside is that that would seem to promote expressions that are n=
ot
> >> strictly POSIX conformant. Is this a problem? Is this a problem worth
> >> worrying about?
> >>
> >>
> > FYI- A patch showing what the implementation for all of the above into
> > libc/regex looks like [1]. Some cleanup is still in order and the test =
set
> > is not exhaustive, but this should implement all of the GNU extensions =
and
> > it's at least functional.
> >
> > It will break some things (like one of the tests, for instance) that
> > relied on being able to escape an ordinary character (e.g. \b) and get =
an
> > ordinary character. This is specified as producing undefined behavior [=
2],
> > though, so I don't feel terrible about breaking it.
> >
> > If this seems desirable, I can work on cleaning it up and splitting it
> > into more consumable bites for FreeBSD's libc.
> >
> > Thanks,
> >
> > Kyle Evans
> >
> > [1] http://files.kyle-evans.net/freebsd/libc-gnuext.diff
> > [2] http://pubs.opengroup.org/onlinepubs/009696899/basedefs/
> > xbd_chap09.html#tag_09_03_03
> >
>=20
> An amended version of this patch can be found here:
> https://files.kyle-evans.net/freebsd/libc-gnuext-2.diff
>=20
> This one introduces a REG_POSIX flag for regcomp(3) that removes the GNU
> extension for a more POSIX conformant implementation along with an
> amendment to regex.3 to document said flag.
>=20
> Instead of removing the tests that don't fail like they should under GNU
> extensions, I've restored them and added a 'P' flag to specify REG_POSIX
> and marked the failing tests as such to clearly denote that they require a
> more strict implementation.
>=20
> Thanks,
>=20

Thanks for working on this

Just to follow up on this:

Have you tested the results with the AT&T testsuite for regex?

You can find it at least in the dragonfly source tree:
https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/abce74f49c2c19b069=
958a0b48de0a9987d14e35

Or online I don't remember where :)

another approach would be to import libtre + extension in our libc (like it=
 was
done on dragonfly - it was actually a freebsd project that stalled)

Best regards,
Bapt

--zty5vwucofg7xgsw
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEEgOTj3suS2urGXVU3Y4mL3PG3PloFAljyR74ACgkQY4mL3PG3
Plp6JA/9HEeUfT4DYLJ9OcHaPwi/5tf54S9iOZD8waD7MtDdydtK9Hghn93rDN6q
4Cxkm1ab0qXnYfFCJqwg2o5jHvmP5RG1a1EkW4OGe0/QUluvVM2bitr7v5BC1IhI
Ngrd3xZebLA6ce5KloSnuFxUWrT46CYlcKPWCwCOsXoP+tCRmEYdy5+fnVHACwlO
PJtR9xGysEJmow+ZWWL6FByHfui/5Wz5hlztD5T72f8/Y4xYpHQ+HisRrTmRm8TA
sxNMHkmffXmuq9wJZY+Pz10ucGkQzS2LjWYfKzN7UcHhqfpLS3GA0II1wqF9rowa
RxdDTOl1SsGh5DxEkqP/hepuX5TItLL95G6N7zBmB2m+6qcWVGTINKw1CMT8wVng
GeGQElR/lM3qlE8C+jj0uq0RLm33d+7weQle4oiPUScKPf6/CGwDuntHkiU8oe2+
yn8LdBNHjuXQcPkmVz34IWEnAo45ZCTuyK8ebJifjPjZEn3cSVS1TG3HARdF3QKJ
e/2pWrwXaA7KXXeW5wA3HamJlcBCIbQ6DKwrKEyJUfavsjp4qmJ/sbE3ok7cM9qY
oGLTJsI7YI1KdDneFiL32zzDmPv0uMj8pLTLwzvmVvzKiWw13yBweA96YEbx1+pf
TPLUOLeYZhaDG9kkyZCVW9ZtSRzfupKfpC49yhsS9TQg65vdAWk=
=oRT7
-----END PGP SIGNATURE-----

--zty5vwucofg7xgsw--

From owner-freebsd-hackers@freebsd.org  Sat Apr 15 16:31:31 2017
Return-Path: <owner-freebsd-hackers@freebsd.org>
Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 07468D3F517
 for <freebsd-hackers@mailman.ysv.freebsd.org>;
 Sat, 15 Apr 2017 16:31:31 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
Received: from NAM02-BL2-obe.outbound.protection.outlook.com
 (mail-bl2nam02on0089.outbound.protection.outlook.com [104.47.38.89])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits))
 (Client CN "mail.protection.outlook.com",
 Issuer "Microsoft IT SSL SHA2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 66F0BCED;
 Sat, 15 Apr 2017 16:31:29 +0000 (UTC)
 (envelope-from kevans91@ksu.edu)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ksu.edu; s=selector2; 
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=2m1amc3RjQSvfwWW8cxJ5VTZsOVAJtVPBWklaQ+hxA4=;
 b=RRfRvxkskL1Ns+v2YDKj5W5ZQ3Sk/BF8YFNexmFn7Mi1HteSNO8iQ6w6RtSUG9Mlz3gY5dqXYIPgUJwUf/aJ+tcKWZdxY/gmW94Hi6qSxZJM9F1V4IW0cIlJpQyWGTKRUz3nriUw18LbMDU7C87oTdN94A4dBhyQhQUc1OI+J7M=
Received: from DM2PR0501CA0031.namprd05.prod.outlook.com (10.162.29.169) by
 BY2PR0501MB2038.namprd05.prod.outlook.com (10.163.197.25) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id
 15.1.1047.6; Sat, 15 Apr 2017 16:31:27 +0000
Received: from BL2NAM02FT062.eop-nam02.prod.protection.outlook.com
 (2a01:111:f400:7e46::205) by DM2PR0501CA0031.outlook.office365.com
 (2a01:111:e400:5148::41) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1047.6 via
 Frontend Transport; Sat, 15 Apr 2017 16:31:27 +0000
Authentication-Results: spf=pass (sender IP is 129.130.18.151)
 smtp.mailfrom=ksu.edu; freebsd.org; dkim=none (message not signed)
 header.d=none;freebsd.org; dmarc=bestguesspass action=none
 header.from=ksu.edu;
Received-SPF: Pass (protection.outlook.com: domain of ksu.edu designates
 129.130.18.151 as permitted sender) receiver=protection.outlook.com;
 client-ip=129.130.18.151; helo=ome-vm-smtp1.campus.ksu.edu;
Received: from ome-vm-smtp1.campus.ksu.edu (129.130.18.151) by
 BL2NAM02FT062.mail.protection.outlook.com (10.152.77.57) with Microsoft SMTP
 Server id 15.1.1019.14 via Frontend Transport; Sat, 15 Apr 2017 16:31:26
 +0000
Received: from calypso.engg.ksu.edu (calypso.engg.ksu.edu [129.130.43.181])
 by ome-vm-smtp1.campus.ksu.edu (8.14.4/8.14.4/Debian-2ubuntu2.1) with ESMTP id
 v3FGVQTs018767; Sat, 15 Apr 2017 11:31:26 -0500
Received: by calypso.engg.ksu.edu (Postfix, from userid 110)
 id 28ADC248304; Sat, 15 Apr 2017 11:31:26 -0500 (CDT)
Received: from mail-wm0-f45.google.com (mail-wm0-f45.google.com [74.125.82.45])
 by calypso.engg.ksu.edu (Postfix) with ESMTPA id CBE6F248302;
 Sat, 15 Apr 2017 11:31:23 -0500 (CDT)
Received: by mail-wm0-f45.google.com with SMTP id y18so4221895wmh.0;
 Sat, 15 Apr 2017 09:31:23 -0700 (PDT)
X-Gm-Message-State: AN3rC/6FUVbkyNl+Z4OJmc3q8Z3survvtGnYDU+5ouUdSoQewhWi4xyR
 0H1HXbZgfKRmVPn3H6dNG1y3R0eTLQ==
X-Received: by 10.28.181.69 with SMTP id e66mr2909977wmf.33.1492273882105;
 Sat, 15 Apr 2017 09:31:22 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.28.39.134 with HTTP; Sat, 15 Apr 2017 09:31:01 -0700 (PDT)
In-Reply-To: <20170415161808.rqcq44qcfyrrrrdg@ivaldir.net>
References: <CACNAnaEmBjWudEJwvRTSqyciOp7-oRbCEQ_e6qtGsap0oHQ4yw@mail.gmail.com>
 <CACNAnaGOLVKR7Y4uzhuS7EB5-UMb3tS9yKL4Srn8knThk0o1kg@mail.gmail.com>
 <CACNAnaHRi4RH4Staf6ZT5+1_ZqSBAR6shOd2=nYt3K9_A5kKZQ@mail.gmail.com>
 <20170415161808.rqcq44qcfyrrrrdg@ivaldir.net>
From: Kyle Evans <kevans91@ksu.edu>
Date: Sat, 15 Apr 2017 11:31:01 -0500
X-Gmail-Original-Message-ID: <CACNAnaFsAdesoF5ftRAuMBg9P1mdnXiNWnmQDTBG54W7q14tew@mail.gmail.com>
Message-ID: <CACNAnaFsAdesoF5ftRAuMBg9P1mdnXiNWnmQDTBG54W7q14tew@mail.gmail.com>
Subject: Re: Replacing libgnuregex
To: Baptiste Daroussin <bapt@freebsd.org>
CC: Ed Maste <emaste@freebsd.org>, Pedro Giffuni <pfg@freebsd.org>,
 <freebsd-hackers@freebsd.org>
X-EOPAttributedMessage: 0
X-Forefront-Antispam-Report: CIP:129.130.18.151; IPV:NLI; CTRY:US; EFV:NLI;
 SFV:NSPM;
 SFS:(10009020)(979002)(39410400002)(39400400002)(39850400002)(39860400002)(39840400002)(39450400003)(2980300002)(438002)(24454002)(199003)(377454003)(189002)(46386002)(38730400002)(93886004)(305945005)(110136004)(189998001)(50986999)(6306002)(54356999)(76176999)(93516999)(2906002)(86362001)(8676002)(63696999)(6246003)(221733001)(575784001)(8936002)(606005)(356003)(9686003)(3480700004)(7906003)(45336002)(8576002)(498394004)(84326002)(229853002)(90966002)(54906002)(75432002)(55446002)(6916009)(61726006)(7116003)(5660300001)(2950100002)(512874002)(54206008)(4326008)(9896002)(42186005)(61266001)(450100002)(236005)(106466001)(88552002)(55456009)(969003)(989001)(999001)(1009001)(1019001);
 DIR:OUT; SFP:1101; SCL:1; SRVR:BY2PR0501MB2038; H:ome-vm-smtp1.campus.ksu.edu;
 FPR:; SPF:Pass; MLV:ovrnspm; A:1; MX:1; PTR:ip-18-151.net.ksu.edu; LANG:en; 
X-Microsoft-Exchange-Diagnostics: 1; BL2NAM02FT062;
 1:DtqVLGnY0/Yqx8xtE8hrAmChvf+KYobtd7xGSAhFlenP4FKbJuzWowm/voXH/hjo8lNqRDZ2KQcZbsM45jRkwii9Y/fZyWCdt6GEbH3JZGVmgQbUzYC+tiGHT5mEI53sveEMBv1agMqs0qWriQhmVnKrCSMVrKzXuUATpb6j6KGCcr5mOMVgboJCyavbh8oaIlF3WWugKeRr+MkR09ipR/iVjBfPyZ+ZnW4xg6UPpT8D+ggehoHoZKUqvg4StzWtlIVJpo2UsPE5DD/tfNRAtfbztCiURfVXyf9o5T8O4jkTi6C2VtqzaIwG6tkC1yujCpux++IIKB8ZBdUmoguxwo/BGAzti9G+Ndh50ib+SXD90kyzrrEc0YRU8jRjhat12o3YF4eWYDajWKsG4iXDQXe56O431vkzVjxByk8j5HsB8Aub4zNPvmib3tuyhqNQmA74/DrLM/Y3XMxGCeKYM3qWzSJZjITpOiA51MQjS1r6EyWg0Rr1gFA6mpkJp+sfpPT3/zq4JsEp3AbJRVv2ucCkkuJFAOAOslW46SDdTMk=
X-MS-Office365-Filtering-Correlation-Id: 3e703db6-f974-4d0f-bf6a-08d4841cdcfc
X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(22001)(8251501002)(2017030254075)(201703131423075)(201703031133081);
 SRVR:BY2PR0501MB2038; 
X-Microsoft-Exchange-Diagnostics: 1; BY2PR0501MB2038;
 3:dOnI9JIWirZn/Bn/1YAgMPkQo90j2sSXlHix64CrNxyjQVP83hjH6dBjCSq+W8cpWmUKEKt04eKEPkdIXWQyHJFCaenJjvTdfmEtQ1QRZ4HiO93zkfifSyF11Zer/xg6sNAcsZEGnPzj1K/93tGH88aoZEZbKB+1rvsNtrYNII4fffy9kvj/RuwuRDO2D6bSUM6I+8HP7qG2dqYx2BZ4wnaH7cBAxa0BjQmgavbF1FDLNdANqiggzWR/5HIxwUb+Z00gaHZEcTIOV8u+6urEjlCGtdh9mo+2w2wn00q3ONa/rGluBzWgyBRBqsfZNCcLNBCaq+RCsuwiBKhMlT8nKL3JmPz5rJakg8mrSdapCQQt5HnN+WDmgJHIdHJmPJYQHHjdMYkse9zpkMGL47lk6o9u+7UBYRPDNfocMou1xcbaUUz/T4m6iT5qIu6sZAwCxAEMfjeVQchiF2ca1GunwK4zUnbhTzL3sAUVAB4bKfv1GCM6GWRrtRLml/4DjSpG
X-Microsoft-Exchange-Diagnostics: 1; BY2PR0501MB2038;
 25:hhYzV1BVaYTOmUa0uNd5fJXVF3eUJhoMeEykj11sm8WO8OktWuhee/+1dVOX145wQkGhgq4ImNqWbsXfi2Lc0TbraVGQ3VV1YsgLc8JFfa2q1ZWGmWG2R90ey0WbBG9CCTrEwBOk3V293pnrjY6o9DtdItfFFyvXDo0AQVEWkQiPbcMZ2Mtt74rwt1KFfuZ6LVCYCn2GsXTqMUTYmmNVxmJJLNsr+McKTESFXQMwilxDEv+5oHiHdUego1e/JttsjFKoSUSCAtKv90N9pZRFijcpmA+IxsHCBmx0mZDKg3ZF9oJXCIfz28X1qKuLHj4FX+y+6JFnwfWie6OtDGcVwGEWIfoSaCInJaq/lkPk76GeytjM1Ii5s+qjJk1xCtQbgLf8G4WdF16Csnl2rh4c8KUaqezZ0BrZ3E/fENbh7kpOhRTETDMg3THjv4KVVzyXBUwMjb4J+26VRTatTbkT8cjUYs/MfcAUf3BLGFj5ZDM=;
 31:vMTF6qtWCdKOBaCa2CcF4NTQx7r5TVHcQoGFH1Nw5M3QkNJKiaNPIX7bg0jCBU4KE3FbIEpeoqd3oz9R73RZQAufTO8F/2RwWp5JWVXMlNyb2xG5MZOCUJn600rfm48X4lkmyMVtU3g1LqUvWmLuBaZIRVSBJpDGwN/aSCUkkTX2MOsNSB0jHWbXBRyyGpBZw9n1IWUezR3h6x/ry+m/EGI1r4nG+C+drsE9IrrQ7I7IB38khggHSf4C6XKnfrXHzU6ntrm+agWOvzjpl84tBDq0yRxZSiG99H51Oz82T4U=
X-Microsoft-Exchange-Diagnostics: 1; BY2PR0501MB2038;
 20:rNiGXxop/L/x3DiMe0RS/xz+MgktyT2U2wTvwgbpLHdCj1Hrzo6SD20rif0E4tVowqGLtVylU2/LSEitr+gKe+sG0Tamqo3PtnQaVby5X4v5ULGNwjeZgI8Lh/niL5nQNfsxp9mQeFQSMK5TpzUPhurzDef58WO0bZGA+ZYCYL7dwI1WMq/vvbfcKATN3nppxx5mXEg6MZ89vp+vYByTyGkcfTyktIr3J4x0Zlc7I4y3+gtS8ghj7z7tCiFNamNEoUn2lysYE0J1bU85PFdqDbykNSWvpNOjkvYsqAv7brDdL5Vlah9SeuDPzARNteN1Qtlhw3OUKMkpr11ImLkt/RP3aNmVH01rulCbRwj37xx4CIkAyThHK8KLjpfIIQlTvn585LoN1DL6GsI9Wr1mFFyvuUWWk9Ax5n01ziskV2AnTqqAUkohKcE/Zdf/847XmbxFb6xJekDKA2MMEPisV4N24Tlt32hEv4XTHXrWUSStGH+4OxHNdqMuW+ONoJ5Q
X-Microsoft-Antispam-PRVS: <BY2PR0501MB20382F51DC5109521974A5EFC1040@BY2PR0501MB2038.namprd05.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:(209352067349851);
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(6040450)(601004)(2401047)(13017025)(5005006)(8121501046)(13015025)(13023025)(13024025)(13018025)(3002001)(10201501046)(93006095)(93004095)(6041248)(201703131423075)(201702281529075)(201702281528075)(201703061421075)(20161123562025)(20161123560025)(20161123555025)(20161123564025)(6072148);
 SRVR:BY2PR0501MB2038; BCL:0; PCL:0; RULEID:; SRVR:BY2PR0501MB2038; 
X-Microsoft-Exchange-Diagnostics: 1; BY2PR0501MB2038;
 4:sBupbUwphn14r2IaUqE/Z6TkfncsAymhs21GEfzuyLANSNph1kJdaMQUtgHFAA95mKKIDYUuSXa24SqYHEqT2lBe8SXk6PVLfjluQT0CATOMygCP59mqgybRuJh/ROM5kfJVSQT88BJiFD2MzAKd/JJAm7jmGBJxUl3j7PST2PvmLJddwS8W8NUIL3b1s7vOJEf0Nsd7oFIq66z6MafRCKPxDuoTRjyKT89y3Ht8VvS9N/KhsmVm1/hiSP4qujekNcnKo4pP4oUDM3jBM8FqL64sCfWghVuGtutRBp5Dly2crEUgzGu2ZkaQD/r07aU4GifRBIHjlMzupV0h61ca7P+4uKpyNshPYOaB9j7AoVf349nhXYy8CWeasxANg3reJWazyr2wu2bkTRJCUuljD6fq0w5JFYp4F0Bj61KbCFtW5aFLkN18b4TfsWrDrNAHf1gjx+fFMVmDNv924HH6rBH8sVnKd0aMrlEvL2OZus12Cr07Rme48SiPxJc4YRRGUFKgH645ZLzVzqiPBNPKgqLe+LSrdSFTvB7HO05E8webVdmXacPMVW5/4NyF2LDSIX9rHIZ7yPYr1Ydccu0XcRCHd1xF7tUCymLY1cPeQHyaAdDxCR5ojuJkht3NPzmI6kr+0DO0aAoZuXkS3fofu6pY3nU2039i/cM3hjqUgEv6eqxSCwhrDbxkZ5FdL4iXdMKjDAjX0zOh2M8yLoWiv0RE7EEWqYDJ1V41QInyPH4ITqWWvLm0ZjDZuZwABxiBVasZnHX1HQshiSq1QKNOILs/IFnjghDPlsurMQh4CdjpVIAN/FQ0CjKUAyq6vgD4uk6CcyW5y+Rs8ZeMGmC7rr4vzRB9AFo8oo5bM1EaFt6Zti/xh0P8yM48h9s05+rcoj5yYKyyao/ctd+sDwnCmQ==
X-Forefront-PRVS: 02788FF38E
X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BY2PR0501MB2038;
 23:MkoEPUBkGChn7CBNfpB7Mbvlws0TH4X124l6/A/?=
 =?us-ascii?Q?9llQMiV8nHnqKMwo+frV1gJCepnJ02wiJmb44LjbTOZ/0eXOSdATlRxz0POi?=
 =?us-ascii?Q?oItIhQOTd9sb0uS48g47youXCK0eWcymPjO1ziVClb50HoqcyirF+MRkl+JF?=
 =?us-ascii?Q?uyyCR6h3ipyaP0jjnXryQTAyuIhJMJ6FvQNLzcG8EekWDpDTDRDRVsWp4zIT?=
 =?us-ascii?Q?iAlL4pWoOLlXC8hOC/CteabX8XySzhx8qcj4070rw+oWDV+m04MJft+7hGob?=
 =?us-ascii?Q?0ZqwVucS9f+Fgk+O7PNdbFBnK1s8wrbhuIDbnfwLxQnkk6sYNPmW8whZPty0?=
 =?us-ascii?Q?PE34sC6ZKl+7Z+cqN3o5aULBo7NTtGH6wk5Ux6rztegMVFZpNTjS1FWDFlj9?=
 =?us-ascii?Q?lzxAuFOmGo5D+LEFvRVCf6to+8iPuEr4dd27lkJt7eTJipd+dnOl/TaYJby1?=
 =?us-ascii?Q?yabMhnEqQ++YpuSr6Ija5irKzkza2V+NwkIVTcIb2GgMxeD/wjpEQZjqHiYO?=
 =?us-ascii?Q?kE5Zjt11TsfpcDzOobV2zWB7F+Qffnf39WnHgm3ZoegXEcxsh9h+LoAXI4Ql?=
 =?us-ascii?Q?vQl5HtYCeCtyRjNsxd073JF96J9RBDpGkVswmCYYzD7FBmlqM8iLam5Aaa3g?=
 =?us-ascii?Q?Bx2v2Cb2XI36L9dW0r/rbm+HOJjQ6HAFdNxTt6r6KyWJpzENWfyvDNwtdGVS?=
 =?us-ascii?Q?iZUAc8ejoavkbmlOaXWlHtnip27AQfT/Z7xfekJIpFxYPnAwR7egyaDqKXWu?=
 =?us-ascii?Q?8w81xmlN9EoEyiUJ9/tbVs8JccYS9PpBkmYzqxQbnHU/YpiW3zHUcu8a9RWS?=
 =?us-ascii?Q?g1Tnmh0xD6PkM1bslnb+7PG9JqDocSmimgPVoQ/BHG35JN1UnxsoUc7jvamJ?=
 =?us-ascii?Q?WMkUEJkj7usfjkyqLi3pgypZOLfP0aVyPPJTWTm1eApkCWRiZ+S0Jcl+BXTM?=
 =?us-ascii?Q?krSwaU7tcwKn8nSYny+KG9BZdluYAfWGOhXzqa1UymSlABrkYbuCLlsXDcmw?=
 =?us-ascii?Q?i0/WWCgQgCFmzPzcjdcjfH5teHqvrYUGY0HkJPpAamj621V7jFNgUD3FlXFY?=
 =?us-ascii?Q?BF4edo5zw7FVrjDflMIIFq2sFSrTbsDRMRACtHWs1zBeSsb9wyFaJCWzndZK?=
 =?us-ascii?Q?IRW+h8tXQjEzFPiREUMqEqw433vnhCUqFj88OV6g7JUEokgwajdSijfLR6yS?=
 =?us-ascii?Q?cd0ngWo59wq6Y2Uh4cVgzDCME9gpr2LmT+NDcrPMkosRtyYO8ptCcDD0ppMr?=
 =?us-ascii?Q?01CfTokVdrhI1jENkR5cORPxOk+XzyIb7RlXtpAkrlybhM63qh+6r6Hvcsir?=
 =?us-ascii?Q?/NvYB72AL03w4KwSFDJZmBOA1LDQ/DktswDXVQRHF35B9+NFtbK1h+Z3nbjN?=
 =?us-ascii?Q?daONWCGWBg9rWDXGfCiHjkbeEdCYvDwRzodIJ7/zANDAzdnh7+mNuKRnkTaB?=
 =?us-ascii?Q?+HYkvR0lADfQS5P0wOtmURTtZEqZHVM7DW8JWjS2GtlenW0FvBhNwFz2JE5J?=
 =?us-ascii?Q?V+zyg42+qAmbqPqtT9UdDiLl0xsNjMPvHYM27WuIBa5YZy+OtgshjQgSRtdw?=
 =?us-ascii?Q?RAyXH23DqoOY2CTafng=3D=3D?=
X-Microsoft-Exchange-Diagnostics: 1; BY2PR0501MB2038;
 6:Wzt3fMOdoICUebW4Th3UDZL7eIOnjwjLPHCCcLWYlgZEX0/736ZM804tR3f3V75JmdEgWYT3gdW2rrSJbHupVuJ5Aqez3sn2dKh40RqGVaxjjaw1gSFhlzi2ZQnAYBbsCWaqhaURed/1uG488/Z8b35pYl2oeHNJtsb9EQ5iX7xe5MdcIn0genvO+RIrkwBKCp6Wr3WkdI1HqNSr+1X0OS777RA51y4U+0n5O+xSJE+KZ7/A/pqp6ouCKLBIe7KeyXovQ9Mhz8pB8xWKdzhm1388aIp5DEj7YfuVee+MX4BC7rJkqf3CWCh6qRmNe9ExCl6dSVGKJcYxRWFEeskMegn+wkWsNQcP7fEea4QIh9lvpXVW04Ej6KVbIuR3cMlyn2G0e+eaH1R3Tl6eKA4JiiklsEoDRVaMETcDohgLYL0yytdYD7z7pjpG5qm9zZEdVR5Pkdmh0LeNZuICamWyNA==;
 5:3H8JS0uLrzdx1oA20y7c4ku5z3KVgbmDh82oBUrCaOZmykd7m02m6A0ZmVLGVxxr6eu0Hb3vBS6zQLB7JGfZfRJTikXe6+7tkjC7yWXVk/XBlxZj+OxRXPxwMTc7Mt0dcYvjNpAGofwH0riQeCjveg==;
 24:eeFMjVSV6wM9xlREVyMAE/zlyASxqDVDeVxOnZDc1fcfC+rhU6Lt4qLKkyXndYqEwDzHVSRb+9S//8hckd++CYytZQ/zJtFdkpsajV1zMyE=
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-Microsoft-Exchange-Diagnostics: 1; BY2PR0501MB2038;
 7:5zQznRQDW31m4V8DREpWODsMNMKOwQRv8OG9hx7Vy/pAmUajIOXKj1ySZLNCBabur3yPrFHoDe80FvCygnQUcj2zhq7eaH/xcrMhoNvWac/xJT9YlCrKiIfghau/yODfpHA8mnusG4jsD57Q7rXijq0X49o5VX0CgVIcOUJ3b89YP5J5G9PoQbBV5Z41P6E0inxztRBnz9A0DdoMoy/czVzsAh2rQv3cI7TLksWU1BAYSkvTX4UmtySNLU2xyhdgFNiMEqZB9GgkG3uUa03ZXXUHH3h8nJhI5iJNtaWjgfC7u3LDPmqJdvuOkd0phLnMSVk/SlphtJQbXCM6optQFQ==;
 20:BONzSB1g8QvVkG7o/Ans0jM8lA7Nv1ph7h0muVGW9mXS4Jlo42l+he1JzcIv1AoIAlIO7kZWBlj4bgl+1ZYEPJQkIBiFJPbNptOn+IK8wXC52v5pkOJ+Uvn0DEvh1rE4o6eDBYdm4qTGX6rl7n31R+qTOHRsFpbUzHOodzImJNs=
X-OriginatorOrg: ksu.edu
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Apr 2017 16:31:26.6020 (UTC)
X-MS-Exchange-CrossTenant-Id: d9a2fa71-d67d-4cb6-b541-06ccaa8013fb
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=d9a2fa71-d67d-4cb6-b541-06ccaa8013fb; Ip=[129.130.18.151];
 Helo=[ome-vm-smtp1.campus.ksu.edu]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR0501MB2038
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.23
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 15 Apr 2017 16:31:31 -0000

On Apr 15, 2017 11:18 AM, "Baptiste Daroussin" <bapt@freebsd.org> wrote:

On Sat, Apr 15, 2017 at 01:02:42AM -0500, Kyle Evans wrote:

> An amended version of this patch can be found here:
> https://files.kyle-evans.net/freebsd/libc-gnuext-2.diff
>
> This one introduces a REG_POSIX flag for regcomp(3) that removes the GNU
> extension for a more POSIX conformant implementation along with an
> amendment to regex.3 to document said flag.
>
> Instead of removing the tests that don't fail like they should under GNU
> extensions, I've restored them and added a 'P' flag to specify REG_POSIX
> and marked the failing tests as such to clearly denote that they require a
> more strict implementation.
>
> Thanks,
>

Thanks for working on this

Just to follow up on this:

Have you tested the results with the AT&T testsuite for regex?


You can find it at least in the dragonfly source tree:
https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/abc
e74f49c2c19b069958a0b48de0a9987d14e35

Or online I don't remember where :)

another approach would be to import libtre + extension in our libc (like it
was
done on dragonfly - it was actually a freebsd project that stalled)

Best regards,
Bapt


Yup, we also have a copy of the AT&T test suite in tree
(contrib/netbsd-tests/lib/libc/regex/data/att). It passed that, the other
NetBSD tests, and I also ran the NetBSD sed and the gsed test suites using
a script provided by pfg@ to ensure no trivial breakage.

Has TRE improved over the years? It seems like we had a version around 2011
or so for bsdgrep that was quite rough. I'm not sure if that was heavily
modified or just an early infancy state.

I think in either case, we might consider throwing errors for the bogus
escape sequences (anything that's not \<, \>, and backrefs for BREs) as an
intermediate to stop *that* behavior, because that's going to be
problematic  for many approaches.