Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Aug 2017 14:55:37 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-bugs@FreeBSD.org
Subject:   [Bug 221804] page fault during shutdown on a server with gjournaled partitions
Message-ID:  <bug-221804-8@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D221804

            Bug ID: 221804
           Summary: page fault during shutdown on a server with gjournaled
                    partitions
           Product: Base System
           Version: 10.3-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: freebsd-bugs@FreeBSD.org
          Reporter: longwitz@incore.de

I am running FreeBSD 10.3-STABLE #1 r317936 using gjournal/gmirror and got
a page fault of the process g_journal switcher during shutdown:

...
GEOM_JOURNAL: Shutting down geom gjournal 1247376231.
GEOM_JOURNAL: Shutting down geom gjournal 1068288894.

Fatal trap 12: page fault while in kernel mode
cpuid =3D 0; apic id =3D 00
fault virtual address   =3D 0x378
fault code              =3D supervisor read data, page not present
instruction pointer     =3D 0x20:0xffffffff806aa1ea
stack pointer           =3D 0x28:0xfffffe104445fa10
frame pointer           =3D 0x28:0xfffffe104445fab0
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 6 (g_journal switcher)
[ thread pid 6 tid 100132 ]
Stopped at      __mtx_lock_sleep+0x1da: movl    0x378(%rbx),%eax

(kgdb) where
...
#10 0xffffffff80954abc in calltrap () at
/usr/src/sys/amd64/amd64/exception.S:238
#11 0xffffffff806aa1ea in __mtx_lock_sleep (c=3D0xfffff8003823d030,
tid=3D18446735278556155904, opts=3D0, file=3D0x0, line=3D1145437376)
    at /usr/src/sys/kern/kern_mutex.c:506
#12 0xffffffff806a9ff7 in __mtx_lock_flags (c=3D<value optimized out>,
opts=3D<value optimized out>, file=3D0x0, line=3D0)
    at /usr/src/sys/kern/kern_mutex.c:256
#13 0xffffffff811686bf in g_journal_switcher (arg=3D0xffffffff81170e48)
    at
/usr/src/sys/modules/geom/geom_journal/../../../geom/journal/g_journal.c:29=
45
#14 0xffffffff8068daba in fork_exit (callout=3D0xffffffff81167a50
<g_journal_switcher>, arg=3D0xffffffff81170e48,
    frame=3D0xfffffe104445fc00) at /usr/src/sys/kern/kern_fork.c:1030
#15 0xffffffff80954ffe in fork_trampoline () at
/usr/src/sys/amd64/amd64/exception.S:613

(kgdb) f 13
#13 0xffffffff811686bf in g_journal_switcher (arg=3D0xffffffff81170e48)
    at
/usr/src/sys/modules/geom/geom_journal/../../../geom/journal/g_journal.c:29=
45
2945                            mtx_lock(&sc->sc_mtx);

(kgdb) list
2940                    g_topology_lock();
2941                    LIST_FOREACH(gp, &g_journal_class.geom, geom) {
2942                            sc =3D gp->softc;
2943                            if (sc =3D=3D NULL)
2944                                    continue;
2945                            mtx_lock(&sc->sc_mtx);
2946                            if ((sc->sc_type & GJ_TYPE_COMPLETE) =3D=3D
GJ_TYPE_COMPLETE &&
2947                                !(sc->sc_flags & GJF_DEVICE_DESTROY) &&
2948                                (sc->sc_flags & GJF_DEVICE_BEFORE_SWITC=
H))
{
2949                                    break;

(kgdb) p *g_journal_class.geom->lh_first
$1 =3D {name =3D 0xfffff80038239560 "gjournal 1068288894", class =3D
0xffffffff81170e48, geom =3D {le_next =3D 0x0,
    le_prev =3D 0xffffffff81170ef0}, consumer =3D {lh_first =3D 0xfffff8003=
82c5900},
provider =3D {lh_first =3D 0x0}, geoms =3D {
    tqe_next =3D 0xfffff8003822a300, tqe_prev =3D 0xfffff8003826c330}, rank=
 =3D 4,
start =3D 0xffffffff8116bcd0 <g_journal_start>,
  spoiled =3D 0, attrchanged =3D 0, dumpconf =3D 0xffffffff811674e0
<g_journal_dumpconf>, access =3D 0xffffffff8116bfe0 <g_journal_access>,
  orphan =3D 0xffffffff8116be30 <g_journal_orphan>, ioctl =3D 0, providergo=
ne =3D 0,
resize =3D 0, spare0 =3D 0x0, spare1 =3D 0x0,
  softc =3D 0xfffff8003823d000, flags =3D 2}
(kgdb) p *(struct g_journal_softc *)((g_journal_class.geom->lh_first))->sof=
tc
$2 =3D {sc_id =3D 1068288894, sc_type =3D 3 '\003', sc_orig_type =3D 1 '\00=
1', sc_geom
=3D 0xfffff800380bce00, sc_flags =3D 144, sc_mtx =3D {
    lock_object =3D {lo_name =3D 0xffffffff8116f916 "gjournal", lo_flags =3D
16908288, lo_data =3D 0, lo_witness =3D 0x0}, mtx_lock =3D 6},
  sc_mediasize =3D 1073741312, sc_sectorsize =3D 512, sc_bio_flush =3D 3,
sc_journal_id =3D 1627568564, sc_journal_next_id =3D 163691452,
  sc_journal_copying =3D 0, sc_journal_offset =3D 59226624, sc_journal_prev=
ious_id
=3D 3706521218, sc_back_queue =3D {queue =3D {
      tqh_first =3D 0x0, tqh_last =3D 0xfffff8003823d068}, last_offset =3D =
805502976,
insert_point =3D 0x0}, sc_regular_queue =3D {queue =3D {
      tqh_first =3D 0x0, tqh_last =3D 0xfffff8003823d088}, last_offset =3D =
65536,
insert_point =3D 0x0}, sc_delayed_queue =3D {queue =3D {
      tqh_first =3D 0x0, tqh_last =3D 0xfffff8003823d0a8}, last_offset =3D =
0,
insert_point =3D 0x0}, sc_delayed_count =3D 0,
  sc_current_queue =3D 0x0, sc_current_count =3D 0, sc_flush_queue =3D 0x0,
sc_flush_count =3D 0, sc_flush_in_progress =3D 0,
  sc_copy_queue =3D 0x0, sc_copy_in_progress =3D 0, sc_dconsumer =3D
0xfffff800382c5900, sc_jconsumer =3D 0xfffff8003822f900,
  sc_inactive =3D {jj_queue =3D 0x0, jj_offset =3D 59142656}, sc_active =3D=
 {jj_queue =3D
0x0, jj_offset =3D 59226112}, sc_jstart =3D 0,
  sc_jend =3D 8589934080, sc_callout =3D {c_links =3D {le =3D {le_next =3D =
0x0, le_prev =3D
0xfffffe0000803f00}, sle =3D {sle_next =3D 0x0}, tqe =3D {
        tqe_next =3D 0x0, tqe_prev =3D 0xfffffe0000803f00}}, c_time =3D 338=
25373681,
c_precision =3D 1342177187,
    c_arg =3D 0xfffff8003823d000, c_func =3D 0xffffffff8116c190
<g_journal_timeout>, c_lock =3D 0x0, c_flags =3D 0, c_iflags =3D 16,
    c_cpu =3D 0}, sc_worker =3D 0x0, sc_rootmount =3D 0x0}

(kgdb) f 11
#11 0xffffffff806aa1ea in __mtx_lock_sleep (c=3D0xfffff8003823d030,
tid=3D18446735278556155904, opts=3D0, file=3D0x0, line=3D1145437376)
    at /usr/src/sys/kern/kern_mutex.c:506
506                             owner =3D (struct thread *)(v & ~MTX_FLAGMA=
SK);
(kgdb) list
501                      * If the owner is running on another CPU, spin unt=
il
the
502                      * owner stops running or the state of the lock
changes.
503                      */
504                     v =3D m->mtx_lock;
505                     if (v !=3D MTX_UNOWNED) {
506                             owner =3D (struct thread *)(v & ~MTX_FLAGMA=
SK);
507                             if (TD_IS_RUNNING(owner)) {
508                                     if (LOCK_LOG_TEST(&m->lock_object, =
0))
509                                             CTR3(KTR_LOCK,
510                                                 "%s: spinning on %p hel=
d by
%p",

(kgdb) p *c
$2 =3D 6
(kgdb) p lda
$3 =3D {config =3D 0xffffffff80d70e48, delay =3D 0, spin_cnt =3D 1}

The first parameter on call of __mtx_lock_sleep is the address of
sc->sc_mtx->mtx_lock with value MTX_DESTROYED (=3D6), and so we have v =3D
m->mtx_lock =3D 6, v & ~MTX_FLAGMASK) =3D 0 with page fault in line 506. Th=
e leak
here ist that the case v =3D MTX_DESTROYED is not handled in this function
__mtx_lock_sleep().

I think we have a race condition between g_journal_switcher and the process
"PID 1" who calls g_journal_shutdown() for every gjournaled partition on
shutdown.

(kgdb) proc 1
[Switching to thread 301 (Thread 100002)]#0  sched_switch
(td=3D0xfffff8000d2ff500, newtd=3D<value optimized out>,
    flags=3D<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1956
1956                    cpuid =3D PCPU_GET(cpuid);
(kgdb) where
#0  sched_switch (td=3D0xfffff8000d2ff500, newtd=3D<value optimized out>,
flags=3D<value optimized out>)
    at /usr/src/sys/kern/sched_ule.c:1956
#1  0xffffffff806cb316 in mi_switch (flags=3D0, newtd=3D0x0) at
/usr/src/sys/kern/kern_synch.c:473
#2  0xffffffff80710c1a in sleepq_wait (wchan=3D0x0, pri=3D0) at
/usr/src/sys/kern/subr_sleepqueue.c:632
#3  0xffffffff806cacea in _sleep (ident=3D<value optimized out>, lock=3D<va=
lue
optimized out>, priority=3D<value optimized out>,
    wmesg=3D<value optimized out>, sbt=3D<value optimized out>, pr=3D<value=
 optimized
out>, flags=3D<value optimized out>)
    at /usr/src/sys/kern/kern_synch.c:242
#4  0xffffffff8075e950 in biowait (bp=3D<value optimized out>, wchan=3D<val=
ue
optimized out>) at /usr/src/sys/kern/vfs_bio.c:3674
#5  0xffffffff8064a0b5 in g_write_data (cp=3D<value optimized out>,
offset=3D146815737344, ptr=3D0xfffff80609429c00, length=3D512)
    at /usr/src/sys/geom/geom_io.c:937
#6  0xffffffff8118c8df in g_mirror_write_metadata (disk=3D0xfffff800380b410=
0,
md=3D<value optimized out>)
    at
/usr/src/sys/modules/geom/geom_mirror/../../../geom/mirror/g_mirror.c:675
#7  0xffffffff8118c495 in g_mirror_update_metadata (disk=3D0xfffff800380b41=
00)
    at
/usr/src/sys/modules/geom/geom_mirror/../../../geom/mirror/g_mirror.c:774
#8  0xffffffff8118ea6e in g_mirror_idle (sc=3D0xfffff80038059c00, acw=3D<va=
lue
optimized out>)
    at
/usr/src/sys/modules/geom/geom_mirror/../../../geom/mirror/g_mirror.c:860
#9  0xffffffff8118f058 in g_mirror_access (pp=3D<value optimized out>, acr=
=3D<value
optimized out>, acw=3D<value optimized out>,
    ace=3D<value optimized out>) at
/usr/src/sys/modules/geom/geom_mirror/../../../geom/mirror/g_mirror.c:2914
#10 0xffffffff8064c16e in g_access (cp=3D0xfffff80038261700, dcr=3D-1, dcw=
=3D-1,
dce=3D-2) at /usr/src/sys/geom/geom_subr.c:919
#11 0xffffffff8064c16e in g_access (cp=3D0xfffff800382c5900, dcr=3D-1, dcw=
=3D-1,
dce=3D-1) at /usr/src/sys/geom/geom_subr.c:919
#12 0xffffffff81169295 in g_journal_destroy (sc=3D0xfffff8003823d000)
    at
/usr/src/sys/modules/geom/geom_journal/../../../geom/journal/g_journal.c:24=
31
#13 0xffffffff81167741 in g_journal_shutdown (arg=3D<value optimized out>,
howto=3D0)
    at
/usr/src/sys/modules/geom/geom_journal/../../../geom/journal/g_journal.c:26=
58
#14 0xffffffff806c11bc in kern_reboot (howto=3D16384) at
/usr/src/sys/kern/kern_shutdown.c:483
#15 0xffffffff806c0acf in sys_reboot (td=3D<value optimized out>, uap=3D<va=
lue
optimized out>) at /usr/src/sys/kern/kern_shutdown.c:229
#16 0xffffffff80970cd2 in amd64_syscall (td=3D0xfffff8000d2ff500, traced=3D=
0) at
subr_syscall.c:141
#17 0xffffffff80954dab in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:398

(kgdb) f 12
#12 0xffffffff81169295 in g_journal_destroy (sc=3D0xfffff8003823d000)
    at
/usr/src/sys/modules/geom/geom_journal/../../../geom/journal/g_journal.c:24=
31
2431                            g_access(cp, -1, -1, -1);

(kgdb) list
2426                        sc->sc_current_count);
2427            }
2428
2429            LIST_FOREACH(cp, &gp->consumer, consumer) {
2430                    if (cp->acr + cp->acw + cp->ace > 0)
2431                            g_access(cp, -1, -1, -1);
2432                    /*
2433                     * We keep all consumers open for writting, so if I=
'll
detach
2434                     * and destroy consumer here, I'll get providers for
taste, so
2435                     * journal will be started again.

(kgdb) p *sc
$4 =3D {sc_id =3D 1068288894, sc_type =3D 3 '\003', sc_orig_type =3D 1 '\00=
1', sc_geom
=3D 0xfffff800380bce00, sc_flags =3D 144, sc_mtx =3D {
    lock_object =3D {lo_name =3D 0xffffffff8116f916 "gjournal", lo_flags =3D
16908288, lo_data =3D 0, lo_witness =3D 0x0}, mtx_lock =3D 6},
  sc_mediasize =3D 1073741312, sc_sectorsize =3D 512, sc_bio_flush =3D 3,
sc_journal_id =3D 1627568564, sc_journal_next_id =3D 163691452,
  sc_journal_copying =3D 0, sc_journal_offset =3D 59226624, sc_journal_prev=
ious_id
=3D 3706521218, sc_back_queue =3D {queue =3D {
      tqh_first =3D 0x0, tqh_last =3D 0xfffff8003823d068}, last_offset =3D =
805502976,
insert_point =3D 0x0}, sc_regular_queue =3D {queue =3D {
      tqh_first =3D 0x0, tqh_last =3D 0xfffff8003823d088}, last_offset =3D =
65536,
insert_point =3D 0x0}, sc_delayed_queue =3D {queue =3D {
      tqh_first =3D 0x0, tqh_last =3D 0xfffff8003823d0a8}, last_offset =3D =
0,
insert_point =3D 0x0}, sc_delayed_count =3D 0,
  sc_current_queue =3D 0x0, sc_current_count =3D 0, sc_flush_queue =3D 0x0,
sc_flush_count =3D 0, sc_flush_in_progress =3D 0,
  sc_copy_queue =3D 0x0, sc_copy_in_progress =3D 0, sc_dconsumer =3D
0xfffff800382c5900, sc_jconsumer =3D 0xfffff8003822f900,
  sc_inactive =3D {jj_queue =3D 0x0, jj_offset =3D 59142656}, sc_active =3D=
 {jj_queue =3D
0x0, jj_offset =3D 59226112}, sc_jstart =3D 0,
  sc_jend =3D 8589934080, sc_callout =3D {c_links =3D {le =3D {le_next =3D =
0x0, le_prev =3D
0xfffffe0000803f00}, sle =3D {sle_next =3D 0x0}, tqe =3D {
        tqe_next =3D 0x0, tqe_prev =3D 0xfffffe0000803f00}}, c_time =3D 338=
25373681,
c_precision =3D 1342177187,
    c_arg =3D 0xfffff8003823d000, c_func =3D 0xffffffff8116c190
<g_journal_timeout>, c_lock =3D 0x0, c_flags =3D 0, c_iflags =3D 16,
    c_cpu =3D 0}, sc_worker =3D 0x0, sc_rootmount =3D 0x0}

Thats the same "struct g_journal_softc" as seen above for the
g_journal_switcher.

Now the story is complete: "PID 1" calls g_journal_shutdown() and loops thr=
ough
the gjournaled partitions to destroy them. Last partition has sc_id =3D
1068288894 which is printed to console. Then in g_journal_destroy() we see
       mtx_unlock(&sc->sc_mtx);
       g_topology_lock();
       mtx_destroy(&sc->sc_mtx);      (That is mtx_lock =3D MTX_DESTROYED)
       ....
       g_topology_unlock();           (in g_mirror_access)
At this moment g_journal_switcher gets the topology_lock and starts running=
 to
the page fault described above.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-221804-8>