Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 29 Aug 2020 05:02:45 -0600
From:      Warner Losh <imp@bsdimp.com>
To:        meloun.michal@gmail.com
Cc:        Mateusz Guzik <mjguzik@gmail.com>, Warner Losh <imp@freebsd.org>,  src-committers <src-committers@freebsd.org>, svn-src-all <svn-src-all@freebsd.org>,  svn-src-head <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r364946 - head/sys/kern
Message-ID:  <CANCZdfp9m7knXfguYh79fdALBiL3ktEH6e=NU4S2qdOv6ory%2Bg@mail.gmail.com>
In-Reply-To: <f1a67850-e9e5-d785-6562-972aeb9f1206@gmail.com>
References:  <202008290430.07T4UCM4007928@repo.freebsd.org> <CAGudoHFAkrAykin6ngH=04254J4AmhHk2NmDyGfrUE=wJcxH2A@mail.gmail.com> <CANCZdfqXtKhKhh33ovFQ4_a3tiesRi8-6ZuMTp0yW%2BMzkxWLzA@mail.gmail.com> <f1a67850-e9e5-d785-6562-972aeb9f1206@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Aug 29, 2020 at 4:38 AM Michal Meloun <meloun.michal@gmail.com>
wrote:

>
>
> On 29.08.2020 12:04, Warner Losh wrote:
> > On Sat, Aug 29, 2020 at 1:09 AM Mateusz Guzik <mjguzik@gmail.com> wrote:
> >
> >> This crashes on boot for me:
> >>
> >
> > I wasn't able to get it to crash on boot for me, but I was able to
> recreate
> > it.
> It crashed on ofw based systems where some enumerated devices have not a
> suitable driver, see:
> ---------------------------------------
> sysctl_devices: nameunit: root0, descs: System root bus, driver: root
> sysctl_devices: nameunit: nexus0, descs: (null), driver: nexus
> sysctl_devices: nameunit: ofwbus0, descs: Open Firmware Device Tree,
> driver: ofwbus
> sysctl_devices: nameunit: pcib0, descs: Nvidia Integrated PCI/PCI-E
> Controller, driver: pcib
> sysctl_devices: nameunit: simplebus0, descs: Flattened device tree
> simple bus, driver: simplebus
> sysctl_devices: nameunit: gic0, descs: ARM Generic Interrupt Controller,
> driver: gic
> sysctl_devices: nameunit: (null), descs: (null), driver:
> sysctl_devices: nameunit: lic0, descs: (null), driver: lic
> sysctl_devices: nameunit: (null), descs: (null), driver:
> sysctl_devices: nameunit: car0, descs: Tegra Clock Driver, driver: car
> ....
> ----------------------------------------------------------------------
> > Fixed in r364949.Confirmed.
>  I think it didn't crash on boot for me because
> > kldxref failed due to the segment thing so devmatch didn't run which
> would
> > have triggered this bug. devinfo did trigger a very similar crash, and
> > r364949 fixes that crash. Even a new kldxref failed due to the too many
> > segments thing, so I can't confirm that's what you hit, but I'm pretty
> sure
> > it is...
> >
> But there is another issue in device_sysctl_handler() (not analyzed yet):
> root@tegra210:~ # sysctl dev.cpu.
> dev.cpu.3.temperature: 50.5C
> dev.cpu.3panic: sbuf_clear makes no sense on sbuf 0xffff00006f21a528
> with drain
> cpuid = 2
> time = 1598696937
> KDB: stack backtrace:
> db_trace_self() at db_fetch_ksymtab+0x164
>          pc = 0xffff0000006787f4  lr = 0xffff000000153400
>          sp = 0xffff00006f21a1b0  fp = 0xffff00006f21a3b0
>
> db_fetch_ksymtab() at vpanic+0x198
>          pc = 0xffff000000153400  lr = 0xffff00000036b274
>          sp = 0xffff00006f21a3c0  fp = 0xffff00006f21a420
>
> vpanic() at panic+0x44
>          pc = 0xffff00000036b274  lr = 0xffff00000036b018
>          sp = 0xffff00006f21a430  fp = 0xffff00006f21a4e0
>
> panic() at sbuf_clear+0xa0
>          pc = 0xffff00000036b018  lr = 0xffff0000003c17c8
>          sp = 0xffff00006f21a4f0  fp = 0xffff00006f21a4f0
>
> sbuf_clear() at sbuf_cpy+0x58
>          pc = 0xffff0000003c17c8  lr = 0xffff0000003c1ff0
>          sp = 0xffff00006f21a500  fp = 0xffff00006f21a500
>
> sbuf_cpy() at _gone_in_dev+0x560
>          pc = 0xffff0000003c1ff0  lr = 0xffff0000003a9078
>          sp = 0xffff00006f21a510  fp = 0xffff00006f21a570
>
> _gone_in_dev() at sbuf_new_for_sysctl+0x170
>          pc = 0xffff0000003a9078  lr = 0xffff00000037c1a8
>          sp = 0xffff00006f21a580  fp = 0xffff00006f21a5a0
>
> sbuf_new_for_sysctl() at kernel_sysctl+0x36c
>          pc = 0xffff00000037c1a8  lr = 0xffff00000037b63c
>          sp = 0xffff00006f21a5b0  fp = 0xffff00006f21a630
>

This traceback is all kinds of crazy. sbuf_new_for_sysctl doesn't call
_gone_in_dev(), which doesn't do sbuf stuff at all. And neither does it
call sbuf_cpy(). Though I get a crash that looks like:
Tracing pid 66442 tid 101464 td 0xfffffe02f47b7c00
kdb_enter() at kdb_enter+0x37/frame 0xfffffe02f4ae3740
vpanic() at vpanic+0x19e/frame 0xfffffe02f4ae3790
panic() at panic+0x43/frame 0xfffffe02f4ae37f0
sbuf_clear() at sbuf_clear+0xac/frame 0xfffffe02f4ae3800
sbuf_cpy() at sbuf_cpy+0x5a/frame 0xfffffe02f4ae3820
device_sysctl_handler() at device_sysctl_handler+0x133/frame
0xfffffe02f4ae38a0
sysctl_root_handler_locked() at sysctl_root_handler_locked+0x9c/frame
0xfffffe02f4ae38f0
sysctl_root() at sysctl_root+0x20a/frame 0xfffffe02f4ae3970
userland_sysctl() at userland_sysctl+0x17d/frame 0xfffffe02f4ae3a20
sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe02f4ae3ad0
amd64_syscall() at amd64_syscall+0x140/frame 0xfffffe02f4ae3bf0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe02f4ae3bf0
--- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x80042d50a, rsp =
0x7fffffffd458, rbp = 0x7fffffffd490 ---

on a sysctl -a which I think makes more sense...  I'll see if I can track
it down... I think it's because sbuf_cpy does an unconditional clear, which
triggers this assert, which is likely bogus for this case. sbuf_cat doesn't
seem to have this issue... I'll confirm and commit.

Warner


> kernel_sysctl() at userland_sysctl+0xf4
>          pc = 0xffff00000037b63c  lr = 0xffff00000037bc5c
>          sp = 0xffff00006f21a640  fp = 0xffff00006f21a6d0
>
> userland_sysctl() at sys___sysctl+0x68
>          pc = 0xffff00000037bc5c  lr = 0xffff00000037bb28
>          sp = 0xffff00006f21a6e0  fp = 0xffff00006f21a790
>
> sys___sysctl() at do_el0_sync+0x4e0
>          pc = 0xffff00000037bb28  lr = 0xffff000000697918
>          sp = 0xffff00006f21a7a0  fp = 0xffff00006f21a830
>
> do_el0_sync() at handle_el0_sync+0x90
>          pc = 0xffff000000697918  lr = 0xffff00000067aa24
>          sp = 0xffff00006f21a840  fp = 0xffff00006f21a980
>
> handle_el0_sync() at 0x4047764c
>          pc = 0xffff00000067aa24  lr = 0x000000004047764c
>          sp = 0xffff00006f21a990  fp = 0x0000ffffffffc250
>
> KDB: enter: panic
> [ thread pid 1263 tid 100092 ]
> Stopped at      0x40477fb4:     undefined       54000042
>
> > Warner
> >
>
> >
> >> atal trap 12: page fault while in kernel mode
> >> cpuid = 0; apic id = 00
> >> fault virtual address   = 0x0
> >> fault code              = supervisor read data, page not present
> >> instruction pointer     = 0x20:0xffffffff805b0a7f
> >> stack pointer           = 0x28:0xfffffe002366a7f0
> >> frame pointer           = 0x28:0xfffffe002366a7f0
> >> code segment            = base 0x0, limit 0xfffff, type 0x1b
> >>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> >> processor eflags        = interrupt enabled, resume, IOPL = 0
> >> current process         = 89 (devmatch)
> >> trap number             = 12
> >> panic: page fault
> >> cpuid = 0
> >> time = 1598692135
> >> KDB: stack backtrace:
> >> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> >> 0xfffffe002366a4a0
> >> vpanic() at vpanic+0x182/frame 0xfffffe002366a4f0
> >> panic() at panic+0x43/frame 0xfffffe002366a550
> >> trap_fatal() at trap_fatal+0x387/frame 0xfffffe002366a5b0
> >> trap_pfault() at trap_pfault+0x4f/frame 0xfffffe002366a610
> >> trap() at trap+0x27d/frame 0xfffffe002366a720
> >> calltrap() at calltrap+0x8/frame 0xfffffe002366a720
> >> --- trap 0xc, rip = 0xffffffff805b0a7f, rsp = 0xfffffe002366a7f0, rbp
> >> = 0xfffffe002366a7f0 ---
> >> strlen() at strlen+0x1f/frame 0xfffffe002366a7f0
> >> sbuf_cat() at sbuf_cat+0x15/frame 0xfffffe002366a810
> >> sysctl_devices() at sysctl_devices+0x104/frame 0xfffffe002366a8a0
> >> sysctl_root_handler_locked() at sysctl_root_handler_locked+0x91/frame
> >> 0xfffffe002366a8f0
> >> sysctl_root() at sysctl_root+0x249/frame 0xfffffe002366a970
> >> userland_sysctl() at userland_sysctl+0x170/frame 0xfffffe002366aa20
> >> sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe002366aad0
> >> amd64_syscall() at amd64_syscall+0x10c/frame 0xfffffe002366abf0
> >> fast_syscall_common() at fast_syscall_common+0xf8/frame
> 0xfffffe002366abf0
> >> --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x80041c0ea, rsp
> >> = 0x7fffffffda78, rbp = 0x7fffffffdab0 ---
> >> KDB: enter: panic
> >> [ thread pid 89 tid 100067 ]
> >> Stopped at      kdb_enter+0x37: movq    $0,0x7e2616(%rip)
> >>
> >>
> >> On 8/29/20, Warner Losh <imp@freebsd.org> wrote:
> >>> Author: imp
> >>> Date: Sat Aug 29 04:30:12 2020
> >>> New Revision: 364946
> >>> URL: https://svnweb.freebsd.org/changeset/base/364946
> >>>
> >>> Log:
> >>>   Move to using sbuf for some sysctl in newbus
> >>>
> >>>   Convert two different sysctl to using sbuf. First, for all the
> default
> >>>   sysctls we implement for each device driver that's attached. This is
> a
> >>>   pure sbuf conversion.
> >>>
> >>>   Second, convert sysctl_devices to fill its buffer with sbuf rather
> >>>   than a hand-rolled crappy thing I wrote years ago.
> >>>
> >>>   Reviewed by: cem, markj
> >>>   Differential Revision: https://reviews.freebsd.org/D26206
> >>>
> >>> Modified:
> >>>   head/sys/kern/subr_bus.c
> >>>
> >>> Modified: head/sys/kern/subr_bus.c
> >>>
> >>
> ==============================================================================
> >>> --- head/sys/kern/subr_bus.c  Sat Aug 29 04:30:06 2020        (r364945)
> >>> +++ head/sys/kern/subr_bus.c  Sat Aug 29 04:30:12 2020        (r364946)
> >>> @@ -260,36 +260,33 @@ enum {
> >>>  static int
> >>>  device_sysctl_handler(SYSCTL_HANDLER_ARGS)
> >>>  {
> >>> +     struct sbuf sb;
> >>>       device_t dev = (device_t)arg1;
> >>> -     const char *value;
> >>> -     char *buf;
> >>>       int error;
> >>>
> >>> -     buf = NULL;
> >>> +     sbuf_new_for_sysctl(&sb, NULL, 1024, req);
> >>>       switch (arg2) {
> >>>       case DEVICE_SYSCTL_DESC:
> >>> -             value = dev->desc ? dev->desc : "";
> >>> +             sbuf_cpy(&sb, dev->desc ? dev->desc : "");
> >>>               break;
> >>>       case DEVICE_SYSCTL_DRIVER:
> >>> -             value = dev->driver ? dev->driver->name : "";
> >>> +             sbuf_cpy(&sb, dev->driver ? dev->driver->name : "");
> >>>               break;
> >>>       case DEVICE_SYSCTL_LOCATION:
> >>> -             value = buf = malloc(1024, M_BUS, M_WAITOK | M_ZERO);
> >>> -             bus_child_location_str(dev, buf, 1024);
> >>> +             bus_child_location_sb(dev, &sb);
> >>>               break;
> >>>       case DEVICE_SYSCTL_PNPINFO:
> >>> -             value = buf = malloc(1024, M_BUS, M_WAITOK | M_ZERO);
> >>> -             bus_child_pnpinfo_str(dev, buf, 1024);
> >>> +             bus_child_pnpinfo_sb(dev, &sb);
> >>>               break;
> >>>       case DEVICE_SYSCTL_PARENT:
> >>> -             value = dev->parent ? dev->parent->nameunit : "";
> >>> +             sbuf_cpy(&sb, dev->parent ? dev->parent->nameunit : "");
> >>>               break;
> >>>       default:
> >>> +             sbuf_delete(&sb);
> >>>               return (EINVAL);
> >>>       }
> >>> -     error = SYSCTL_OUT_STR(req, value);
> >>> -     if (buf != NULL)
> >>> -             free(buf, M_BUS);
> >>> +     error = sbuf_finish(&sb);
> >>> +     sbuf_delete(&sb);
> >>>       return (error);
> >>>  }
> >>>
> >>> @@ -5464,13 +5461,13 @@ SYSCTL_PROC(_hw_bus, OID_AUTO, info,
> >> CTLTYPE_STRUCT
> >>> |
> >>>  static int
> >>>  sysctl_devices(SYSCTL_HANDLER_ARGS)
> >>>  {
> >>> +     struct sbuf             sb;
> >>>       int                     *name = (int *)arg1;
> >>>       u_int                   namelen = arg2;
> >>>       int                     index;
> >>>       device_t                dev;
> >>>       struct u_device         *udev;
> >>>       int                     error;
> >>> -     char                    *walker, *ep;
> >>>
> >>>       if (namelen != 2)
> >>>               return (EINVAL);
> >>> @@ -5501,34 +5498,21 @@ sysctl_devices(SYSCTL_HANDLER_ARGS)
> >>>       udev->dv_devflags = dev->devflags;
> >>>       udev->dv_flags = dev->flags;
> >>>       udev->dv_state = dev->state;
> >>> -     walker = udev->dv_fields;
> >>> -     ep = walker + sizeof(udev->dv_fields);
> >>> -#define CP(src)                                              \
> >>> -     if ((src) == NULL)                              \
> >>> -             *walker++ = '\0';                       \
> >>> -     else {                                          \
> >>> -             strlcpy(walker, (src), ep - walker);    \
> >>> -             walker += strlen(walker) + 1;           \
> >>> -     }                                               \
> >>> -     if (walker >= ep)                               \
> >>> -             break;
> >>> -
> >>> -     do {
> >>> -             CP(dev->nameunit);
> >>> -             CP(dev->desc);
> >>> -             CP(dev->driver != NULL ? dev->driver->name : NULL);
> >>> -             bus_child_pnpinfo_str(dev, walker, ep - walker);
> >>> -             walker += strlen(walker) + 1;
> >>> -             if (walker >= ep)
> >>> -                     break;
> >>> -             bus_child_location_str(dev, walker, ep - walker);
> >>> -             walker += strlen(walker) + 1;
> >>> -             if (walker >= ep)
> >>> -                     break;
> >>> -             *walker++ = '\0';
> >>> -     } while (0);
> >>> -#undef CP
> >>> -     error = SYSCTL_OUT(req, udev, sizeof(*udev));
> >>> +     sbuf_new(&sb, udev->dv_fields, sizeof(udev->dv_fields),
> >> SBUF_FIXEDLEN);
> >>> +     sbuf_cat(&sb, dev->nameunit);
> >>> +     sbuf_putc(&sb, '\0');
> >>> +     sbuf_cat(&sb, dev->desc);
> >>> +     sbuf_putc(&sb, '\0');
> >>> +     sbuf_cat(&sb, dev->driver != NULL ? dev->driver->name : '\0');
> >>> +     sbuf_putc(&sb, '\0');
> >>> +     bus_child_pnpinfo_sb(dev, &sb);
> >>> +     sbuf_putc(&sb, '\0');
> >>> +     bus_child_location_sb(dev, &sb);
> >>> +     sbuf_putc(&sb, '\0');
> >>> +     error = sbuf_finish(&sb);
> >>> +     if (error == 0)
> >>> +             error = SYSCTL_OUT(req, udev, sizeof(*udev));
> >>> +     sbuf_delete(&sb);
> >>>       free(udev, M_BUS);
> >>>       return (error);
> >>>  }
> >>>
> >>
> >>
> >> --
> >> Mateusz Guzik <mjguzik gmail.com>
> >>
> >
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfp9m7knXfguYh79fdALBiL3ktEH6e=NU4S2qdOv6ory%2Bg>