Date: Sat, 29 Aug 2020 05:34:31 -0600 From: Warner Losh <imp@bsdimp.com> To: meloun.michal@gmail.com Cc: Mateusz Guzik <mjguzik@gmail.com>, Warner Losh <imp@freebsd.org>, src-committers <src-committers@freebsd.org>, svn-src-all <svn-src-all@freebsd.org>, svn-src-head <svn-src-head@freebsd.org> Subject: Re: svn commit: r364946 - head/sys/kern Message-ID: <CANCZdfq6U-L%2BpJyMgYT_6fB=KWSudp3tdt%2B7K67gPvRRQ-6u1Q@mail.gmail.com> In-Reply-To: <213fcb81-ceab-677f-98dc-e8cb33fef7d1@gmail.com> References: <202008290430.07T4UCM4007928@repo.freebsd.org> <CAGudoHFAkrAykin6ngH=04254J4AmhHk2NmDyGfrUE=wJcxH2A@mail.gmail.com> <CANCZdfqXtKhKhh33ovFQ4_a3tiesRi8-6ZuMTp0yW%2BMzkxWLzA@mail.gmail.com> <f1a67850-e9e5-d785-6562-972aeb9f1206@gmail.com> <CANCZdfp9m7knXfguYh79fdALBiL3ktEH6e=NU4S2qdOv6ory%2Bg@mail.gmail.com> <213fcb81-ceab-677f-98dc-e8cb33fef7d1@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Aug 29, 2020 at 5:25 AM Michal Meloun <meloun.michal@gmail.com> wrote: > > > On 29.08.2020 13:02, Warner Losh wrote: > > On Sat, Aug 29, 2020 at 4:38 AM Michal Meloun <meloun.michal@gmail.com> > > wrote: > > > >> > >> > >> On 29.08.2020 12:04, Warner Losh wrote: > >>> On Sat, Aug 29, 2020 at 1:09 AM Mateusz Guzik <mjguzik@gmail.com> > wrote: > >>> > >>>> This crashes on boot for me: > >>>> > >>> > >>> I wasn't able to get it to crash on boot for me, but I was able to > >> recreate > >>> it. > >> It crashed on ofw based systems where some enumerated devices have not a > >> suitable driver, see: > >> --------------------------------------- > >> sysctl_devices: nameunit: root0, descs: System root bus, driver: root > >> sysctl_devices: nameunit: nexus0, descs: (null), driver: nexus > >> sysctl_devices: nameunit: ofwbus0, descs: Open Firmware Device Tree, > >> driver: ofwbus > >> sysctl_devices: nameunit: pcib0, descs: Nvidia Integrated PCI/PCI-E > >> Controller, driver: pcib > >> sysctl_devices: nameunit: simplebus0, descs: Flattened device tree > >> simple bus, driver: simplebus > >> sysctl_devices: nameunit: gic0, descs: ARM Generic Interrupt Controller, > >> driver: gic > >> sysctl_devices: nameunit: (null), descs: (null), driver: > >> sysctl_devices: nameunit: lic0, descs: (null), driver: lic > >> sysctl_devices: nameunit: (null), descs: (null), driver: > >> sysctl_devices: nameunit: car0, descs: Tegra Clock Driver, driver: car > >> .... > >> ---------------------------------------------------------------------- > >>> Fixed in r364949.Confirmed. > >> I think it didn't crash on boot for me because > >>> kldxref failed due to the segment thing so devmatch didn't run which > >> would > >>> have triggered this bug. devinfo did trigger a very similar crash, and > >>> r364949 fixes that crash. Even a new kldxref failed due to the too many > >>> segments thing, so I can't confirm that's what you hit, but I'm pretty > >> sure > >>> it is... > >>> > >> But there is another issue in device_sysctl_handler() (not analyzed > yet): > >> root@tegra210:~ # sysctl dev.cpu. > >> dev.cpu.3.temperature: 50.5C > >> dev.cpu.3panic: sbuf_clear makes no sense on sbuf 0xffff00006f21a528 > >> with drain > >> cpuid = 2 > >> time = 1598696937 > >> KDB: stack backtrace: > >> db_trace_self() at db_fetch_ksymtab+0x164 > >> pc = 0xffff0000006787f4 lr = 0xffff000000153400 > >> sp = 0xffff00006f21a1b0 fp = 0xffff00006f21a3b0 > >> > >> db_fetch_ksymtab() at vpanic+0x198 > >> pc = 0xffff000000153400 lr = 0xffff00000036b274 > >> sp = 0xffff00006f21a3c0 fp = 0xffff00006f21a420 > >> > >> vpanic() at panic+0x44 > >> pc = 0xffff00000036b274 lr = 0xffff00000036b018 > >> sp = 0xffff00006f21a430 fp = 0xffff00006f21a4e0 > >> > >> panic() at sbuf_clear+0xa0 > >> pc = 0xffff00000036b018 lr = 0xffff0000003c17c8 > >> sp = 0xffff00006f21a4f0 fp = 0xffff00006f21a4f0 > >> > >> sbuf_clear() at sbuf_cpy+0x58 > >> pc = 0xffff0000003c17c8 lr = 0xffff0000003c1ff0 > >> sp = 0xffff00006f21a500 fp = 0xffff00006f21a500 > >> > >> sbuf_cpy() at _gone_in_dev+0x560 > >> pc = 0xffff0000003c1ff0 lr = 0xffff0000003a9078 > >> sp = 0xffff00006f21a510 fp = 0xffff00006f21a570 > >> > >> _gone_in_dev() at sbuf_new_for_sysctl+0x170 > >> pc = 0xffff0000003a9078 lr = 0xffff00000037c1a8 > >> sp = 0xffff00006f21a580 fp = 0xffff00006f21a5a0 > >> > >> sbuf_new_for_sysctl() at kernel_sysctl+0x36c > >> pc = 0xffff00000037c1a8 lr = 0xffff00000037b63c > >> sp = 0xffff00006f21a5b0 fp = 0xffff00006f21a630 > >> > > > > This traceback is all kinds of crazy. sbuf_new_for_sysctl doesn't call > > _gone_in_dev(), which doesn't do sbuf stuff at all. And neither does it > > call sbuf_cpy(). Though I get a crash that looks like: > > Tracing pid 66442 tid 101464 td 0xfffffe02f47b7c00 > > kdb_enter() at kdb_enter+0x37/frame 0xfffffe02f4ae3740 > > vpanic() at vpanic+0x19e/frame 0xfffffe02f4ae3790 > > panic() at panic+0x43/frame 0xfffffe02f4ae37f0 > > sbuf_clear() at sbuf_clear+0xac/frame 0xfffffe02f4ae3800 > > sbuf_cpy() at sbuf_cpy+0x5a/frame 0xfffffe02f4ae3820 > > device_sysctl_handler() at device_sysctl_handler+0x133/frame > > 0xfffffe02f4ae38a0 > > sysctl_root_handler_locked() at sysctl_root_handler_locked+0x9c/frame > > 0xfffffe02f4ae38f0 > > sysctl_root() at sysctl_root+0x20a/frame 0xfffffe02f4ae3970 > > userland_sysctl() at userland_sysctl+0x17d/frame 0xfffffe02f4ae3a20 > > sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe02f4ae3ad0 > > amd64_syscall() at amd64_syscall+0x140/frame 0xfffffe02f4ae3bf0 > > fast_syscall_common() at fast_syscall_common+0xf8/frame > 0xfffffe02f4ae3bf0 > > --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x80042d50a, rsp = > > 0x7fffffffd458, rbp = 0x7fffffffd490 --- > > > > on a sysctl -a which I think makes more sense... I'll see if I can track > > it down... I think it's because sbuf_cpy does an unconditional clear, > which > > triggers this assert, which is likely bogus for this case. sbuf_cat > doesn't > > seem to have this issue... I'll confirm and commit. > > > > Warner > > Yeah, sorry. Local symbols are not available for netbooted kernel :(. > And i csan confirm that problem is cause by using sbuf_cpy() on sbuf > allocated by sbuf_new_for_sysctl() (thus with drain handler) in > device_sysctl_handler(). But pure replacing sbuf_cpy() by sbuf_cat() > gives me another panic: > panic: Assertion (sb->s_flags & SBUF_INCLUDENUL) == 0 failed at > /usr2/Meloun/git/pmap/sys/kern/subr_bus.c:4936 > (still as respose for sysctl dev.cpu) > OK. My bouncer system here has something wrong with /, but I changed the sbuf_cpy to sbuf_cat. Can you confirm that it works for you? Warner > > > > > >> kernel_sysctl() at userland_sysctl+0xf4 > >> pc = 0xffff00000037b63c lr = 0xffff00000037bc5c > >> sp = 0xffff00006f21a640 fp = 0xffff00006f21a6d0 > >> > >> userland_sysctl() at sys___sysctl+0x68 > >> pc = 0xffff00000037bc5c lr = 0xffff00000037bb28 > >> sp = 0xffff00006f21a6e0 fp = 0xffff00006f21a790 > >> > >> sys___sysctl() at do_el0_sync+0x4e0 > >> pc = 0xffff00000037bb28 lr = 0xffff000000697918 > >> sp = 0xffff00006f21a7a0 fp = 0xffff00006f21a830 > >> > >> do_el0_sync() at handle_el0_sync+0x90 > >> pc = 0xffff000000697918 lr = 0xffff00000067aa24 > >> sp = 0xffff00006f21a840 fp = 0xffff00006f21a980 > >> > >> handle_el0_sync() at 0x4047764c > >> pc = 0xffff00000067aa24 lr = 0x000000004047764c > >> sp = 0xffff00006f21a990 fp = 0x0000ffffffffc250 > >> > >> KDB: enter: panic > >> [ thread pid 1263 tid 100092 ] > >> Stopped at 0x40477fb4: undefined 54000042 > >> > >>> Warner > >>> > >> > >>> > >>>> atal trap 12: page fault while in kernel mode > >>>> cpuid = 0; apic id = 00 > >>>> fault virtual address = 0x0 > >>>> fault code = supervisor read data, page not present > >>>> instruction pointer = 0x20:0xffffffff805b0a7f > >>>> stack pointer = 0x28:0xfffffe002366a7f0 > >>>> frame pointer = 0x28:0xfffffe002366a7f0 > >>>> code segment = base 0x0, limit 0xfffff, type 0x1b > >>>> = DPL 0, pres 1, long 1, def32 0, gran 1 > >>>> processor eflags = interrupt enabled, resume, IOPL = 0 > >>>> current process = 89 (devmatch) > >>>> trap number = 12 > >>>> panic: page fault > >>>> cpuid = 0 > >>>> time = 1598692135 > >>>> KDB: stack backtrace: > >>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > >>>> 0xfffffe002366a4a0 > >>>> vpanic() at vpanic+0x182/frame 0xfffffe002366a4f0 > >>>> panic() at panic+0x43/frame 0xfffffe002366a550 > >>>> trap_fatal() at trap_fatal+0x387/frame 0xfffffe002366a5b0 > >>>> trap_pfault() at trap_pfault+0x4f/frame 0xfffffe002366a610 > >>>> trap() at trap+0x27d/frame 0xfffffe002366a720 > >>>> calltrap() at calltrap+0x8/frame 0xfffffe002366a720 > >>>> --- trap 0xc, rip = 0xffffffff805b0a7f, rsp = 0xfffffe002366a7f0, rbp > >>>> = 0xfffffe002366a7f0 --- > >>>> strlen() at strlen+0x1f/frame 0xfffffe002366a7f0 > >>>> sbuf_cat() at sbuf_cat+0x15/frame 0xfffffe002366a810 > >>>> sysctl_devices() at sysctl_devices+0x104/frame 0xfffffe002366a8a0 > >>>> sysctl_root_handler_locked() at sysctl_root_handler_locked+0x91/frame > >>>> 0xfffffe002366a8f0 > >>>> sysctl_root() at sysctl_root+0x249/frame 0xfffffe002366a970 > >>>> userland_sysctl() at userland_sysctl+0x170/frame 0xfffffe002366aa20 > >>>> sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe002366aad0 > >>>> amd64_syscall() at amd64_syscall+0x10c/frame 0xfffffe002366abf0 > >>>> fast_syscall_common() at fast_syscall_common+0xf8/frame > >> 0xfffffe002366abf0 > >>>> --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x80041c0ea, rsp > >>>> = 0x7fffffffda78, rbp = 0x7fffffffdab0 --- > >>>> KDB: enter: panic > >>>> [ thread pid 89 tid 100067 ] > >>>> Stopped at kdb_enter+0x37: movq $0,0x7e2616(%rip) > >>>> > >>>> > >>>> On 8/29/20, Warner Losh <imp@freebsd.org> wrote: > >>>>> Author: imp > >>>>> Date: Sat Aug 29 04:30:12 2020 > >>>>> New Revision: 364946 > >>>>> URL: https://svnweb.freebsd.org/changeset/base/364946 > >>>>> > >>>>> Log: > >>>>> Move to using sbuf for some sysctl in newbus > >>>>> > >>>>> Convert two different sysctl to using sbuf. First, for all the > >> default > >>>>> sysctls we implement for each device driver that's attached. This > is > >> a > >>>>> pure sbuf conversion. > >>>>> > >>>>> Second, convert sysctl_devices to fill its buffer with sbuf rather > >>>>> than a hand-rolled crappy thing I wrote years ago. > >>>>> > >>>>> Reviewed by: cem, markj > >>>>> Differential Revision: https://reviews.freebsd.org/D26206 > >>>>> > >>>>> Modified: > >>>>> head/sys/kern/subr_bus.c > >>>>> > >>>>> Modified: head/sys/kern/subr_bus.c > >>>>> > >>>> > >> > ============================================================================== > >>>>> --- head/sys/kern/subr_bus.c Sat Aug 29 04:30:06 2020 > (r364945) > >>>>> +++ head/sys/kern/subr_bus.c Sat Aug 29 04:30:12 2020 > (r364946) > >>>>> @@ -260,36 +260,33 @@ enum { > >>>>> static int > >>>>> device_sysctl_handler(SYSCTL_HANDLER_ARGS) > >>>>> { > >>>>> + struct sbuf sb; > >>>>> device_t dev = (device_t)arg1; > >>>>> - const char *value; > >>>>> - char *buf; > >>>>> int error; > >>>>> > >>>>> - buf = NULL; > >>>>> + sbuf_new_for_sysctl(&sb, NULL, 1024, req); > >>>>> switch (arg2) { > >>>>> case DEVICE_SYSCTL_DESC: > >>>>> - value = dev->desc ? dev->desc : ""; > >>>>> + sbuf_cpy(&sb, dev->desc ? dev->desc : ""); > >>>>> break; > >>>>> case DEVICE_SYSCTL_DRIVER: > >>>>> - value = dev->driver ? dev->driver->name : ""; > >>>>> + sbuf_cpy(&sb, dev->driver ? dev->driver->name : ""); > >>>>> break; > >>>>> case DEVICE_SYSCTL_LOCATION: > >>>>> - value = buf = malloc(1024, M_BUS, M_WAITOK | M_ZERO); > >>>>> - bus_child_location_str(dev, buf, 1024); > >>>>> + bus_child_location_sb(dev, &sb); > >>>>> break; > >>>>> case DEVICE_SYSCTL_PNPINFO: > >>>>> - value = buf = malloc(1024, M_BUS, M_WAITOK | M_ZERO); > >>>>> - bus_child_pnpinfo_str(dev, buf, 1024); > >>>>> + bus_child_pnpinfo_sb(dev, &sb); > >>>>> break; > >>>>> case DEVICE_SYSCTL_PARENT: > >>>>> - value = dev->parent ? dev->parent->nameunit : ""; > >>>>> + sbuf_cpy(&sb, dev->parent ? dev->parent->nameunit : > ""); > >>>>> break; > >>>>> default: > >>>>> + sbuf_delete(&sb); > >>>>> return (EINVAL); > >>>>> } > >>>>> - error = SYSCTL_OUT_STR(req, value); > >>>>> - if (buf != NULL) > >>>>> - free(buf, M_BUS); > >>>>> + error = sbuf_finish(&sb); > >>>>> + sbuf_delete(&sb); > >>>>> return (error); > >>>>> } > >>>>> > >>>>> @@ -5464,13 +5461,13 @@ SYSCTL_PROC(_hw_bus, OID_AUTO, info, > >>>> CTLTYPE_STRUCT > >>>>> | > >>>>> static int > >>>>> sysctl_devices(SYSCTL_HANDLER_ARGS) > >>>>> { > >>>>> + struct sbuf sb; > >>>>> int *name = (int *)arg1; > >>>>> u_int namelen = arg2; > >>>>> int index; > >>>>> device_t dev; > >>>>> struct u_device *udev; > >>>>> int error; > >>>>> - char *walker, *ep; > >>>>> > >>>>> if (namelen != 2) > >>>>> return (EINVAL); > >>>>> @@ -5501,34 +5498,21 @@ sysctl_devices(SYSCTL_HANDLER_ARGS) > >>>>> udev->dv_devflags = dev->devflags; > >>>>> udev->dv_flags = dev->flags; > >>>>> udev->dv_state = dev->state; > >>>>> - walker = udev->dv_fields; > >>>>> - ep = walker + sizeof(udev->dv_fields); > >>>>> -#define CP(src) \ > >>>>> - if ((src) == NULL) \ > >>>>> - *walker++ = '\0'; \ > >>>>> - else { \ > >>>>> - strlcpy(walker, (src), ep - walker); \ > >>>>> - walker += strlen(walker) + 1; \ > >>>>> - } \ > >>>>> - if (walker >= ep) \ > >>>>> - break; > >>>>> - > >>>>> - do { > >>>>> - CP(dev->nameunit); > >>>>> - CP(dev->desc); > >>>>> - CP(dev->driver != NULL ? dev->driver->name : NULL); > >>>>> - bus_child_pnpinfo_str(dev, walker, ep - walker); > >>>>> - walker += strlen(walker) + 1; > >>>>> - if (walker >= ep) > >>>>> - break; > >>>>> - bus_child_location_str(dev, walker, ep - walker); > >>>>> - walker += strlen(walker) + 1; > >>>>> - if (walker >= ep) > >>>>> - break; > >>>>> - *walker++ = '\0'; > >>>>> - } while (0); > >>>>> -#undef CP > >>>>> - error = SYSCTL_OUT(req, udev, sizeof(*udev)); > >>>>> + sbuf_new(&sb, udev->dv_fields, sizeof(udev->dv_fields), > >>>> SBUF_FIXEDLEN); > >>>>> + sbuf_cat(&sb, dev->nameunit); > >>>>> + sbuf_putc(&sb, '\0'); > >>>>> + sbuf_cat(&sb, dev->desc); > >>>>> + sbuf_putc(&sb, '\0'); > >>>>> + sbuf_cat(&sb, dev->driver != NULL ? dev->driver->name : '\0'); > >>>>> + sbuf_putc(&sb, '\0'); > >>>>> + bus_child_pnpinfo_sb(dev, &sb); > >>>>> + sbuf_putc(&sb, '\0'); > >>>>> + bus_child_location_sb(dev, &sb); > >>>>> + sbuf_putc(&sb, '\0'); > >>>>> + error = sbuf_finish(&sb); > >>>>> + if (error == 0) > >>>>> + error = SYSCTL_OUT(req, udev, sizeof(*udev)); > >>>>> + sbuf_delete(&sb); > >>>>> free(udev, M_BUS); > >>>>> return (error); > >>>>> } > >>>>> > >>>> > >>>> > >>>> -- > >>>> Mateusz Guzik <mjguzik gmail.com> > >>>> > >>> > >> > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfq6U-L%2BpJyMgYT_6fB=KWSudp3tdt%2B7K67gPvRRQ-6u1Q>