Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 05 Jun 2008 15:57:08 +0300
From:      Oleksandr Tymoshenko <gonzo@freebsd.org>
To:        Alexander Motin <mav@freebsd.org>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: Crashes in devfs. Possibly on interface creation/destruction.
Message-ID:  <4847E2A4.2020509@freebsd.org>
In-Reply-To: <48470853.6080807@FreeBSD.org>
References:  <48470853.6080807@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Alexander Motin wrote:
> Hi.
> 
> After recent upgrading from 6.3-RC1/mpd-5.0rc1 to 6.3-STABLE/mpd-5.1 
> some of my PPPoE servers started to crash with about weekly period. 
> Usually they just just hang without rebooting and core dumping. Consoles 
> are inaccessible. All I have got from them was:
> 
> kernel: Fatal trap 12: page fau
> kernel: lt while in k
> kernel: ernel
> kernel: mode
> kernel:
> kernel: cpuid = 1; apic id = 01
> kernel: faut virtual address = 0x58
> kernel:
> kernel: fault code           = supervisor read, page not present
> kernel:
> kernel: instruction pointer  = 0x20:0xc04800be
> kernel:
> kernel: stack pointer                = 0x28:0xd690883c
> kernel: frame pointer                = 0x28:0
> kernel: xd6908854
> kernel: code segment         =
> kernel: base 0x0, limit 0xfffff, type 0x1b
> kernel:
> kernel: = DPL 0, pres 1, def32 1, gra
> kernel: n 1
> kernel: processor eflags     = interrupt
> kernel: enab
> kernel: led, r
> kernel: esume
> kernel: , IOPL
> kernel: = 0
> kernel:
> kernel: current process              = 1835 (mpd5)
> kernel:
> kernel: trap number          = 12
> 
> "fault virtual address" and "instruction pointer" are always the same.
> 
> Address 0xc04800be looks like part of devfs code:
>  > addr2line -f -e kernel.debug 0xc04800be
> devfs_populate_loop
> /usr/src/sys/fs/devfs/devfs_devs.c:443
> 
> devfs_devs.c:
>                 de = devfs_newdirent(s, q - s);
>                 if (cdp->cdp_c.si_flags & SI_ALIAS) {
>                         de->de_uid = 0;
>                         de->de_gid = 0;
>                         de->de_mode = 0755;
>                         de->de_dirent->d_type = DT_LNK;
>                         pdev = cdp->cdp_c.si_parent;
> ->> line 443 ->>        j = strlen(pdev->si_name) + 1;
>                         de->de_symlink = malloc(j, M_DEVFS, M_WAITOK);
>                         bcopy(pdev->si_name, de->de_symlink, j);
> 
> 0x58 - is precisely the offset of si_name field inside of struct cdev. 
> So looks like pdev = cdp->cdp_c.si_parent is NULL here for some reason.
> 
> As soon as network interfaces have respective devfs entries and looking 
> higher interface creation/destruction rate that newest mpd5.1 is able to 
> reach due to optimizations, I think it may be some kind or race 
> somewhere interface creation.
> 
> Can somebody give me any hint where to look to?
     On a quick glance the most likely place is make_dev_alias call in net/if.c
line 457. And the most likely suspect is race for if_index variable. There are
even a couple of "XXX: should be locked" notes there :)


-- 
gonzo



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4847E2A4.2020509>