Date: Thu, 5 Jun 2008 14:04:47 +0300 From: Kostik Belousov <kostikbel@gmail.com> To: Alexander Motin <mav@freebsd.org> Cc: freebsd-stable@freebsd.org Subject: Re: Crashes in devfs. Possibly on interface creation/destruction. Message-ID: <20080605110447.GB94309@deviant.kiev.zoral.com.ua> In-Reply-To: <48470853.6080807@FreeBSD.org> References: <48470853.6080807@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--OwLcNYc0lM97+oe1 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jun 05, 2008 at 12:25:39AM +0300, Alexander Motin wrote: > Hi. >=20 > After recent upgrading from 6.3-RC1/mpd-5.0rc1 to 6.3-STABLE/mpd-5.1=20 > some of my PPPoE servers started to crash with about weekly period.=20 > Usually they just just hang without rebooting and core dumping. Consoles= =20 > are inaccessible. All I have got from them was: >=20 > kernel: Fatal trap 12: page fau > kernel: lt while in k > kernel: ernel > kernel: mode > kernel: > kernel: cpuid =3D 1; apic id =3D 01 > kernel: faut virtual address =3D 0x58 > kernel: > kernel: fault code =3D supervisor read, page not present > kernel: > kernel: instruction pointer =3D 0x20:0xc04800be > kernel: > kernel: stack pointer =3D 0x28:0xd690883c > kernel: frame pointer =3D 0x28:0 > kernel: xd6908854 > kernel: code segment =3D > kernel: base 0x0, limit 0xfffff, type 0x1b > kernel: > kernel: =3D DPL 0, pres 1, def32 1, gra > kernel: n 1 > kernel: processor eflags =3D interrupt > kernel: enab > kernel: led, r > kernel: esume > kernel: , IOPL > kernel: =3D 0 > kernel: > kernel: current process =3D 1835 (mpd5) > kernel: > kernel: trap number =3D 12 >=20 > "fault virtual address" and "instruction pointer" are always the same. >=20 > Address 0xc04800be looks like part of devfs code: > > addr2line -f -e kernel.debug 0xc04800be > devfs_populate_loop > /usr/src/sys/fs/devfs/devfs_devs.c:443 >=20 > devfs_devs.c: > de =3D devfs_newdirent(s, q - s); > if (cdp->cdp_c.si_flags & SI_ALIAS) { > de->de_uid =3D 0; > de->de_gid =3D 0; > de->de_mode =3D 0755; > de->de_dirent->d_type =3D DT_LNK; > pdev =3D cdp->cdp_c.si_parent; > ->> line 443 ->> j =3D strlen(pdev->si_name) + 1; > de->de_symlink =3D malloc(j, M_DEVFS, M_WAITOK); > bcopy(pdev->si_name, de->de_symlink, j); >=20 > 0x58 - is precisely the offset of si_name field inside of struct cdev.=20 > So looks like pdev =3D cdp->cdp_c.si_parent is NULL here for some reason. >=20 > As soon as network interfaces have respective devfs entries and looking= =20 > higher interface creation/destruction rate that newest mpd5.1 is able to= =20 > reach due to optimizations, I think it may be some kind or race=20 > somewhere interface creation. >=20 > Can somebody give me any hint where to look to? Try the following patch. It is against current, there might be further races at the device destruction, but may be not. Also, please note that devfs in RELENG_6 and RELENG_7/CURRENT are diverged enough to make MFC of most bugfixes to RELENG_6 nearly impossible. diff --git a/sys/kern/kern_conf.c b/sys/kern/kern_conf.c index e9d0f7b..af9a47d 100644 --- a/sys/kern/kern_conf.c +++ b/sys/kern/kern_conf.c @@ -825,9 +825,9 @@ make_dev_alias(struct cdev *pdev, const char *fmt, ...) va_end(ap); =20 devfs_create(dev); + dev_dependsl(pdev, dev); clean_unrhdrl(devfs_inos); dev_unlock(); - dev_depends(pdev, dev); =20 notify_create(dev); =20 --OwLcNYc0lM97+oe1 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEYEARECAAYFAkhHyE8ACgkQC3+MBN1Mb4i1ugCeK8fu6zpdX4CKQLPDzy3F+ONO JgUAni7/sU+i2wgpCxT7+CNXqNTCtzLX =08mK -----END PGP SIGNATURE----- --OwLcNYc0lM97+oe1--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080605110447.GB94309>