Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 25 Nov 2007 09:56:20 +0200
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Robert Watson <rwatson@freebsd.org>
Cc:        freebsd-current@freebsd.org, Rako <rako29@gmail.com>
Subject:   Re: panic with tcpdrop
Message-ID:  <20071125075620.GA78396@deviant.kiev.zoral.com.ua>
In-Reply-To: <20071124211859.S14018@fledge.watson.org>
References:  <47473E30.6070608@gmail.com> <20071124003453.O14018@fledge.watson.org> <47477F9F.2080900@gmail.com> <20071124142149.Y14018@fledge.watson.org> <47486C9B.4020407@gmail.com> <20071124211859.S14018@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--GOYT2+aw+EAigp19
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Nov 24, 2007 at 09:19:42PM +0000, Robert Watson wrote:
> On Sat, 24 Nov 2007, Rako wrote:
>=20
> >the patch solve the problem with tcpdrop, Thanks!!
> >
> >An other panic ocurred, but on other area, is on snp.ko module (watch -W=
=20
> >/dev/ttyv0) but can't get backtrace. This panic is simliar at
> >
> >http://lists.freebsd.org/pipermail/freebsd-current/2007-March/069990.html
> >
> >the problem may be at line 164 of /usr/src/sys/dev/snp/snp.c snp =3D=20
> >ttytosnp(tp);
> >
> >where snp get NULL
> >
> >but, no familiar with this ... Any idea what can I do to solve the error?
>=20
> I'm having trouble reproducing this -- could you give me a detailed set o=
f=20
> instructions regarding the specific steps I should take to try and get th=
is=20
> panic, if it's reproduceable for you?
>=20
> Thanks,
>=20
> Robert N M Watson
> Computer Laboratory
> University of Cambridge
>=20
> >
> >Regards,
> >Javier
> >
> >
> >Fatal trap 12: page fault while in kernel mode
> >fault virtual address   =3D 0x24
> >fault code              =3D supervisor read, page not present
> >instruction pointer     =3D 0x20:0xc3e4f230
> >stack pointer           =3D 0x28:0xd66c3b34
> >frame pointer           =3D 0x28:0xd66c3b88
> >code segment            =3D base 0x0, limit 0xfffff, type 0x1b
> >                       =3D DPL 0, pres 1, def32 1, gran 1
> >processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
> >current process         =3D 2216 (make)
> >trap number             =3D 12
> >panic: page fault
> >KDB: stack backtrace:
> >db_trace_self_wrapper(c0a5f1ea,d66c39d4,c078878a,c0a5d5f4,c0b5bcc0,...) =
at=20
> >db_trace_self_wrapper+0x26
> >kdb_backtrace(c0a5d5f4,c0b5bcc0,c0a1fb8c,d66c39e0,d66c39e0,...) at=20
> >kdb_backtrace+0x29
> >panic(c0a1fb8c,c0a7c54d,c3e44770,1,1,...) at panic+0xaa
> >trap_fatal(c3e942b8,0,1,0,c39f5630,...) at trap_fatal+0x303
> >trap_pfault(0,c39f5630,c39f5630,0,c,...) at trap_pfault+0x250
> >trap(d66c3af4) at trap+0x382
> >calltrap() at calltrap+0x6
> >--- trap 0xc, eip =3D 0xc3e4f230, esp =3D 0xd66c3b34, ebp =3D 0xd66c3b88=
 ---
> >snplwrite(c33bf800,d66c3c60,0,d66c3bbc,c0754bec,...) at snplwrite+0x80
> >ttywrite(c3389600,d66c3c60,0,c39cf5e8,c39f5630,...) at ttywrite+0x39
> >giant_write(c3389600,d66c3c60,0,0,c0abb080,...) at giant_write+0x6c
> >devfs_write_f(c39cf5e8,d66c3c60,c3de4800,0,c39f5630,...) at=20
> >devfs_write_f+0x75
> >dofilewrite(d66c3c60,ffffffff,ffffffff,0,c39cf5e8,...) at dofilewrite+0x=
97
> >kern_writev(c39f5630,1,d66c3c60,2813c076,0,...) at kern_writev+0x58
> >write(c39f5630,d66c3cfc,c,110,c337e630,...) at write+0x4f
> >syscall(d66c3d38) at syscall+0x335
> >Xint0x80_syscall() at Xint0x80_syscall+0x20
> >--- syscall (4, FreeBSD ELF32, write), eip =3D 0x8083603, esp =3D 0xbfbf=
d4ec,=20
> >ebp =3D 0xbfbfd528 ---
> >Uptime: 19m14s
> >Physical memory: 495 MB
> >Dumping 86 MB: 71 55 39 23 7

I believe I have a plausible explanation for the panic. Please, look
at the snpioctl(), SNPSTTY command. First, assume that both the s > 0
and snoop device has attached tty. Then, snp_tty will be overwritten,
without detaching the old tty from the snooper. In this case, ttytosnp()
would not find the snp from tty, returning NULL. This would lead to the
trace above. This is old kernel bug.

Now, I shall note that watch(8) does not attach to the new tty without
detaching from the previous one. But, after destroy_dev_sched() conversion
have been done for snp(4), actual detach is asynchronous. Since watch(8)
opens the numbered snpX clone device instead of the master /dev/snp, it
could reopen the same device. The condition is racy, and thus not easily
reproducable.

The patch below might help with kernel panic.

diff --git a/sys/dev/snp/snp.c b/sys/dev/snp/snp.c
index a84e90c..b8f3d63 100644
--- a/sys/dev/snp/snp.c
+++ b/sys/dev/snp/snp.c
@@ -491,7 +491,7 @@ snpioctl(struct cdev *dev, u_long cmd, caddr_t data, in=
t flags,
     struct thread *td)
 {
 	struct snoop *snp;
-	struct tty *tp, *tpo;
+	struct tty *tp;
 	struct cdev *tdev;
 	struct file *fp;
 	int s;
@@ -502,6 +502,9 @@ snpioctl(struct cdev *dev, u_long cmd, caddr_t data, in=
t flags,
 		s =3D *(int *)data;
 		if (s < 0)
 			return (snp_down(snp));
+		if (snp->snp_tty !=3D NULL)
+			return (EBUSY);
+
 		if (fget(td, s, &fp) !=3D 0)
 			return (EINVAL);
 		if (fp->f_type !=3D DTYPE_VNODE ||
@@ -520,13 +523,6 @@ snpioctl(struct cdev *dev, u_long cmd, caddr_t data, i=
nt flags,
 			return (EBUSY);
=20
 		s =3D spltty();
-
-		if (snp->snp_target =3D=3D NULL) {
-			tpo =3D snp->snp_tty;
-			if (tpo)
-				tpo->t_state &=3D ~TS_SNOOP;
-		}
-
 		tp->t_state |=3D TS_SNOOP;
 		snp->snp_olddisc =3D tp->t_line;
 		tp->t_line =3D snooplinedisc;

--GOYT2+aw+EAigp19
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (FreeBSD)

iD8DBQFHSSqkC3+MBN1Mb4gRAr82AKCWD8vFCRzGob8JrDKvGBE3lSq3AwCfcva8
u3P+zlHD2OccT7s856iJxSE=
=chBL
-----END PGP SIGNATURE-----

--GOYT2+aw+EAigp19--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071125075620.GA78396>