From owner-freebsd-net Thu May 25 20:49:14 2000 Delivered-To: freebsd-net@freebsd.org Received: from chmls06.mediaone.net (chmls06.mediaone.net [24.128.1.71]) by hub.freebsd.org (Postfix) with ESMTP id 7C4DE37B563 for ; Thu, 25 May 2000 20:49:10 -0700 (PDT) (envelope-from davep@who.net) Received: from h0000f806dfda.ne.mediaone.net (h0000f806dfda.ne.mediaone.net [24.147.250.67]) by chmls06.mediaone.net (8.8.7/8.8.7) with ESMTP id XAA25851 for ; Thu, 25 May 2000 23:49:09 -0400 (EDT) Received: from h0000f806dfda.ne.mediaone.net (localhost [127.0.0.1]) by h0000f806dfda.ne.mediaone.net (8.9.3/8.9.3) with ESMTP id XAA01157 for ; Thu, 25 May 2000 23:49:09 -0400 (EDT) (envelope-from davep@who.net) Message-Id: <200005260349.XAA01157@h0000f806dfda.ne.mediaone.net> To: freebsd-net@freebsd.org Subject: kernel panic in in_delayed_cksum() Reply-To: "David A. Panariti" X-Attribution: davep Date: Thu, 25 May 2000 23:49:09 -0400 From: "David A. Panariti" Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org I sent this to freebsd-stable but have gotten no solutions, so I'll post it here, too. I believe I have found a bug in netinet. After a cvsup and make world/kernel + mergemaster, I started getting the followiing panics: delayed m_pullup, m->len: 40 off: 23040 p: 6 Fatal trap 12: page fault while in kernel mode fault virtual address = 0x8 fault code = supervisor read, page not present instruction pointer = 0x8:0xc01b10a8 stack pointer = 0x10:0xcd069ae4 frame pointer = 0x10:0xcd069b10 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 5740 (itnd) interrupt mask = trap number = 12 panic: page fault syncing disks... done Uptime: 10h22m23s I cannot remember exactly when I cvsup'd. It was either May 17 or May 20. (Any easy way to tell?) UPDATE: Thanks to Trond Endrest, I know it was cvsup'd on May 20, 16:05 EDT. Unfortunately, I can only reproduce the panics using an old version of the AltaVista tunnel. The tunnel worked perfectly with up to 4-RELEASE. I only have the binary of the tunnel code and it was compiled for FreeBSD2.2. The fact that it ran perfectly up to 4R is a testament to backward compatibility! Anyway, after some investigation, it looks like the m_pullup() is failing inside in_delayed_cksum(). The mbuf is then NULL and we panic when we set the csum. It looks like m_pullup() is failing since offset is very big. Some prints I added yield this: (IP_VHL_HL(ip->ip_vhl) << 2): 0, csum_data: 23040 off too big, skipping csum (I added code to return w/o setting the csum if I see a bogus offset and I no longer panic, and the ftp which was failing now works better, but now can panic elsewhere) Further investigation shows csum_data being mangled here in ip_output(): ip = mtod(m, struct ip *); /* * Fill in IP header. */ if ((flags & (IP_FORWARDING|IP_RAWOUTPUT)) == 0) { ip->ip_vhl = IP_MAKE_VHL(IPVERSION, hlen >> 2); ip->ip_off &= IP_DF; >>>>>>>>>>> ip->ip_id = htons(ip_id++); ipstat.ips_localout++; } else { hlen = IP_VHL_HL(ip->ip_vhl) << 2; dp_ck_csum_data(m, "a-7.1"); /* davep */ } More prints show: off too big @ a-7.3, off: 0x14, csum_data: 0x5a00 ip: 0xc0a91920, m: 0xc0a91900, &csum_data: 0xc0a91924 Where: ip is ip header inside mbuf m is mbuf pointer &csum_data = &m->m_pkthdr.csum_data csum_data is inside the IP header! And, coincidentally(NOT) ip_id is 4 bytes inside the struct ip, thus overlaying csum_data. So it looks like the m_data is pointing at M_databuf which should imply (as the comment states) /* !M_PKTHDR, !M_EXT */ And yet the code is using fields from struct pkthdr MH_pkthdr; /* M_PKTHDR set */ This is where I leave it for those more familiar with the code to pursue. Hopefully someone who knows the code can use this info to find and fix the bug quickly. It's taken me ~6 hrs just to find out this much. thanks, davep -- David Panariti / I can't complain, davep@who.net but sometimes I still do. (see also http://www.four11.com) / -- Joe Walsh To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message