From owner-freebsd-pf@FreeBSD.ORG Sun May 27 18:29:28 2012 Return-Path: Delivered-To: freebsd-pf@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1FBD1106566C; Sun, 27 May 2012 18:29:28 +0000 (UTC) (envelope-from Joerg.Pulz@frm2.tum.de) Received: from mailhost.frm2.tum.de (mailhost.frm2.tum.de [129.187.179.12]) by mx1.freebsd.org (Postfix) with ESMTP id B59BE8FC16; Sun, 27 May 2012 18:29:27 +0000 (UTC) Received: from mailhost.frm2.tum.de (localhost [127.0.0.1]) by mailhost.frm2.tum.de (8.14.4/8.14.4) with ESMTP id q4RIS6PW011493; Sun, 27 May 2012 20:28:06 +0200 (CEST) (envelope-from Joerg.Pulz@frm2.tum.de) X-Virus-Scanned: at mailhost.frm2.tum.de Received: from [31.245.72.216] (tmo-102-253.customers.d1-online.com [80.187.102.253]) (authenticated bits=0) by mailhost.frm2.tum.de (8.14.4/8.14.4) with ESMTP id q4RIRw6T011489 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Sun, 27 May 2012 20:28:02 +0200 (CEST) (envelope-from Joerg.Pulz@frm2.tum.de) X-Authentication-Warning: mailhost.frm2.tum.de: Host tmo-102-253.customers.d1-online.com [80.187.102.253] claimed to be [31.245.72.216] References: <201205250730.q4P7UGu0006036@freefall.freebsd.org> <20120525091627.GA27514@insomnia.benzedrine.cx> User-Agent: K-9 Mail for Android In-Reply-To: <20120525091627.GA27514@insomnia.benzedrine.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Joerg Pulz Date: Sun, 27 May 2012 20:27:58 +0200 To: Daniel Hartmeier Message-ID: <344070cc-e9c8-4eed-872c-14e8bcd343ff@email.android.com> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (mailhost.frm2.tum.de [129.187.179.12]); Sun, 27 May 2012 20:28:04 +0200 (CEST) Cc: bug-followup@freebsd.org, freebsd-pf@freebsd.org Subject: Re: kern/168190: [pf] panic when using pf and route-to (maybe: bad fragment handling?) X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 May 2012 18:29:28 -0000 Daniel Hartmeier wrote: >On Fri, May 25, 2012 at 07:30:16AM +0000, Joerg Pulz wrote: > >> the system is still running without panic, but i found the following >log >> entries from last night: >> >> May 24 23:28:57 charon kernel: pf_route: m0->m_len < sizeof(struct >ip) >> May 24 23:28:57 charon kernel: pf_route: m0->m_len < sizeof(struct >ip) >> >> Do you think that this may be related to the panics? >> I've found this error message two times in contrib/pf/net/pf.c. >> I can't say which of them or both have printed the message. > >Yes, this could possibly explain it. > >All pfil consumers assume that the IP header is one continuous memory >region in the mbuf, without verifying this or correcting it (with >m_pullup() or such) if wrong. > >pf: pf_check_in() in sys/contrib/pf/net/pf_ioctl.c > h = mtod(*m, struct ip *); > access h->ip_len > >ipfw: ipfw_check_hook() in sys/netinet/ipfw/ip_fw_pfil.c > SET_NET_IPLEN(mtod(*m0, struct ip *)); > >ipfilter: fr_check_wrapper() in >sys/contrib/ipfilter/netinet/ip_fil_freebsd.c > struct ip *ip = mtod(*mp, struct ip *); > access ip->ip_hl > >Hence, the caller of pfil_run_hooks() must ensure this before the call. >ip_input() does, but there are operations that might violate the >condition again. > >If the IP header is not continuous in the first mbuf, doing > > struct ip *ip = mtod(m, struct ip *); > ip->ip_len = ntohs(ip->ip_len); > >will read (and write) unrelated memory. > >If, later, something does call m_pullup(), ip_len will get 'replaced' >again. This could lead to some byte order swaps getting undone. > >I'm not sure what the proper action is here, i.e. should we be >surprised >that an mbuf with such a small m_len is found, and track down how it >was produced, or should the pfil code simply expect this? > >I'd probably add a sanity check to pfil_run_hooks(), like > >--- sys/net/pfil.c 23 Sep 2011 00:51:37 -0000 1.19.2.1 >+++ sys/net/pfil.c 25 May 2012 09:10:27 -0000 >@@ -46,6 +46,8 @@ > > #include > #include >+#include >+#include > > static struct mtx pfil_global_lock; > >@@ -74,15 +76,21 @@ > struct mbuf *m = *mp; > int rv = 0; > >+ if (m->m_pkthdr.len < sizeof(struct ip) || >+ m->m_len < sizeof(struct ip)) >+ panic("pfil_run_hooks: m->m_pkthdr.len %d, m->m_len >%d", >+ (int)m->m_pkthdr.len, (int)m->m_len); > PFIL_RLOCK(ph, &rmpt); > KASSERT(ph->ph_nhooks >= 0, ("Pfil hook count dropped < 0")); > for (pfh = pfil_hook_get(dir, ph); pfh != NULL; > pfh = TAILQ_NEXT(pfh, pfil_link)) { > if (pfh->pfil_func != NULL) { >+ ASSERT_HOST_BYTE_ORDER(m); > rv = (*pfh->pfil_func)(pfh->pfil_arg, &m, ifp, dir, > inp); > if (rv != 0 || m == NULL) > break; >+ ASSERT_HOST_BYTE_ORDER(m); > } > } > PFIL_RUNLOCK(ph, &rmpt); > >Then when it will panic (instead of just the pf_route() message), the >stack trace could help. > >This might require several iterations, adding such checks all around >the >place. The cause might be some mbuf operation done in ipsec, for some >edge case, explaining why it occurs so rarely... > >If I could easily reproduce it locally, I'd probably do it, but it's >your machine that crashes all the time, so it's your call :) > >Daniel Daniel, i've seen 12 more "pf_route: m0->m_len < sizeof(struct ip)" messages since the system is running after adding your patch, but no panic. Is there another place in the code where i can add an additional check? Kind regards Jörg