From owner-freebsd-pf@FreeBSD.ORG Sun May 27 18:30:09 2012 Return-Path: Delivered-To: freebsd-pf@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9790B10656D1 for ; Sun, 27 May 2012 18:30:09 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 7970A8FC15 for ; Sun, 27 May 2012 18:30:09 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q4RIU9OD039896 for ; Sun, 27 May 2012 18:30:09 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q4RIU9fA039893; Sun, 27 May 2012 18:30:09 GMT (envelope-from gnats) Date: Sun, 27 May 2012 18:30:09 GMT Message-Id: <201205271830.q4RIU9fA039893@freefall.freebsd.org> To: freebsd-pf@FreeBSD.org From: Joerg Pulz Cc: Subject: Re: kern/168190: [pf] panic when using pf and route-to (maybe: bad fragment handling?) X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Joerg Pulz List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 May 2012 18:30:09 -0000 The following reply was made to PR kern/168190; it has been noted by GNATS. From: Joerg Pulz To: Daniel Hartmeier Cc: freebsd-pf@freebsd.org, bug-followup@freebsd.org Subject: Re: kern/168190: [pf] panic when using pf and route-to (maybe: bad fragment handling?) Date: Sun, 27 May 2012 20:27:58 +0200 Daniel Hartmeier wrote: >On Fri, May 25, 2012 at 07:30:16AM +0000, Joerg Pulz wrote: > >> the system is still running without panic, but i found the following >log >> entries from last night: >> >> May 24 23:28:57 charon kernel: pf_route: m0->m_len < sizeof(struct >ip) >> May 24 23:28:57 charon kernel: pf_route: m0->m_len < sizeof(struct >ip) >> >> Do you think that this may be related to the panics? >> I've found this error message two times in contrib/pf/net/pf.c. >> I can't say which of them or both have printed the message. > >Yes, this could possibly explain it. > >All pfil consumers assume that the IP header is one continuous memory >region in the mbuf, without verifying this or correcting it (with >m_pullup() or such) if wrong. > >pf: pf_check_in() in sys/contrib/pf/net/pf_ioctl.c > h = mtod(*m, struct ip *); > access h->ip_len > >ipfw: ipfw_check_hook() in sys/netinet/ipfw/ip_fw_pfil.c > SET_NET_IPLEN(mtod(*m0, struct ip *)); > >ipfilter: fr_check_wrapper() in >sys/contrib/ipfilter/netinet/ip_fil_freebsd.c > struct ip *ip = mtod(*mp, struct ip *); > access ip->ip_hl > >Hence, the caller of pfil_run_hooks() must ensure this before the call. >ip_input() does, but there are operations that might violate the >condition again. > >If the IP header is not continuous in the first mbuf, doing > > struct ip *ip = mtod(m, struct ip *); > ip->ip_len = ntohs(ip->ip_len); > >will read (and write) unrelated memory. > >If, later, something does call m_pullup(), ip_len will get 'replaced' >again. This could lead to some byte order swaps getting undone. > >I'm not sure what the proper action is here, i.e. should we be >surprised >that an mbuf with such a small m_len is found, and track down how it >was produced, or should the pfil code simply expect this? > >I'd probably add a sanity check to pfil_run_hooks(), like > >--- sys/net/pfil.c 23 Sep 2011 00:51:37 -0000 1.19.2.1 >+++ sys/net/pfil.c 25 May 2012 09:10:27 -0000 >@@ -46,6 +46,8 @@ > > #include > #include >+#include >+#include > > static struct mtx pfil_global_lock; > >@@ -74,15 +76,21 @@ > struct mbuf *m = *mp; > int rv = 0; > >+ if (m->m_pkthdr.len < sizeof(struct ip) || >+ m->m_len < sizeof(struct ip)) >+ panic("pfil_run_hooks: m->m_pkthdr.len %d, m->m_len >%d", >+ (int)m->m_pkthdr.len, (int)m->m_len); > PFIL_RLOCK(ph, &rmpt); > KASSERT(ph->ph_nhooks >= 0, ("Pfil hook count dropped < 0")); > for (pfh = pfil_hook_get(dir, ph); pfh != NULL; > pfh = TAILQ_NEXT(pfh, pfil_link)) { > if (pfh->pfil_func != NULL) { >+ ASSERT_HOST_BYTE_ORDER(m); > rv = (*pfh->pfil_func)(pfh->pfil_arg, &m, ifp, dir, > inp); > if (rv != 0 || m == NULL) > break; >+ ASSERT_HOST_BYTE_ORDER(m); > } > } > PFIL_RUNLOCK(ph, &rmpt); > >Then when it will panic (instead of just the pf_route() message), the >stack trace could help. > >This might require several iterations, adding such checks all around >the >place. The cause might be some mbuf operation done in ipsec, for some >edge case, explaining why it occurs so rarely... > >If I could easily reproduce it locally, I'd probably do it, but it's >your machine that crashes all the time, so it's your call :) > >Daniel Daniel, i've seen 12 more "pf_route: m0->m_len < sizeof(struct ip)" messages since the system is running after adding your patch, but no panic. Is there another place in the code where i can add an additional check? Kind regards Jörg