From owner-freebsd-virtualization@FreeBSD.ORG Wed Jun 5 08:52:27 2013 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2F7C3C15; Wed, 5 Jun 2013 08:52:27 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bk0-x236.google.com (mail-bk0-x236.google.com [IPv6:2a00:1450:4008:c01::236]) by mx1.freebsd.org (Postfix) with ESMTP id 3774D1C64; Wed, 5 Jun 2013 08:52:26 +0000 (UTC) Received: by mail-bk0-f54.google.com with SMTP id it19so693182bkc.13 for ; Wed, 05 Jun 2013 01:52:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=RhwcelCQpqAxDJ72aASwgKzRrHLb5Sq5Q1IaIK54REo=; b=muEOfzhTxJbfoJ3AGV4M0fafYBKSTDhyhUvVnL/Hn8wkkL2SpUTZRfBv3+5rvjjXSQ 4SLZK6QAUWZuXQMI+vU7nodCU+cQogf0NSwyIL/6tqth8acRNI0mklaoFE3n7O67CC/e j/E+qzH6ET/9GJbFrURL2dQJNWrn7kEFdXZPwQfQhzv/YBZRbGDp4iBiTrL+Nsyy5VNz 7bvj+Y3f91NJ5HZeO3u7AtOeG0Cpta9xW+SCLEVpACJC3ZA3kYsg7GmUgbz3EJgRV9N7 iIxd2iiFDBJloS7HtD46Cz4XUCbFvqFFmt+RbT16twSIUYZ2bFpkaMAC2ChAhHVEfd0p 0aSA== X-Received: by 10.204.62.137 with SMTP id x9mr9224354bkh.90.1370422344859; Wed, 05 Jun 2013 01:52:24 -0700 (PDT) Received: from localhost ([178.150.115.244]) by mx.google.com with ESMTPSA id jm15sm25045160bkb.13.2013.06.05.01.52.22 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 05 Jun 2013 01:52:23 -0700 (PDT) Sender: Mikolaj Golub Date: Wed, 5 Jun 2013 11:52:20 +0300 From: Mikolaj Golub To: Nikos Vassiliadis Subject: Re: pf + vimage patch Message-ID: <20130605085219.GA53217@gmail.com> References: <51AC84EE.6020009@gmx.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51AC84EE.6020009@gmx.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-jail@freebsd.org, Gleb Smirnoff , freebsd-virtualization@freebsd.org, freebsd-pf@freebsd.org X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jun 2013 08:52:27 -0000 Hi, On Mon, Jun 03, 2013 at 01:58:38PM +0200, Nikos Vassiliadis wrote: > Hi, > > Please review this patch. It fixes some problems with pf and vimage. > For the time being only pf works. ALTQ, pflog, pfsync are not changed > nor tested but as time permits, I'll work on them. Basic packet > filtering functionality per VNET should be ok. > Thank you for working on this. I'd really like to see pf and vimage integrated, as it looks like one of the main stoppers to have vimage in GENERIC. I hoped more knowledgeable people would comment on this though. Anyway, not being familiar with pf, here is a couple of things I would recommend to make your patch more attractive for potential reviewers: 1) It looks like the patch can be split on several parts. A log message to every change describing why it is needed and what problem solves would be very helpful. As a tool to maintain such changes I personally prefer git. 2) You use spaces for indentation, where tabs should be. This adds unnecessary noise and makes the patch less readable. Also someone will need to fix this when eventually committing to the tree. 3) When generating diff from svn, adding -x-pu options will make the diff more readable. Also see some questions/comments below. > Index: sys/net/pfvar.h > =================================================================== > --- sys/net/pfvar.h (revision 251294) > +++ sys/net/pfvar.h (working copy) > @@ -901,7 +901,6 @@ > struct pf_ruleset *, struct pf_pdesc *, int); > extern pflog_packet_t *pflog_packet_ptr; > > -#define V_pf_end_threads VNET(pf_end_threads) > #endif /* _KERNEL */ > > #define PFSYNC_FLAG_SRCNODE 0x04 > Index: sys/netpfil/pf/pf.c > =================================================================== > --- sys/netpfil/pf/pf.c (revision 251294) > +++ sys/netpfil/pf/pf.c (working copy) > @@ -300,8 +300,6 @@ > > int in4_cksum(struct mbuf *m, u_int8_t nxt, int off, int len); > > -VNET_DECLARE(int, pf_end_threads); > - > VNET_DEFINE(struct pf_limit, pf_limits[PF_LIMIT_MAX]); > > #define PACKET_LOOPED(pd) ((pd)->pf_mtag && \ > @@ -359,15 +357,13 @@ > > SYSCTL_NODE(_net, OID_AUTO, pf, CTLFLAG_RW, 0, "pf(4)"); > > -VNET_DEFINE(u_long, pf_hashsize); > -#define V_pf_hashsize VNET(pf_hashsize) > -SYSCTL_VNET_UINT(_net_pf, OID_AUTO, states_hashsize, CTLFLAG_RDTUN, > - &VNET_NAME(pf_hashsize), 0, "Size of pf(4) states hashtable"); > +u_long pf_hashsize; > +SYSCTL_UINT(_net_pf, OID_AUTO, states_hashsize, CTLFLAG_RDTUN, > + &pf_hashsize, 0, "Size of pf(4) states hashtable"); > > -VNET_DEFINE(u_long, pf_srchashsize); > -#define V_pf_srchashsize VNET(pf_srchashsize) > -SYSCTL_VNET_UINT(_net_pf, OID_AUTO, source_nodes_hashsize, CTLFLAG_RDTUN, > - &VNET_NAME(pf_srchashsize), 0, "Size of pf(4) source nodes hashtable"); > +u_long pf_srchashsize; > +SYSCTL_UINT(_net_pf, OID_AUTO, source_nodes_hashsize, CTLFLAG_RDTUN, > + &pf_srchashsize, 0, "Size of pf(4) source nodes hashtable"); > Why do you have to devirtualize these variables? Are per vnet hashtables sizes not useful or do they cause issues? > VNET_DEFINE(void *, pf_swi_cookie); > > @@ -698,12 +694,12 @@ > struct pf_srchash *sh; > u_int i; > > - TUNABLE_ULONG_FETCH("net.pf.states_hashsize", &V_pf_hashsize); > - if (V_pf_hashsize == 0 || !powerof2(V_pf_hashsize)) > - V_pf_hashsize = PF_HASHSIZ; > - TUNABLE_ULONG_FETCH("net.pf.source_nodes_hashsize", &V_pf_srchashsize); > - if (V_pf_srchashsize == 0 || !powerof2(V_pf_srchashsize)) > - V_pf_srchashsize = PF_HASHSIZ / 4; > + TUNABLE_ULONG_FETCH("net.pf.states_hashsize", &pf_hashsize); > + if (pf_hashsize == 0 || !powerof2(pf_hashsize)) > + pf_hashsize = PF_HASHSIZ; > + TUNABLE_ULONG_FETCH("net.pf.source_nodes_hashsize", &pf_srchashsize); > + if (pf_srchashsize == 0 || !powerof2(pf_srchashsize)) > + pf_srchashsize = PF_HASHSIZ / 4; > > V_pf_hashseed = arc4random(); > > @@ -717,11 +713,11 @@ > V_pf_state_key_z = uma_zcreate("pf state keys", > sizeof(struct pf_state_key), pf_state_key_ctor, NULL, NULL, NULL, > UMA_ALIGN_PTR, 0); > - V_pf_keyhash = malloc(V_pf_hashsize * sizeof(struct pf_keyhash), > + V_pf_keyhash = malloc(pf_hashsize * sizeof(struct pf_keyhash), > M_PFHASH, M_WAITOK | M_ZERO); > - V_pf_idhash = malloc(V_pf_hashsize * sizeof(struct pf_idhash), > + V_pf_idhash = malloc(pf_hashsize * sizeof(struct pf_idhash), > M_PFHASH, M_WAITOK | M_ZERO); > - V_pf_hashmask = V_pf_hashsize - 1; > + V_pf_hashmask = pf_hashsize - 1; > for (i = 0, kh = V_pf_keyhash, ih = V_pf_idhash; i <= V_pf_hashmask; > i++, kh++, ih++) { > mtx_init(&kh->lock, "pf_keyhash", NULL, MTX_DEF); > @@ -735,9 +731,9 @@ > V_pf_limits[PF_LIMIT_SRC_NODES].zone = V_pf_sources_z; > uma_zone_set_max(V_pf_sources_z, PFSNODE_HIWAT); > uma_zone_set_warning(V_pf_sources_z, "PF source nodes limit reached"); > - V_pf_srchash = malloc(V_pf_srchashsize * sizeof(struct pf_srchash), > + V_pf_srchash = malloc(pf_srchashsize * sizeof(struct pf_srchash), > M_PFHASH, M_WAITOK|M_ZERO); > - V_pf_srchashmask = V_pf_srchashsize - 1; > + V_pf_srchashmask = pf_srchashsize - 1; > for (i = 0, sh = V_pf_srchash; i <= V_pf_srchashmask; i++, sh++) > mtx_init(&sh->lock, "pf_srchash", NULL, MTX_DEF); > > @@ -757,13 +753,17 @@ > STAILQ_INIT(&V_pf_sendqueue); > SLIST_INIT(&V_pf_overloadqueue); > TASK_INIT(&V_pf_overloadtask, 0, pf_overload_task, &V_pf_overloadqueue); > - mtx_init(&pf_sendqueue_mtx, "pf send queue", NULL, MTX_DEF); > - mtx_init(&pf_overloadqueue_mtx, "pf overload/flush queue", NULL, > - MTX_DEF); > + if (IS_DEFAULT_VNET(curvnet)) { > + mtx_init(&pf_sendqueue_mtx, "pf send queue", NULL, MTX_DEF); > + mtx_init(&pf_overloadqueue_mtx, "pf overload/flush queue", NULL, > + MTX_DEF); > + } > > /* Unlinked, but may be referenced rules. */ > TAILQ_INIT(&V_pf_unlinked_rules); > - mtx_init(&pf_unlnkdrules_mtx, "pf unlinked rules", NULL, MTX_DEF); > + if (IS_DEFAULT_VNET(curvnet)) > + mtx_init(&pf_unlnkdrules_mtx, "pf unlinked rules", NULL, MTX_DEF); > + > } "if (IS_DEFAULT_VNET(curvnet))" constructions look a little ugly for me. Isn't possible to split these initialization functions on two parts: one (not "virtualized") to run by pf_load() and the other by vnet_pf_init()? > > void > @@ -1309,68 +1309,35 @@ > pf_purge_thread(void *v) > { > u_int idx = 0; > + VNET_ITERATOR_DECL(vnet_iter); > > - CURVNET_SET((struct vnet *)v); > - > for (;;) { > - PF_RULES_RLOCK(); > - rw_sleep(pf_purge_thread, &pf_rules_lock, 0, "pftm", hz / 10); > + tsleep(pf_purge_thread, PWAIT, "pftm", hz / 10); > + VNET_LIST_RLOCK(); > + VNET_FOREACH(vnet_iter) { > + CURVNET_SET(vnet_iter); > > - if (V_pf_end_threads) { > - /* > - * To cleanse up all kifs and rules we need > - * two runs: first one clears reference flags, > - * then pf_purge_expired_states() doesn't > - * raise them, and then second run frees. > - */ > - PF_RULES_RUNLOCK(); > - pf_purge_unlinked_rules(); > - pfi_kif_purge(); > - > - /* > - * Now purge everything. > - */ > - pf_purge_expired_states(0, V_pf_hashmask); > - pf_purge_expired_fragments(); > - pf_purge_expired_src_nodes(); > - > - /* > - * Now all kifs & rules should be unreferenced, > - * thus should be successfully freed. > - */ > - pf_purge_unlinked_rules(); > - pfi_kif_purge(); > - > - /* > - * Announce success and exit. > - */ > - PF_RULES_RLOCK(); > - V_pf_end_threads++; > - PF_RULES_RUNLOCK(); > - wakeup(pf_purge_thread); > - kproc_exit(0); > - } Running only one instance of pf_purge_thread, which iterates over all vnets looks like a good idea to me, but do we really not need this clean up stuff for our only thread? Don't you have issues e.g. on pf module unload? > - PF_RULES_RUNLOCK(); > - > /* Process 1/interval fraction of the state table every run. */ > idx = pf_purge_expired_states(idx, V_pf_hashmask / > - (V_pf_default_rule.timeout[PFTM_INTERVAL] * 10)); > + (V_pf_default_rule.timeout[PFTM_INTERVAL] * 10)); > > /* Purge other expired types every PFTM_INTERVAL seconds. */ > if (idx == 0) { > - /* > - * Order is important: > - * - states and src nodes reference rules > - * - states and rules reference kifs > - */ > - pf_purge_expired_fragments(); > - pf_purge_expired_src_nodes(); > - pf_purge_unlinked_rules(); > - pfi_kif_purge(); > + /* > + * Order is important: > + * - states and src nodes reference rules > + * - states and rules reference kifs > + */ > + pf_purge_expired_fragments(); > + pf_purge_expired_src_nodes(); > + pf_purge_unlinked_rules(); > + pfi_kif_purge(); This is a good example of unnecessary whitespace noise. > } > + CURVNET_RESTORE(); > + } > + VNET_LIST_RUNLOCK(); > } > /* not reached */ > - CURVNET_RESTORE(); > } > -- Mikolaj Golub