From owner-freebsd-pf@FreeBSD.ORG Sun Mar 30 08:27:21 2014 Return-Path: Delivered-To: freebsd-pf@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 737206F6; Sun, 30 Mar 2014 08:27:21 +0000 (UTC) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:150:6101::4]) by mx1.freebsd.org (Postfix) with ESMTP id 1A68215F; Sun, 30 Mar 2014 08:27:18 +0000 (UTC) Received: from mail.vx.sk (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id 935321B98C; Sun, 30 Mar 2014 10:27:09 +0200 (CEST) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk by mail.vx.sk (amavisd-new, unix socket) with LMTP id y2uVPNJv8tZv; Sun, 30 Mar 2014 10:27:08 +0200 (CEST) Received: from [192.168.2.103] (dslb-092-078-029-103.pools.arcor-ip.net [92.78.29.103]) by mail.vx.sk (Postfix) with ESMTPSA id A82A21B981; Sun, 30 Mar 2014 10:27:07 +0200 (CEST) Message-ID: <5337D55A.6030607@FreeBSD.org> Date: Sun, 30 Mar 2014 10:27:06 +0200 From: Martin Matuska User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: Nikos Vassiliadis , Mikolaj Golub Subject: CFR projects/pf: vnet awareness for pf_overloadqueue X-Enigmail-Version: 1.5.2 Content-Type: multipart/mixed; boundary="------------070708030109060004010007" Cc: freebsd-pf@FreeBSD.org X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Mar 2014 08:27:21 -0000 This is a multi-part message in MIME format. --------------070708030109060004010007 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hi, with the pf_mtag_z patch applied, the second patch that fixes panics I experience is the overload queue patch. I have looked into solving this via context (adding a "struct vnet" member to pf_overload_entry). This leaves unsolved problems - first, we have now vnet information on per-entry instead of per-queue. There are two places in pf_overload_task() where we are not processing an entry but need vnet information: 1. V_pf_idhash[i] in pf_overload_task(): for (int i = 0; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; struct pf_state_key *sk; struct pf_state *s; PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { 2. end of pf_overload_task() but that is only the debug tunable V_pf_status_debug: if (V_pf_status.debug >= PF_DEBUG_MISC) printf("%s: %u states killed", __func__, killed); On the other hand, if we want to keep per-vnet overloadqueues than it makes sense to store vnet information on queue level. If we pack vnet information into each entry and the overloadqueue has global locks anyway, why not keeping a single global queue with entries from different vnets? At the current state the code causes panics if pf_overload_task() is fired because vnet context is missing. It needs to be fixed in any of the ways. A patch for adding per-queue vnet information is attached. Thank you. mm --------------070708030109060004010007 Content-Type: text/x-patch; name="pf_overloadqueue.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="pf_overloadqueue.patch" Index: projects/pf/head/sys/netpfil/pf/pf.c =================================================================== --- projects/pf/head/sys/netpfil/pf/pf.c (revision 263908) +++ projects/pf/head/sys/netpfil/pf/pf.c (working copy) @@ -173,7 +173,11 @@ struct pf_overload_entry { struct pf_rule *rule; }; -SLIST_HEAD(pf_overload_head, pf_overload_entry); +struct pf_overload_head { + SLIST_HEAD(, pf_overload_entry) head; + struct vnet *vnet; +}; + static VNET_DEFINE(struct pf_overload_head, pf_overloadqueue); #define V_pf_overloadqueue VNET(pf_overloadqueue) static VNET_DEFINE(struct task, pf_overloadtask); @@ -512,7 +516,7 @@ pf_src_connlimit(struct pf_state **state) pfoe->rule = (*state)->rule.ptr; pfoe->dir = (*state)->direction; PF_OVERLOADQ_LOCK(); - SLIST_INSERT_HEAD(&V_pf_overloadqueue, pfoe, next); + SLIST_INSERT_HEAD(&V_pf_overloadqueue.head, pfoe, next); PF_OVERLOADQ_UNLOCK(); taskqueue_enqueue(taskqueue_swi, &V_pf_overloadtask); @@ -529,11 +533,13 @@ pf_overload_task(void *c, int pending) PF_OVERLOADQ_LOCK(); queue = *(struct pf_overload_head *)c; - SLIST_INIT((struct pf_overload_head *)c); + SLIST_INIT(&((struct pf_overload_head *)c)->head); PF_OVERLOADQ_UNLOCK(); + CURVNET_SET(queue.vnet); + bzero(&p, sizeof(p)); - SLIST_FOREACH(pfoe, &queue, next) { + SLIST_FOREACH(pfoe, &queue.head, next) { V_pf_status.lcounters[LCNT_OVERLOAD_TABLE]++; if (V_pf_status.debug >= PF_DEBUG_MISC) { printf("%s: blocking address ", __func__); @@ -565,16 +571,19 @@ pf_overload_task(void *c, int pending) /* * Remove those entries, that don't need flushing. */ - SLIST_FOREACH_SAFE(pfoe, &queue, next, pfoe1) + SLIST_FOREACH_SAFE(pfoe, &queue.head, next, pfoe1) if (pfoe->rule->flush == 0) { - SLIST_REMOVE(&queue, pfoe, pf_overload_entry, next); + SLIST_REMOVE(&queue.head, pfoe, pf_overload_entry, + next); free(pfoe, M_PFTEMP); } else V_pf_status.lcounters[LCNT_OVERLOAD_FLUSH]++; /* If nothing to flush, return. */ - if (SLIST_EMPTY(&queue)) + if (SLIST_EMPTY(&queue.head)) { + CURVNET_RESTORE(); return; + } for (int i = 0; i <= pf_hashmask; i++) { struct pf_idhash *ih = &V_pf_idhash[i]; @@ -584,7 +593,7 @@ pf_overload_task(void *c, int pending) PF_HASHROW_LOCK(ih); LIST_FOREACH(s, &ih->states, entry) { sk = s->key[PF_SK_WIRE]; - SLIST_FOREACH(pfoe, &queue, next) + SLIST_FOREACH(pfoe, &queue.head, next) if (sk->af == pfoe->af && ((pfoe->rule->flush & PF_FLUSH_GLOBAL) || pfoe->rule == s->rule.ptr) && @@ -599,10 +608,12 @@ pf_overload_task(void *c, int pending) } PF_HASHROW_UNLOCK(ih); } - SLIST_FOREACH_SAFE(pfoe, &queue, next, pfoe1) + SLIST_FOREACH_SAFE(pfoe, &queue.head, next, pfoe1) free(pfoe, M_PFTEMP); if (V_pf_status.debug >= PF_DEBUG_MISC) printf("%s: %u states killed", __func__, killed); + + CURVNET_RESTORE(); } /* @@ -803,8 +814,9 @@ pf_vnet_initialize() /* Send & overload+flush queues. */ STAILQ_INIT(&V_pf_sendqueue); - SLIST_INIT(&V_pf_overloadqueue); + SLIST_INIT(&V_pf_overloadqueue.head); TASK_INIT(&V_pf_overloadtask, 0, pf_overload_task, &V_pf_overloadqueue); + V_pf_overloadqueue.vnet = curvnet; /* Unlinked, but may be referenced rules. */ TAILQ_INIT(&V_pf_unlinked_rules); @@ -1680,7 +1692,7 @@ pf_purge_unlinked_rules() * an already unlinked rule. */ PF_OVERLOADQ_LOCK(); - if (!SLIST_EMPTY(&V_pf_overloadqueue)) { + if (!SLIST_EMPTY(&V_pf_overloadqueue.head)) { PF_OVERLOADQ_UNLOCK(); return; } --------------070708030109060004010007--