From owner-freebsd-pf@freebsd.org Wed Aug 26 10:09:19 2015 Return-Path: Delivered-To: freebsd-pf@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 30A009C3132 for ; Wed, 26 Aug 2015 10:09:19 +0000 (UTC) (envelope-from ml@my.gd) Received: from mail-lb0-f172.google.com (mail-lb0-f172.google.com [209.85.217.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 89B101F32 for ; Wed, 26 Aug 2015 10:09:17 +0000 (UTC) (envelope-from ml@my.gd) Received: by lbbpu9 with SMTP id pu9so116864196lbb.3 for ; Wed, 26 Aug 2015 03:09:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=hnz4o3GPPp3Rff/vOOmv2+PnCEz5memcTs4DDS84cX8=; b=btSFolqyhLBSY0FgcPjHNsG+cw1gRRtgiRAg0dNQXDfQenXm+GPV+TaJUOlJ4ZLO1I nz3u5KSeMxx1OzaEX8vPPBROb/RTfCR/d5QZ83KKUCQ8oFsn7E94MRF7XGJ921J47CRQ JZc5UuXxo0mrIWVl5v8JtpnO6xzcEeDjsiwF9lFTI5wyVxMtk4XeRi45InyRQ5P2R+Tg 0h1JdymmORiXfgDsd6msgozQOSsnFzsT7ObTlxmaGJwi86HY/8E52AImoiad9GPBNoIY 2X0iMPDSwwiPTrgcBs8GjUShHxybVuPUFeQToaX+n9fv4eJ2I6TM2bDgnSNdEcibNG2Q KB9g== X-Gm-Message-State: ALoCoQlOGecPIoCa3mBJZzzjOEhwVeNnhBCJbsT++CVCk1DlIizwe1+RPDzT1kQj/BcSIWNug4NG MIME-Version: 1.0 X-Received: by 10.152.22.4 with SMTP id z4mr30359566lae.81.1440583750166; Wed, 26 Aug 2015 03:09:10 -0700 (PDT) Received: by 10.112.60.34 with HTTP; Wed, 26 Aug 2015 03:09:10 -0700 (PDT) In-Reply-To: References: Date: Wed, 26 Aug 2015 12:09:10 +0200 Message-ID: Subject: Re: Machine freezes when loading pf ruleset From: Damien Fleuriot To: Kolontai Andrej Cc: "freebsd-pf@freebsd.org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-pf@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Technical discussion and general questions about packet filter \(pf\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Aug 2015 10:09:19 -0000 On 25 August 2015 at 17:55, Kolontai Andrej < Andrej.Kolontai@verwaltung.uni-muenchen.de> wrote: > Hello, > > I'm new to this list and I hope it's the right place to ask. > > We have highly utilized installation of two FreeBSD-machines running > 10.1-RELEASE, pf and carp. There are about 50 networks (some via vlan, so= me > ipsec) connected to them, usually about 50000 pf states, about 1500 rules > and traffic sometimes hitting 1 gbit/s or more. > > When we load the ruleset on the active machine it sometimes freezes. It > occurs more often as our installation grows. Sometimes it freezes > completely, which is actually good as the backup machine takes over. But = at > other times it's still some kind of alive but unbearably slow (ping is > about 40 s instead of the usual 0.2 ms), the backup machine does not take > over and people start to get angry. > The crashed machine usually recovers after a couple of minutes. > > We've been trying to resolve this for some months now and are running out > of ideas. We do believe that something is wrong with or setup but have no > clue what it could be. > > When it happens, the syslogs show tons of messages like these: > > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.101.87 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.101.87 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.107.206 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.101.87 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.107.206 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.101.87 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.106.171 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.106.171 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.106.72 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.106.58 to 141.84.149.19 > Aug 24 09:08:39 applej last message repeated 2 times > Aug 24 09:08:39 applej kernel: pf: BAD state: TCP in wire: > 10.200.108.141:49202 172.23.2.92:8192 stack: - [lo=3D3482585550 > high=3D3482585552 win=3D8192 modulator=3D0pf_map_addr: src tracking maps > ]172.23.106.72 [lo=3D0 hi > gh=3D1 win=3D1 modulator=3D0 to ]141.84.149.18 2:0 > Aug 24 09:08:39 applej kernel: S seq=3D3503222902 (3503222902) ack=3D0 le= n=3D0 > ackskew=3D0 pkts=3D1:1 dir=3Din,fwd > Aug 24 09:08:39 applej kernel: pf: State failure on: 1 | 5 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.106.171 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.107.20 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps > 172.23.101.87 to 141.84.149.18 > Aug 24 09:08:39 applej kernel: pf_map_addr: src tracking maps pf_map_addr= : > src tracking maps pf_map_addr: src tracking maps > 172.23.106.171172.23.106.171 to 172.23.106.171141.84.149.18 to to > Aug 24 09:08:39 applej kernel: 141.84.149.18141.84.149.18 > Aug 24 09:08:39 applej kernel: > Aug 24 09:08:40 applej kernel: pf_map_addr: src tracking maps pf_map_addr= : > src tracking maps 172.23.106.171172.23.101.87 to > Aug 24 09:08:40 applej kernel: to 141.84.149.18141.84.149.18 > Aug 24 09:08:40 applej kernel: > Aug 24 09:08:40 applej kernel: pf: OK ICMP 3:10 141.84.149.249 -> > 10.153.101.239 state: TCP in wire: 141.84.149.249:37677 10.153.101.239 > Aug 24 09:08:40 applej kernel: :80 stack: - [lo=3D4171141930 high=3D41711= 59337 > win=3D106 modulator=3D0 wscale=3D7] [lo=3D2653133771 high=3D2653147339 wi= n=3D136 > modulator=3D0 wscale=3D7] 10:10 seq=3D2653133771 > Aug 24 09:08:40 applej kernel: pf: OK ICMP 3:10 141.84.149.249 -> > 10.153.101.239 state: TCP out wire: 10.153.101.239:80 141.84.149.249 > Aug 24 09:08:40 applej kernel: :37677 stack: - [lo=3D4171141930 > high=3D4171159337 win=3D106 modulator=3D0 wscale=3D7] [lo=3D2653133771 > high=3D2653147339 win=3D136 modulator=3D0 wscale=3D7] 10:10 seq=3D2653133= 771 > Aug 24 09:08:40 applej kernel: pf_map_addr: src tracking maps pf_map_addr= : > src tracking maps 172.23.106.184pf: wire key attach failed on vlan42: > 172.23.106.165pf_map_addr: src tracking maps > Aug 24 09:08:40 applej kernel: 112 to in141.84.149.18 wire: 172.23.106.5= 8 > Aug 24 09:08:40 applej kernel: 141.84.149.27 > Aug 24 09:08:40 applej kernel: to 224.0.0.18 to 1:0141.84.149.18, > existing: > Aug 24 09:08:40 applej kernel: 112141.84.149.19 in > Aug 24 09:08:40 applej kernel: wire: > Aug 24 09:08:40 applej kernel: 141.84.149.27 > Aug 24 09:08:40 applej kernel: 224.0.0.18 stack: 141.84.149.27 224.0.0.18 > 1:0 > Aug 24 09:08:40 applej kernel: pf: wire key attach failed on vlan42: 112 > in wire: 141.84.149.1 224.0.0.18 1:0, existing: > Aug 24 09:08:40 applej kernel: 112 in wire: 141.84.149.1 224.0.0.18 stack= : > 141.84.149.1 224.0.0.18 1:0 > Aug 24 09:08:40 applej kernel: > Aug 24 09:08:40 applej kernel: pf_map_addr: src tracking maps pf_map_addr= : > src tracking maps pf: wire key attach failed on vlan250: 172.23.100.108pf= : > OK ICMP 3:10 112 > Aug 24 09:08:40 applej kernel: 172.23.106.184 in to wire: to > 10.153.101.73141.84.149.18 224.0.0.18141.84.149.18 1:0141.84.149.198, > existing: > Aug 24 09:08:40 applej kernel: 112 in > Aug 24 09:08:40 applej kernel: pf_map_addr: src tracking maps > Aug 24 09:08:40 applej kernel: > Aug 24 09:08:40 applej kernel: wire: -> 10.153.101.73172.23.106.58 to > 224.0.0.18141.84.149.19 stack: > Aug 24 09:08:40 applej kernel: 10.153.101.73 > Aug 24 09:08:40 applej kernel: 77.7.121.109 state: 224.0.0.18TCP 1:0 > Aug 24 09:08:40 applej kernel: out wire: 141.84.149.198:80 > 77.7.121.109:52521 stack: > Aug 24 09:08:40 applej kernel: - [lo=3D2750816253 high=3D2750829820 win= =3D256 > modulator=3D0 wscale=3D8] [lo=3D2902946358 high=3D2903011894 win=3D106 mo= dulator=3D0 > Aug 24 09:08:40 applej kernel: wscale=3D7] 7:9 seq=3D2750816252 > Aug 24 09:08:40 applej kernel: pf: OK ICMP 3:10 141.84.149.198 -> > 77.7.121.109 state: TCP in wire: 77.7.121.109:52521 > Aug 24 09:08:40 applej kernel: 141.84.149.198:80 stack: - [lo=3D275081625= 3 > high=3D2750829820 win=3D256 modulator=3D0 > Aug 24 09:08:40 applej kernel: wscale=3D8] [lo=3D2902946358 high=3D290301= 1894 > win=3D106 modulator=3D0 wscale=3D7] 7:9 seq=3D2750816252 > .... > Aug 24 09:09:11 applej kernel: 4:7:80:1433 seq=3D1025128833 > Aug 24 09:09:11 applej kernel: 141.84.149.19:47806 stack: > 10.110.32.129-:50143 [lo=3D3790813207 high=3D3790831127 win=3D14600 modul= ator=3D0 > stack: wscale=3D8-] [lo=3D3961092366 high=3D3962143087 win=3D4106 modula= tor=3D0 > [lo=3D155 > 1251521 high=3D1554989121 win=3D17898 modulator=3D0 wscale=3D8 wscale=3D8= ]] > [lo=3D1005669588 high=3D1006720647 win=3D4106 modulator=3D0 4:2 wscale=3D= 8 > seq=3D1551251520 > Aug 24 09:09:11 applej kernel: ] 4:4 seq=3D1005669587 > Aug 24 09:09:11 applej kernel: pf: OK ICMP 3:1 172.23.254.250pf: OK ICMP > 3:1 -> 10.110.32.220172.23.106.87 -> state: 10.110.32.193TCP state: > inTCP wire: out172.23.106.87 wire: :5017310.110.32.193 :1433141.84 > .149.21 :844310.110.32.130 stack: :54454172.23.106.87 stack: :50173- > [lo=3D2704530546 high=3D2705581267 win=3D4106 modulator=3D0141.84.149.18 > wscale=3D8:8443] [lo=3D521335749 high=3D521353924 win=3D256 modulator=3D0= [lo=3D19884644 > 01 high=3D1989515460 win=3D4106 modulator=3D0 wscale=3D8 wscale=3D8]] [lo= =3D3203135938 > high=3D3203200962 win=3D71 modulator=3D0 4:4 wscale=3D8 seq=3D1988464400 > Aug 24 09:09:11 applej kernel: ] 7:4 seq=3D521335748 > Aug 24 09:09:11 applej kernel: pf: OK ICMP 3:1 pf: OK ICMP 3:1 pf: OK ICM= P > 3:1 pf: OK ICMP 3:1 pf: OK ICMP 3:1 pf: OK ICMP 3:1 > 172.23.254.250172.23.254.25010.110.32.220 -> -> 172.23.106.8710.110.32.1= 93 > state: > state: TCPTCP in out wire: wire: 172.23.106.8710.110.32.193:50168:1433 > 141.84.149.2110.110.32.129:8443:50175 stack: stack: 172.23.106.87-:50168 > [lo=3D1504649300 high=3D1505700021 win=3D4106 modulator=3D0 wscale=3D814= 1 > .84.149.18]:8443 [lo=3D723475428 high=3D724526487 win=3D4106 modulator=3D= 0 > [lo=3D975354116 high=3D975374851 win=3D256 modulator=3D0 wscale=3D8 wscal= e=3D8]] 4:4 > [lo=3D4251253613 high=3D4251317869 win=3D81 modulator=3D0 seq=3D723475427 > Aug 24 09:09:11 applej kernel: wscale=3D8] 7:4 seq=3D975354115 > Aug 24 09:09:11 applej kernel: 141.84.44.211 -> > 172.23.254.250217.86.168.236 -> state: 172.23.3.29TCP state: inTCP wire= : > 141.84.44.211217.86.168.236 -> :5098310.180.10.209 state: > 141.84.149.198TCP:80 in stack > : wire: -10.180.10.209 [lo=3D3709359805 high=3D3709359806 win=3D8192 > modulator=3D0 in:49223] [lo=3D0 high=3D8192 win=3D1 modulator=3D0141.84.= 149.244]:443 > 2:0 stack: seq=3D3709359804 > Aug 24 09:09:11 applej kernel: - [lo=3D827930303 high=3D827944894 win=3D6= 7 > modulator=3D0 wire: wscale=3D8172.23.3.29]:35196 [lo=3D1796383793 > high=3D1796400177 win=3D114 modulator=3D0 wscale=3D710.153.101.108]:6556= 10:10 > stack: s > eq=3D827930303 > Aug 24 09:09:11 applej kernel: - [lo=3D1991477221 high=3D1991477222 win= =3D14600 > modulator=3D0 -> ]172.23.101.172 [lo=3D0 high=3D14600 win=3D1 modulator= =3D0 state: > ]TCP 2:0 in seq=3D1991477220 > Aug 24 09:09:11 applej kernel: wire: 172.23.101.172:49613 > 141.84.149.21:8080 stack: 172.23.101.172:49613 141.84.149.19:8080 > [lo=3D3559829126 high=3D3559829127 win=3D8192 modulator=3D0] [lo=3D0 high= =3D8192 win=3D1 > modulator=3D0] > 2:0 seq=3D3559829125 > Aug 24 09:09:11 applej kernel: pf: OK ICMP 3:1 pf: OK ICMP 3:1 > 10.110.32.220141.84.44.211 -> pf_map_addr: src tracking maps -> > 172.23.101.8710.110.32.193 to 115.248.50.20141.84.149.19 state: > Aug 24 09:09:11 applej kernel: state: TCPTCP out in wire: wire: > 10.110.32.193115.248.50.20:1433:50426 10.110.32.130141.84.149.229:54455:= 80 > stack: stack: -- [lo=3D201188016 high=3D202238737 win=3D513 modulator=3D= 0 [lo > =3D2382456191 high=3D2382456193 win=3D2048 modulator=3D0 wscale=3D8]] [lo= =3D0 high=3D1 > win=3D1 modulator=3D0 [lo=3D3935828538 high=3D3935959789 win=3D4106 modul= ator=3D0] > wscale=3D8 2:0] seq=3D2382456191 > Aug 24 09:09:11 applej kernel: 4:4 seq=3D3935828537 > > Sorry if this is too much but I really can' t tell which of these > messages could help. Perhaps there isn't any useful information at all. T= o > me it seems it's just a side effect of the actual problem which I am unab= le > to see. > > Maybe somebody has an idea about what's going wrong and what to > investigate. Or theories right now are: > * issues with pfsync (messages complaining about states, but unlikely, > it's normally working just fine) > * some lack of ressources (yet, which?) > * race condition inside the kernel (there are 24 CPU cores) > > We have not yet been able to reproduce it in a lab environment. > > Right now, we work around this by putting the master into the backup stat= e > (using demotion) before loading the ruleset and changing back to master > state after the reload. This is working fine. But it kills, of course, th= e > IPSec-SAs. But that's something we can live with right now. > > > Viele Gr=C3=BC=C3=9Fe > Andrej Kolontai > > Ludwig-Maximilians-Universitaet Muenchen > Ref. VI.4 (IT-Sicherheit & Verzeichnisdienste) > Martiusstrasse 4 / 207 > 80802 Muenchen > > phone +49 (0)89 2180-3815 > email mailto:andrej.kolontai@verwaltung.uni-muenchen.de > web http://www.uni-muenchen.de/zuv/it/ > > 1.5k rules seems like a lot for PF to handle. Is that 1.5k rules you've written in the conf, or 1.5k rules from `pfctl -sr | wc -l' ? If that's merely your number of configuration lines, that may very well expand to many more rules in reality. For example "pass in quick on { $vlan2 $vlan3 } inet proto { $tcp $udp } from to port 53" will expand into 4 actual rules. I would suggest you find a way to drastically lower that. You may also wish to ensure : - you're using, if at all possible, a *dedicated* pfsync link (like a cable on a dedicated interface between the boxes) - you've got enough RAM to store your PF states As another, *unrelated* performance enhancement, make sure to use the "quick" keyword whenever applicable.