From owner-freebsd-stable@FreeBSD.ORG Fri Jul 28 01:07:56 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2E49716A4DF for ; Fri, 28 Jul 2006 01:07:56 +0000 (UTC) (envelope-from drosih@rpi.edu) Received: from smtp6.server.rpi.edu (smtp6.server.rpi.edu [128.113.2.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id E181143D7C for ; Fri, 28 Jul 2006 01:07:47 +0000 (GMT) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp6.server.rpi.edu (8.13.1/8.13.1) with ESMTP id k6S17kab005659 for ; Thu, 27 Jul 2006 21:07:46 -0400 Mime-Version: 1.0 Message-Id: Date: Thu, 27 Jul 2006 21:07:45 -0400 To: freebsd-stable@freebsd.org From: Garance A Drosihn Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-CanItPRO-Stream: default X-RPI-SA-Score: undef - spam-scanning disabled X-Scanned-By: CanIt (www . canit . ca) Subject: Weird problems with 'pf' (on both 5.x and 6.x) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Jul 2006 01:07:56 -0000 It happens that I noticed two odd networking problems recently. One of them is easily reproducible, and I have it tracked down to one innocuous-looking line in my /etc/pf.conf. The other is a problem in a chat server that I run, with a few hundred people on it, and is much more of a hassle to reproduce. But turning off 'pf' to solve the first problem seems to have also solved the second problem, so I assume both problems come from the same culprit. Once I figured out how to reproduce the problem, it seems so easy to reproduce that I find it odd that no one else has run into it. But I also do not notice any PR's that seemed to describe the problem. I'd appreciate it if people would try to duplicate the problem on some other machines. This problem has been seen on: 5.x-stable as built on Mon Jul 24 6.x-stable as built on Mon Jul 17 (as well as several earlier snapshots of both 5.x and 6.x). I have a freebsd box which is the server for a print queue named 'bill', and is running pf. I have other machines which reference that queue. It seems that machines on the same subnet as the server-box do not exhibit the problem. But for other machines, if I do 'lpq -Pbill' twice in rapid succession, then the second one will hang. After some futzing around, I determined that if my pf.conf has only the lines: # Filtering: the implicit first two rules are #pass in all #pass out all then I can do many many lpq's in a row, without any trouble. But if I restart pf after adding these lines to pf.conf: # Allow all outgoing tcp and udp connections and keep state pass out quick proto { tcp, udp } all keep state then I have the problem where the second 'lpq' from a remote host will hang, if it is done right after the first one. That's right. I add a rule which just does "quick passing" for *outbound* connections, and somehow that screws up (blocks?) *incoming* connections. I have no rules which should block any packets at all, so my guess is that some packets are getting lost, delayed, or corrupted somewhere. -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu