From: Daniel Pocock
Date: Mon, 19 Sep 2005 15:02:39 +0100
To:
Subject: FreeBSD, quagga (BGP) and 2950 VLANs

Hi,

I've been told that FreeBSD performs routing computations in linear time, even with large routing tables (such as from BGP), and that it is therefore superior to Linux for use as a border router. Is this so, and are there any specific documents I should review about the performance of FreeBSD routing? As I haven't used FreeBSD before (I've been using Debian for about 10 years), I am wanting to make sure my expectations are not unreasonable. I've discovered that there is the 4.11 release and the 5.4 release. Are there any compelling reasons why I should choose one of these over the other, for my intended application? The only application I will be running is quagga. I'm planning to connect the FreeBSD server to a trunk port on a Cisco 2950 and put each interconnected IP provider into a separate VLAN. The documentation I've read so far suggests that FreeBSD is happy with VLANs - will this arrangement work and will it have any significant effect on performance? Director
London Voice and Data Exchange Limited From: Charles Swiger
Date: Mon, 19 Sep 2005 13:17:22 -0400
To: Daniel Pocock
Subject: Re: FreeBSD, quagga (BGP) and 2950 VLANs

On Sep 19, 2005, at 10:02 AM, Daniel Pocock wrote:
> I've been told that FreeBSD performs routing computations in linear
> time, even with large routing tables (such as from BGP), and that
> it is therefore superior to Linux for use as a border router. Is > this so, and are there any specific documents I should review about > the performance of FreeBSD routing? I believe FreeBSD uses a radix lookup for the routing table which is O (1); I don't know enough about the implementation in Linux to make claims about one platform being superior. > I've discovered that there is the 4.11 release and the 5.4 > release. Are there any compelling reasons why I should choose one > of these over the other, for my intended application? The only > application I will be running is quagga. If you are setting up a new system, you should go with 5.4. 4.11 is older and thus extremely well-tested by now, and might arguably be a bit more reliable, but 5.4 has better support for ACPI and recent hardware, as well as a significantly better SMP implementation. > I'm planning to connect the FreeBSD server to a trunk port on a > Cisco 2950 and put each interconnected IP provider into a separate > VLAN. The documentation I've read so far suggests that FreeBSD is > happy with VLANs - will this arrangement work and will it have any > significant effect on performance? This ought to work fine, but you might want to make sure your NICs supports VLAN_MTU and VLAN_HWTAGGING options to help offload some of the work: bge0: flags=8802 mtu 1500 options=1a "man 4 vlan" has a more complete discussion, including a list of NICs which have this kind of hardware support. --
-Chuck We use this setup (with 4.x and 5.x as core routers). We do this since approx. 1996 (in those days with gated and fbsd 2.x and without the VLAN stuff), so it's very solid from our point of view.

--
MfG/Best regards, Kurt Jaeger From: Roman Volf
Date: Mon, 19 Sep 2005 14:39:31 -0700
Subject: Re: FreeBSD, quagga (BGP) and 2950 VLANs Kurt Jaeger wrote:
>Hallo,
>
>
>>I'm planning to connect the FreeBSD server to a trunk port on a Cisco
>>2950 and put each interconnected IP provider into a separate VLAN. The >>documentation I've read so far suggests that FreeBSD is happy with VLANs >>- will this arrangement work and will it have any significant effect on >>performance? >> >> > >We use this setup (with 4.x and 5.x as core routers). > >We do this since approx. 1996 (in those days with gated and fbsd 2.x and >without the VLAN stuff), so it's very solid from our point of view. > > > What kind of throughput do you get using FreeBSD and what kind of hardware? Doing a straight FTP transfer from one server to another through a CIsco 3640 seems to cap at about 40 Mbits/second so I was wondering how that compares to a x86 system running FreeBSD. -- Roman Volf Keystreams Internet Solutions From owner-freebsd-isp@FreeBSD.ORG Mon Sep 19 21:46:19 2005 Return-Path: X-Original-To: Delivered-To: Received: from ( []) by (Postfix) with ESMTP id DBA5C16A41F for ; Mon, 19 Sep 2005 21:46:19 +0000 (GMT) (envelope-from Received: from ( []) by (Postfix) with ESMTP id 83D5943D46 for ; Mon, 19 Sep 2005 21:46:19 +0000 (GMT) (envelope-from Received: from lists by with local (Exim 4.43) id 1EHTSs-000BoI-I1; Mon, 19 Sep 2005 23:46:18 +0200 Date: Mon, 19 Sep 2005 23:46:18 +0200 From: Kurt Jaeger To: Roman Volf Message-ID: <> References: <> <> <> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <> Cc: Subject: Re: FreeBSD, quagga (BGP) and 2950 VLANs X-BeenThere: X-Mailman-Version: 2.1.5 Precedence: list List-Id: Internet Services Providers List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2005 21:46:20 -0000 Hello, > >>I'm planning to connect the FreeBSD server to a trunk port on a Cisco > >>2950 and put each interconnected IP provider into a separate VLAN. > What kind of throughput do you get using FreeBSD and what kind of > hardware? Hardware:
CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (2998.57-MHz 686-class CPU)
real memory = 2146631680 (2096320K bytes)
12 fxp interfaces, 8 vlans on some of those fxp interfaces working as

Throughput: 100mbit peak was possible. All the hardware is 100mbit.
We currently prepare a gigE testbed to spread the load.

> Doing a straight FTP transfer from one server to another
> through a CIsco 3640 seems to cap at about 40 Mbits/second so I was
> wondering how that compares to a x86 system running FreeBSD.

The tests I made using some barebone hardware etc
seems to max out around 500 mbit/sec. It wasn't a full-blown BGP setup,
far from it. More seems easily be possible, but we still need
to test.

--
MfG/Best regards, Kurt Jaeger GmbH                fon +49 711 90074-23
Ruppmannstr. 27          fax +49 711 90074-33
D-70565 Stuttgart        mob +49 171 3101372 It wasn't a full-blown BGP setup, > far from it. More seems easily be possible, but we still need > to test. If you use ftp, the limit might be the IO from/to disk, not the cisco throughput. Use ttcp to test the transfer limits, not ftp. -- MfG/Best regards, Kurt Jaeger 15 years to go ! GmbH                fon +49 711 90074-23
Ruppmannstr. 27          fax +49 711 90074-33
D-70565 Stuttgart        mob +49 171 3101372 It wasn't a full-blown BGP setup, > far from it. More seems easily be possible, but we still need > to test. > 500mbit/sec throughput? that would be 1 gbit/sec input + output? can the PCI bus do any better than that? -- Sten Daniel Sørsdal From owner-freebsd-isp@FreeBSD.ORG Mon Sep 19 23:09:02 2005 Return-Path: X-Original-To: Delivered-To: Received: from ( []) by (Postfix) with ESMTP id A26DB16A41F for ; Mon, 19 Sep 2005 23:09:02 +0000 (GMT) (envelope-from Received: from ( []) by (Postfix) with ESMTP id C5C3243D46 for ; Mon, 19 Sep 2005 23:09:01 +0000 (GMT) (envelope-from Received: from [] ( by with esmtp (Exim 4.30) id 1EHUku-0005QV-7B for; Mon, 19 Sep 2005 23:09:00 +0000 Received: from [] (localhost []) by (Postfix) with ESMTP id B90FEDEC23 for ; Tue, 20 Sep 2005 00:08:55 +0100 (BST) Message-ID: <> Date: Tue, 20 Sep 2005 00:08:55 +0100 From: Daniel Pocock User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050513 Debian/1.7.8-1 X-Accept-Language: en MIME-Version: 1.0 To: References: <> <> <> <> <> In-Reply-To: <> Content-Type: multipart/signed; protocol="application/x-pkcs7-signature"; micalg=sha1; boundary="------------ms050300070800010409040502" X-Originating-Pythagoras-IP: [] Subject: Re: FreeBSD, quagga (BGP) and 2950 VLANs X-BeenThere: X-Mailman-Version: 2.1.5 Precedence: list List-Id: Internet Services Providers List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Sep 2005 23:09:02 -0000 This is a cryptographically signed message in MIME format. --------------ms050300070800010409040502 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Kurt Jaeger wrote: >Hi! > > > >>>Doing a straight FTP transfer from one server to another >>>through a CIsco 3640 seems to cap at about 40 Mbits/second so I was >>>wondering how that compares to a x86 system running FreeBSD. >>> >>> >>The tests I made using some barebone hardware etc >>seems to max out around 500 mbit/sec. It wasn't a full-blown BGP setup, >>far from it. More seems easily be possible, but we still need >>to test. >> >> > >If you use ftp, the limit might be the IO from/to disk, not the cisco >throughput. > >Use ttcp to test the transfer limits, not ftp. > > > > > Thanks for all the great answers. I think that packets per second is just as important as megabits per second when evaluating routing. We are a wholesale VoIP operator, so we switch many small packets (less than 100 bytes each) - it takes almost as much CPU power to make routing decisions for a 100 byte packet as for a 1500 byte FTP packet. You will find much of the Cisco specs talk about throughput in packets per second (pps, or kpps). The NIC I am using is the built in Intel i82555 in a DL360 1U server. According to vlan(4), the fxp driver doesn't do native VLAN, but can do large MTU for VLAN tagging. This should give enough performance for our initial requirements (up to 10mbit) and I'll review it later. I'm also curious about whether FreeBSD supports polled rather than interrupt driven behaviour in the NIC driver - that means that the system won't keep on re-entering an interrupt handler concurrently while under load (when a DoS attack is in progress). The server will be set up tommorrow - I'll do some tests and publish my experiences on my web site, as I know several of my customers want to duplicate this for their own redundancy plans. Regards,

Daniel
--------------------------------------
Director
London Voice and Data Exchange Limited From: Chuck Swiger
Date: Mon, 19 Sep 2005 19:30:26 -0400
To: Daniel Pocock
Subject: Re: FreeBSD, quagga (BGP) and 2950 VLANs

Daniel Pocock wrote:
[ ... ]
> I'm also curious about whether FreeBSD supports polled rather than
> interrupt driven behaviour in the NIC driver - that means that the
> system won't keep on re-entering an interrupt handler concurrently while
> under load (when a DoS attack is in progress). Indeed it does, see "man polling". Make sure you increase HZ to at least 1000...

--
-Chuck From: "Fretz Marco"
Date: Wed, 21 Sep 2005 13:10:59 +0200
Subject: AW: HP DL140 with SATA

Hello

Thanks for your help. Is this Controller >supported in Free 5.4? > =20 > It is a normal SATA non-raid kontroller, I am running a dl140 g2 with a=20 genom mirror. working great with 5.4 and 6.0 and 6.0/amd64 -- Rasmus Fauske _______________________________________________ mailing list To unsubscribe, send any mail to "" From owner-freebsd-isp@FreeBSD.ORG Wed Sep 21 15:06:54 2005 Return-Path: X-Original-To: Delivered-To: Received: from ( []) by (Postfix) with ESMTP id 7F47716A41F for ; Wed, 21 Sep 2005 15:06:54 +0000 (GMT) (envelope-from Received: from ( []) by (Postfix) with ESMTP id C224A43D46 for ; Wed, 21 Sep 2005 15:06:51 +0000 (GMT) (envelope-from X-SpamCatcher-Score: 32 [X] Received: from [] (HELO by (CommuniGate Pro SMTP 4.2.10) with ESMTP id 8047095 for; Wed, 21 Sep 2005 17:06:48 +0200 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Date: Wed, 21 Sep 2005 17:06:47 +0200 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: AW: AW: HP DL140 with SATA Thread-Index: AcW+nwGhHGgD0LznTK+Bou0i+HN6ZgAHs+NA From: "Fretz Marco" To: "Rasmus Fauske" , Cc: Subject: AW: AW: AW: HP DL140 with SATA X-BeenThere: X-Mailman-Version: 2.1.5 Precedence: list List-Id: Internet Services Providers List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 Sep 2005 15:06:54 -0000 Hello news: no chance to get the SATA ICH5 run on 6.0 or 5.4. i got only = ATA_IDENTIFY time outs I tried freebsd 4.11 > no problem Why this? Anyone any idea? thanks -----Urspr=FCngliche Nachricht----- Von: Rasmus Fauske []=20 Gesendet: Mittwoch, 21. September 2005 13:24 An: Fretz Marco Betreff: Re: AW: AW: HP DL140 with SATA It was working when I tried it, but I only used it fo some minutes. Fretz Marco wrote: >Hi > >Thanks. Ok but 6.0 isnt stable yet. So i need to run 5.4. you'r sure, = you have a DL140 G2 and SATA controller was working with free 5.4?=20 > >-----Urspr=FCngliche Nachricht----- >Von: Rasmus Fauske []=20 >Gesendet: Mittwoch, 21. September 2005 13:12 >An: Fretz Marco >Betreff: Re: AW: HP DL140 with SATA >Wichtigkeit: Hoch > > =20 > >>Hello >> >>Thanks for your help. I had a dl140 g2 here some days ago. I booted = from a >>5.4-release cd and got no support for this controller. May you send me >>some kernel output? Dmesg? >> =20 >> > >This is the output for 6.0, I don't have for 5.4 any more as this box = is >in production now > >Copyright (c) 1992-2005 The FreeBSD Project. >Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, = 1994 > The Regents of the University of California. Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved. September 2005 14:14 >>An: >>Betreff: Re: HP DL140 with SATA >> >>Fretz Marco wrote: >> >> =20 >> >>>Hi there >>> >>>We are an ips and looking for some pizza box server from HP. Do you = got >>>any experiences with the HP DL140 with SATA Raid? Is this Controller >>>supported in Free 5.4? >>> >>> >>> =20 >>> >>It is a normal SATA non-raid kontroller, I am running a dl140 g2 with = a >>genom mirror. working great with 5.4 and 6.0 and 6.0/amd64 >> >>-- >>Rasmus Fauske >>_______________________________________________ >> mailing list >> >>To unsubscribe, send any mail to "" >> >> =20 >> > > > =20 > From owner-freebsd-isp@FreeBSD.ORG Fri Sep 23 21:30:14 2005 Return-Path: X-Original-To: Delivered-To: Received: from ( []) by (Postfix) with ESMTP id 54EF516A426 for ; Fri, 23 Sep 2005 21:30:09 +0000 (GMT) (envelope-from Received: from ( []) by (Postfix) with SMTP id 760F043D53 for ; Fri, 23 Sep 2005 21:30:08 +0000 (GMT) (envelope-from Received: from (EHLO repackage) by with SMTP; Fri, 23 Sep 2005 23:30:06 +0200 id 1989829755amortize72270 for; Fri, 23 Sep 2005 23:30:06 +0200 Mime-Version: 1.0 (Apple Message framework v728) Content-Transfer-Encoding: 7bit Message-Id: <> Content-Type: text/plain; charset=US-ASCII; format=flowed To: From: Christie Date: Fri, 23 Sep 2005 23:30:05 +0200 X-Mailer: Apple Mail (2.728) Subject: No need to pay more - cheapest OEM online. I can think of a number of options, but before changing what I'm doing at the moment I'd like to see if anyone has good experiences with any of the others. The application: a clustered webserver. The users' CGIs run in a chroot environment, and these clearly need to be identical (otherwise a CGI running on one box would behave differently when running on a different box). Ultimately I'd like to synchronise the host OS on each server too. Note that this is a single-master, multiple-slave type of filesystem synchronisation I'm interested in. 1. Keep a master image on an admin box, and rsync it out to the frontends ------------------------------------------------------------------------- This is what I'm doing at the moment. Install a master image in /webroot/cgi, add packages there (chroot /webroot/cgi pkg_add ...), and rsync it. [Actually I'm exporting it using NFS, and the frontends run rsync locally when required to update their local copies against the NFS master] Disadvantages: - rsyncing a couple of gigs of data is not particularly fast, even when only a few files have changed - if a sysadmin (wrongly) changes a file on a front-end instead of on the master copy in the admin box, then the change will be lost when the next rsync occurs. They might think they've fixed a problem, and then (say) 24 hours later their change is wiped. However if this is a config file, the fact that the old file has been reinstated might not be noticed until the daemon is restarted or the box rebooted - maybe months later. This I think is the biggest fundamental problem. - files can be added locally and they will remain indefinitely (unless we use rsync --delete which is a bit scary). If this is done then adding a new machine into the cluster by rsyncing from the master will not pick up these extra files. So, here are the alternatives I'm considering, and I'd welcome any additional suggestions too. 2. Run the images directly off NFS ---------------------------------- I've had this running before, even the entire O/S, and it works just fine. However the NFS server itself then becomes a critical single-point-of-failure: if it has to be rebooted and is out of service for 2 minutes, then the whole cluster is out of service for that time. I think this is only feasible if I can build a highly-available NFS server, which really means a pair of boxes serving the same data. Since the system image is read-only from the point of view of the frontends, this should be easy enough: frontends frontends | | | | | | NFS -----------> NFS server 1 sync server 2 As far as I know, NFS clients don't support the idea of failing over from one server to another, so I'd have to make a server pair which transparently fails over. I could make one NFS server take over the other server's IP address using carp or vrrp. However, I suspect that the clients might notice. I know that NFS is 'stateless' in the sense that a server can be rebooted, but for a client to be redirected from one server to the other, I expect that these filesytems would have to be *identical*, down to the level of the inode numbers being the same. If that's true, then rsync between the two NFS servers won't cut it. I was thinking of perhaps using geom_mirror plus ggated/ggatec to make a block-identical read-only mirror image on NFS server 2 - this also has the advantage that any updates are close to instantaneous. What worries me here is how NFS server 2, which has the mirrored filesystem mounted read-only, will take to having the data changed under its nose. Does it for example keep caches of inodes in memory, and what would happen if those inodes on disk were to change? I guess I can always just unmount and remount the filesystem on NFS server 2 after each change. My other concern is about susceptibility to DoS-type attacks: if one frontend were to go haywire and start hammering the NFS servers really hard, it could impact on all the other machines in the cluster. However, the problems of data synchronisation are solved: any change made on the NFS server is visible identically to all front-ends, and sysadmins can't make changes on the front-ends because the NFS export is read-only. 3. Use a network distributed filesystem - CODA? AFS? ---------------------------------------------------- If each frontend were to access the filesystem as a read-only network mount, but have a local copy to work with in the case of disconnected operation, then the SPOF of an NFS server would be eliminated. However, I have no experience with CODA, and although it's been in the tree since 2002, the README's don't inspire confidence: "It is mostly working, but hasn't been run long enough to be sure all the bugs are sorted out. ... This code is not SMP ready" Also, a local cache is no good if the data you want during disconnected operation is not in the cache at that time, which I think means this idea is not actually a very good one. 4. Mount filesystems read-only ------------------------------ On each front-end I could store /webroot/cgi on a filesystem mounted read-only to prevent tampering (as long as the sysadmin doesn't remount it read-write of course). That would work reasonably well, except that being mounted read-only I couldn't use rsync to update it! It might also work with geom_mirror and ggated/ggatec, except for the issue I raised before about changing blocks on a filesystem under the nose of a client who is actively reading from it. 5. Using a filesystem which really is read-only ----------------------------------------------- Better tamper-protection could be had by keeping data in a filesystem structure which doesn't support any updates at all - such as cd9660 or geom_uzip. The issue here is how to roll out a new version of the data. I could push out a new filesystem image into a second partition, but it would then be necessary to unmount the old filesystem and remount the new on the same place, and you can't really unmount a filesystem which is in use. So this would require a reboot. I was thinking that some symlink trickery might help: /webroot/cgi -> /webroot/cgi1 /webroot/cgi1 # filesystem A mounted here /webroot/cgi2 # filesystem B mounted here It should be possible to unmount /webroot/cgi2, dd in a new image, remount it, and change the symlink to point to /webroot/cgi2. After a little while, hopefully all the applications will stop using files in /webroot/cgi1, so this one can be unmounted and a new one put in its place on the next update. However this is not guaranteed, especially if there are long-lived processes using binary images in this partition. You'd still have to stop and restart all those processes. If reboots were acceptable, then the filesystem image could also be stored in ramdisk pulled in via pxeboot. This makes sense especially for geom_uzip where the data is pre-compressed. However I would still prefer to avoid frequent reboots if at all possible. Also, whilst a ramdisk might be OK for the root filesystem, a typical CGI environment (with perl, php, ruby, python, and loads of libraries) would probably be too large anyway. 6. Journaling filesystem replication ------------------------------------ If the data were stored on a journaling filesystem on the master box, and the journal logs were distributed out to the slaves, then they would all have identical filesystem copies and only a minimal amount of data would need to be pushed out to each machine on each change. (This would be rather like NetApps and their snap-mirroring system). However I'm not aware of any journaling filesystem for FreeBSD, let alone whether it would support filesystem replication in this way. Well, that's what I've come up with so far. I'd be very interested to hear if people have any other strategies or suggestions, particularly with practical experience in a clustered/ISP environment. Regards, Brian Candler.