From owner-freebsd-current@FreeBSD.ORG Fri Nov 19 09:53:03 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4615616A4CE; Fri, 19 Nov 2004 09:53:03 +0000 (GMT) Received: from hotmail.com (bay2-dav7.bay2.hotmail.com [65.54.246.111]) by mx1.FreeBSD.org (Postfix) with ESMTP id 165FE43D5C; Fri, 19 Nov 2004 09:53:03 +0000 (GMT) (envelope-from tssajo@hotmail.com) Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Fri, 19 Nov 2004 01:53:01 -0800 Message-ID: Received: from 24.24.201.219 by BAY2-DAV7.phx.gbl with DAV; Fri, 19 Nov 2004 09:52:25 +0000 X-Originating-IP: [24.24.201.219] X-Originating-Email: [tssajo@hotmail.com] X-Sender: tssajo@hotmail.com From: "Zoltan Frombach" To: =?iso-8859-1?Q?S=F8ren_Schmidt?= , "Poul-Henning Kamp" , "Garance A Drosihn" References: <26249.1100342074@critter.freebsd.dk> <4195E5DB.2070302@DeepCore.dk> Date: Fri, 19 Nov 2004 01:52:27 -0800 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2180 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 X-OriginalArrivalTime: 19 Nov 2004 09:53:01.0559 (UTC) FILETIME=[8EA24870:01C4CE1D] cc: freebsd-current@freebsd.org cc: Robert Watson Subject: Re: 5.3-RELEASE: WARNING - WRITE_DMA interrupt timout X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Nov 2004 09:53:03 -0000 My problem is not related to a SATA controller. I use the onboard UDMA133 controller (pretty rare) with a Maxtor UDMA133 drive. It is a new ABIT motherboard that uses SiS chipset. The hard drive is not new, but previously I used it in UDMA100 mode only, with another motherboard. See: atapci0: port 0x4000-0x400f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 2.5 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 ad0: 78167MB [158816/16/63] at ata0-master UDMA133 Everything works pretty well on this server. Except that these DMA_WRITE warning messages make me worrying. However, I was not getting too many of them lately, and none since I installed Soren's patch a few hours ago. I also figured out why my system became so unresponsive at times. I host about 150 domains on this server, with email and everything. I use qmail as the MTA, and by default it accepts all email on all hosted domains, even when the mail is addressed to a non-existing user. It will try to bounce those messages but only later in the process. IMO, it is very poor design of the qmail MTA, an otherwise pretty powerful email program. I also use qmail-scanner with clamav and spamassassin. The qmail-scanner program and spamassassin are written in Perl. So every single message that qmail accepts gets through qmail-scanner (and therefore gets through clamav and spamassassin as well), even the ones that are addressed to non-existing users... Some of the hosted domains at times get hit really hard with extensive spam and around that time the server becomes very unresponsive. Not surprisingly though, because according to my maillog, time to time some spammer send literally hundreds of junk mail to non-existing users, all within a few seconds of time. Right then the server comes to a crawl. Last time, I couldn't access any hosted web sites via HTTP nor FTP for minutes. It took me like 3 minutes to be able to get in via SSH because of the slowness. Finally I was able to see the reason: all those Perl processes scanning the junk mail... The server became a victim of a DOS attack caused by excessive spam. So I believe that this was the reason of the unresponsiveness. And it could be the reason why I received those DMA_WRITE warnings at those times! I'm not a 100% sure about it though, but I think it is possible. I'm going to apply a patch to qmail in a few days. That makes qmail to reject messages sent to unexsiting users immediately, so they won't need to get scanned. This way, I believe, I will greatly reduce the load caused by this flood of junk mail. Then hopefully these DMA_WARNING messages will be gone, too... We'll see. Zoltan > At 7:33 PM -0800 11/18/04, Zoltan Frombach wrote: >>For your information, I applied this patch just now to my kernel. >>Sorry about the delay! I will send an update in a few days once I >>see if those DMA_WRITE warnings are still happening or not. > > For those who may have missed my other message, it looks like all > of my problems were related to a PCI-based SATA controller which > was added by the store that built my machine. This card was added > even though I had selected a motherboard with on-board SATA. > > The problem controller was a: > and it has been causing me enough problems that I couldn't get > through a buildworld to even try the suggested patch. > > I have now switched to the on-board: > and so far I have not seen any more of these WRITE_DMA messages. > None. And I have been pounding the disk pretty hard with a > variety of work for a few hours now. So, now there is no point > in me adding the patch, because I no longer see the message! > > It would still be nice if FreeBSD would react better to whatever > problems this card causes. I still have this stupid card, and I > would be happy to mail it off to anyone who might want to debug the > problems with it. And if we *can't* fix it, then maybe we should > just remove support for it. I have had to rebuild my freebsd > partitions several times now due to these problems, and certainly > that wasn't much fun. Although I guess my problems might also be > partially due to the Western Digital drive I was using, when it is > used in combination with this card. > > -- > Garance Alistair Drosehn = gad@gilead.netel.rpi.edu > Senior Systems Programmer or gad@freebsd.org > Rensselaer Polytechnic Institute or drosih@rpi.edu