From owner-freebsd-current@FreeBSD.ORG Fri Oct 28 19:52:44 2005 Return-Path: X-Original-To: current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 97DBB16A41F for ; Fri, 28 Oct 2005 19:52:44 +0000 (GMT) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.FreeBSD.org (Postfix) with ESMTP id DC59F43D46 for ; Fri, 28 Oct 2005 19:52:43 +0000 (GMT) (envelope-from bra@fsn.hu) Received: from localhost (localhost [127.0.0.1]) by people.fsn.hu (Postfix) with ESMTP id DA1228444E for ; Fri, 28 Oct 2005 15:50:16 +0200 (CEST) Received: from people.fsn.hu ([127.0.0.1]) by localhost (people.fsn.hu [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 66910-02-5 for ; Fri, 28 Oct 2005 15:50:10 +0200 (CEST) Received: from [127.0.0.1] (unknown [192.168.2.3]) by people.fsn.hu (Postfix) with ESMTP id 35DAA84451 for ; Fri, 28 Oct 2005 15:50:10 +0200 (CEST) Message-ID: <43622C91.5080300@fsn.hu> Date: Fri, 28 Oct 2005 15:50:09 +0200 From: Attila Nagy User-Agent: Mozilla Thunderbird 1.0.7 (X11/20050930) X-Accept-Language: en-us, en MIME-Version: 1.0 To: current@FreeBSD.org Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: amavisd-new at fsn.hu X-Mailman-Approved-At: Sat, 29 Oct 2005 17:09:22 +0000 Cc: Subject: Freezes with 6.0 and 7-CURRENT when working with many symlinks/dirs X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Oct 2005 19:52:44 -0000 Hello, I'm struggling with this bug for a while now. I have a fully reproduceable freeze with both RELENG_6 and HEAD in amd64 mode (I could not try with i386). It strikes when I want to synchonise a large pool of symlinks/directories from another machine to this FreeBSD one. The total number of files is about 6-10 million. The freeze occurs randomly, either when rsync deletes a massive amount of symlinks, or directories on the local machine, or when it starts to create them. But it freezes, no matter what I do. The machine itself is a HP DL380G4 (two Xeons, HTT on), which has an additional SmartArray 6402 controller (ciss0: the SmartArray 6i on the motherboard and ciss1 the 6402). I would like to sync onto ciss1, that's where the activity happens. Under "freeze" I mean the machine stops working, I can not ping, ssh sessions disconnect and the console hungs. I can do two things in this stage. Turning MP_WATCHDOG on catches this and enters the debugger and when I issue an NMI I get the same effect (of course :). I've tried the following to workaround or locate the source of this problem: - turn HTT off - turn softupdates off - turn ACPI off (with the beastie menu) - turn preemption off - debug.mpsafevfs=0 and debug.mpsafenet=0 - turn dirhash off all without success. I have nfsd and quota enabled, but currently the former is not in use. The synchronised directories and files are in the ownership of many, non existend (not in /etc/master.passwd) uids and I have quota for most of those uids. I could collect three traces, some of them are a little bit mangled by the ILO (ssh access to the console). http://people.fsn.hu/~bra/freebsd/crash-20051028/ crash1 and crash2 is from the in-kernel debugger, crash3 is after the MP_WATCHDOG fired and a call doadump and kgdb kernel /var/crash/vmcore... Any ideas what else should I try, or what should I do in the debugger to make it easier to find where the problem is? Thanks, -- Attila Nagy e-mail: Attila.Nagy@fsn.hu Adopt a directory on our free software phone: +3630 306 6758 server! http://www.fsn.hu/?f=brick