From owner-freebsd-stable@FreeBSD.ORG Sat Jan 6 01:00:09 2007 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 33ABC16A50C for ; Sat, 6 Jan 2007 01:00:09 +0000 (UTC) (envelope-from frode@nordahl.net) Received: from smtp1.powertech.no (smtp1.powertech.no [195.159.0.145]) by mx1.freebsd.org (Postfix) with ESMTP id A8E9713C448 for ; Sat, 6 Jan 2007 01:00:08 +0000 (UTC) (envelope-from frode@nordahl.net) Received: from [195.159.148.126] (dhcp7.xu.nordahl.net [195.159.148.126]) by smtp1.powertech.no (Postfix) with ESMTP id 90789845B for ; Sat, 6 Jan 2007 01:34:02 +0100 (CET) Mime-Version: 1.0 (Apple Message framework v752.3) Content-Transfer-Encoding: 7bit Message-Id: <0491A255-404B-4802-851C-43F4691C19E2@nordahl.net> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: freebsd-stable@freebsd.org From: Frode Nordahl Date: Sat, 6 Jan 2007 01:34:02 +0100 X-Mailer: Apple Mail (2.752.3) Subject: Livelock in 6.2-RC1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Jan 2007 01:00:09 -0000 Hello colleagues, I am experiencing a rare livelock on four of my backend mail servers running 6.1-STABLE, 6.2-BETA2 and 6.2-RC1. They are running OpenLDAP slapd, postfix and UW-IMAPD. The servers can run for months without any problem, but nevertheless I have experienced this problem on multiple versions and different hardware configurations about 5 times since september / october 2006. Server is responding to pings, but all other activity halts. On one occasion when one of the servers displayed this behaviour it managed to recover from the situation by itself after being gone for 20-30 minutes. Typical hardware configuration: CPU 2x Xeon 3.06GHz or 1x Core2Duo 2.00GHz (SMP) RAM 4 GB RAM DISK Intel SRCU42X (amr) or Dell PERC 5/i (mfi) Kernel config: include GENERIC options KDB # Enable kernel debugger support. options BREAK_TO_DEBUGGER options DDB # Support DDB. options GDB # Support remote GDB. options QUOTA options SMP On the last crash i collected the following info from DDB: db> tr Tracing pid 11 tid 100005 td 0xc8f90780 kdb_enter(c092f08b) at kdb_enter+0x2b siointr1(c9120800) at siointr1+0xce siointr(c9120800) at siointr+0x5e intr_execute_handlers(c8f864c8,e7b14c94,4,e7b14cd8,c0889503,...) at intr_execute_handlers+0xe1 lapic_handle_intr(3d) at lapic_handle_intr+0x2e Xapic_isr1() at Xapic_isr1+0x33 --- interrupt, eip = 0xc0b5b0e5, esp = 0xe7b14cd8, ebp = 0xe7b14cd8 --- acpi_cpu_c1(0,0,e7b14cf8,c8f90780,1,...) at acpi_cpu_c1+0x5 acpi_cpu_idle(e7b14d10,c066a779,c8f8fa78,c066a6e4,e7b14d24,...) at acpi_cpu_idle+0x152 cpu_idle(c8f8fa78,c066a6e4,e7b14d24,c066a465,0,...) at cpu_idle+0x28 idle_proc(0,e7b14d38) at idle_proc+0x95 fork_exit(c066a6e4,0,e7b14d38) at fork_exit+0x71 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xe7b14d6c, ebp = 0 --- db> show lockedbufs buf at 0xdd08cbd0 b_flags = 0x20000000 b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0 b_bufobj = (0xc937ed80), b_data = 0xdea14000, b_blkno = 14386688 b_npages = 4, pages(OBJ, IDX, PA): (0xc1045210, 0x1b70c0, 0xdbe35000), (0xc1045210, 0x1b70c1, 0xc17d6000),(0xc1045210, 0x1b70c2, 0x582d7000), (0xc1045210, 0x1b70c3, 0x84498000) I have a crashdump or two available for further investigation. -- Frode Nordahl