From owner-freebsd-questions@FreeBSD.ORG Tue Jul 3 01:11:07 2007 Return-Path: X-Original-To: questions@FreeBSD.org Delivered-To: freebsd-questions@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C544E16A468 for ; Tue, 3 Jul 2007 01:11:07 +0000 (UTC) (envelope-from yrao@force10networks.com) Received: from mx.force10networks.com (nat-eqx.force10networks.com [69.25.56.27]) by mx1.freebsd.org (Postfix) with ESMTP id AFD3A13C4AD for ; Tue, 3 Jul 2007 01:11:07 +0000 (UTC) (envelope-from yrao@force10networks.com) Received: from mx.force10networks.com ([10.11.0.215]) by mx.force10networks.com with Microsoft SMTPSVC(6.0.3790.0); Mon, 2 Jul 2007 17:59:20 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Date: Mon, 2 Jul 2007 17:59:18 -0700 Message-ID: <9E2742C54E161041A53F36F9A8DC31BE070236@EXCH-CLUSTER-04.force10networks.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: SMP options and core dump failure Thread-Index: Ace9DWJ17tywCFqaSYORefnKhCvGjQ== From: "Yong Rao" To: X-OriginalArrivalTime: 03 Jul 2007 00:59:20.0325 (UTC) FILETIME=[6366BB50:01C7BD0D] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: SMP options and core dump failure X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Jul 2007 01:11:07 -0000 Hello, =20 We have a problem with SMP kernel. It could not dump out core when the crash happens. =20 I am able to isolate the problem to kernel configurations which have SMP enabled when used with 2 cpus.=20 With ONE cpu the core dump works ok. =20 I built the kernel with GENERIC, and deliberately crash the kernel (for testing purpose). The core dump works fine. Only added the "options SMP" and crashed the kernel, then prior to any pages being dumped out, it hangs there. =20 Has someone successfully core dumped on a system using SMP kernel with multiple CPUs? =20 I tried on two different boxes (different motherboards, CPUs and hard disks). Both got failed. =20 I tried to enable the DDB, but don't know what to look for when it goes into ddb. Appreciate any pointers. =20 a) The CPU information is=20 =20 CPU: Dual Core AMD Opteron(tm) Processor 280 (2405.47-MHz 686-class CPU) Origin =3D "AuthenticAMD" Id =3D 0x20f12 Stepping =3D 2 =20 Features=3D0x178bfbff Features2=3D0x1 AMD Features=3D0xe2500800 AMD Features2=3D0x3 Cores per package: 2 =20 =20 b) We also tried on another mother board, which has 2 CPUs. The CPU information is below. =20 CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2800.11-MHz 686-class CPU) Origin =3D "GenuineIntel" Id =3D 0xf29 Stepping =3D 9 =20 Features=3D0xbfebfbff Features2=3D0x4400> real memory =3D 2147418112 (2047 MB) avail memory =3D 2096300032 (1999 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 6 =20 =20 c) The following are the prints when the dump hung. =20 mem dump: start address =3D 0x4352, len=3D0x30 =20 =20 Fatal trap 12: page fault while in kernel mode cpuid =3D 1; apic id =3D 01 fault virtual address =3D 0x4352 fault code =3D supervisor read, page not present instruction pointer =3D 0x20:0xc9e9fc92 stack pointer =3D 0x28:0xebdbdbdc frame pointer =3D 0x28:0xebdbdbf8 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, def32 1, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 74231 (pnicdbg) trap number =3D 12 panic: page fault cpuid =3D 1 Uptime: 1d18h27m42s Dumping 4030 MB (2 chunks) chunk 0: 1MB (154 pages) ... ok chunk 1: 4031MB (1031776 pages) (stopped and hung here) =20 =20 Thanks, =20 Yong Rao Force10 Networks Inc. 350 Holger Way San Jose, CA 95132 408 571 6317 =20