From owner-freebsd-smp@FreeBSD.ORG Thu Jun 14 02:38:12 2007 Return-Path: X-Original-To: freebsd-smp@freebsd.org Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B726516A41F for ; Thu, 14 Jun 2007 02:38:12 +0000 (UTC) (envelope-from yrao@force10networks.com) Received: from mx.force10networks.com (nat-eqx.force10networks.com [69.25.56.27]) by mx1.freebsd.org (Postfix) with ESMTP id 9A84513C455 for ; Thu, 14 Jun 2007 02:38:12 +0000 (UTC) (envelope-from yrao@force10networks.com) Received: from mx.force10networks.com ([10.11.0.221]) by mx.force10networks.com with Microsoft SMTPSVC(6.0.3790.0); Wed, 13 Jun 2007 19:26:52 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----_=_NextPart_001_01C7AE2B.776F9873" Date: Wed, 13 Jun 2007 19:26:51 -0700 Message-ID: <1818EFE74C4A8A4292E05835D378EC66130055@EXCH-CLUSTER-07.force10networks.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: FreeBSD-6.2, SMP, coredump -- fatal trap 12 : page fault while in kernel mode, current process: (swi1:net) Thread-Index: AceuK3dIAgXNwS2jRQeRPjW3XMxRAA== From: "Yong Rao" To: X-OriginalArrivalTime: 14 Jun 2007 02:26:52.0043 (UTC) FILETIME=[77D20DB0:01C7AE2B] X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Jagjit Choudhary Subject: FreeBSD-6.2, SMP, coredump -- fatal trap 12 : page fault while in kernel mode, current process: (swi1:net) X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jun 2007 02:38:12 -0000 This is a multi-part message in MIME format. ------_=_NextPart_001_01C7AE2B.776F9873 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hello, =20 Recently, in developing our driver, we found the core dump did not work on the FBSD OS 6.2. It hung at the beginning of core dump when crash happened.=20 =20 We used the same configuration file to build the kernel on FBSD 6.0 (with SMP options). It worked. It dumped out the whole core file successfully.=20 =20 What is the problem? How to solve this is issue? Or what option should we add to get core dump work if we want to keep SMP option in the kernel configuration?=20 The following are the prints when the dump hung. fatal trap 12 : page fault while in kernel mode cpuid =3D 1 ; apic id =3D 00 fault virtual address 0x8 fault code =3D supervisor read , page not present instruction pointer =3D 0x20:0xc059ac77 stack pointer =3D 0x28:0xe7fc2b30 frame pointer =3D 0x28:0xe7fc2b4c code segment =3D base 0x0 limit 0xfffff , type 0x1b =3D DPL0 , pres 1 , def32 1 , gran 1 processor eflags =3D interrupt enabled , resume , IOPL =3D 0 current process =3D 12 (swi1:net) trap number 12 panic : page fault cpuid =3D 0; uptime 3hours 51 minutes 18 seconds dumping 4030MB in 2 chunks chunk 0 1MB chunk 1 4031 MB (stopped and hung here) =20 =20 Our system: CPU: Dual Core AMD Opteron(tm) Processor 280 (2405.46-MHz 686-class CPU), 4x 1Gb RAM =20 FreeBSD localhost.localdomain 6.2-RELEASE FreeBSD 6.2-RELEASE #1: Wed Jun 13 18:28:22 PDT 2007 root@localhost.localdomain:/usr/src/sys/i386/compile/YONGSMPPOLL60 i386 =20 Attached is our kernel build configuration file. Thanks, Yong =20 ------_=_NextPart_001_01C7AE2B.776F9873 Content-Type: application/octet-stream; name="YONGSMPPOLL60" Content-Transfer-Encoding: base64 Content-Description: YONGSMPPOLL60 Content-Disposition: attachment; filename="YONGSMPPOLL60" IwojIEdFTkVSSUMgLS0gR2VuZXJpYyBrZXJuZWwgY29uZmlndXJhdGlvbiBmaWxlIGZvciBGcmVl QlNEL2kzODYKIwojIEZvciBtb3JlIGluZm9ybWF0aW9uIG9uIHRoaXMgZmlsZSwgcGxlYXNlIHJl YWQgdGhlIGhhbmRib29rIHNlY3Rpb24gb24KIyBLZXJuZWwgQ29uZmlndXJhdGlvbiBGaWxlczoK IwojICAgIGh0dHA6Ly93d3cuRnJlZUJTRC5vcmcvZG9jL2VuX1VTLklTTzg4NTktMS9ib29rcy9o YW5kYm9vay9rZXJuZWxjb25maWctY29uZmlnLmh0bWwKIwojIFRoZSBoYW5kYm9vayBpcyBhbHNv IGF2YWlsYWJsZSBsb2NhbGx5IGluIC91c3Ivc2hhcmUvZG9jL2hhbmRib29rCiMgaWYgeW91J3Zl IGluc3RhbGxlZCB0aGUgZG9jIGRpc3RyaWJ1dGlvbiwgb3RoZXJ3aXNlIGFsd2F5cyBzZWUgdGhl CiMgRnJlZUJTRCBXb3JsZCBXaWRlIFdlYiBzZXJ2ZXIgKGh0dHA6Ly93d3cuRnJlZUJTRC5vcmcv KSBmb3IgdGhlCiMgbGF0ZXN0IGluZm9ybWF0aW9uLgojCiMgQW4gZXhoYXVzdGl2ZSBsaXN0IG9m IG9wdGlvbnMgYW5kIG1vcmUgZGV0YWlsZWQgZXhwbGFuYXRpb25zIG9mIHRoZQojIGRldmljZSBs aW5lcyBpcyBhbHNvIHByZXNlbnQgaW4gdGhlIC4uLy4uL2NvbmYvTk9URVMgYW5kIE5PVEVTIGZp bGVzLgojIElmIHlvdSBhcmUgaW4gZG91YnQgYXMgdG8gdGhlIHB1cnBvc2Ugb3IgbmVjZXNzaXR5 IG9mIGEgbGluZSwgY2hlY2sgZmlyc3QKIyBpbiBOT1RFUy4KIwojICRGcmVlQlNEOiBzcmMvc3lz L2kzODYvY29uZi9HRU5FUklDLHYgMS40MjkuMi4zLjIuMSAyMDA1LzEwLzI4IDE5OjIyOjQxIGpo YiBFeHAgJAoKbWFjaGluZQkJaTM4NgojY3B1CQlJNDg2X0NQVQojY3B1CQlJNTg2X0NQVQpjcHUJ CUk2ODZfQ1BVCmlkZW50CQlNZXRhbmV0d29ya3MKCiMgVG8gc3RhdGljYWxseSBjb21waWxlIGlu IGRldmljZSB3aXJpbmcgaW5zdGVhZCBvZiAvYm9vdC9kZXZpY2UuaGludHMKI2hpbnRzCQkiR0VO RVJJQy5oaW50cyIJCSMgRGVmYXVsdCBwbGFjZXMgdG8gbG9vayBmb3IgZGV2aWNlcy4KCm1ha2Vv cHRpb25zCURFQlVHPS1nCQkjIEJ1aWxkIGtlcm5lbCB3aXRoIGdkYigxKSBkZWJ1ZyBzeW1ib2xz Cgojb3B0aW9ucyAJU0NIRURfVUxFCQkjIFVMRSBzY2hlZHVsZXIKb3B0aW9ucyAJU01QCQkJIyBT eW1tZXRyaWMgTXVsdGlQcm9jZXNzb3IgS2VybmVsCm9wdGlvbnMJCURFVklDRV9QT0xMSU5HCQkj IERldmljZSBQb2xsaW5nCm9wdGlvbnMJCUhaPTEwMDAJCQkjIFBvbGxpbmcgaW50ZXJ2YWwgCQpv cHRpb25zIAlTQ0hFRF80QlNECQkjIDRCU0Qgc2NoZWR1bGVyCm9wdGlvbnMgCVBSRUVNUFRJT04J CSMgRW5hYmxlIGtlcm5lbCB0aHJlYWQgcHJlZW1wdGlvbgpvcHRpb25zIAlJTkVUCQkJIyBJbnRl ck5FVHdvcmtpbmcKb3B0aW9ucyAJSU5FVDYJCQkjIElQdjYgY29tbXVuaWNhdGlvbnMgcHJvdG9j b2xzCm9wdGlvbnMgCUZGUwkJCSMgQmVya2VsZXkgRmFzdCBGaWxlc3lzdGVtCm9wdGlvbnMgCVNP RlRVUERBVEVTCQkjIEVuYWJsZSBGRlMgc29mdCB1cGRhdGVzIHN1cHBvcnQKb3B0aW9ucyAJVUZT X0FDTAkJCSMgU3VwcG9ydCBmb3IgYWNjZXNzIGNvbnRyb2wgbGlzdHMKb3B0aW9ucyAJVUZTX0RJ UkhBU0gJCSMgSW1wcm92ZSBwZXJmb3JtYW5jZSBvbiBiaWcgZGlyZWN0b3JpZXMKb3B0aW9ucyAJ TURfUk9PVAkJCSMgTUQgaXMgYSBwb3RlbnRpYWwgcm9vdCBkZXZpY2UKb3B0aW9ucyAJTkZTQ0xJ RU5UCQkjIE5ldHdvcmsgRmlsZXN5c3RlbSBDbGllbnQKb3B0aW9ucyAJTkZTU0VSVkVSCQkjIE5l dHdvcmsgRmlsZXN5c3RlbSBTZXJ2ZXIKb3B0aW9ucyAJTkZTX1JPT1QJCSMgTkZTIHVzYWJsZSBh cyAvLCByZXF1aXJlcyBORlNDTElFTlQKb3B0aW9ucyAJTVNET1NGUwkJCSMgTVNET1MgRmlsZXN5 c3RlbQpvcHRpb25zIAlDRDk2NjAJCQkjIElTTyA5NjYwIEZpbGVzeXN0ZW0Kb3B0aW9ucyAJUFJP Q0ZTCQkJIyBQcm9jZXNzIGZpbGVzeXN0ZW0gKHJlcXVpcmVzIFBTRVVET0ZTKQpvcHRpb25zIAlQ U0VVRE9GUwkJIyBQc2V1ZG8tZmlsZXN5c3RlbSBmcmFtZXdvcmsKb3B0aW9ucyAJR0VPTV9HUFQJ CSMgR1VJRCBQYXJ0aXRpb24gVGFibGVzLgpvcHRpb25zIAlDT01QQVRfNDMJCSMgQ29tcGF0aWJs ZSB3aXRoIEJTRCA0LjMgW0tFRVAgVEhJUyFdCm9wdGlvbnMgCUNPTVBBVF9GUkVFQlNENAkJIyBD b21wYXRpYmxlIHdpdGggRnJlZUJTRDQKb3B0aW9ucyAJQ09NUEFUX0ZSRUVCU0Q1CQkjIENvbXBh dGlibGUgd2l0aCBGcmVlQlNENQpvcHRpb25zIAlTQ1NJX0RFTEFZPTUwMDAJCSMgRGVsYXkgKGlu IG1zKSBiZWZvcmUgcHJvYmluZyBTQ1NJCm9wdGlvbnMgCUtUUkFDRQkJCSMga3RyYWNlKDEpIHN1 cHBvcnQKb3B0aW9ucyAJU1lTVlNITQkJCSMgU1lTVi1zdHlsZSBzaGFyZWQgbWVtb3J5Cm9wdGlv bnMgCVNZU1ZNU0cJCQkjIFNZU1Ytc3R5bGUgbWVzc2FnZSBxdWV1ZXMKb3B0aW9ucyAJU1lTVlNF TQkJCSMgU1lTVi1zdHlsZSBzZW1hcGhvcmVzCm9wdGlvbnMgCV9LUE9TSVhfUFJJT1JJVFlfU0NI RURVTElORyAjIFBPU0lYIFAxMDAzXzFCIHJlYWwtdGltZSBleHRlbnNpb25zCm9wdGlvbnMgCUtC RF9JTlNUQUxMX0NERVYJIyBpbnN0YWxsIGEgQ0RFViBlbnRyeSBpbiAvZGV2Cm9wdGlvbnMgCUFI Q19SRUdfUFJFVFRZX1BSSU5UCSMgUHJpbnQgcmVnaXN0ZXIgYml0ZmllbGRzIGluIGRlYnVnCgkJ CQkJIyBvdXRwdXQuICBBZGRzIH4xMjhrIHRvIGRyaXZlci4Kb3B0aW9ucyAJQUhEX1JFR19QUkVU VFlfUFJJTlQJIyBQcmludCByZWdpc3RlciBiaXRmaWVsZHMgaW4gZGVidWcKCQkJCQkjIG91dHB1 dC4gIEFkZHMgfjIxNWsgdG8gZHJpdmVyLgpvcHRpb25zIAlBREFQVElWRV9HSUFOVAkJIyBHaWFu dCBtdXRleCBpcyBhZGFwdGl2ZS4KCmRldmljZQkJYXBpYwkJCSMgSS9PIEFQSUMKCiMgQnVzIHN1 cHBvcnQuCmRldmljZQkJZWlzYQpkZXZpY2UJCXBjaQoKIyBGbG9wcHkgZHJpdmVzCiNkZXZpY2UJ CWZkYwoKIyBBVEEgYW5kIEFUQVBJIGRldmljZXMKZGV2aWNlCQlhdGEKZGV2aWNlCQlhdGFkaXNr CQkjIEFUQSBkaXNrIGRyaXZlcwojZGV2aWNlCQlhdGFyYWlkCQkjIEFUQSBSQUlEIGRyaXZlcwpk ZXZpY2UJCWF0YXBpY2QJCSMgQVRBUEkgQ0RST00gZHJpdmVzCiNkZXZpY2UJCWF0YXBpZmQJCSMg QVRBUEkgZmxvcHB5IGRyaXZlcwojZGV2aWNlCQlhdGFwaXN0CQkjIEFUQVBJIHRhcGUgZHJpdmVz Cm9wdGlvbnMgCUFUQV9TVEFUSUNfSUQJIyBTdGF0aWMgZGV2aWNlIG51bWJlcmluZwoKIyBTQ1NJ IENvbnRyb2xsZXJzCiNkZXZpY2UJCWFoYgkJIyBFSVNBIEFIQTE3NDIgZmFtaWx5CiNkZXZpY2UJ CWFoYwkJIyBBSEEyOTQwIGFuZCBvbmJvYXJkIEFJQzd4eHggZGV2aWNlcwojZGV2aWNlCQlhaGQJ CSMgQUhBMzkzMjAvMjkzMjAgYW5kIG9uYm9hcmQgQUlDNzl4eCBkZXZpY2VzCiNkZXZpY2UJCWFt ZAkJIyBBTUQgNTNDOTc0IChUZWtyYW0gREMtMzkwKFQpKQojZGV2aWNlCQlpc3AJCSMgUWxvZ2lj IGZhbWlseQojI2RldmljZSAJaXNwZncJCSMgRmlybXdhcmUgZm9yIFFMb2dpYyBIQkFzLSBub3Jt YWxseSBhIG1vZHVsZQojZGV2aWNlCQltcHQJCSMgTFNJLUxvZ2ljIE1QVC1GdXNpb24KIyNkZXZp Y2UJCW5jcgkJIyBOQ1IvU3ltYmlvcyBMb2dpYwojZGV2aWNlCQlzeW0JCSMgTkNSL1N5bWJpb3Mg TG9naWMgKG5ld2VyIGNoaXBzZXRzICsgdGhvc2Ugb2YgYG5jcicpCiNkZXZpY2UJCXRybQkJIyBU ZWtyYW0gREMzOTVVL1VXL0YgREMzMTVVIGFkYXB0ZXJzCgojZGV2aWNlCQlhZHYJCSMgQWR2YW5z eXMgU0NTSSBhZGFwdGVycwojZGV2aWNlCQlhZHcJCSMgQWR2YW5zeXMgd2lkZSBTQ1NJIGFkYXB0 ZXJzCiNkZXZpY2UJCWFoYQkJIyBBZGFwdGVjIDE1NHggU0NTSSBhZGFwdGVycwojZGV2aWNlCQlh aWMJCSMgQWRhcHRlYyAxNVswMTJdeCBTQ1NJIGFkYXB0ZXJzLCBBSUMtNlsyM102MC4KI2Rldmlj ZQkJYnQJCSMgQnVzbG9naWMvTXlsZXggTXVsdGlNYXN0ZXIgU0NTSSBhZGFwdGVycwoKI2Rldmlj ZQkJbmN2CQkjIE5DUiA1M0M1MDAKI2RldmljZQkJbnNwCQkjIFdvcmtiaXQgTmluamEgU0NTSS0z CiNkZXZpY2UJCXN0ZwkJIyBUTUMgMThDMzAvMThDNTAKCiMgU0NTSSBwZXJpcGhlcmFscwojZGV2 aWNlCQlzY2J1cwkJIyBTQ1NJIGJ1cyAocmVxdWlyZWQgZm9yIFNDU0kpCiNkZXZpY2UJCWNoCQkj IFNDU0kgbWVkaWEgY2hhbmdlcnMKI2RldmljZQkJZGEJCSMgRGlyZWN0IEFjY2VzcyAoZGlza3Mp CiNkZXZpY2UJCXNhCQkjIFNlcXVlbnRpYWwgQWNjZXNzICh0YXBlIGV0YykKI2RldmljZQkJY2QJ CSMgQ0QKI2RldmljZQkJcGFzcwkJIyBQYXNzdGhyb3VnaCBkZXZpY2UgKGRpcmVjdCBTQ1NJIGFj Y2VzcykKI2RldmljZQkJc2VzCQkjIFNDU0kgRW52aXJvbm1lbnRhbCBTZXJ2aWNlcyAoYW5kIFNB Ri1URSkKCiMgUkFJRCBjb250cm9sbGVycyBpbnRlcmZhY2VkIHRvIHRoZSBTQ1NJIHN1YnN5c3Rl bQojZGV2aWNlCQlhbXIJCSMgQU1JIE1lZ2FSQUlECiNkZXZpY2UJCWFyY21zcgkJIyBBcmVjYSBT QVRBIElJIFJBSUQKI2RldmljZQkJYXNyCQkjIERQVCBTbWFydFJBSUQgViwgVkkgYW5kIEFkYXB0 ZWMgU0NTSSBSQUlECiNkZXZpY2UJCWNpc3MJCSMgQ29tcGFxIFNtYXJ0IFJBSUQgNSoKI2Rldmlj ZQkJZHB0CQkjIERQVCBTbWFydGNhY2hlIElJSSwgSVYgLSBTZWUgTk9URVMgZm9yIG9wdGlvbnMK I2RldmljZQkJaHB0bXYJCSMgSGlnaHBvaW50IFJvY2tldFJBSUQgMTgyeAojZGV2aWNlCQlpaXIJ CSMgSW50ZWwgSW50ZWdyYXRlZCBSQUlECiNkZXZpY2UJCWlwcwkJIyBJQk0gKEFkYXB0ZWMpIFNl cnZlUkFJRAojZGV2aWNlCQltbHkJCSMgTXlsZXggQWNjZWxlUkFJRC9lWHRyZW1lUkFJRAojZGV2 aWNlCQl0d2EJCSMgM3dhcmUgOTAwMCBzZXJpZXMgUEFUQS9TQVRBIFJBSUQKCiMgUkFJRCBjb250 cm9sbGVycwojZGV2aWNlCQlhYWMJCSMgQWRhcHRlYyBGU0EgUkFJRAojZGV2aWNlCQlhYWNwCQkj IFNDU0kgcGFzc3Rocm91Z2ggZm9yIGFhYyAocmVxdWlyZXMgQ0FNKQojZGV2aWNlCQlpZGEJCSMg Q29tcGFxIFNtYXJ0IFJBSUQKI2RldmljZQkJbWx4CQkjIE15bGV4IERBQzk2MCBmYW1pbHkKI2Rl dmljZQkJcHN0CQkjIFByb21pc2UgU3VwZXJ0cmFrIFNYNjAwMAojZGV2aWNlCQl0d2UJCSMgM3dh cmUgQVRBIFJBSUQKCiMgYXRrYmRjMCBjb250cm9scyBib3RoIHRoZSBrZXlib2FyZCBhbmQgdGhl IFBTLzIgbW91c2UKZGV2aWNlCQlhdGtiZGMJCSMgQVQga2V5Ym9hcmQgY29udHJvbGxlcgpkZXZp Y2UJCWF0a2JkCQkjIEFUIGtleWJvYXJkCmRldmljZQkJcHNtCQkjIFBTLzIgbW91c2UKCmRldmlj ZQkJdmdhCQkjIFZHQSB2aWRlbyBjYXJkIGRyaXZlcgoKZGV2aWNlCQlzcGxhc2gJCSMgU3BsYXNo IHNjcmVlbiBhbmQgc2NyZWVuIHNhdmVyIHN1cHBvcnQKCiMgc3lzY29ucyBpcyB0aGUgZGVmYXVs dCBjb25zb2xlIGRyaXZlciwgcmVzZW1ibGluZyBhbiBTQ08gY29uc29sZQpkZXZpY2UJCXNjCgoj IEVuYWJsZSB0aGlzIGZvciB0aGUgcGN2dCAoVlQyMjAgY29tcGF0aWJsZSkgY29uc29sZSBkcml2 ZXIKI2RldmljZQkJdnQKI29wdGlvbnMgCVhTRVJWRVIJCSMgc3VwcG9ydCBmb3IgWCBzZXJ2ZXIg b24gYSB2dCBjb25zb2xlCiNvcHRpb25zIAlGQVRfQ1VSU09SCSMgc3RhcnQgd2l0aCBibG9jayBj dXJzb3IKCmRldmljZQkJYWdwCQkjIHN1cHBvcnQgc2V2ZXJhbCBBR1AgY2hpcHNldHMKCiMgUG93 ZXIgbWFuYWdlbWVudCBzdXBwb3J0IChzZWUgTk9URVMgZm9yIG1vcmUgb3B0aW9ucykKI2Rldmlj ZQkJYXBtCiMgQWRkIHN1c3BlbmQvcmVzdW1lIHN1cHBvcnQgZm9yIHRoZSBpODI1NC4KZGV2aWNl CQlwbXRpbWVyCgojIFBDQ0FSRCAoUENNQ0lBKSBzdXBwb3J0CiMgUENNQ0lBIGFuZCBjYXJkYnVz IGJyaWRnZSBzdXBwb3J0CiNkZXZpY2UJCWNiYgkJIyBjYXJkYnVzICh5ZW50YSkgYnJpZGdlCiNk ZXZpY2UJCXBjY2FyZAkJIyBQQyBDYXJkICgxNi1iaXQpIGJ1cwojZGV2aWNlCQljYXJkYnVzCQkj IENhcmRCdXMgKDMyLWJpdCkgYnVzCgojIFNlcmlhbCAoQ09NKSBwb3J0cwpkZXZpY2UJCXNpbwkJ IyA4MjUwLCAxNls0NV01MCBiYXNlZCBzZXJpYWwgcG9ydHMKCiMgUGFyYWxsZWwgcG9ydAojZGV2 aWNlCQlwcGMKI2RldmljZQkJcHBidXMJCSMgUGFyYWxsZWwgcG9ydCBidXMgKHJlcXVpcmVkKQoj ZGV2aWNlCQlscHQJCSMgUHJpbnRlcgojZGV2aWNlCQlwbGlwCQkjIFRDUC9JUCBvdmVyIHBhcmFs bGVsCiNkZXZpY2UJCXBwaQkJIyBQYXJhbGxlbCBwb3J0IGludGVyZmFjZSBkZXZpY2UKIyNkZXZp Y2UJCXZwbwkJIyBSZXF1aXJlcyBzY2J1cyBhbmQgZGEKCiMgSWYgeW91J3ZlIGdvdCBhICJkdW1i IiBzZXJpYWwgb3IgcGFyYWxsZWwgUENJIGNhcmQgdGhhdCBpcwojIHN1cHBvcnRlZCBieSB0aGUg cHVjKDQpIGdsdWUgZHJpdmVyLCB1bmNvbW1lbnQgdGhlIGZvbGxvd2luZwojIGxpbmUgdG8gZW5h YmxlIGl0IChjb25uZWN0cyB0byB0aGUgc2lvIGFuZC9vciBwcGMgZHJpdmVycyk6CiNkZXZpY2UJ CXB1YwoKIyBQQ0kgRXRoZXJuZXQgTklDcy4KI2RldmljZQkJZGUJCSMgREVDL0ludGVsIERDMjF4 NHggKGBgVHVsaXAnJykKZGV2aWNlCQllbQkJIyBJbnRlbCBQUk8vMTAwMCBhZGFwdGVyIEdpZ2Fi aXQgRXRoZXJuZXQgQ2FyZAojZGV2aWNlCQlpeGdiCQkjIEludGVsIFBSTy8xMEdiRSBFdGhlcm5l dCBDYXJkCiNkZXZpY2UJCXR4cAkJIyAzQ29tIDNjUjk5MCAoYGBUeXBob29uJycpCiNkZXZpY2UJ CXZ4CQkjIDNDb20gM2M1OTAsIDNjNTk1IChgYFZvcnRleCcnKQoKIyBQQ0kgRXRoZXJuZXQgTklD cyB0aGF0IHVzZSB0aGUgY29tbW9uIE1JSSBidXMgY29udHJvbGxlciBjb2RlLgojIE5PVEU6IEJl IHN1cmUgdG8ga2VlcCB0aGUgJ2RldmljZSBtaWlidXMnIGxpbmUgaW4gb3JkZXIgdG8gdXNlIHRo ZXNlIE5JQ3MhCmRldmljZQkJbWlpYnVzCQkjIE1JSSBidXMgc3VwcG9ydAojZGV2aWNlCQliZmUJ CSMgQnJvYWRjb20gQkNNNDQweCAxMC8xMDAgRXRoZXJuZXQKZGV2aWNlCQliZ2UJCSMgQnJvYWRj b20gQkNNNTcweHggR2lnYWJpdCBFdGhlcm5ldAojZGV2aWNlCQlkYwkJIyBERUMvSW50ZWwgMjEx NDMgYW5kIHZhcmlvdXMgd29ya2FsaWtlcwpkZXZpY2UJCWZ4cAkJIyBJbnRlbCBFdGhlckV4cHJl c3MgUFJPLzEwMEIgKDgyNTU3LCA4MjU1OCkKI2RldmljZQkJbGdlCQkjIExldmVsIDEgTFhUMTAw MSBnaWdhYml0IEV0aGVybmV0CiNkZXZpY2UJCW5nZQkJIyBOYXRTZW1pIERQODM4MjAgZ2lnYWJp dCBFdGhlcm5ldAojZGV2aWNlCQludmUJCSMgblZpZGlhIG5Gb3JjZSBNQ1Agb24tYm9hcmQgRXRo ZXJuZXQgTmV0d29ya2luZwojZGV2aWNlCQlwY24JCSMgQU1EIEFtNzlDOTd4IFBDSSAxMC8xMDAo cHJlY2VkZW5jZSBvdmVyICdsbmMnKQojZGV2aWNlCQlyZQkJIyBSZWFsVGVrIDgxMzlDKy84MTY5 LzgxNjlTLzgxMTBTCiNkZXZpY2UJCXJsCQkjIFJlYWxUZWsgODEyOS84MTM5CiNkZXZpY2UJCXNm CQkjIEFkYXB0ZWMgQUlDLTY5MTUgKGBgU3RhcmZpcmUnJykKI2RldmljZQkJc2lzCQkjIFNpbGlj b24gSW50ZWdyYXRlZCBTeXN0ZW1zIFNpUyA5MDAvU2lTIDcwMTYKI2RldmljZQkJc2sJCSMgU3lz S29ubmVjdCBTSy05ODR4ICYgU0stOTgyeCBnaWdhYml0IEV0aGVybmV0CiNkZXZpY2UJCXN0ZQkJ IyBTdW5kYW5jZSBTVDIwMSAoRC1MaW5rIERGRS01NTBUWCkKI2RldmljZQkJdGkJCSMgQWx0ZW9u IE5ldHdvcmtzIFRpZ29uIEkvSUkgZ2lnYWJpdCBFdGhlcm5ldAojZGV2aWNlCQl0bAkJIyBUZXhh cyBJbnN0cnVtZW50cyBUaHVuZGVyTEFOCiNkZXZpY2UJCXR4CQkjIFNNQyBFdGhlclBvd2VyIElJ ICg4M2MxNzAgYGBFUElDJycpCiNkZXZpY2UJCXZnZQkJIyBWSUEgVlQ2MTJ4IGdpZ2FiaXQgRXRo ZXJuZXQKI2RldmljZQkJdnIJCSMgVklBIFJoaW5lLCBSaGluZSBJSQojZGV2aWNlCQl3YgkJIyBX aW5ib25kIFc4OUM4NDBGCiNkZXZpY2UJCXhsCQkjIDNDb20gM2M5MHggKGBgQm9vbWVyYW5nJycs IGBgQ3ljbG9uZScnKQoKIyBJU0EgRXRoZXJuZXQgTklDcy4gIHBjY2FyZCBOSUNzIGluY2x1ZGVk LgojZGV2aWNlCQljcwkJIyBDcnlzdGFsIFNlbWljb25kdWN0b3IgQ1M4OXgwIE5JQwojICdkZXZp Y2UgZWQnIHJlcXVpcmVzICdkZXZpY2UgbWlpYnVzJwojZGV2aWNlCQllZAkJIyBORVsxMl0wMDAs IFNNQyBVbHRyYSwgM2M1MDMsIERTODM5MCBjYXJkcwojZGV2aWNlCQlleAkJIyBJbnRlbCBFdGhl ckV4cHJlc3MgUHJvLzEwIGFuZCBQcm8vMTArCiNkZXZpY2UJCWVwCQkjIEV0aGVybGluayBJSUkg YmFzZWQgY2FyZHMKI2RldmljZQkJZmUJCSMgRnVqaXRzdSBNQjg2OTZ4IGJhc2VkIGNhcmRzCiNk ZXZpY2UJCWllCQkjIEV0aGVyRXhwcmVzcyA4LzE2LCAzQzUwNywgU3RhckxBTiAxMCBldGMuCiNk ZXZpY2UJCWxuYwkJIyBORTIxMDAsIE5FMzItVkwgTGFuY2UgRXRoZXJuZXQgY2FyZHMKI2Rldmlj ZQkJc24JCSMgU01DJ3MgOTAwMCBzZXJpZXMgb2YgRXRoZXJuZXQgY2hpcHMKI2RldmljZQkJeGUJ CSMgWGlyY29tIHBjY2FyZCBFdGhlcm5ldAoKIyBJU0EgZGV2aWNlcyB0aGF0IHVzZSB0aGUgb2xk IElTQSBzaGltcwojZGV2aWNlCQlsZQoKIyBXaXJlbGVzcyBOSUMgY2FyZHMKI2RldmljZQkJd2xh bgkJIyA4MDIuMTEgc3VwcG9ydAojZGV2aWNlCQlhbgkJIyBBaXJvbmV0IDQ1MDAvNDgwMCA4MDIu MTEgd2lyZWxlc3MgTklDcy4KI2RldmljZQkJYXdpCQkjIEJheVN0YWNrIDY2MCBhbmQgb3RoZXJz CiNkZXZpY2UJCXJhbAkJIyBSYWxpbmsgVGVjaG5vbG9neSBSVDI1MDAgd2lyZWxlc3MgTklDcy4K I2RldmljZQkJd2kJCSMgV2F2ZUxBTi9JbnRlcnNpbC9TeW1ib2wgODAyLjExIHdpcmVsZXNzIE5J Q3MuCiMjZGV2aWNlCQl3bAkJIyBPbGRlciBub24gODAyLjExIFdhdmVsYW4gd2lyZWxlc3MgTklD LgoKIyBQc2V1ZG8gZGV2aWNlcy4KZGV2aWNlCQlsb29wCQkjIE5ldHdvcmsgbG9vcGJhY2sKZGV2 aWNlCQlyYW5kb20JCSMgRW50cm9weSBkZXZpY2UKZGV2aWNlCQlldGhlcgkJIyBFdGhlcm5ldCBz dXBwb3J0CmRldmljZQkJc2wJCSMgS2VybmVsIFNMSVAKZGV2aWNlCQlwcHAJCSMgS2VybmVsIFBQ UApkZXZpY2UJCXR1bgkJIyBQYWNrZXQgdHVubmVsLgpkZXZpY2UJCXB0eQkJIyBQc2V1ZG8tdHR5 cyAodGVsbmV0IGV0YykKZGV2aWNlCQltZAkJIyBNZW1vcnkgImRpc2tzIgpkZXZpY2UJCWdpZgkJ IyBJUHY2IGFuZCBJUHY0IHR1bm5lbGluZwpkZXZpY2UJCWZhaXRoCQkjIElQdjYtdG8tSVB2NCBy ZWxheWluZyAodHJhbnNsYXRpb24pCgojIFRoZSBgYnBmJyBkZXZpY2UgZW5hYmxlcyB0aGUgQmVy a2VsZXkgUGFja2V0IEZpbHRlci4KIyBCZSBhd2FyZSBvZiB0aGUgYWRtaW5pc3RyYXRpdmUgY29u c2VxdWVuY2VzIG9mIGVuYWJsaW5nIHRoaXMhCiMgTm90ZSB0aGF0ICdicGYnIGlzIHJlcXVpcmVk IGZvciBESENQLgpkZXZpY2UJCWJwZgkJIyBCZXJrZWxleSBwYWNrZXQgZmlsdGVyCgojIFVTQiBz dXBwb3J0CmRldmljZQkJdWhjaQkJIyBVSENJIFBDSS0+VVNCIGludGVyZmFjZQpkZXZpY2UJCW9o Y2kJCSMgT0hDSSBQQ0ktPlVTQiBpbnRlcmZhY2UKZGV2aWNlCQllaGNpCQkjIEVIQ0kgUENJLT5V U0IgaW50ZXJmYWNlIChVU0IgMi4wKQpkZXZpY2UJCXVzYgkJIyBVU0IgQnVzIChyZXF1aXJlZCkK I2RldmljZQkJdWRicAkJIyBVU0IgRG91YmxlIEJ1bGsgUGlwZSBkZXZpY2VzCmRldmljZQkJdWdl bgkJIyBHZW5lcmljCmRldmljZQkJdWhpZAkJIyAiSHVtYW4gSW50ZXJmYWNlIERldmljZXMiCmRl dmljZQkJdWtiZAkJIyBLZXlib2FyZApkZXZpY2UJCXVscHQJCSMgUHJpbnRlcgojZGV2aWNlCQl1 bWFzcwkJIyBEaXNrcy9NYXNzIHN0b3JhZ2UgLSBSZXF1aXJlcyBzY2J1cyBhbmQgZGEKZGV2aWNl CQl1bXMJCSMgTW91c2UKI2RldmljZQkJdXJhbAkJIyBSYWxpbmsgVGVjaG5vbG9neSBSVDI1MDBV U0Igd2lyZWxlc3MgTklDcwojZGV2aWNlCQl1cmlvCQkjIERpYW1vbmQgUmlvIDUwMCBNUDMgcGxh eWVyCiNkZXZpY2UJCXVzY2FubmVyCSMgU2Nhbm5lcnMKIyBVU0IgRXRoZXJuZXQsIHJlcXVpcmVz IG1paWJ1cwojZGV2aWNlCQlhdWUJCSMgQURNdGVrIFVTQiBFdGhlcm5ldAojZGV2aWNlCQlheGUJ CSMgQVNJWCBFbGVjdHJvbmljcyBVU0IgRXRoZXJuZXQKI2RldmljZQkJY2RjZQkJIyBHZW5lcmlj IFVTQiBvdmVyIEV0aGVybmV0CiNkZXZpY2UJCWN1ZQkJIyBDQVRDIFVTQiBFdGhlcm5ldAojZGV2 aWNlCQlrdWUJCSMgS2F3YXNha2kgTFNJIFVTQiBFdGhlcm5ldAojZGV2aWNlCQlydWUJCSMgUmVh bFRlayBSVEw4MTUwIFVTQiBFdGhlcm5ldAoKIyBGaXJlV2lyZSBzdXBwb3J0CiNkZXZpY2UJCWZp cmV3aXJlCSMgRmlyZVdpcmUgYnVzIGNvZGUKI2RldmljZQkJc2JwCQkjIFNDU0kgb3ZlciBGaXJl V2lyZSAoUmVxdWlyZXMgc2NidXMgYW5kIGRhKQojZGV2aWNlCQlmd2UJCSMgRXRoZXJuZXQgb3Zl ciBGaXJlV2lyZSAobm9uLXN0YW5kYXJkISkK ------_=_NextPart_001_01C7AE2B.776F9873-- From owner-freebsd-smp@FreeBSD.ORG Thu Jun 14 02:43:34 2007 Return-Path: X-Original-To: freebsd-smp@freebsd.org Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DCF8D16A400 for ; Thu, 14 Jun 2007 02:43:34 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id C7D1B13C44B for ; Thu, 14 Jun 2007 02:43:34 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id EBD231A3C19; Wed, 13 Jun 2007 19:43:08 -0700 (PDT) Received: from rot13.obsecurity.org (rot13.obsecurity.org [192.168.1.5]) by obsecurity.dyndns.org (Postfix) with ESMTP id 16C87512A6; Wed, 13 Jun 2007 22:43:34 -0400 (EDT) Received: by rot13.obsecurity.org (Postfix, from userid 1001) id D8933C0FF; Wed, 13 Jun 2007 22:43:33 -0400 (EDT) Date: Wed, 13 Jun 2007 22:43:33 -0400 From: Kris Kennaway To: Yong Rao Message-ID: <20070614024333.GA70019@rot13.obsecurity.org> References: <1818EFE74C4A8A4292E05835D378EC66130055@EXCH-CLUSTER-07.force10networks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1818EFE74C4A8A4292E05835D378EC66130055@EXCH-CLUSTER-07.force10networks.com> User-Agent: Mutt/1.4.2.3i Cc: freebsd-smp@freebsd.org, Jagjit Choudhary Subject: Re: FreeBSD-6.2, SMP, coredump -- fatal trap 12 : page fault while in kernel mode, current process: (swi1:net) X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jun 2007 02:43:34 -0000 On Wed, Jun 13, 2007 at 07:26:51PM -0700, Yong Rao wrote: > Hello, > > > > Recently, in developing our driver, we found the core dump did not work > on the FBSD OS 6.2. It hung at the beginning of core dump when crash > happened. This can happen when the panic involves the storage subsystem, or if something else is performing I/O (i.e. holding a relevant lock) at the time of panic. Online debugging using ddb helps a lot. Kris From owner-freebsd-smp@FreeBSD.ORG Thu Jun 14 06:21:15 2007 Return-Path: X-Original-To: freebsd-smp@freebsd.org Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4086E16A400 for ; Thu, 14 Jun 2007 06:21:15 +0000 (UTC) (envelope-from yrao@force10networks.com) Received: from mx.force10networks.com (nat-eqx.force10networks.com [69.25.56.27]) by mx1.freebsd.org (Postfix) with ESMTP id 28EFD13C487 for ; Thu, 14 Jun 2007 06:21:14 +0000 (UTC) (envelope-from yrao@force10networks.com) Received: from mx.force10networks.com ([10.11.0.221]) by mx.force10networks.com with Microsoft SMTPSVC(6.0.3790.0); Wed, 13 Jun 2007 23:21:54 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Wed, 13 Jun 2007 23:21:53 -0700 Message-ID: <1818EFE74C4A8A4292E05835D378EC66130074@EXCH-CLUSTER-07.force10networks.com> In-Reply-To: <20070614024333.GA70019@rot13.obsecurity.org> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: FreeBSD-6.2, SMP, coredump -- fatal trap 12 : page fault while in kernel mode, current process: (swi1:net) Thread-Index: AceuLeYXLnmfiS93Rj6qn0XSIzFnVQAHYnsQ References: <1818EFE74C4A8A4292E05835D378EC66130055@EXCH-CLUSTER-07.force10networks.com> <20070614024333.GA70019@rot13.obsecurity.org> From: "Yong Rao" To: "Kris Kennaway" X-OriginalArrivalTime: 14 Jun 2007 06:21:54.0450 (UTC) FILETIME=[4D829720:01C7AE4C] Cc: freebsd-smp@freebsd.org, Jagjit Choudhary Subject: RE: FreeBSD-6.2, SMP, coredump -- fatal trap 12 : page fault while in kernel mode, current process: (swi1:net) X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jun 2007 06:21:15 -0000 Hi Kris, Thanks for your reply. I have 2 exactly the same equipments. The ONLY difference is the OS. One is running 6.0 FBSD. The other one is 6.2 FBSD.=20 I built the kernels using the same configuration file (with SMP option). My problem is that the OS 6.2 does not dump core when crash happens, while the OS 6.0 does core dump perfectly. Why it is like this? Something wrong with FBSD 6.2 OS when SMP options is enabled? Thanks, Yong -----Original Message----- From: Kris Kennaway [mailto:kris@obsecurity.org]=20 Sent: Wednesday, June 13, 2007 7:44 PM To: Yong Rao Cc: freebsd-smp@freebsd.org; Jagjit Choudhary Subject: Re: FreeBSD-6.2, SMP, coredump -- fatal trap 12 : page fault while in kernel mode, current process: (swi1:net) On Wed, Jun 13, 2007 at 07:26:51PM -0700, Yong Rao wrote: > Hello, >=20 > =20 >=20 > Recently, in developing our driver, we found the core dump did not work > on the FBSD OS 6.2. It hung at the beginning of core dump when crash > happened.=20 This can happen when the panic involves the storage subsystem, or if something else is performing I/O (i.e. holding a relevant lock) at the time of panic. Online debugging using ddb helps a lot. Kris From owner-freebsd-smp@FreeBSD.ORG Thu Jun 14 08:48:19 2007 Return-Path: X-Original-To: smp@FreeBSD.org Delivered-To: freebsd-smp@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8561716A41F; Thu, 14 Jun 2007 08:48:19 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 6991B13C448; Thu, 14 Jun 2007 08:48:19 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 5C02A1A4D80; Thu, 14 Jun 2007 01:47:52 -0700 (PDT) Received: from rot13.obsecurity.org (rot13.obsecurity.org [192.168.1.5]) by obsecurity.dyndns.org (Postfix) with ESMTP id 6FD68512A6; Thu, 14 Jun 2007 04:48:18 -0400 (EDT) Received: by rot13.obsecurity.org (Postfix, from userid 1001) id A88D6BE96; Thu, 14 Jun 2007 04:48:17 -0400 (EDT) Date: Thu, 14 Jun 2007 04:48:17 -0400 From: Kris Kennaway To: performance@FreeBSD.org Message-ID: <20070614084817.GA81087@rot13.obsecurity.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ibTvN161/egqYuK8" Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: smp@FreeBSD.org, current@FreeBSD.org Subject: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0 X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jun 2007 08:48:19 -0000 --ibTvN161/egqYuK8 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline I have been benchmarking BIND 9.4.1 recursive query performance on an 8-core opteron, using the resperf utility (dns/dnsperf in ports). The query data set was taken from www.freebsd.org's httpd-access.log with some of the highly aggressive robot IP addresses pruned out (to avoid huge numbers of repeated queries against a small subset of addresses, which would skew the results). Testing was done over a broadcom gigabit ethernet cable connected back-to-back between two identical machines. named was restarted in between tests to flush the cache. resperf is designed to slowly increase the query rate over a period of 60 seconds, up to a maximum query rate, to determine the point at which the server starts to fall behind on answering queries. To more accurately measure this point, in each case I tuned the maximum query rate so that the server fell behind after around 50 seconds of load. 7.0 was used with up-to-date CVS sources and the SCHED_SMP (enhanced SMP) scheduler, which is not yet committed but for which patches have been posted by Jeff Roberson. Actually this did not make much difference compared to ULE on this workload, although I didn't graph ULE. BIND 9.4.1 from the base system was used for the threaded version, and the bind94 port with threads disabled for comparison. All debugging was disabled. 6.2 was used from CVS with libthr and the 4BSD scheduler (ULE 1.0 is broken in 6.x). In addition I also tested a previously posted patch from rwatson that may be found here: www.watson.org/~robert/freebsd/netperf/20070311-sosend_dgram.diff The results show several interesting things: http://obsecurity.dyndns.org/bind-resperf.png Firstly, 7.0 beats 6.2 across the board, and has about 60% higher peak performance. BIND does not scale beyond 4 worker threads, but this appears to be due to high contention on pthread mutexes in userland, i.e. a BIND design problem rather than a FreeBSD kernel problem. There is moderate UDP contention that, if it can be optimized, might increase peak performance but is not likely to improve scaling. For now it appears that BIND 9.4 does not scale to >4 CPUs. FreeBSD 6.2 seems to have at least two major performance bottlenecks, due to file descriptor locking, and poor scaling of the old sx lock implementation (both have been fixed in 7.0). I actually don't know what is using the sx locks so heavily in 6.2, there does not appear to be an analogue on the 7.0 lock profile. There are other optimizations in 7.0 that are probably responsible for a smaller part of the difference. Robert's patch gives a modest boost to 6.2 at light concurrency but is swamped by the other scaling problems at high load. The graph should not be interpreted as showing that this patch performs worse at high load; the variance is so enormous that it is easily consistent with the CVS data. It would be interesting to test BIND performance when acting as an authoritative server, which probably has very different performance characteristics; the difficulty there is getting access to a suitably interesting and representative zone file and query data. Kris --ibTvN161/egqYuK8 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.3 (FreeBSD) iD8DBQFGcQDRWry0BWjoQKURAnalAJ98Xy6gN98dAgfaE2/wEcGP1h6aJACfaJWC G2I23fG2Bt5St7iCAUxl6Kw= =Jqh4 -----END PGP SIGNATURE----- --ibTvN161/egqYuK8-- From owner-freebsd-smp@FreeBSD.ORG Thu Jun 14 13:55:48 2007 Return-Path: X-Original-To: smp@freebsd.org Delivered-To: freebsd-smp@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8F8F816A46B; Thu, 14 Jun 2007 13:55:48 +0000 (UTC) (envelope-from tec@mega.net.br) Received: from msrv.matik.com.br (msrv.matik.com.br [200.152.83.14]) by mx1.freebsd.org (Postfix) with ESMTP id 1710513C480; Thu, 14 Jun 2007 13:55:47 +0000 (UTC) (envelope-from tec@mega.net.br) Received: from ap-h.matik.com.br (ap-h.matik.com.br [200.152.83.36]) by msrv.matik.com.br (8.14.1/8.13.1) with ESMTP id l5ECcRZo088562; Thu, 14 Jun 2007 09:38:28 -0300 (BRT) (envelope-from tec@mega.net.br) From: NOC Meganet Organization: Prowip Telecom Ltda To: freebsd-performance@freebsd.org Date: Thu, 14 Jun 2007 09:36:55 -0300 User-Agent: KMail/1.9.6 References: <20070614084817.GA81087@rot13.obsecurity.org> In-Reply-To: <20070614084817.GA81087@rot13.obsecurity.org> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200706140936.55916.tec@mega.net.br> X-Virus-Scanned: ClamAV version 0.90.3, clamav-milter version 0.90.3 on msrv.matik.com.br X-Virus-Status: Clean Cc: smp@freebsd.org, performance@freebsd.org, Kris Kennaway Subject: Re: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0 X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jun 2007 13:55:48 -0000 On Thursday 14 June 2007 05:48:17 Kris Kennaway wrote: > 6.2 was used from CVS with libthr and the 4BSD scheduler (ULE 1.0 is > broken in 6.x). just curious what is broken because I use ULE on several servers perfectly. it seems to me that ULE is even faster on SMP when not having heavy load. Also "calcru went backwards" issues I do not get with ULE but sporadically on 4BSD scheduler kernels, specially on dualcore cpus. HM -- Prowip Telecom Ltda AS 22706 A mensagem foi scaneada pelo sistema de e-mail e pode ser considerada segura. Service fornecido pelo Datacenter Matik https://datacenter.matik.com.br From owner-freebsd-smp@FreeBSD.ORG Thu Jun 14 18:09:42 2007 Return-Path: X-Original-To: smp@freebsd.org Delivered-To: freebsd-smp@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A6E7C16A46B; Thu, 14 Jun 2007 18:09:42 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 90F1513C465; Thu, 14 Jun 2007 18:09:42 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id EB8EF1A4D84; Thu, 14 Jun 2007 11:09:13 -0700 (PDT) Received: from rot13.obsecurity.org (rot13.obsecurity.org [192.168.1.5]) by obsecurity.dyndns.org (Postfix) with ESMTP id CF4D7513AE; Thu, 14 Jun 2007 14:09:41 -0400 (EDT) Received: by rot13.obsecurity.org (Postfix, from userid 1001) id 7353CBE89; Thu, 14 Jun 2007 14:09:41 -0400 (EDT) Date: Thu, 14 Jun 2007 14:09:41 -0400 From: Kris Kennaway To: NOC Meganet Message-ID: <20070614180941.GA88451@rot13.obsecurity.org> References: <20070614084817.GA81087@rot13.obsecurity.org> <200706140936.55916.tec@mega.net.br> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200706140936.55916.tec@mega.net.br> User-Agent: Mutt/1.4.2.3i Cc: freebsd-performance@freebsd.org, performance@freebsd.org, smp@freebsd.org, Kris Kennaway Subject: Re: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0 X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jun 2007 18:09:42 -0000 On Thu, Jun 14, 2007 at 09:36:55AM -0300, NOC Meganet wrote: > On Thursday 14 June 2007 05:48:17 Kris Kennaway wrote: > > 6.2 was used from CVS with libthr and the 4BSD scheduler (ULE 1.0 is > > broken in 6.x). > > just curious what is broken because I use ULE on several servers perfectly. it > seems to me that ULE is even faster on SMP when not having heavy load. > Also "calcru went backwards" issues I do not get with ULE but sporadically on > 4BSD scheduler kernels, specially on dualcore cpus. ULE on 6.x and is known to have severe performance problems in some workloads, as well as bugs that cause it to crash. Use it at your own peril :) Kris From owner-freebsd-smp@FreeBSD.ORG Thu Jun 14 19:06:04 2007 Return-Path: X-Original-To: freebsd-smp@freebsd.org Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CC95D16A46B for ; Thu, 14 Jun 2007 19:06:04 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id B152E13C46A for ; Thu, 14 Jun 2007 19:06:04 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id E2A881A3C1A; Thu, 14 Jun 2007 12:05:35 -0700 (PDT) Received: from rot13.obsecurity.org (rot13.obsecurity.org [192.168.1.5]) by obsecurity.dyndns.org (Postfix) with ESMTP id 03BAA5129D; Thu, 14 Jun 2007 15:06:03 -0400 (EDT) Received: by rot13.obsecurity.org (Postfix, from userid 1001) id 94980BEC4; Thu, 14 Jun 2007 15:06:03 -0400 (EDT) Date: Thu, 14 Jun 2007 15:06:03 -0400 From: Kris Kennaway To: Yong Rao Message-ID: <20070614190603.GA89528@rot13.obsecurity.org> References: <1818EFE74C4A8A4292E05835D378EC66130055@EXCH-CLUSTER-07.force10networks.com> <20070614024333.GA70019@rot13.obsecurity.org> <1818EFE74C4A8A4292E05835D378EC66130074@EXCH-CLUSTER-07.force10networks.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="NzB8fVQJ5HfG6fxh" Content-Disposition: inline In-Reply-To: <1818EFE74C4A8A4292E05835D378EC66130074@EXCH-CLUSTER-07.force10networks.com> User-Agent: Mutt/1.4.2.3i Cc: Jagjit Choudhary , freebsd-smp@freebsd.org, Kris Kennaway Subject: Re: FreeBSD-6.2, SMP, coredump -- fatal trap 12 : page fault while in kernel mode, current process: (swi1:net) X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jun 2007 19:06:04 -0000 --NzB8fVQJ5HfG6fxh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 13, 2007 at 11:21:53PM -0700, Yong Rao wrote: > Hi Kris, >=20 > Thanks for your reply. >=20 > I have 2 exactly the same equipments. The ONLY difference is the OS. One > is running 6.0 FBSD. The other one is 6.2 FBSD.=20 >=20 > I built the kernels using the same configuration file (with SMP option). > My problem is that the OS 6.2 does not dump core when crash happens, > while the OS 6.0 does core dump perfectly. Why it is like this? > Something wrong with FBSD 6.2 OS when SMP options is enabled? Impossible to rule it out of course, but this sounds unlikely. Many of us developers perform dumps on SMP 6.2 systems regularly. There were *many* changes between 6.0 and 6.2, and perhaps it is as simple as your workload is exercising a slightly different code path or timing. Using DDB you will be able to determine what the other threads are doing at the time of crash and this should help to figure out why dumping is failing. Kris --NzB8fVQJ5HfG6fxh Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.3 (FreeBSD) iD8DBQFGcZGbWry0BWjoQKURAgNpAJ9ZBSoKSjyjLGS2LkRSgRWzHYeB6gCfddSH omoX5/J1c149Q/O7EJRzMho= =qioB -----END PGP SIGNATURE----- --NzB8fVQJ5HfG6fxh-- From owner-freebsd-smp@FreeBSD.ORG Thu Jun 14 23:37:04 2007 Return-Path: X-Original-To: freebsd-smp@freebsd.org Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AAC8216A400 for ; Thu, 14 Jun 2007 23:37:04 +0000 (UTC) (envelope-from yrao@force10networks.com) Received: from mx.force10networks.com (nat-eqx.force10networks.com [69.25.56.27]) by mx1.freebsd.org (Postfix) with ESMTP id 92B7C13C45D for ; Thu, 14 Jun 2007 23:37:04 +0000 (UTC) (envelope-from yrao@force10networks.com) Received: from mx.force10networks.com ([10.11.0.221]) by mx.force10networks.com with Microsoft SMTPSVC(6.0.3790.0); Thu, 14 Jun 2007 16:37:44 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Thu, 14 Jun 2007 16:37:43 -0700 Message-ID: <1818EFE74C4A8A4292E05835D378EC66130205@EXCH-CLUSTER-07.force10networks.com> In-Reply-To: <20070614190603.GA89528@rot13.obsecurity.org> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: FreeBSD-6.2, SMP, coredump -- fatal trap 12 : page fault while in kernel mode, current process: (swi1:net) Thread-Index: AceutydeAlpDdK1bRxGRUR+iQqSZ0gAJOzbw References: <1818EFE74C4A8A4292E05835D378EC66130055@EXCH-CLUSTER-07.force10networks.com> <20070614024333.GA70019@rot13.obsecurity.org> <1818EFE74C4A8A4292E05835D378EC66130074@EXCH-CLUSTER-07.force10networks.com> <20070614190603.GA89528@rot13.obsecurity.org> From: "Yong Rao" To: "Kris Kennaway" X-OriginalArrivalTime: 14 Jun 2007 23:37:44.0810 (UTC) FILETIME=[0202ECA0:01C7AEDD] Cc: freebsd-smp@freebsd.org, Jagjit Choudhary Subject: RE: FreeBSD-6.2, SMP, coredump -- fatal trap 12 : page fault while in kernel mode, current process: (swi1:net) X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jun 2007 23:37:04 -0000 Hi Kris, We are wondering if there is any connection between the core dump failure and the options DEVICE_POLLING? In our kernel, we have options DEVICE_POLLING, and in our driver (Ethernet) we have the ether_poll_register() function (not registered yet). Do you know if there is any problem with this? (Using polling mode and SMP simultaneously) Thanks, Yong -----Original Message----- From: Kris Kennaway [mailto:kris@obsecurity.org]=20 Sent: Thursday, June 14, 2007 12:06 PM To: Yong Rao Cc: Kris Kennaway; freebsd-smp@freebsd.org; Jagjit Choudhary Subject: Re: FreeBSD-6.2, SMP, coredump -- fatal trap 12 : page fault while in kernel mode, current process: (swi1:net) On Wed, Jun 13, 2007 at 11:21:53PM -0700, Yong Rao wrote: > Hi Kris, >=20 > Thanks for your reply. >=20 > I have 2 exactly the same equipments. The ONLY difference is the OS. One > is running 6.0 FBSD. The other one is 6.2 FBSD.=20 >=20 > I built the kernels using the same configuration file (with SMP option). > My problem is that the OS 6.2 does not dump core when crash happens, > while the OS 6.0 does core dump perfectly. Why it is like this? > Something wrong with FBSD 6.2 OS when SMP options is enabled? Impossible to rule it out of course, but this sounds unlikely. Many of us developers perform dumps on SMP 6.2 systems regularly. There were *many* changes between 6.0 and 6.2, and perhaps it is as simple as your workload is exercising a slightly different code path or timing. Using DDB you will be able to determine what the other threads are doing at the time of crash and this should help to figure out why dumping is failing. Kris From owner-freebsd-smp@FreeBSD.ORG Fri Jun 15 00:03:21 2007 Return-Path: X-Original-To: smp@FreeBSD.org Delivered-To: freebsd-smp@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A66BD16A479; Fri, 15 Jun 2007 00:03:21 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 8EF3A13C4AE; Fri, 15 Jun 2007 00:03:21 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id BFDD11A3C1A; Thu, 14 Jun 2007 17:02:51 -0700 (PDT) Received: from rot13.obsecurity.org (rot13.obsecurity.org [192.168.1.5]) by obsecurity.dyndns.org (Postfix) with ESMTP id 92327513AE; Thu, 14 Jun 2007 20:03:20 -0400 (EDT) Received: by rot13.obsecurity.org (Postfix, from userid 1001) id 36814BEC4; Thu, 14 Jun 2007 20:03:20 -0400 (EDT) Date: Thu, 14 Jun 2007 20:03:20 -0400 From: Kris Kennaway To: Chuck Swiger Message-ID: <20070615000320.GA94458@rot13.obsecurity.org> References: <20070614084817.GA81087@rot13.obsecurity.org> <449EAA15-A4BC-4AAE-B3ED-B65E7A079877@mac.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="bp/iNruPH9dso1Pn" Content-Disposition: inline In-Reply-To: <449EAA15-A4BC-4AAE-B3ED-B65E7A079877@mac.com> User-Agent: Mutt/1.4.2.3i Cc: smp@FreeBSD.org, performance@FreeBSD.org, current@FreeBSD.org, Kris Kennaway Subject: Re: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0 X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jun 2007 00:03:21 -0000 --bp/iNruPH9dso1Pn Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Jun 14, 2007 at 04:53:01PM -0700, Chuck Swiger wrote: > Hi, Kris-- >=20 > This was interesting, thanks for putting together the testing and =20 > graphs. >=20 > On Jun 14, 2007, at 1:48 AM, Kris Kennaway wrote: > >I have been benchmarking BIND 9.4.1 recursive query performance on an > >8-core opteron, using the resperf utility (dns/dnsperf in ports). The > >query data set was taken from www.freebsd.org's httpd-access.log with > >some of the highly aggressive robot IP addresses pruned out (to avoid > >huge numbers of repeated queries against a small subset of addresses, > >which would skew the results). >=20 > It's at least arguable that doing queries against a data set =20 > including a bunch of repeats is "skewed" in a more realistic =20 > fashion. :-) A quick look at some of the data sources I have handy =20 > such as http access logs or Squid proxy logs suggests that (for =20 > example) out of a database of 17+ million requests, there were only =20 > 46000 unique IPs involved. There were still lots of repeats, just some of them were repeated hundreds of thousands of times - I stripped about a dozen of those (googlebots, I'm looking at you ;-), leaving a distribution that was less biased to the top end. > You might find it interesting to compare doing queries against your =20 > raw and filtered datasets, just to see what kind of difference you =20 > get, if any. Cached queries perform much better, as you might expect. As an estimate I was getting query rates exceeding 120000 qps when serving entirely out of cache, and I dont think I reached the upper bound yet. > >Testing was done over a broadcom gigabit ethernet cable connected > >back-to-back between two identical machines. named was restarted in > >between tests to flush the cache. >=20 > What was the external network connectivity in terms of speed? The =20 > docs suggest you need something like a 16MBs up/8 Mbs down =20 > connectivity in order to get up to 50K requests/sec.... I wasn't seeing anything close to this, so I guess it depends how much data is being returned by the queries (I was doing PTR lookups). I forget the exact numbers but it wasn't exceeding about 10Mbit in both directions, which should have been well within link capacity. Also the lock profiling data bears out the interpretation that it was BIND that was becoming saturated and not the hardware. > [ ... ] > >It would be interesting to test BIND performance when acting as an > >authoritative server, which probably has very different performance > >characteristics; the difficulty there is getting access to a suitably > >interesting and representative zone file and query data. >=20 > I suppose you could also set up a test nameserver which claims to be =20 > authoritative for all of in-addr.arpa, and set up a bunch (65K?) /16 =20 > reverse zone files, and then test against real unmodified IPs, but it =20 > would be easier to do something like this: >=20 > Set up a nameserver which is authoritative for 1.10.in-addr.arpa (ie, =20 > the reverse zone for 10.1/16), and use a zonefile with the $GENERATE =20 > directive to populate your PTR records: >=20 > $TTL 86400 > $origin 1.10.in-addr.arpa. >=20 > @ IN SOA localhost. hostmaster.localhost. ( > 1 ; serial (YYYYMMDD##) > 3h ; Refresh 3 hours > 1h ; Retry 1 hour > 30d ; Expire 30 days > 1d ) ; Minimum 24 hours >=20 > @ NS localhost. >=20 > $GENERATE 0-255 $.0 PTR ip-10-1-0-$.example.com. > $GENERATE 0-255 $.1 PTR ip-10-1-1-$.example.org. > $GENERATE 0-255 $.2 PTR ip-10-1-2-$.example.net. > ; ...etc... >=20 > ...and then feed it a query database consisting of PTR lookups. If =20 > you wanted to, you could take your existing IP database, and glue the =20 > last two octets of the real IPs onto 10.1 to produce a reasonable =20 > assortment of IPs to perform a reverse lookup upon. I could construct something like this but I'd prefer a more "realistic" workload (i.e. an uneven distribution of queries against different subsets of the data). I don't have a good idea what "realistic" means here, which makes it hard to construct one from scratch. Fortunately I have an offer from someone for access to a real large zone file and a large sample of queries. Kris --bp/iNruPH9dso1Pn Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.3 (FreeBSD) iD8DBQFGcddHWry0BWjoQKURAm5SAJ0WNKEKSmWAeDvbLVZDsYGGtyT9QQCgt/Rl imFuDyK59RuNiN+tPJ4C8/Q= =c16C -----END PGP SIGNATURE----- --bp/iNruPH9dso1Pn-- From owner-freebsd-smp@FreeBSD.ORG Fri Jun 15 00:08:06 2007 Return-Path: X-Original-To: smp@FreeBSD.org Delivered-To: freebsd-smp@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A11DD16A473; Fri, 15 Jun 2007 00:08:06 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from mail-out3.apple.com (mail-out3.apple.com [17.254.13.22]) by mx1.freebsd.org (Postfix) with ESMTP id 87A1213C468; Fri, 15 Jun 2007 00:08:06 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from relay8.apple.com (relay8.apple.com [17.128.113.38]) by mail-out3.apple.com (Postfix) with ESMTP id 8CBCB8DDBEC; Thu, 14 Jun 2007 16:51:52 -0700 (PDT) Received: from relay8.apple.com (unknown [127.0.0.1]) by relay8.apple.com (Symantec Mail Security) with ESMTP id 3E8DD4008B; Thu, 14 Jun 2007 16:53:02 -0700 (PDT) X-AuditID: 11807126-a1339bb000002ff2-ff-4671d4deddef Received: from [17.214.13.96] (cswiger1.apple.com [17.214.13.96]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by relay8.apple.com (Apple SCV relay) with ESMTP id 1571E4005A; Thu, 14 Jun 2007 16:53:02 -0700 (PDT) In-Reply-To: <20070614084817.GA81087@rot13.obsecurity.org> References: <20070614084817.GA81087@rot13.obsecurity.org> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <449EAA15-A4BC-4AAE-B3ED-B65E7A079877@mac.com> Content-Transfer-Encoding: 7bit From: Chuck Swiger Date: Thu, 14 Jun 2007 16:53:01 -0700 To: Kris Kennaway X-Mailer: Apple Mail (2.752.2) X-Brightmail-Tracker: AAAAAA== Cc: smp@FreeBSD.org, performance@FreeBSD.org, current@FreeBSD.org Subject: Re: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0 X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jun 2007 00:08:06 -0000 Hi, Kris-- This was interesting, thanks for putting together the testing and graphs. On Jun 14, 2007, at 1:48 AM, Kris Kennaway wrote: > I have been benchmarking BIND 9.4.1 recursive query performance on an > 8-core opteron, using the resperf utility (dns/dnsperf in ports). The > query data set was taken from www.freebsd.org's httpd-access.log with > some of the highly aggressive robot IP addresses pruned out (to avoid > huge numbers of repeated queries against a small subset of addresses, > which would skew the results). It's at least arguable that doing queries against a data set including a bunch of repeats is "skewed" in a more realistic fashion. :-) A quick look at some of the data sources I have handy such as http access logs or Squid proxy logs suggests that (for example) out of a database of 17+ million requests, there were only 46000 unique IPs involved. You might find it interesting to compare doing queries against your raw and filtered datasets, just to see what kind of difference you get, if any. > Testing was done over a broadcom gigabit ethernet cable connected > back-to-back between two identical machines. named was restarted in > between tests to flush the cache. What was the external network connectivity in terms of speed? The docs suggest you need something like a 16MBs up/8 Mbs down connectivity in order to get up to 50K requests/sec.... [ ... ] > It would be interesting to test BIND performance when acting as an > authoritative server, which probably has very different performance > characteristics; the difficulty there is getting access to a suitably > interesting and representative zone file and query data. I suppose you could also set up a test nameserver which claims to be authoritative for all of in-addr.arpa, and set up a bunch (65K?) /16 reverse zone files, and then test against real unmodified IPs, but it would be easier to do something like this: Set up a nameserver which is authoritative for 1.10.in-addr.arpa (ie, the reverse zone for 10.1/16), and use a zonefile with the $GENERATE directive to populate your PTR records: $TTL 86400 $origin 1.10.in-addr.arpa. @ IN SOA localhost. hostmaster.localhost. ( 1 ; serial (YYYYMMDD##) 3h ; Refresh 3 hours 1h ; Retry 1 hour 30d ; Expire 30 days 1d ) ; Minimum 24 hours @ NS localhost. $GENERATE 0-255 $.0 PTR ip-10-1-0-$.example.com. $GENERATE 0-255 $.1 PTR ip-10-1-1-$.example.org. $GENERATE 0-255 $.2 PTR ip-10-1-2-$.example.net. ; ...etc... ...and then feed it a query database consisting of PTR lookups. If you wanted to, you could take your existing IP database, and glue the last two octets of the real IPs onto 10.1 to produce a reasonable assortment of IPs to perform a reverse lookup upon. -- -Chuck From owner-freebsd-smp@FreeBSD.ORG Fri Jun 15 00:26:06 2007 Return-Path: X-Original-To: smp@FreeBSD.org Delivered-To: freebsd-smp@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8A2FC16A400; Fri, 15 Jun 2007 00:26:06 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from mail-out3.apple.com (mail-out3.apple.com [17.254.13.22]) by mx1.freebsd.org (Postfix) with ESMTP id 6FA5813C45A; Fri, 15 Jun 2007 00:26:06 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from relay7.apple.com (relay7.apple.com [17.128.113.37]) by mail-out3.apple.com (Postfix) with ESMTP id 864088DE567; Thu, 14 Jun 2007 17:24:56 -0700 (PDT) Received: from relay7.apple.com (unknown [127.0.0.1]) by relay7.apple.com (Symantec Mail Security) with ESMTP id 4323630076; Thu, 14 Jun 2007 17:26:06 -0700 (PDT) X-AuditID: 11807125-9ff64bb000000801-ce-4671dc9e5429 Received: from [17.214.13.96] (cswiger1.apple.com [17.214.13.96]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by relay7.apple.com (Apple SCV relay) with ESMTP id 1818530041; Thu, 14 Jun 2007 17:26:06 -0700 (PDT) In-Reply-To: <20070615000320.GA94458@rot13.obsecurity.org> References: <20070614084817.GA81087@rot13.obsecurity.org> <449EAA15-A4BC-4AAE-B3ED-B65E7A079877@mac.com> <20070615000320.GA94458@rot13.obsecurity.org> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <7A845D91-435E-4F1C-A05A-270A04DAC20E@mac.com> Content-Transfer-Encoding: 7bit From: Chuck Swiger Date: Thu, 14 Jun 2007 17:26:05 -0700 To: Kris Kennaway X-Mailer: Apple Mail (2.752.2) X-Brightmail-Tracker: AAAAAA== Cc: smp@FreeBSD.org, performance@FreeBSD.org, current@FreeBSD.org Subject: Re: BIND 9.4.1 performance on FreeBSD 6.2 vs. 7.0 X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jun 2007 00:26:06 -0000 On Jun 14, 2007, at 5:03 PM, Kris Kennaway wrote: >> It's at least arguable that doing queries against a data set >> including a bunch of repeats is "skewed" in a more realistic >> fashion. :-) A quick look at some of the data sources I have handy >> such as http access logs or Squid proxy logs suggests that (for >> example) out of a database of 17+ million requests, there were only >> 46000 unique IPs involved. > > There were still lots of repeats, just some of them were repeated > hundreds of thousands of times - I stripped about a dozen of those > (googlebots, I'm looking at you ;-), leaving a distribution that was > less biased to the top end. Heh, yes, it's surprising how happy a webspider is to crawl around a heavily-interlinked site. :-) Perhaps someone ought to add a: Crawl-delay: 600 ...statement to http://www.freebsd.org/robots.txt...? >> You might find it interesting to compare doing queries against your >> raw and filtered datasets, just to see what kind of difference you >> get, if any. > > Cached queries perform much better, as you might expect. As an > estimate I was getting query rates exceeding 120000 qps when serving > entirely out of cache, and I dont think I reached the upper bound yet. Sure, anything cached or anything the nameserver is authoritative for is going to be directly answerable without having to do an external recursive query. >> What was the external network connectivity in terms of speed? The >> docs suggest you need something like a 16MBs up/8 Mbs down >> connectivity in order to get up to 50K requests/sec.... > > I wasn't seeing anything close to this, so I guess it depends how much > data is being returned by the queries (I was doing PTR lookups). I > forget the exact numbers but it wasn't exceeding about 10Mbit in both > directions, which should have been well within link capacity. Also > the lock profiling data bears out the interpretation that it was BIND > that was becoming saturated and not the hardware. OK, thanks for the info. Maybe I'll get a chance to run some numbers of my own testing, if I can free up some time from WWDC.... >> [ ... ] >>> It would be interesting to test BIND performance when acting as an >>> authoritative server, which probably has very different performance >>> characteristics; the difficulty there is getting access to a >>> suitably >>> interesting and representative zone file and query data. >> >> I suppose you could also set up a test nameserver which claims to be >> authoritative for all of in-addr.arpa, and set up a bunch (65K?) /16 >> reverse zone files, and then test against real unmodified IPs, but it >> would be easier to do something like this: >> >> Set up a nameserver which is authoritative for 1.10.in-addr.arpa (ie, >> the reverse zone for 10.1/16), and use a zonefile with the $GENERATE >> directive to populate your PTR records: >> >> [ ...zonefile snipped for brevity... ] >> >> ...and then feed it a query database consisting of PTR lookups. If >> you wanted to, you could take your existing IP database, and glue the >> last two octets of the real IPs onto 10.1 to produce a reasonable >> assortment of IPs to perform a reverse lookup upon. > > I could construct something like this but I'd prefer a more > "realistic" workload (i.e. an uneven distribution of queries against > different subsets of the data). I don't have a good idea what > "realistic" means here, which makes it hard to construct one from > scratch. Fortunately I have an offer from someone for access to a > real large zone file and a large sample of queries. Ah, very good, then. While I expect there to be quite a difference between recursive queries vs. authoritative/locally answerable queries (after all, that seems to be why both dnsperf and resperf were created as distinct programs), I'm not convinced that there is too much difference between doing reverse lookups for one set of IPs versus another if those IPs are all in zones the server is authoritative for. -- -Chuck From owner-freebsd-smp@FreeBSD.ORG Fri Jun 15 00:41:21 2007 Return-Path: X-Original-To: freebsd-smp@freebsd.org Delivered-To: freebsd-smp@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4A6A116A400 for ; Fri, 15 Jun 2007 00:41:21 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 345D813C484 for ; Fri, 15 Jun 2007 00:41:21 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 685031A4D84; Thu, 14 Jun 2007 17:40:51 -0700 (PDT) Received: from rot13.obsecurity.org (rot13.obsecurity.org [192.168.1.5]) by obsecurity.dyndns.org (Postfix) with ESMTP id 80597513BC; Thu, 14 Jun 2007 20:41:20 -0400 (EDT) Received: by rot13.obsecurity.org (Postfix, from userid 1001) id 1E663BEC4; Thu, 14 Jun 2007 20:41:20 -0400 (EDT) Date: Thu, 14 Jun 2007 20:41:20 -0400 From: Kris Kennaway To: Yong Rao Message-ID: <20070615004120.GA95169@rot13.obsecurity.org> References: <1818EFE74C4A8A4292E05835D378EC66130055@EXCH-CLUSTER-07.force10networks.com> <20070614024333.GA70019@rot13.obsecurity.org> <1818EFE74C4A8A4292E05835D378EC66130074@EXCH-CLUSTER-07.force10networks.com> <20070614190603.GA89528@rot13.obsecurity.org> <1818EFE74C4A8A4292E05835D378EC66130205@EXCH-CLUSTER-07.force10networks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1818EFE74C4A8A4292E05835D378EC66130205@EXCH-CLUSTER-07.force10networks.com> User-Agent: Mutt/1.4.2.3i Cc: Jagjit Choudhary , freebsd-smp@freebsd.org, Kris Kennaway Subject: Re: FreeBSD-6.2, SMP, coredump -- fatal trap 12 : page fault while in kernel mode, current process: (swi1:net) X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 Jun 2007 00:41:21 -0000 On Thu, Jun 14, 2007 at 04:37:43PM -0700, Yong Rao wrote: > Hi Kris, > > We are wondering if there is any connection between the core dump > failure and the options DEVICE_POLLING? In our kernel, we have options > DEVICE_POLLING, and in our driver (Ethernet) we have the > ether_poll_register() function (not registered yet). Do you know if > there is any problem with this? (Using polling mode and SMP > simultaneously) I am not sure, sorry. I don't use this option myself. Kris