Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 27 Dec 2004 14:52:45 -0700
From:      Troy Bowman <tbowman@aros.net>
To:        freebsd-stable@freebsd.org
Subject:   am64/FreeBSD-5.3-STABLE (or RELEASE) crashes often
Message-ID:  <1104184365.16903.29.camel@gargamel>

next in thread | raw e-mail | index | archive | help

--=-n/g/LRBv/Bpbe5WyhHLb
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

And it doesn't dump its core to its dump swap space, too, so I can't run
savecore after reboot to get debugging info.  I have the swap space in
fstab commented out so it won't come up at boot to be able to manually
harvest the core, as it gives "savecore: no dumps found."  (it doesn't
happen automatically, either).=20

We recently thought we'd give 5.3 a go in production, and it has been
too unstable.   When it crashes, it doesn't reboot, so it just hangs
there until someone has to drive in and push the button.  Who knows,
maybe Linux would be more stable at this point.  Sigh.

Hardware that it is running on is a Tyan s2875 with dual amd64/246
processors, and 2 GB Registered DDR RAM (Corsair).  We're also running
vinum for all of the filesystems, mirroring them all, including the root
filesystem.  The vinum is using two SATA WD Raptors.  I have one older
IDE drive plugged in to capture the kernel dumps. =20

We've tried many different memory configurations to see if we can tune
it so that FreeBSD can handle it (DRAM ECC vs master ECC, bank & node
interleaving turned off/on, slowing the memory down, DRAM Scrub Redirect
off/on, etc, to no avail.

It's usually pagedaemon that croaks, but it crashes on the keyboard irq
process and serial IO irq process for some reason also.  I guess since
it's usually the pager that dies, that's the reason why I can't get
kernel dumps.  Here are some (manually copied) panics from the console.

Fatal trap 12: page fault while in kernel mode
cpuid =3D 0; apic id =3D 00
fault virtual address   =3D 0x88
fault code              =3D supervisor read, page not present
instruction pointer     =3D 0x8:0xffffffff80389aea
stack pointer           =3D 0x10:0xffffffffb2051a60
frame pointer           =3D 0x10:0xffffff006b12d000
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 53 (pagedaemon)
trap number             =3D 12
panic: page fault
cpuid =3D 0
boot() called on cpu#0
Uptime: 10h18m49s

...

Fatal trap 12: page fault while in kernel mode
cpuid =3D 0; apic id =3D 00
fault virtual address   =3D 0x88
fault code              =3D supervisor read, page not present
instruction pointer     =3D 0x8:0xffffffff8038a10a
frame pointer           =3D 0x10:0xffffffffb2051ab0
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D interrupt enabled, resume, IOPL =3D 0
current process         =3D 53 (pagedaemon)
trap number             =3D 12
panic: page fault
cpuid =3D 0
boot() called on cpu#0
Uptime: 15h59m55s

...
                        =3D DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        =3D resumek IOPL =3D 0
current process         =3D 36 (swi5: clock sio)
trap number             =3D 12
panic: page fault
cpuid =3D 1
kernel trap 12 with interrupts disabled

Fatal trap 12: page fault while in kernel mode
cpuid =3D 1; apic id =3D 01
fault virtual address   =3D 0x48
fault code              =3D supervisor read, page not present
instruction pointer     =3D 0x8: 0xffffffff803a40d3
stack pointer           =3D 0x10: 0xffffffffb1d63650
frame pointer           =3D 0x10: 0xffffff007b7f3a40
code segment            =3D base 0x0, limit 0xfffff, type 0x1b
                        =3D DPL 0,pres 1, long 1, def32 0, gran 1
processor eflags        =3D resume, IOPL =3D 0
current process         =3D 30
trap number             =3D 12
panic: page fault
cpuid =3D 1
spin lock sched lock held by 0xffffff007b8177b0 for > 5 seconds

...


What can I do to debug this more if I can't harvest the kernel dumps to
report a bug?  Is there anything the FreeBSD team can do?   Do I need to
resort to Linux for dual amd64 support for now? <cringe>

Thanks,

../troy

--=-n/g/LRBv/Bpbe5WyhHLb
Content-Type: application/x-pkcs7-signature; name=smime.p7s
Content-Disposition: attachment; filename=smime.p7s
Content-Transfer-Encoding: base64

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJxTCCAz0w
ggKmoAMCAQICAwxm8TANBgkqhkiG9w0BAQQFADBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhh
d3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVt
YWlsIElzc3VpbmcgQ0EwHhcNMDQwNTI4MTgzODQ5WhcNMDUwNTI4MTgzODQ5WjB9MQ8wDQYDVQQE
EwZCb3dtYW4xEDAOBgNVBCoTB00uIFRyb3kxFzAVBgNVBAMTDk0uIFRyb3kgQm93bWFuMR4wHAYJ
KoZIhvcNAQkBFg90cm95QGR1Ymxhbi5uZXQxHzAdBgkqhkiG9w0BCQEWEHRib3dtYW5AYXJvcy5u
ZXQwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQD7Y9ghMVs2OMIXd10BzwQVktILwTrg
3pLP0gRcqRnXnlyaj3JarkFuG5231bgI48UtZLlk1afqKcO33SbVZwf/Ky036fLQoCR/911rIvWR
bi+AIEZtOgXEx7+qrkPV9RjxJT0PkuBIlCsHCBj15HMAaapIf0hTU0dUuzJV+JSQ4VpYW5fj67Ht
VjD47IzROtixxBJO8eEfFC8s38lQ3W+kIbZmTFzEYWHDFfuPDz316YPWynNbAbI5vyZ9oMO6btpw
2ji/VIkzx+y2gjb7UuMc+ORDXYOQCtjmjsbvhJ49oETeMT/YgsS8t6W4NHZw3UWMyP9HWuZ92HQ4
Yvz4rhHBAgMBAAGjYjBgMA8GA1UdDwEB/wQFAwMH+YAwEQYJYIZIAYb4QgEBBAQDAgWgMCwGA1Ud
EQQlMCOBD3Ryb3lAZHVibGFuLm5ldIEQdGJvd21hbkBhcm9zLm5ldDAMBgNVHRMBAf8EAjAAMA0G
CSqGSIb3DQEBBAUAA4GBAGXuN0RhaLnR7Um49UYSsOe8+ROvsvtJbm18ua55RhwyMfo3LgYUMIk3
02d6L5SnH+3+pQrg3zNQtHQT+0YCdBWLdjwHoD4pwn5FqdPo0KKqn6Uaci8GXYKqpD+o/fvDIKqt
UefpM/mqHGW2lNMaW52n05XARYFdFB8tsYVHkl6vMIIDPTCCAqagAwIBAgIDDGbxMA0GCSqGSIb3
DQEBBAUAMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5KSBM
dGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQTAeFw0wNDA1
MjgxODM4NDlaFw0wNTA1MjgxODM4NDlaMH0xDzANBgNVBAQTBkJvd21hbjEQMA4GA1UEKhMHTS4g
VHJveTEXMBUGA1UEAxMOTS4gVHJveSBCb3dtYW4xHjAcBgkqhkiG9w0BCQEWD3Ryb3lAZHVibGFu
Lm5ldDEfMB0GCSqGSIb3DQEJARYQdGJvd21hbkBhcm9zLm5ldDCCASIwDQYJKoZIhvcNAQEBBQAD
ggEPADCCAQoCggEBAPtj2CExWzY4whd3XQHPBBWS0gvBOuDeks/SBFypGdeeXJqPclquQW4bnbfV
uAjjxS1kuWTVp+opw7fdJtVnB/8rLTfp8tCgJH/3XWsi9ZFuL4AgRm06BcTHv6quQ9X1GPElPQ+S
4EiUKwcIGPXkcwBpqkh/SFNTR1S7MlX4lJDhWlhbl+Prse1WMPjsjNE62LHEEk7x4R8ULyzfyVDd
b6QhtmZMXMRhYcMV+48PPfXpg9bKc1sBsjm/Jn2gw7pu2nDaOL9UiTPH7LaCNvtS4xz45ENdg5AK
2OaOxu+Enj2gRN4xP9iCxLy3pbg0dnDdRYzI/0da5n3YdDhi/PiuEcECAwEAAaNiMGAwDwYDVR0P
AQH/BAUDAwf5gDARBglghkgBhvhCAQEEBAMCBaAwLAYDVR0RBCUwI4EPdHJveUBkdWJsYW4ubmV0
gRB0Ym93bWFuQGFyb3MubmV0MAwGA1UdEwEB/wQCMAAwDQYJKoZIhvcNAQEEBQADgYEAZe43RGFo
udHtSbj1RhKw57z5E6+y+0lubXy5rnlGHDIx+jcuBhQwiTfTZ3ovlKcf7f6lCuDfM1C0dBP7RgJ0
FYt2PAegPinCfkWp0+jQoqqfpRpyLwZdgqqkP6j9+8Mgqq1R5+kz+aocZbaU0xpbnafTlcBFgV0U
Hy2xhUeSXq8wggM/MIICqKADAgECAgENMA0GCSqGSIb3DQEBBQUAMIHRMQswCQYDVQQGEwJaQTEV
MBMGA1UECBMMV2VzdGVybiBDYXBlMRIwEAYDVQQHEwlDYXBlIFRvd24xGjAYBgNVBAoTEVRoYXd0
ZSBDb25zdWx0aW5nMSgwJgYDVQQLEx9DZXJ0aWZpY2F0aW9uIFNlcnZpY2VzIERpdmlzaW9uMSQw
IgYDVQQDExtUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgQ0ExKzApBgkqhkiG9w0BCQEWHHBlcnNv
bmFsLWZyZWVtYWlsQHRoYXd0ZS5jb20wHhcNMDMwNzE3MDAwMDAwWhcNMTMwNzE2MjM1OTU5WjBi
MQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoG
A1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0EwgZ8wDQYJKoZIhvcNAQEB
BQADgY0AMIGJAoGBAMSmPFVzVftOucqZWh5owHUEcJ3f6f+jHuy9zfVb8hp2vX8MOmHyv1HOAdTl
UAow1wJjWiyJFXCO3cnwK4Vaqj9xVsuvPAsH5/EfkTYkKhPPK9Xzgnc9A74r/rsYPge/QIACZNen
prufZdHFKlSFD0gEf6e20TxhBEAeZBlyYLf7AgMBAAGjgZQwgZEwEgYDVR0TAQH/BAgwBgEB/wIB
ADBDBgNVHR8EPDA6MDigNqA0hjJodHRwOi8vY3JsLnRoYXd0ZS5jb20vVGhhd3RlUGVyc29uYWxG
cmVlbWFpbENBLmNybDALBgNVHQ8EBAMCAQYwKQYDVR0RBCIwIKQeMBwxGjAYBgNVBAMTEVByaXZh
dGVMYWJlbDItMTM4MA0GCSqGSIb3DQEBBQUAA4GBAEiM0VCD6gsuzA2jZqxnD3+vrL7CF6FDlpSd
f0whuPg2H6otnzYvwPQcUCCTcDz9reFhYsPZOhl+hLGZGwDFGguCdJ4lUJRix9sncVcljd2pnDmO
jCBPZV+V2vf3h9bGCE6u9uo05RAaWzVNd+NWIXiC3CEZNd4ksdMdRv9dX2VPMYIC5zCCAuMCAQEw
aTBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEs
MCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0ECAwxm8TAJBgUrDgMC
GgUAoIIBUzAYBgkqhkiG9w0BCQMxCwYJKoZIhvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0wNDEyMjcy
MTUyNDVaMCMGCSqGSIb3DQEJBDEWBBSP00YpqqRByELQ0vI45jMBoSCvijB4BgkrBgEEAYI3EAQx
azBpMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQu
MSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIDDGbxMHoGCyqG
SIb3DQEJEAILMWugaTBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcg
KFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0EC
Awxm8TANBgkqhkiG9w0BAQEFAASCAQD3yJSEp74gttbvctfDp9N7HL9QjyeJdXuQtV2FpYCyu6o2
tCHuuMv3QINDhh6Sr0zQlVugqrpOyF7Zq0RLUYivBBxJtvGpr0/FOsbXbQPqZfEBIOcuCUwGLFPo
6KFROqw/e67YbRFBpICtbMBW7+76LQXOmzirM6+gQccTcbulJLw/T7ao6e1rbqrOep8EqM5J73Yd
TaL5Watfs31YiC281MmPCpPhbSeE9vmzH4SQ3WkgWFHliYsJItflpL+xmtDcIuBmb0OpwIGJuilj
hTI3tz69YezsS6DJZ3RBPvKywkxeiG6+vZi24clFdZAQ1/mx96KXA2h20KCOvkBtF0ucAAAAAAAA



--=-n/g/LRBv/Bpbe5WyhHLb--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1104184365.16903.29.camel>