Date: Tue, 16 Oct 2007 00:49:17 +0100 From: Deomid Ryabkov <myself@rojer.pp.ru> To: freebsd-hackers@freebsd.org Subject: Re: 6.2: reproducible hang on amd64, traced to 24h of commits Message-ID: <4713FC7D.6070201@rojer.pp.ru> In-Reply-To: <460D13B0.5070500@rojer.pp.ru> References: <460D13B0.5070500@rojer.pp.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
This is a cryptographically signed message in MIME format. --------------ms030203010005010808090603 Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit fwiw, i have not traced it down to a commit (got fed up with hangs), but conclusively singled out smartmontools as the trigger. after adding 2 more disks, machine wouldn't even boot up past starting smartmontools, locking up hard with the same symptoms. with smartmontools disabled, it booted up and has been up for > 2 months now. Deomid Ryabkov wrote: > ok, now that the machine has been up for 10 days, i am reasonably sure > i've close enough to this one. > > back in january i cvsupped to -STABLE and the box (dual head opteron > box) started hanging. > and i mean it dies completely. > i have all debug options and a working serial console, but still it > just dies and both serial and system console are unresponsive. > no panic message on either, nothing. pretty sad. > > the kernel config is vanilla SMP GENERIC, with all debug options i > could think of enabled (after it started hanging). > > so the first thing i did after rebooting the box a couple of times is > fall back to kernel.old (6.1-STABLE circa august '06). > no hangs. i then started incrementally updating, gradually getting > closer to jan 22. > long story short, i seem to have isolated the problem to commits made > between > date=2006.12.28.00.00.00 and date=2006.12.29.00.00.00. > last hang i had was when running the 12/29 kernel, now it's 12/28 and > the box has been up for 2 weeks already. > based on previois experience i'm pretty certain that this is it. with > bad kernel the box would never stay up more than a few days, never > more than 5. > between 12/28 and 12/29 i see some changes to /sys/amd64/ and > /sys/pci/, which might've be the cause. > i will probably start looking into individual changes, but if anyone > more experienced than me could take a look, it'd be appreciated. > i am willing to try patches. > i confirmed that recent (as of 3 weeks or so) -STABLE still has this > problem. > > thanks in advance. > > ==== > files under /sys that were changed between 12/28 and 12/29: > > Edit src/sys/amd64/amd64/mptable_pci.c > Edit src/sys/amd64/pci/pci_bus.c > Edit src/sys/contrib/dev/ath/public/wackelf.c > Edit src/sys/dev/acpica/acpi_pci.c > Edit src/sys/dev/acpica/acpi_pcib_acpi.c > Edit src/sys/dev/acpica/acpi_pcib_pci.c > Checkout src/sys/dev/ath/if_ath.c > Edit src/sys/dev/cardbus/cardbus.c > Edit src/sys/dev/drm/drm_agpsupport.c > Edit src/sys/dev/pci/pci.c > Edit src/sys/dev/pci/pci_if.m > Edit src/sys/dev/pci/pci_pci.c > Edit src/sys/dev/pci/pci_private.h > Edit src/sys/dev/pci/pcib_private.h > Edit src/sys/dev/pci/pcivar.h > Edit src/sys/i386/i386/mptable_pci.c > Edit src/sys/i386/pci/pci_bus.c > Edit src/sys/kern/subr_bus.c > Checkout src/sys/netgraph/ng_deflate.h > Edit src/sys/pci/agp.c > Edit src/sys/pci/agpreg.h > Edit src/sys/powerpc/ofw/ofw_pcib_pci.c > Edit src/sys/sparc64/pci/apb.c > Edit src/sys/sparc64/pci/ofw_pcib.c > Edit src/sys/sparc64/pci/ofw_pcibus.c > Edit src/sys/sys/param.h > > > ==== > kernel configuration used: > > include GENERIC > > options SMP > > options KDB > options DDB > > makeoptions DEBUG=-g > options INVARIANTS > options INVARIANT_SUPPORT > options WITNESS > options DEBUG_LOCKS > options DEBUG_VFS_LOCKS > options DIAGNOSTIC > ==== > -- Deomid Ryabkov aka Rojer myself@rojer.pp.ru rojer@sysadmins.ru ICQ: 8025844 --------------ms030203010005010808090603 Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIJPTCC AvkwggJioAMCAQICEBSsKKL5WVjzKP6XqbFuFxowDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UE BhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMT I1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA3MDUyNzAxMjM1NloX DTA4MDUyNjAxMjM1NlowXzEQMA4GA1UEBBMHUnlhYmtvdjEPMA0GA1UEKhMGRGVvbWlkMRcw FQYDVQQDEw5EZW9taWQgUnlhYmtvdjEhMB8GCSqGSIb3DQEJARYSbXlzZWxmQHJvamVyLnBw LnJ1MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArWqlOZVx3IRUSdA6ZnFp2+Su bCUBXwtbtI85NhIm45OugjjzcDoO0bcm2UnYalVzBR9zpRPsUyw53+nWphovBP4adrfCaVHX 9tPE3qDH1sLSuz8RNDwu1joU0w7WLYJIhGjPyv0oWBdEcQJ9HKhCVN9UWZJ9HfYHmXqpNNWF 0iidiVNjAcQs3E+1AK4L9PKryLJxCHRvSiviL9qw843jqfT8B1NJ48W82Tqep0O79CAxWKHY seXwQ294lZxXpNril9bnZ8iVbYhVdFvS3T70mIVP3LrXAjXxIG4vd7n3wsg4uWsOqg/9ChUD Bw/PwwNcLPckEEqL/uFEpmybdjGngwIDAQABoy8wLTAdBgNVHREEFjAUgRJteXNlbGZAcm9q ZXIucHAucnUwDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOBgQAX9ky6qWJikV3SSwmF j5wG5rq+svRE+Nv6sIF/OgkABrg9To9iUMjVQV1XjEt5AsdxVJWJFhnAGJXDcfV18QKEwdUz q4RU7aiA4aorOzAXZR+ezF6HZrp0agchh7rcwKJ60EbNZgycrcmPy8UPWjJyn4U6HS4FObr5 q9UB2aHlYDCCAvkwggJioAMCAQICEBSsKKL5WVjzKP6XqbFuFxowDQYJKoZIhvcNAQEFBQAw YjELMAkGA1UEBhMCWkExJTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4x LDAqBgNVBAMTI1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA3MDUy NzAxMjM1NloXDTA4MDUyNjAxMjM1NlowXzEQMA4GA1UEBBMHUnlhYmtvdjEPMA0GA1UEKhMG RGVvbWlkMRcwFQYDVQQDEw5EZW9taWQgUnlhYmtvdjEhMB8GCSqGSIb3DQEJARYSbXlzZWxm QHJvamVyLnBwLnJ1MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArWqlOZVx3IRU SdA6ZnFp2+SubCUBXwtbtI85NhIm45OugjjzcDoO0bcm2UnYalVzBR9zpRPsUyw53+nWphov BP4adrfCaVHX9tPE3qDH1sLSuz8RNDwu1joU0w7WLYJIhGjPyv0oWBdEcQJ9HKhCVN9UWZJ9 HfYHmXqpNNWF0iidiVNjAcQs3E+1AK4L9PKryLJxCHRvSiviL9qw843jqfT8B1NJ48W82Tqe p0O79CAxWKHYseXwQ294lZxXpNril9bnZ8iVbYhVdFvS3T70mIVP3LrXAjXxIG4vd7n3wsg4 uWsOqg/9ChUDBw/PwwNcLPckEEqL/uFEpmybdjGngwIDAQABoy8wLTAdBgNVHREEFjAUgRJt eXNlbGZAcm9qZXIucHAucnUwDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOBgQAX9ky6 qWJikV3SSwmFj5wG5rq+svRE+Nv6sIF/OgkABrg9To9iUMjVQV1XjEt5AsdxVJWJFhnAGJXD cfV18QKEwdUzq4RU7aiA4aorOzAXZR+ezF6HZrp0agchh7rcwKJ60EbNZgycrcmPy8UPWjJy n4U6HS4FObr5q9UB2aHlYDCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAwgdExCzAJ BgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUgVG93bjEa MBgGA1UEChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRpb24gU2Vy dmljZXMgRGl2aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBDQTEr MCkGCSqGSIb3DQEJARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0wMzA3MTcw MDAwMDBaFw0xMzA3MTYyMzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUg Q29uc3VsdGluZyAoUHR5KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1h aWwgSXNzdWluZyBDQTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV+065ypla HmjAdQRwnd/p/6Me7L3N9VvyGna9fww6YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfArhVqqP3FW y688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+B79AgAJk16emu59l0cUqVIUPSAR/p7bRPGEE QB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8ECDAGAQH/AgEAMEMGA1UdHwQ8MDowOKA2 oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQZXJzb25hbEZyZWVtYWlsQ0EuY3Js MAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UEAxMRUHJpdmF0ZUxhYmVsMi0x MzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+svsIXoUOWlJ1/TCG4+DYf qi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydxVyWN3amcOY6MIE9l X5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggNkMIIDYAIBATB2 MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5KSBMdGQu MSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQFKwoovlZ WPMo/pepsW4XGjAJBgUrDgMCGgUAoIIBwzAYBgkqhkiG9w0BCQMxCwYJKoZIhvcNAQcBMBwG CSqGSIb3DQEJBTEPFw0wNzEwMTUyMzQ5MTdaMCMGCSqGSIb3DQEJBDEWBBTgCqQSRBTJxGfn 96/hqiYlwza3nDBSBgkqhkiG9w0BCQ8xRTBDMAoGCCqGSIb3DQMHMA4GCCqGSIb3DQMCAgIA gDANBggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDCBhQYJKwYBBAGCNxAE MXgwdjBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkg THRkLjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0ECEBSs KKL5WVjzKP6XqbFuFxowgYcGCyqGSIb3DQEJEAILMXigdjBiMQswCQYDVQQGEwJaQTElMCMG A1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBl cnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0ECEBSsKKL5WVjzKP6XqbFuFxowDQYJKoZIhvcN AQEBBQAEggEAQ7elxXe9aPA9yKixnt8g1oCLetA4IQt+7auumHqNQxzk3H6thb3S7fl3+Zwi Iw6Jbpm/qunP96NxB3LaOBg9zMnZsQhtu+icig0/M/nh1SZovfAJt27lOcKMW5GcHJrUIiiZ 3z/t1C9leqcH0vcjDlbx49MOesD6eVYSQdHWvxtKYyxLwylRc7PXYv9ZB8nErDozuqmxEcYB o/InlksIPZ8A3wulv2I6fha7PSXQ9nrq/fEx7kH5EdBwY9YnSbr6PDlaZNUt9Q01vqth2MqS QW+QACZTtoRltQ7lVaItCaSyK4PdVaWnRRfjzUqyeUxz+RPja63XEFyCdvEgUz+2OAAAAAAA AA== --------------ms030203010005010808090603--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4713FC7D.6070201>