From owner-freebsd-stable Mon Apr 28 11:30:33 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id LAA29758 for stable-outgoing; Mon, 28 Apr 1997 11:30:33 -0700 (PDT) Received: from tor-adm1.nbc.netcom.ca (taob@tor-adm1.nbc.netcom.ca [207.181.89.5]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id LAA29750 for ; Mon, 28 Apr 1997 11:30:30 -0700 (PDT) Received: from localhost (taob@localhost) by tor-adm1.nbc.netcom.ca (8.8.5/8.8.5) with SMTP id OAA16342 for ; Mon, 28 Apr 1997 14:29:23 -0400 (EDT) Date: Mon, 28 Apr 1997 14:29:23 -0400 (EDT) From: Brian Tao Reply-To: Brian Tao To: freebsd-stable@freebsd.org Subject: System freezes with 2.2-970420 In-Reply-To: <3364A21F.167EB0E7@nf.jinr.ru> Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-559023410-1483920592-862251942=:12135" Content-ID: Sender: owner-stable@freebsd.org X-Loop: FreeBSD.org Precedence: bulk This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---559023410-1483920592-862251942=:12135 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII Content-ID: Changing from an old 3.0 snapshot to a recent 2.2-STABLE seems to have fixed the ahc-related problems (it survived a 36-hour burn-in with two 2940UW's and seven 4.3GB drives). However, I've come across two ways to freeze up a 2.2-970420-RELENG system, one of them I was able to reproduce consistently. The first is newfs'ing a disk via the block device rather than the raw device. Doing so freezes up the machine at the same point in the newfs each time. Using the raw device works fine: # newfs /dev/sd2s1a newfs: /dev/sd2s1a: not a character-special device Warning: 560 sector(s) in last cylinder unallocated /dev/sd2s1a: 8482256 sectors in 2071 cylinders of 1 tracks, 4096 sectors 4141.7MB in 130 cyl groups (16 c/g, 32.00MB/g, 7680 i/g) super-block backups (for fsck -b #) at: 32, 65568, 131104, 196640, 262176, 327712, 393248, 458784, 524320, 589856, 655392, 720928, 786464, 852000, 917536, 983072, 1048608, 1114144, 1179680, 1245216, 1310752, 1376288, 1441824, 1507360, 1572896, 1638432, 1703968, 1769504, 1835040, 1900576, 1966112, 2031648, 2097184, 2162720, 2228256, [...machine locks up at this point...] # newfs /dev/rsd2s1a Warning: 560 sector(s) in last cylinder unallocated /dev/rsd2s1a: 8482256 sectors in 2071 cylinders of 1 tracks, 4096 sectors 4141.7MB in 130 cyl groups (16 c/g, 32.00MB/g, 7680 i/g) super-block backups (for fsck -b #) at: 32, 65568, 131104, 196640, 262176, 327712, 393248, 458784, 524320, 589856, 655392, 720928, 786464, 852000, 917536, 983072, 1048608, 1114144, 1179680, 1245216, 1310752, 1376288, 1441824, 1507360, 1572896, 1638432, 1703968, 1769504, 1835040, 1900576, 1966112, 2031648, 2097184, 2162720, 2228256, 2293792, 2359328, 2424864, 2490400, 2555936, 2621472, 2687008, 2752544, 2818080, 2883616, 2949152, 3014688, 3080224, 3145760, 3211296, 3276832, 3342368, 3407904, 3473440, 3538976, 3604512, 3670048, 3735584, 3801120, 3866656, 3932192, 3997728, 4063264, 4128800, 4194336, 4259872, 4325408, 4390944, 4456480, 4522016, 4587552, 4653088, 4718624, 4784160, 4849696, 4915232, 4980768, 5046304, 5111840, 5177376, 5242912, 5308448, 5373984, 5439520, 5505056, 5570592, 5636128, 5701664, 5767200, 5832736, 5898272, 5963808, 6029344, 6094880, 6160416, 6225952, 6291488, 6357024, 6422560, 6488096, 6553632, 6619168, 6684704, 6750240, 6815776, 6881312, 6946848, 7012384, 7077920, 7143456, 7208992, 7274528, 7340064, 7405600, 7471136, 7536672, 7602208, 7667744, 7733280, 7798816, 7864352, 7929888, 7995424, 8060960, 8126496, 8192032, 8257568, 8323104, 8388640, 8454176, # The second happened doing a "quotaon -a", with six quota filesystems, all of which were currently exported and mounted on a BSD/OS 2.1 client. quota.user files were present at the root of each filesystem, although limits for all ~30000 users were set to zero at the time. The server boots with quotas on, but I had shut off all quotas earlier. Turning them back on caused the system to freeze. I haven't tried reproducing this one. In both cases, there were no console or syslog messages indicating a problem, and a hard reboot was necessary. This is a production machine now, so I can't really do a lot of fiddling with it. I'll try the "blessed" 2.2-970422 kernel to see if that makes any difference. I can send boot messages and the kernel config if that will help. AHC_TAGENABLE, AHC_SCBPAGING_ENABLE and the SysV IPC options are enabled. bt and ccd drivers are compiled in, but not currently in active use. -- Brian Tao (BT300, taob@netcom.ca) "Though this be madness, yet there is method in't" ---559023410-1483920592-862251942=:12135--