Date: Mon, 11 Feb 2002 17:00:22 -0600 From: "David Winkler" <dwinkler@ala.net> To: <freebsd-questions@freebsd.org> Subject: random reboot problem Message-ID: <ALEMLPMDJPEKFLNGKIHFKEJGCGAB.dwinkler@ala.net>
next in thread | raw e-mail | index | archive | help
I am having a reoccurring problem with a FreeBSD 4.2 Release install where my server will randomly reboot from time to time. Normally the machine will run for a week or two at a time, sometimes 3, without an occurrence, but I will now and then get messages in my logs about problems with one of the drives. When the machine comes back up, it requires a manual fsck to fix the drive and bring the server back up again. When it occurs, there are no log entries about the problem that I can find. Below is information from my logs about the machine. I am experiencing almost the identical problem on another machine running FreeBSD 4.3 Release with an older version of the IBM drives, and an older Adaptec card. I am also running FreeBSD 3.4 with an identical setup as the 4.3 box, and not experiencing this problem. Each of the 3 systems mentioned are running 2 drives, with a 3 position cable, 1 for the SCSI card, and 1 for each drive, all configured in the same manner. Any information on a fix for this problem, or where to check beyond replacing hardware ( have replaced drives in the 4.3 machine with no effect ) would be greatly appreciated. Here is the information from the machine that might be helpful in tracking down the problem. I'll gladly give any additional information needed. DMESG snippets >>>> FreeBSD 4.2 Release CPU: AMD Athlon(tm) Processor (1008.99-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x642 Stepping = 2 Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV, PAT,PSE36,MMX,FXSR> AMD Features=0xc0440000<<b18>,AMIE,DSP,3DNow!> real memory = 536788992 (524208K bytes) ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0xa000-0xa0ff mem 0xe4000000-0xe4000fff irq 11 at device 11.0 on pci0 aic7892: Wide Channel A, SCSI Id=7, 32/255 SCBs da0 at ahc0 bus 0 target 1 lun 0 da0: <IBM DDYS-T18350N S96H> Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) da1 at ahc0 bus 0 target 6 lun 0 da1: <IBM DDYS-T18350N S96H> Fixed Direct Access SCSI-3 device da1: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) <<<< End DMESG snippets SYSLOG Entries >>>> Feb 9 02:25:08 mail /kernel: (da1:ahc0:0:6:0): SCB 0xd6 - timed out while idle, SEQADDR == 0x5 Feb 9 02:25:25 mail /kernel: STACK == 0x1, 0x106, 0x15e, 0x174 Feb 9 02:25:25 mail /kernel: SXFRCTL0 == 0x80 Feb 9 02:25:25 mail /kernel: SCB count = 255 Feb 9 02:25:25 mail /kernel: QINFIFO entries: Feb 9 02:25:25 mail /kernel: Waiting Queue entries: Feb 9 02:25:25 mail /kernel: Disconnected Queue entries: 22:243 12:156 10:169 25:209 28:131 18:0 27:230 8:91 31:153 19: 45 20:89 17:62 5:107 15:42 29:94 4:233 30:202 16:124 26:109 14:175 3:55 21:115 23:118 2:23 11:69 24:220 0:110 1:99 6:96 7:208 9:214 Feb 9 02:25:25 mail /kernel: QOUTFIFO entries: Feb 9 02:25:25 mail /kernel: Sequencer Free SCB List: 13 Feb 9 02:25:25 mail /kernel: Pending list: 243 53 239 130 63 198 148 10 41 215 163 180 187 13 212 73 7 82 154 200 184 1 73 201 30 170 18 229 176 117 217 44 24 92 5 199 56 249 52 179 70 22 237 231 172 227 43 97 37 108 1 83 12 149 72 102 216 11 8 194 33 171 254 250 15 67 128 245 77 123 114 50 221 242 191 140 240 57 234 38 182 186 85 64 132 39 238 14 76 168 25 74 150 34 79 95 252 146 166 156 169 209 131 0 230 91 153 45 89 62 107 42 94 233 202 124 109 175 55 115 118 23 69 220 110 99 96 208 214 Feb 9 02:25:25 mail /kernel: Kernel Free SCB list: 127 122 167 51 205 165 100 203 16 138 61 142 160 19 164 197 58 224 2 32 6 204 87 241 137 223 66 86 139 145 159 181 93 244 101 121 112 177 80 119 68 49 81 59 54 235 103 213 2 78 188 174 111 158 185 48 152 113 151 120 144 183 3 98 133 196 161 178 106 90 35 226 75 189 88 206 65 253 207 228 225 40 157 135 26 192 136 17 60 222 190 20 9 36 143 27 31 47 247 147 125 129 126 32 236 248 116 211 46 29 210 28 219 134 71 104 141 21 105 84 251 195 4 162 246 155 193 Feb 9 02:25:25 mail /kernel: sg[0] - Addr 0x1028f000 : Length 2048 Feb 9 02:25:25 mail /kernel: (da1:ahc0:0:6:0): Queuing a BDR SCB Feb 9 02:25:25 mail /kernel: (da1:ahc0:0:6:0): SCB 0xd6 - timed out while idle, SEQADDR == 0x166 Feb 9 02:25:25 mail /kernel: STACK == 0x174, 0x15e, 0x174, 0x2e Feb 9 02:25:25 mail /kernel: SXFRCTL0 == 0x88 Feb 9 02:25:25 mail /kernel: SCB count = 255 Feb 9 02:25:25 mail /kernel: QINFIFO entries: Feb 9 02:25:25 mail /kernel: Waiting Queue entries: Feb 9 02:25:25 mail /kernel: Disconnected Queue entries: 22:243 12:156 10:169 25:209 28:131 18:0 27:230 8:91 31:153 19: 45 20:89 17:62 5:107 15:42 29:94 4:233 30:202 16:124 26:109 14:175 3:55 21:115 23:118 2:23 11:69 24:220 0:110 1:99 6:96 7:208 Feb 9 02:25:25 mail /kernel: QOUTFIFO entries: Feb 9 02:25:25 mail /kernel: Sequencer Free SCB List: 13 Feb 9 02:25:25 mail /kernel: Pending list: 243 53 239 130 63 198 148 10 41 215 163 180 187 13 212 73 7 82 154 200 184 1 73 201 30 170 18 229 176 117 217 44 24 92 5 199 56 249 52 179 70 22 237 231 172 227 43 97 37 108 1 83 12 149 72 102 216 11 8 194 33 171 254 250 15 67 128 245 77 123 114 50 221 242 191 140 240 57 234 38 182 186 85 64 132 39 238 14 76 168 25 74 150 34 79 95 252 146 166 156 169 209 131 0 230 91 153 45 89 62 107 42 94 233 202 124 109 175 55 115 118 23 69 220 110 99 96 208 214 Feb 9 02:25:25 mail /kernel: Kernel Free SCB list: 127 122 167 51 205 165 100 203 16 138 61 142 160 19 164 197 58 224 2 32 6 204 87 241 137 223 66 86 139 145 159 181 93 244 101 121 112 177 80 119 68 49 81 59 54 235 103 213 2 78 188 174 111 158 185 48 152 113 151 120 144 183 3 98 133 196 161 178 106 90 35 226 75 189 88 206 65 253 207 228 225 40 157 135 26 192 136 17 60 222 190 20 9 36 143 27 31 47 247 147 125 129 126 32 236 248 116 211 46 29 210 28 219 134 71 104 141 21 105 84 251 195 4 162 246 155 193 Feb 9 02:25:25 mail /kernel: sg[0] - Addr 0x1028f000 : Length 2048 Feb 9 02:25:25 mail /kernel: (da1:ahc0:0:6:0): no longer in timeout, status = 34b Feb 9 02:25:25 mail /kernel: ahc0: Issued Channel A Bus Reset. 128 SCBs aborted <<<< End Syslog Entries -------------------------------------------- David Winkler Network Administrator Alanet Internet Services 3246 Montgomery Highway Sun Plaza, Suite 114 Dothan, Alabama 36303 Phone: (334) 702-2949 Fax: (334) 702-2887 email: dwinkler@ala.net To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ALEMLPMDJPEKFLNGKIHFKEJGCGAB.dwinkler>