Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 Feb 2002 17:00:22 -0600
From:      "David Winkler" <dwinkler@ala.net>
To:        <freebsd-questions@freebsd.org>
Subject:   random reboot problem
Message-ID:  <ALEMLPMDJPEKFLNGKIHFKEJGCGAB.dwinkler@ala.net>

next in thread | raw e-mail | index | archive | help
I am having a reoccurring problem with a FreeBSD 4.2 Release install where
my server will randomly reboot from time to time. Normally the machine will
run for a week or two at a time, sometimes 3, without an occurrence, but I
will
now and then get messages in my logs about problems with one of the drives.
When the machine comes back up, it requires a manual fsck to fix the drive
and bring the server back up again. When it occurs, there are no log entries
about the problem that I can find. Below is information from my logs about
the
machine.

I am experiencing almost the identical problem on another machine running
FreeBSD 4.3 Release with an older version of the IBM drives, and an older
Adaptec card. I am also running FreeBSD 3.4 with an identical setup as the
4.3 box, and not experiencing this problem.

Each of the 3 systems mentioned are running 2 drives, with a 3 position
cable,
1 for the SCSI card, and 1 for each drive, all configured in the same
manner.

Any information on a fix for this problem, or where to check beyond
replacing
hardware ( have replaced drives in the 4.3 machine with no effect ) would
be greatly appreciated.

Here is the information from the machine that might be helpful in tracking
down
the problem. I'll gladly give any additional information needed.

DMESG snippets >>>>

FreeBSD 4.2 Release

CPU: AMD Athlon(tm) Processor (1008.99-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x642  Stepping = 2

Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,
PAT,PSE36,MMX,FXSR>
  AMD Features=0xc0440000<<b18>,AMIE,DSP,3DNow!>
real memory  = 536788992 (524208K bytes)

ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0xa000-0xa0ff mem
0xe4000000-0xe4000fff irq 11 at device 11.0 on pci0
aic7892: Wide Channel A, SCSI Id=7, 32/255 SCBs

da0 at ahc0 bus 0 target 1 lun 0
da0: <IBM DDYS-T18350N S96H> Fixed Direct Access SCSI-3 device
da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
da1 at ahc0 bus 0 target 6 lun 0
da1: <IBM DDYS-T18350N S96H> Fixed Direct Access SCSI-3 device
da1: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing
Enabled
da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)

<<<< End DMESG snippets

SYSLOG Entries >>>>

Feb  9 02:25:08 mail /kernel: (da1:ahc0:0:6:0): SCB 0xd6 - timed out while
idle, SEQADDR == 0x5
Feb  9 02:25:25 mail /kernel: STACK == 0x1, 0x106, 0x15e, 0x174
Feb  9 02:25:25 mail /kernel: SXFRCTL0 == 0x80
Feb  9 02:25:25 mail /kernel: SCB count = 255
Feb  9 02:25:25 mail /kernel: QINFIFO entries:
Feb  9 02:25:25 mail /kernel: Waiting Queue entries:
Feb  9 02:25:25 mail /kernel: Disconnected Queue entries: 22:243 12:156
10:169 25:209 28:131 18:0 27:230 8:91 31:153 19:
45 20:89 17:62 5:107 15:42 29:94 4:233 30:202 16:124 26:109 14:175 3:55
21:115 23:118 2:23 11:69 24:220 0:110 1:99 6:96
7:208 9:214
Feb  9 02:25:25 mail /kernel: QOUTFIFO entries:
Feb  9 02:25:25 mail /kernel: Sequencer Free SCB List: 13
Feb  9 02:25:25 mail /kernel: Pending list: 243 53 239 130 63 198 148 10 41
215 163 180 187 13 212 73 7 82 154 200 184 1
73 201 30 170 18 229 176 117 217 44 24 92 5 199 56 249 52 179 70 22 237 231
172 227 43 97 37 108 1 83 12 149 72 102 216
11 8 194 33 171 254 250 15 67 128 245 77 123 114 50 221 242 191 140 240 57
234 38 182 186 85 64 132 39 238 14 76 168 25
74 150 34 79 95 252 146 166 156 169 209 131 0 230 91 153 45 89 62 107 42 94
233 202 124 109 175 55 115 118 23 69 220 110
 99 96 208 214
Feb  9 02:25:25 mail /kernel: Kernel Free SCB list: 127 122 167 51 205 165
100 203 16 138 61 142 160 19 164 197 58 224 2
32 6 204 87 241 137 223 66 86 139 145 159 181 93 244 101 121 112 177 80 119
68 49 81 59 54 235 103 213 2 78 188 174 111
158 185 48 152 113 151 120 144 183 3 98 133 196 161 178 106 90 35 226 75 189
88 206 65 253 207 228 225 40 157 135 26 192
 136 17 60 222 190 20 9 36 143 27 31 47 247 147 125 129 126 32 236 248 116
211 46 29 210 28 219 134 71 104 141 21 105 84
 251 195 4 162 246 155 193
Feb  9 02:25:25 mail /kernel: sg[0] - Addr 0x1028f000 : Length 2048
Feb  9 02:25:25 mail /kernel: (da1:ahc0:0:6:0): Queuing a BDR SCB
Feb  9 02:25:25 mail /kernel: (da1:ahc0:0:6:0): SCB 0xd6 - timed out while
idle, SEQADDR == 0x166
Feb  9 02:25:25 mail /kernel: STACK == 0x174, 0x15e, 0x174, 0x2e
Feb  9 02:25:25 mail /kernel: SXFRCTL0 == 0x88
Feb  9 02:25:25 mail /kernel: SCB count = 255
Feb  9 02:25:25 mail /kernel: QINFIFO entries:
Feb  9 02:25:25 mail /kernel: Waiting Queue entries:
Feb  9 02:25:25 mail /kernel: Disconnected Queue entries: 22:243 12:156
10:169 25:209 28:131 18:0 27:230 8:91 31:153 19:
45 20:89 17:62 5:107 15:42 29:94 4:233 30:202 16:124 26:109 14:175 3:55
21:115 23:118 2:23 11:69 24:220 0:110 1:99 6:96
7:208
Feb  9 02:25:25 mail /kernel: QOUTFIFO entries:
Feb  9 02:25:25 mail /kernel: Sequencer Free SCB List: 13
Feb  9 02:25:25 mail /kernel: Pending list: 243 53 239 130 63 198 148 10 41
215 163 180 187 13 212 73 7 82 154 200 184 1
73 201 30 170 18 229 176 117 217 44 24 92 5 199 56 249 52 179 70 22 237 231
172 227 43 97 37 108 1 83 12 149 72 102 216
11 8 194 33 171 254 250 15 67 128 245 77 123 114 50 221 242 191 140 240 57
234 38 182 186 85 64 132 39 238 14 76 168 25
74 150 34 79 95 252 146 166 156 169 209 131 0 230 91 153 45 89 62 107 42 94
233 202 124 109 175 55 115 118 23 69 220 110
 99 96 208 214
Feb  9 02:25:25 mail /kernel: Kernel Free SCB list: 127 122 167 51 205 165
100 203 16 138 61 142 160 19 164 197 58 224 2
32 6 204 87 241 137 223 66 86 139 145 159 181 93 244 101 121 112 177 80 119
68 49 81 59 54 235 103 213 2 78 188 174 111
158 185 48 152 113 151 120 144 183 3 98 133 196 161 178 106 90 35 226 75 189
88 206 65 253 207 228 225 40 157 135 26 192
 136 17 60 222 190 20 9 36 143 27 31 47 247 147 125 129 126 32 236 248 116
211 46 29 210 28 219 134 71 104 141 21 105 84
 251 195 4 162 246 155 193
Feb  9 02:25:25 mail /kernel: sg[0] - Addr 0x1028f000 : Length 2048
Feb  9 02:25:25 mail /kernel: (da1:ahc0:0:6:0): no longer in timeout, status
= 34b
Feb  9 02:25:25 mail /kernel: ahc0: Issued Channel A Bus Reset. 128 SCBs
aborted

<<<< End Syslog Entries

--------------------------------------------
David Winkler
Network Administrator
Alanet Internet Services
3246 Montgomery Highway
Sun Plaza, Suite 114
Dothan, Alabama 36303
Phone: (334) 702-2949
Fax:   (334) 702-2887
email: dwinkler@ala.net



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ALEMLPMDJPEKFLNGKIHFKEJGCGAB.dwinkler>