Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 4 Jan 1997 08:45:17 -0800 (PST)
From:      markd@grizzly.com
To:        FreeBSD-gnats-submit@freebsd.org
Subject:   kern/2371: SCSI disk corruption
Message-ID:  <199701041645.IAA00351@osprey.grizzly.com>
Resent-Message-ID: <199701041650.IAA01768@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         2371
>Category:       kern
>Synopsis:       SCSI disk corruption
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Jan  4 08:50:01 PST 1997
>Last-Modified:
>Originator:     Mark Diekhans
>Organization:
>Release:        FreeBSD 2.2-BETA_A i386
>Environment:

    FreeBSD 2.2-BETA_A #0: Sun Dec 29 20:13:01 PST 1996
        markd@osprey.grizzly.com:/usr/src/sys/compile/OSPREY
    Calibrating clock(s) relative to mc146818A clock ... i586 clock: 166188714 Hz, i8254 clock: 1193152 Hz
    CPU: Pentium (166.19-MHz 586-class CPU)
      Origin = "GenuineIntel"  Id = 0x52c  Stepping=12
      Features=0x1bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8>
    real memory  = 67108864 (65536K bytes)
    avail memory = 64073728 (62572K bytes)
    Probing for devices on PCI bus 0:
    chip0 <Intel 82439> rev 1 on pci0:0
    chip1 <Intel 82371SB PCI-ISA bridge> rev 0 on pci0:7:0
    chip2 <Intel 82371SB IDE interface> rev 0 on pci0:7:1
    vga0 <VGA-compatible display device> rev 1 int a irq 10 on pci0:9
    ahc0 <Adaptec 2940 SCSI host adapter> rev 0 int a irq 11 on pci0:12
    ahc0: aic7870 Single Channel, SCSI Id=7, 16/255 SCBs
    ahc0 waiting for scsi devices to settle
    ahc0: target 0 Tagged Queuing Device
    (ahc0:0:0): "HP C3725S 6039" type 0 fixed SCSI 2
    sd0(ahc0:0:0): Direct-Access 2047MB (4194058 512 byte sectors)
    ahc0: target 1 Tagged Queuing Device
    (ahc0:1:0): "SEAGATE ST32550N 0015" type 0 fixed SCSI 2
    sd1(ahc0:1:0): Direct-Access 2047MB (4194058 512 byte sectors)
    ahc0:A:5: refuses syncronous negotiation. Using asyncronous transfers
    (ahc0:5:0): "IOMEGA ZIP 100 D.09" type 0 removable SCSI 2
    sd2(ahc0:5:0): Direct-Access 
    sd2(ahc0:5:0): ILLEGAL REQUEST asc:24,0 Invalid field in CDB
    sd2 could not mode sense (4). Using ficticious geometry

    sd2(ahc0:5:0): NOT READY asc:3a,0 Medium not present
    sd2: could not get size
    0MB (0 512 byte sectors)
    Probing for devices on the ISA bus:
    sc0 at 0x60-0x6f irq 1 on motherboard
    sc0: VGA color <4 virtual consoles, flags=0x0>
    sio0 at 0x3f8-0x3ff irq 4 on isa
    sio0: type 16550A
    sio1 at 0x2f8-0x2ff irq 3 on isa
    sio1: type 16550A
    sio2 not found at 0x3e8
    lpt0 at 0x378-0x37f irq 7 on isa
    lpt0: Interrupt-driven port
    lp0: TCP/IP capable interface
    psm0 at 0x60-0x64 irq 12 on motherboard
    psm0: device ID 0, 2 buttons?
    fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
    fdc0: NEC 72065B
    fd0: 1.44MB 3.5in
    npx0 on motherboard
    npx0: INT 16 interface

AHC options:
    controller	ahc0
    options	AHC_TAGENABLE
    options	AHC_SCBPAGING_ENABLE
    options 	AHC_ALLOW_MEMIO

>Description:
    As with many reports of disk corruption, this might not be very useful
due to lack of solid data...

    System was discovered crashed, on reboot, the second scsi disk was corrupt.
Fsck reported a large number of errors and it was eventually run with -y.
When remounted, all files where in lost+found.

    The last major activity before the crash was formatting of sd2, an IOMega
Zip drive.  SCSI format, partition and newfs had generated the following
message:

    Jan  3 22:47:13 osprey /kernel: sd2(ahc0:5:0): UNIT ATTENTION asc:28,0
    Jan  3 22:47:13 osprey /kernel: sd2(ahc0:5:0):  Not ready to ready transition, medium may have changed
    Jan  3 22:47:13 osprey /kernel: Debugger("sdopen: no slices") called.
    Jan  3 22:47:13 osprey /kernel: sd2(ahc0:5:0): ILLEGAL REQUEST asc:24,0 Invalid field in CDB
    Jan  3 22:47:13 osprey /kernel: sd2 could not mode sense (4). Using ficticious geometry
    Jan  3 22:47:13 osprey /kernel: sd2: invalid primary partition table: no magic

The system was alive for two more hours, but only doing routine tasks.

The day befofe the following errors where reported in the log:

    Jan  2 02:34:51 osprey /kernel: st0(ahc0:2:0): timed out in command phase, SCSISIGI == 0x84
    Jan  2 02:34:51 osprey /kernel: SEQADDR == 0x5a
    Jan  2 02:34:51 osprey /kernel: st0(ahc0:2:0): abort message in message buffer
    Jan  2 02:34:53 osprey /kernel: st0(ahc0:2:0): timed out in dataout phase, SCSISIGI == 0x0
    Jan  2 02:34:53 osprey /kernel: SEQADDR == 0xe
    Jan  2 02:34:53 osprey /kernel: ahc0: Issued Channel A Bus Reset. 1 SCBs aborted
    Jan  2 02:34:53 osprey /kernel: st0(ahc0:2:0): Target Busy
    Jan  2 02:34:53 osprey /kernel: st0(ahc0:2:0): Target Busy
    Jan  2 02:34:53 osprey /kernel: sd0(ahc0:0:0): UNIT ATTENTION info?:4020040 asc:29,0
    Jan  2 02:34:53 osprey /kernel: sd0(ahc0:0:0):  Power on, reset, or bus device reset occurred
    Jan  2 02:34:53 osprey /kernel: , retries:4
    Jan  2 02:34:53 osprey /kernel: sd0(ahc0:0:0): NOT READY info?:4020040 asc:4,1
    Jan  2 02:34:53 osprey /kernel: sd0(ahc0:0:0):  Logical unit is in process of becoming ready
    Jan  2 02:34:53 osprey /kernel: , retries:3
    Jan  2 02:34:53 osprey /kernel: sd0(ahc0:0:0): ABORTED COMMAND info?:4020040 asc:4e,0 Overlapped commands attempted
    Jan  2 02:34:53 osprey /kernel: , retries:2
    Jan  2 02:34:53 osprey /kernel: sd0(ahc0:0:0): NOT READY info?:4020040 asc:4,1
    Jan  2 02:34:53 osprey /kernel: sd0(ahc0:0:0):  Logical unit is in process of becoming ready
    Jan  2 02:34:53 osprey /kernel: , retries:1
    Jan  2 02:35:03 osprey /kernel: sd0(ahc0:0:0): timed out in dataout phase, SCSISIGI == 0x0
    Jan  2 02:35:03 osprey /kernel: SEQADDR == 0xb
    Jan  2 02:35:17 osprey /kernel: SEQADDR == 0xf
    Jan  2 02:35:17 osprey /kernel: ahc0: Issued Channel A Bus Reset. 1 SCBs aborted
    Jan  2 02:35:17 osprey /kernel: spec_getpages: I/O read error
    Jan  2 02:35:17 osprey /kernel: vm_fault: pager input (probably hardware) error, PID 4026 failure
    Jan  2 02:35:17 osprey /kernel: sd0(ahc0:0:0): NOT READY info?:4020040 asc:4,1
    Jan  2 02:35:17 osprey /kernel: sd0(ahc0:0:0):  Logical unit is in process of becoming ready
    Jan  2 02:35:17 osprey /kernel: , retries:4
    Jan  2 02:35:17 osprey /kernel: sd0(ahc0:0:0): NOT READY info?:4020040 asc:4,1
    Jan  2 02:35:17 osprey /kernel: sd0(ahc0:0:0):  Logical unit is in process of becoming ready
    Jan  2 02:35:17 osprey /kernel: , retries:3
    Jan  2 02:35:17 osprey /kernel: sd0(ahc0:0:0): NOT READY info?:4020040 asc:4,1
    Jan  2 02:35:17 osprey /kernel: sd0(ahc0:0:0):  Logical unit is in process of becoming ready
    Jan  2 02:35:17 osprey /kernel: , retries:2
    Jan  2 02:35:17 osprey /kernel: pid 4026 (bash), uid 0: exited on signal 11 (core dumped)
    Jan  2 02:35:17 osprey /kernel: sd0(ahc0:0:0): timed out in dataout phase, SCSISIGI == 0x0
    Jan  2 02:35:17 osprey /kernel: SEQADDR == 0xf
    Jan  2 02:35:17 osprey /kernel: sd0(ahc0:0:0): timed out in dataout phase, SCSISIGI == 0x0
    Jan  2 02:35:17 osprey /kernel: SEQADDR == 0xe
    Jan  2 02:35:17 osprey /kernel: ahc0: Issued Channel A Bus Reset. 1 SCBs aborted
    Jan  2 02:35:17 osprey /kernel: sd0(ahc0:0:0): UNIT ATTENTION info?:4020040 asc:29,0
    Jan  2 02:35:17 osprey /kernel: sd0(ahc0:0:0):  Power on, reset, or bus device reset occurred
    Jan  2 02:35:17 osprey /kernel: , retries:3
    Jan  2 02:35:18 osprey /kernel: sd0(ahc0:0:0): NOT READY info?:4020040 asc:4,1
    Jan  2 02:35:18 osprey /kernel: sd0(ahc0:0:0):  Logical unit is in process of becoming ready
    Jan  2 02:35:18 osprey /kernel: , retries:2
    Jan  2 02:35:18 osprey /kernel: sd0(ahc0:0:0): NOT READY info?:4020040 asc:4,1
    Jan  2 02:35:18 osprey /kernel: sd0(ahc0:0:0):  Logical unit is in process of becoming ready
    Jan  2 02:35:18 osprey /kernel: , retries:1
    Jan  2 02:35:18 osprey /kernel: sd0(ahc0:0:0): NOT READY info?:4020040 asc:4,1
    Jan  2 02:35:18 osprey /kernel: sd0(ahc0:0:0):  Logical unit is in process of becoming ready
    Jan  2 02:35:18 osprey /kernel: , FAILURE
    Jan  2 02:36:42 osprey /kernel: sd1(ahc0:1:0): UNIT ATTENTION asc:29,0
    Jan  2 02:36:42 osprey /kernel: sd1(ahc0:1:0):  Power on, reset, or bus device reset occurred field replaceable unit: 1
    Jan  2 02:36:42 osprey /kernel: , retries:4

Note: AHC_TAGENABLE and AHC_SCBPAGING_ENABLE and AHC_ALLOW_MEMIO where enabled.

Since I gave up driver writing before SCSI was invented, I can't very useful
in the analysis.

>How-To-Repeat:

>Fix:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199701041645.IAA00351>