Skip site navigation (1)Skip section navigation (2)
Date:      Sat,  9 Sep 2000 07:00:39 -0700 (PDT)
From:      dl@leo.org
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/21148: multiple crashes while using vinum
Message-ID:  <20000909140039.91D4337B424@hub.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         21148
>Category:       kern
>Synopsis:       multiple crashes while using vinum
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Sep 09 07:10:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator:     Daniel Lang
>Release:        4.1-STABLE
>Organization:
TU Muenchen
>Environment:
FreeBSD atleo4.leo.org 4.1-STABLE FreeBSD 4.1-STABLE #0: Fri Sep  8 10:24:40 CEST 2000     root@atleo2.leo.org:/usr/obj/usr/src/sys/ATLEO4  i386

>Description:
The machine crashed repeatedly after a vinum raid5 was set up
and used heavily.

Hardware: Dell Poweredge 6100/200 4xPPro SMP machine, with 3
Adaptec SCSI controllers and one Promise Fasttrack ATA100 IDE
controller... see dmesg:

dmesg output:
Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 4.1-STABLE #0: Fri Sep  8 10:24:40 CEST 2000
    root@atleo2.leo.org:/usr/obj/usr/src/sys/ATLEO4
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium Pro (198.95-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x619  Stepping = 9
  Features=0xfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV>
real memory  = 536870912 (524288K bytes)
avail memory = 518316032 (506168K bytes)
Programming 16 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfec08000
 cpu1 (AP):  apic id:  4, version: 0x00040011, at 0xfec08000
 cpu2 (AP):  apic id:  1, version: 0x00040011, at 0xfec08000
 cpu3 (AP):  apic id:  2, version: 0x00040011, at 0xfec08000
 io0 (APIC): apic id: 14, version: 0x000f0011, at 0xfec00000
Preloaded elf kernel "kernel" at 0xc0401000.
Pentium Pro MTRR support enabled
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Intel 82454KX/GX (Orion) host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0xff80-0xff9f mem 0xfe900000-0xfe9fffff,0xfe2ff000-0xfe2fffff irq 10 at device 11.0 on pci0
fxp0: Ethernet address 00:a0:c9:99:47:2c
ahc0: <Adaptec 2940 Ultra SCSI adapter> port 0xfc00-0xfcff mem 0xfeaff000-0xfeafffff irq 11 at device 12.0 on pci0
ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
isab0: <Intel 82375EB PCI-EISA bridge> at device 14.0 on pci0
eisa0: <EISA bus> on isab0
mainboard0: <INT31c0 (System Board)> on eisa0 slot 0
isa0: <ISA bus> on isab0
chip0: <> mem 0xfffffc00-0xffffffff,0xfffffc00-0xffffffff,0xfffffc00-0xffffffff,0xfffffc00-0xffffffff,0xfffffc00-0xffffffff,0xfec01000-0xfec013ff at device 15.0 on pci0
chip1: <Intel 82453KX/GX (Orion) PCI memory controller> at device 20.0 on pci0
pcib1: <Intel 82454KX/GX (Orion) host to PCI bridge> on motherboard
pci1: <PCI bus> on pcib1
ahc1: <Adaptec aic7880 Ultra SCSI adapter> port 0xec00-0xecff mem 0xfe1ff000-0xfe1fffff irq 5 at device 11.0 on pci1
ahc1: Using left over BIOS settings
ahc1: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
ahc2: <Adaptec aic7880 Ultra SCSI adapter> port 0xe800-0xe8ff mem 0xfe1fe000-0xfe1fefff irq 5 at device 12.0 on pci1
ahc2: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
ahc2: Host Adapter Bios disabled.  Using default SCSI device parameters
atapci0: <Promise ATA100 controller> port 0xe480-0xe4bf,0xe4f0-0xe4f3,0xe4e8-0xe4ef,0xe4f4-0xe4f7,0xe4f8-0xe4ff mem 0xfe1a0000-0xfe1bffff irq 9 at device 13.0 on pci1
ata2: at 0xe4f8 on atapci0
ata3: at 0xe4e8 on atapci0
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: model Generic PS/2 mouse, device ID 0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <12 virtual consoles, flags=0x100>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A, console
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: parallel port not found.
APIC_IO: routing 8254 via IOAPIC #0 intpin 2
IP packet filtering initialized, divert enabled, rule-based forwarding enabled, default to accept, logging limited to 100 packets/entry by default
IPv6 packet filtering initialized, default to accept, logging limited to 100 packets/entry
IPsec: Initialized Security Association Processing.
IP Filter: v3.4.8 initialized.  Default = pass all, Logging = enabled
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #3 Launched!
ad0: 73308MB <IBM-DTLA-307075> [148945/16/63] at ata2-master using UDMA100
ad1: 73308MB <IBM-DTLA-307075> [148945/16/63] at ata2-slave using UDMA100
ad2: 73308MB <IBM-DTLA-307075> [148945/16/63] at ata3-master using UDMA100
ad3: 73308MB <IBM-DTLA-307075> [148945/16/63] at ata3-slave using UDMA100
Waiting 3 seconds for SCSI devices to settle
pt0 at ahc1 bus 0 target 6 lun 0
pt0: <DELL 6UW BACKPLANE 7> Fixed Processor SCSI-2 device 
pt0: 3.300MB/s transfers
sa0 at ahc2 bus 0 target 6 lun 0
sa0: <ARCHIVE Python 29987-XXX 5.AM> Removable Sequential Access SCSI-2 device 
sa0: 4.545MB/s transfers (4.545MHz, offset 15)
ses0 at ahc1 bus 0 target 6 lun 0
ses0: <DELL 6UW BACKPLANE 7> Fixed Processor SCSI-2 device 
ses0: 3.300MB/s transfers
ses0: SAF-TE Compliant Device
da2 at ahc1 bus 0 target 2 lun 0
da2: <SEAGATE ST19171W 2224> Fixed Direct Access SCSI-2 device 
da2: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled
da2: 8683MB (17783112 512 byte sectors: 64H 32S/T 8683C)
da3 at ahc1 bus 0 target 3 lun 0
da3: <SEAGATE ST19171W 2224> Fixed Direct Access SCSI-2 device 
da3: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled
da3: 8683MB (17783112 512 byte sectors: 64H 32S/T 8683C)
da0 at ahc1 bus 0 target 0 lun 0
da0: <SEAGATE ST34572WC 0784> Fixed Direct Access SCSI-2 device 
da0: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled
da0: 4095MB (8388315 512 byte sectors: 64H 32S/T 4095C)
da1 at ahc1 bus 0 target 1 lun 0
da1: <SEAGATE ST34572WC 0784> Fixed Direct Access SCSI-2 device 
da1: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled
da1: 4095MB (8388315 512 byte sectors: 64H 32S/T 4095C)
ch0 at ahc2 bus 0 target 6 lun 1
ch0: <ARCHIVE Python 29987-XXX 5.AM> Removable Changer SCSI-2 device 
ch0: 4.545MB/s transfers (4.545MHz, offset 15)
ch0: 0 slots, 1 drive, 1 picker, 0 portals
Mounting root from ufs:/dev/da0s1a
WARNING: / was not properly dismounted
vinum: loaded
vinum: reading configuration from /dev/ad3s1e
vinum: updating configuration from /dev/ad2s1e
vinum: updating configuration from /dev/ad1s1e
vinum: updating configuration from /dev/ad0s1e
cd0 at ahc2 bus 0 target 5 lun 0
cd0: <NEC CD-ROM DRIVE:464 1.05> Removable CD-ROM SCSI-2 device 
cd0: 20.000MB/s transfers (20.000MHz, offset 15)
cd0: Attempt to query device size failed: NOT READY, Medium not present

Kernel Config file:

machine         i386
#cpu            I386_CPU
#cpu            I486_CPU
#cpu            I586_CPU
cpu             I686_CPU
ident           ATLEO4
maxusers        256

makeoptions     DEBUG=-g                #Build kernel with gdb(1) debug symbols

options         INET                    #InterNETworking
options         INET6                   #IPv6 communications protocols
options         IPSEC                   #IP security
options         IPSEC_ESP               #IP security (crypto; define w/ IPSEC)
options         IPSEC_DEBUG             #debug for IP security
options         MROUTING
options         IPFIREWALL              #firewall
options         IPFIREWALL_VERBOSE      #print information about
                                        # dropped packets
options         IPFIREWALL_FORWARD      #enable transparent proxy support
options         IPFIREWALL_VERBOSE_LIMIT=100    #limit verbosity
options         IPFIREWALL_DEFAULT_TO_ACCEPT    #allow everything by default
options         IPV6FIREWALL            #firewall for IPv6
options         IPV6FIREWALL_VERBOSE
options         IPV6FIREWALL_VERBOSE_LIMIT=100
options         IPV6FIREWALL_DEFAULT_TO_ACCEPT
options         IPDIVERT                #divert sockets
options         IPFILTER                #ipfilter support
options         IPFILTER_LOG            #ipfilter logging
options         IPSTEALTH               #support for stealth forwarding
options         TCPDEBUG
#options         TCP_DROP_SYNFIN         #drop TCP packets with SYN+FIN
options         TCP_RESTRICT_RST        #restrict emission of TCP RST
options         NETATALK                #Appletalk protocol


options         FFS                     #Berkeley Fast Filesystem
options         FFS_ROOT                #FFS usable as root device [keep this!]
options         SOFTUPDATES             #Enable FFS soft updates support
options         MFS                     #Memory Filesystem
options         MD_ROOT                 #MD is a potential root device
options         NFS                     #Network Filesystem
options         NFS_ROOT                #NFS usable as root device, NFS required

options         COMPAT_43               #Compatible with BSD 4.3 [KEEP THIS!]
options         SCSI_DELAY=3000 #Delay (in ms) before probing SCSI
options         UCONSOLE                #Allow users to grab the console
options         USERCONFIG              #boot -c editor
options         VISUAL_USERCONFIG       #visual boot -c editor
options         KTRACE                  #ktrace(1) support
options         SYSVSHM                 #SYSV-style shared memory
options         SYSVMSG                 #SYSV-style message queues
options         SYSVSEM                 #SYSV-style semaphores
options         P1003_1B                #Posix P1003_1B real-time extensions
options         _KPOSIX_PRIORITY_SCHEDULING
options         ICMP_BANDLIM            #Rate limit bad replies
options         KBD_INSTALL_CDEV        # install a CDEV entry in /dev
options         NETGRAPH

# To make an SMP kernel, the next two are needed
options         SMP                     # Symmetric MultiProcessor Kernel
options         APIC_IO                 # Symmetric (APIC) I/O
# Optionally these may need tweaked, (defaults shown):
options         NCPU=4                  # number of CPUs
options         NBUS=3                  # number of busses
options         NAPIC=1                 # number of IO APICs
options         NINTR=24                # number of INTs

device          isa
device          eisa
device          pci

# Floppy drives
device          fdc0    at isa? port IO_FD1 irq 6 drq 2
device          fd0     at fdc0 drive 0
device          fd1     at fdc0 drive 1

# ATA and ATAPI devices
#device         ata0    at isa? port IO_WD1 irq 14
#device         ata1    at isa? port IO_WD2 irq 15
device          ata
device          atadisk                 # ATA disk drives
device          atapicd                 # ATAPI CDROM drives
device          atapifd                 # ATAPI floppy drives
device          atapist                 # ATAPI tape drives
#options        ATA_STATIC_ID           #Static device numbering
options         ATA_ENABLE_ATAPI_DMA    #Enable DMA on ATAPI devices


# SCSI Controllers
#device         ahb             # EISA AHA1742 family
device          ahc0            # AHA2940 and onboard AIC7xxx devices
device          ahc1            # AHA2940 and onboard AIC7xxx devices
device          ahc2            # AHA2940 and onboard AIC7xxx devices

# SCSI peripherals
device          scbus           # SCSI bus (required)
device          da              # Direct Access (disks)
device          sa              # Sequential Access (tape etc)
device          ch              # SCSI media changers
device          cd              # CD
device          pass            # Passthrough device (direct SCSI access)
device          pt              # SCSI processor type
device          ses             # SCSI SES/SAF-TE driver

# disks
# the first ahc0 ist the external controller, which we use as last bus
# the first internal ahc1 is the first we use with the SCA disks
# the second internal ahc2 has the CD-ROM and the Archive Python
device          scbus0 at ahc1
device          scbus1 at ahc2
device          scbus2 at ahc0
device          da0 at scbus0 target 0
device          da1 at scbus0 target 1
device          da2 at scbus0 target 2
device          da3 at scbus0 target 3

# atkbdc0 controls both the keyboard and the PS/2 mouse
device          atkbdc0 at isa? port IO_KBD
device          atkbd0  at atkbdc? irq 1 flags 0x1
device          psm0    at atkbdc? irq 12

device          vga0    at isa?

# splash screen/screen saver
pseudo-device   splash

# syscons is the default console driver, resembling an SCO console
device          sc0     at isa? flags 0x100
options         MAXCONS=12              # number of virtual consoles
options         SC_NORM_ATTR="(FG_LIGHTGREY|BG_BLACK)"
options         SC_NORM_REV_ATTR="(FG_YELLOW|BG_GREEN)"
options         SC_KERNEL_CONS_ATTR="(FG_WHITE|BG_BLUE)"
options         SC_KERNEL_CONS_REV_ATTR="(FG_BLACK|BG_RED)"

# Floating point support - do not disable.
device          npx0    at nexus? port IO_NPX irq 13

# Power management support (see LINT for more options)
device          apm0    at nexus? disable flags 0x20 # Advanced Power Management

# PCCARD (PCMCIA) support

# Serial (COM) ports
device          sio0    at isa? port IO_COM1 flags 0x10 irq 4
device          sio1    at isa? port IO_COM2 irq 3
device          sio2    at isa? disable port IO_COM3 irq 5
device          sio3    at isa? disable port IO_COM4 irq 9

# Parallel port
device          ppc0    at isa? irq 7
device          ppbus           # Parallel port bus (required)
device          lpt             # Printer
device          plip            # TCP/IP over parallel
device          ppi             # Parallel port interface device
#device         vpo             # Requires scbus and da

# PCI Ethernet NICs.
device          de              # DEC/Intel DC21x4x (``Tulip'')
device          fxp             # Intel EtherExpress PRO/100B (82557, 82558)
device          tx              # SMC 9432TX (83c170 ``EPIC'')
device          vx              # 3Com 3c590, 3c595 (``Vortex'')
device          wx              # Intel Gigabit Ethernet Card (``Wiseman'')

# PCI Ethernet NICs that use the common MII bus controller code.
device          miibus          # MII bus support
device          dc              # DEC/Intel 21143 and various workalikes
device          rl              # RealTek 8129/8139
device          sf              # Adaptec AIC-6915 (``Starfire'')
device          sis             # Silicon Integrated Systems SiS 900/SiS 7016
device          ste             # Sundance ST201 (D-Link DFE-550TX)
device          tl              # Texas Instruments ThunderLAN
device          vr              # VIA Rhine, Rhine II
device          wb              # Winbond W89C840F
device          xl              # 3Com 3c90x (``Boomerang'', ``Cyclone'')

# ISA Ethernet NICs.

# Pseudo devices - the number indicates how many units to allocated.
pseudo-device   loop            # Network loopback
pseudo-device   ether           # Ethernet support
pseudo-device   sl      1       # Kernel SLIP
pseudo-device   ppp     1       # Kernel PPP
pseudo-device   tun             # Packet tunnel.
pseudo-device   pty     256     # Pseudo-ttys (telnet etc)
pseudo-device   md              # Memory "disks"
pseudo-device   gif     4       # IPv6 and IPv4 tunneling
pseudo-device   faith   1       # IPv6-to-IPv4 relaying (translation)
pseudo-device   vn
pseudo-device   snp     4

# The `bpf' pseudo-device enables the Berkeley Packet Filter.
# Be aware of the administrative consequences of enabling this!
pseudo-device   bpf             #Berkeley packet filter

# USB support
device          uhci            # UHCI PCI->USB interface
device          ohci            # OHCI PCI->USB interface
device          usb             # USB Bus (required)
device          ugen            # Generic
device          uhid            # "Human Interface Devices"
device          ukbd            # Keyboard
device          ulpt            # Printer
device          umass           # Disks/Mass storage - Requires scbus and da
device          ums             # Mouse
# USB Ethernet, requires mii
device          aue             # ADMtek USB ethernet
device          cue             # CATC USB ethernet
device          kue             # Kawasaki LSI USB ethernet

VINUM statements according to instructions on www.vinumvm.org:

Problem: Subsequent crashes (kernel panics) during heavy disk-access on a vinum
         device.
FreeBSD: 4.1-STABLE, no changes to the sources

Vinum list: one raid5 volume from 4 ATA drives

atleo4:/usr/src#vinum list
4 drives:
D d1                    State: up       Device /dev/ad0s1e      Avail: 0/73304 MB (0%)
D d2                    State: up       Device /dev/ad1s1e      Avail: 0/73304 MB (0%)
D d3                    State: up       Device /dev/ad2s1e      Avail: 0/73304 MB (0%)
D d4                    State: up       Device /dev/ad3s1e      Avail: 0/73304 MB (0%)
 
1 volumes:
V leoata                State: up       Plexes:       1 Size:        214 GB

1 plexes:
P leoata.p0          R5 State: up       Subdisks:     4 Size:        214 GB

4 subdisks:
S leoata.p0.s0          State: up       PO:        0  B Size:         71 GB
S leoata.p0.s1          State: up       PO:      512 kB Size:         71 GB
S leoata.p0.s2          State: up       PO:     1024 kB Size:         71 GB
S leoata.p0.s3          State: up       PO:     1536 kB Size:         71 GB 


The history file reflects the creation of the volume
which didn't cause any problems:

History file in: /var/log/vinum_history (not /var/tmp !):
[..]
 6 Sep 2000 17:41:13.473942 *** vinum started ***
 6 Sep 2000 17:41:13.475950 create -v vinum.init.leoata
drive d1 device /dev/ad0e
drive d2 device /dev/ad1e
drive d3 device /dev/ad2e
drive d4 device /dev/ad3e
volume leoata
  plex org raid5 512k
    sd length 150127097s drive d1
    sd length 150127097s drive d2
    sd length 150127097s drive d3
    sd length 150127097s drive d4
 6 Sep 2000 17:41:13.491734 *** Created devices ***
[..]
 6 Sep 2000 17:50:55.914542 *** vinum started ***
 6 Sep 2000 17:50:55.916405 init -w leoata.p0
[..]

/var/log/messages from the same period:
[..]
Sep  6 17:41:13 atleo4 /kernel: vinum: drive d1 is up
Sep  6 17:41:13 atleo4 /kernel: vinum: drive d2 is up
Sep  6 17:41:13 atleo4 /kernel: vinum: drive d3 is up
Sep  6 17:41:13 atleo4 /kernel: vinum: drive d4 is up
Sep  6 17:41:13 atleo4 /kernel: vinum: removing 1515 blocks of partial stripe at the en
d of leoata.p0
Sep  6 17:50:55 atleo4 /kernel: vinum: leoata.p0.s2 is initializing by force
Sep  6 17:50:55 atleo4 /kernel: vinum: leoata.p0 is initializing
Sep  6 17:50:55 atleo4 /kernel: vinum: leoata.p0.s0 is initializing by force
Sep  6 17:50:56 atleo4 /kernel: vinum: leoata.p0.s1 is initializing by force
Sep  6 17:50:56 atleo4 /kernel: vinum: leoata.p0.s3 is initializing by force
[..]
Sep  6 21:08:09 atleo4 /kernel: vinum: leoata.p0.s0 is initialized by force
Sep  6 21:08:10 atleo4 /kernel: vinum: leoata.p0.s0 is initialized
Sep  6 21:08:10 atleo4 /kernel: vinum: leoata.p0.s1 is initialized by force
Sep  6 21:08:10 atleo4 /kernel: vinum: leoata.p0.s1 is initialized
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s2 is initialized by force
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s2 is initialized
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s3 is initialized by force
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s0 is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s1 is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s2 is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s3 is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0 is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata is up
Sep  6 21:08:32 atleo4 /kernel: vinum: leoata.p0.s3 is up
[..]

newfs, mount, etc worked.

Crash anlysis: 4 crashes total within two days!!
The machine was did not crash before vinum was used on it.

I'm pretty sure, that the modules and kernel are compiled
with debugging symbols, that is, configured with -g (CONFIGARGS= -g), and
makeoptions     DEBUG=-g in the kernel config.

atleo4:/var/crash#file /modules/vinum.ko
/modules/vinum.ko: ELF 32-bit LSB shared object, Intel 80386, version 1 (FreeBSD), not
stripped

atleo4:/var/crash#file kernel.1
kernel.1: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), dynamically link
ed, not stripped
atleo4:/var/crash#file kernel.2
kernel.2: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), dynamically link
ed, not stripped
atleo4:/var/crash#file kernel.3
kernel.3: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), dynamically link
ed, not stripped
atleo4:/var/crash#file kernel.4
kernel.4: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), dynamically link
ed, not stripped

But I don't seem to get a proper analysis with your .gdbinit.* files,
and gdb says: no debugging symbols found ???

Maybe there is something I missed, but what ???

However...

Crash 1:
atleo4:/var/crash#gdb -k kernel.1 vmcore.1
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...
(no debugging symbols found)...
SMP 4 cpus
IdlePTD 4284416
initial pcb at 3608e0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
fault virtual address   = 0x0
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc23266ca
stack pointer           = 0x10:0xff806f00
frame pointer           = 0x10:0xff806f1c
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0

syncing disks...

Fatal trap 12: page fault while in kernel mode
mp_lock = 00000003; cpuid = 0; lapic.id = 00000000
fault virtual address   = 0x30
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0273971
stack pointer           = 0x10:0xff806d20
frame pointer           = 0x10:0xff806d24
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 00000003; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0
Uptime: 1h18m17s

dumping to dev #da/0x20001, offset 1048576
dump 512 ...
---
#0  0xc016b6b8 in boot ()
.gdbinit:4: Error in sourced command file:
Attempt to extract a component of a value that is not a structure.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This may be because of missing debugging symbols ??
Stacktrace:
(kgdb) bt
#0  0xc016b6b8 in boot ()
#1  0xc016ba70 in poweroff_wait ()
#2  0xc02d9baf in trap_fatal ()
#3  0xc02d9845 in trap_pfault ()
#4  0xc02d93df in trap ()
#5  0xc0273971 in acquire_lock ()
#6  0xc0277660 in softdep_update_inodeblock ()
#7  0xc0272c5d in ffs_update ()
#8  0xc027a931 in ffs_sync ()
#9  0xc01993f3 in sync ()
#10 0xc016b48b in boot ()
#11 0xc016ba70 in poweroff_wait ()
#12 0xc02d9baf in trap_fatal ()
#13 0xc02d9845 in trap_pfault ()
#14 0xc02d93df in trap ()
#15 0xc23266ca in ?? ()
#16 0xc019136b in biodone ()
#17 0xc02af030 in ad_interrupt ()
#18 0xc02ab3e6 in ata_intr ()
#19 0xc02e202d in intr_mux ()

Crash 2:
[..]
SMP 4 cpus
IdlePTD 4284416
initial pcb at 3608e0
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
fault virtual address   = 0xc3608010
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc232a112
stack pointer           = 0x10:0xff806ee8
frame pointer           = 0x10:0xff806ef0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 00000002; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0

syncing disks...

Fatal trap 12: page fault while in kernel mode
mp_lock = 00000003; cpuid = 0; lapic.id = 00000000
fault virtual address   = 0x30
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc0273971
stack pointer           = 0x10:0xff806d08
frame pointer           = 0x10:0xff806d0c
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 00000003; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0
Uptime: 14h5m17s
[..]

#0  0xc016b6b8 in boot ()
#1  0xc016ba70 in poweroff_wait ()
#2  0xc02d9baf in trap_fatal ()
#3  0xc02d9845 in trap_pfault ()
#4  0xc02d93df in trap ()
#5  0xc0273971 in acquire_lock ()
#6  0xc0277660 in softdep_update_inodeblock ()
#7  0xc0272c5d in ffs_update ()
#8  0xc027a931 in ffs_sync ()
#9  0xc01993f3 in sync ()
#10 0xc016b48b in boot ()
#11 0xc016ba70 in poweroff_wait ()
#12 0xc02d9baf in trap_fatal ()
#13 0xc02d9845 in trap_pfault ()
#14 0xc02d93df in trap ()
#15 0xc232a112 in ?? ()
#16 0xc2326bfc in ?? ()
#17 0xc019136b in biodone ()
#18 0xc02af030 in ad_interrupt ()
#19 0xc02ab3e6 in ata_intr ()
#20 0xc02e202d in intr_mux ()

Crash 3: This one is different ...

SMP 4 cpus
IdlePTD 4272128
initial pcb at 360920
panicstr: ffs_valloc: dup alloc
panic messages:
---
panic: ffs_valloc: dup alloc
mp_lock = 00000001; cpuid = 0; lapic.id = 00000000
boot() called on cpu#0

syncing disks... 166 38 19 5 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
giving up on 2 buffers
Uptime: 11h42m59s
[..]
#0  0xc016b6bc in boot ()
#1  0xc016ba74 in poweroff_wait ()
#2  0xc0270030 in ffs_valloc ()
#3  0xc02817ca in ufs_mkdir ()
#4  0xc02827d5 in ufs_vnoperate ()
#5  0xc019c28a in mkdir ()
#6  0xc02d9f09 in syscall2 ()
#7  0xc02c845b in Xint0x80_syscall ()
#8  0x804efc7 in ?? ()
#9  0x80494fd in ?? ()
[..]

Crash 4:

SMP 4 cpus
IdlePTD 4272128
initial pcb at 360920
panicstr: page fault
panic messages:
---
Fatal trap 12: page fault while in kernel mode
mp_lock = 03000002; cpuid = 3; lapic.id = 02000000
fault virtual address   = 0xc32c9010
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc232a112
stack pointer           = 0x10:0xff81bee8
frame pointer           = 0x10:0xff81bef0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 03000002; cpuid = 3; lapic.id = 02000000
boot() called on cpu#3

syncing disks...

Fatal trap 12: page fault while in kernel mode
mp_lock = 03000003; cpuid = 3; lapic.id = 02000000
fault virtual address   = 0x30
fault code              = supervisor read, page not present
instruction pointer     = 0x8:0xc027397d
stack pointer           = 0x10:0xff81bd00
frame pointer           = 0x10:0xff81bd04
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = Idle
interrupt mask          = bio  <- SMP: XXX
trap number             = 12
panic: page fault
mp_lock = 03000003; cpuid = 3; lapic.id = 02000000
boot() called on cpu#3
Uptime: 3h23m29s
[..]
(kgdb) bt
#0  0xc016b6bc in boot ()
#1  0xc016ba74 in poweroff_wait ()
'#2  0xc02d9bdf in trap_fatal ()
#3  0xc02d9875 in trap_pfault ()
#4  0xc02d940f in trap ()
#5  0xc027397d in acquire_lock ()
#6  0xc0277b52 in softdep_fsync_mountdev ()
#7  0xc027bc9a in ffs_fsync ()
#8  0xc027a9c6 in ffs_sync ()
#9  0xc01993e7 in sync ()
#10 0xc016b48f in boot ()
#11 0xc016ba74 in poweroff_wait ()
#12 0xc02d9bdf in trap_fatal ()
#13 0xc02d9875 in trap_pfault ()
#14 0xc02d940f in trap ()
#15 0xc232a112 in ?? ()
#16 0xc2326bfc in ?? ()
#17 0xc019135f in biodone ()
#18 0xc02af068 in ad_interrupt ()
#19 0xc02ab41e in ata_intr ()
#20 0xc02e205d in intr_mux ()
[..]

Of course this could be a ATA problem, but I already had
two crashes in a previous configuration while trying to
set up a stripe with two SCSI disks.
A detailed description of these previous problems has
been sent to Greg Lehey <grog@lemis.com> on August 16 2000.






>How-To-Repeat:

Tricky, this some sort of unique hardware configuration.
On this configuration it seems to be sufficient to 
transfer huge amounts of data to the vinum device
(around 100GB have been transferred in total, with interruptions
of the crashes. The largest portion during uptime may be
around 50GB). The data was transferred via NFS.

The filesystem uses SOFTUPDATES, the first crash 
corrupted it in severe way, so that fsck had to 
be run manually (producing lots of 'unexpected softupdates
inconsistency' errors). But I guess thats just a side-effect.
>Fix:
Nope.

>Release-Note:
>Audit-Trail:
>Unformatted:


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000909140039.91D4337B424>