From owner-freebsd-bugs Tue Aug 8 7:30:15 2000 Delivered-To: freebsd-bugs@freebsd.org Received: from freefall.freebsd.org (freefall.FreeBSD.ORG [204.216.27.21]) by hub.freebsd.org (Postfix) with ESMTP id DCBFA37B8F1 for ; Tue, 8 Aug 2000 07:30:01 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.9.3/8.9.2) id HAA25996; Tue, 8 Aug 2000 07:30:01 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from tuminfo2.informatik.tu-muenchen.de (tuminfo2.informatik.tu-muenchen.de [131.159.0.81]) by hub.freebsd.org (Postfix) with ESMTP id 1E75E37B7FE for ; Tue, 8 Aug 2000 07:22:33 -0700 (PDT) (envelope-from langd@leo.org) Received: from atleo3.leo.org ([131.159.72.12] HELO atleo3.leo.org ident: NO-IDENT-SERVICE [port 2403]) by tuminfo2.informatik.tu-muenchen.de with SMTP id <111677-226>; Tue, 8 Aug 2000 16:22:27 +0000 Received: by atleo3.leo.org (Postfix, from userid 20455) id 2B98F17408; Tue, 8 Aug 2000 16:22:25 +0200 (CEST) Message-Id: <20000808142225.2B98F17408@atleo3.leo.org> Date: Tue, 8 Aug 2000 16:22:26 +0000 From: dl@leo.org Reply-To: dl@leo.org To: FreeBSD-gnats-submit@freebsd.org X-Send-Pr-Version: 3.2 Subject: kern/20484: FreeBSD 4.0 crashes repeatedly: trap 12: page fault while in kernel mode Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org >Number: 20484 >Category: kern >Synopsis: FreeBSD 4.0 crashes repeatedly: trap 12: page fault while in kernel mode >Confidential: no >Severity: critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Aug 08 07:30:01 PDT 2000 >Closed-Date: >Last-Modified: >Originator: Daniel Lang >Release: FreeBSD 4.0-STABLE i386 >Organization: TU Muenchen >Environment: Hardware configuration: Athlon 700 on ASUS K7V board, BIOS Revision 1005 256 MB ECC RAM (due to BIOS problems ECC mode disabled), Adaptec 29160, 3com 905BX NIC, remaining configuration see following dmesg output: Relevant dmesg output: Copyright (c) 1992-2000 The FreeBSD Project. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 4.0-STABLE #0: Fri Jul 7 12:36:02 CEST 2000 root@atleo3.leo.org:/usr/obj/usr/src/sys/ATLEO3 Timecounter "i8254" frequency 1193182 Hz CPU: AMD Athlon(tm) Processor (700.03-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x621 Stepping = 1 Features=0x183f9ff AMD Features=0xc0400000 real memory = 268419072 (262128K bytes) avail memory = 256929792 (250908K bytes) Preloaded elf kernel "kernel" at 0xc03b6000. Pentium Pro MTRR support enabled md0: Malloc disk npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 pcib2: at device 1.0 on pci0 pci1: on pcib2 pci1: at 0.0 irq 11 isab0: at device 4.0 on pci0 isa0: on isab0 pci0: at 4.1 uhci0: port 0xb400-0xb41f irq 15 at device 4.2 on pc i0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhub0: port 1 power on failed, IOERROR uhub0: port 2 power on failed, IOERROR uhci1: port 0xb000-0xb01f irq 15 at device 4.3 on pc i0 usb1: on uhci1 usb1: USB revision 1.0 uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhub1: port 1 power on failed, IOERROR uhub1: port 2 power on failed, IOERROR xl0: <3Com 3c905B-TX Fast Etherlink XL> port 0x9400-0x947f mem 0xe1000000-0xe100 007f irq 12 at device 10.0 on pci0 xl0: Ethernet address: 00:a0:24:a6:86:04 miibus0: on xl0 xlphy0: <3Com internal media interface> on miibus0 xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto xl0: supplying EUI64: 00:a0:24:ff:fe:a6:86:04 ahc0: port 0x9000-0x90ff mem 0xe0800000-0x e0800fff irq 10 at device 11.0 on pci0 ahc0: aic7892 Wide Channel A, SCSI Id=7, 16/255 SCBs pcib1: on motherboard pci2: on pcib1 fdc0: at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: flags 0x1 irq 1 on atkbdc0 kbd0 at atkbd0 vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <12 virtual consoles, flags=0x100> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A, console sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A ppc0: at port 0x378-0x37f irq 7 on isa0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/8 bytes threshold ppi0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port plip0: on ppbus0 IP packet filtering initialized, divert enabled, rule-based forwarding enabled, default to accept, logging limited to 100 packets/entry by default IPsec: Initialized Security Association Processing. IPv6 packet filtering initialized, default to accept, logging limited to 100 pac kets/entry IP Filter: initialized. Default = pass all, Logging = enabled IP Filter: v3.3.8 Waiting 3 seconds for SCSI devices to settle Mounting root from ufs:/dev/da0s1a da0 at ahc0 bus 0 target 1 lun 0 da0: Fixed Direct Access SCSI-3 device da0: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da0: 8748MB (17916240 512 byte sectors: 255H 63S/T 1115C) WARNING: / was not properly dismounted Kernel config file: machine i386 #cpu I386_CPU #cpu I486_CPU #cpu I586_CPU cpu I686_CPU ident ATLEO3 maxusers 256 options INET #InterNETworking options INET6 #IPv6 communications protocols options IPSEC #IP security options IPSEC_ESP #IP security (crypto; define w/ IPSEC) options IPSEC_DEBUG #debug for IP security options MROUTING options IPFIREWALL #firewall options IPFIREWALL_VERBOSE #print information about # dropped packets options IPFIREWALL_FORWARD #enable transparent proxy support options IPFIREWALL_VERBOSE_LIMIT=100 #limit verbosity #options IPFIREWALL_DEFAULT_TO_ACCEPT #allow everything by default options IPV6FIREWALL #firewall for IPv6 options IPV6FIREWALL_VERBOSE options IPV6FIREWALL_VERBOSE_LIMIT=100 #options IPV6FIREWALL_DEFAULT_TO_ACCEPT options IPDIVERT #divert sockets options IPFILTER #ipfilter support options IPFILTER_LOG #ipfilter logging options IPSTEALTH #support for stealth forwarding options TCPDEBUG #options TCP_DROP_SYNFIN #drop TCP packets with SYN+FIN options TCP_RESTRICT_RST #restrict emission of TCP RST options FFS #Berkeley Fast Filesystem options FFS_ROOT #FFS usable as root device [keep this!] options SOFTUPDATES #Enable FFS soft updates support options MFS #Memory Filesystem options MD_ROOT #MD is a potential root device options NFS #Network Filesystem options NFS_ROOT #NFS usable as root device, NFS required #options MSDOSFS #MSDOS Filesystem #options CD9660 #ISO 9660 Filesystem #options CD9660_ROOT #CD-ROM usable as root, CD9660 required #options PROCFS #Process filesystem options COMPAT_43 #Compatible with BSD 4.3 [KEEP THIS!] options SCSI_DELAY=3000 #Delay (in ms) before probing SCSI options UCONSOLE #Allow users to grab the console options USERCONFIG #boot -c editor options VISUAL_USERCONFIG #visual boot -c editor options KTRACE #ktrace(1) support options SYSVSHM #SYSV-style shared memory options SYSVMSG #SYSV-style message queues options SYSVSEM #SYSV-style semaphores options P1003_1B #Posix P1003_1B real-time extensions options _KPOSIX_PRIORITY_SCHEDULING options ICMP_BANDLIM #Rate limit bad replies options KBD_INSTALL_CDEV # install a CDEV entry in /dev options NETGRAPH device isa device eisa device pci # Floppy drives device fdc0 at isa? port IO_FD1 irq 6 drq 2 device fd0 at fdc0 drive 0 device fd1 at fdc0 drive 1 # SCSI Controllers device ahc0 # AHA2940 and onboard AIC7xxx devices # SCSI peripherals device scbus # SCSI bus (required) device da # Direct Access (disks) device sa # Sequential Access (tape etc) device cd # CD device pass # Passthrough device (direct SCSI access) # disks device scbus0 at ahc0 device da0 at scbus0 target 0 # atkbdc0 controls both the keyboard and the PS/2 mouse device atkbdc0 at isa? port IO_KBD device atkbd0 at atkbdc? irq 1 flags 0x1 device psm0 at atkbdc? irq 12 device vga0 at isa? # splash screen/screen saver pseudo-device splash # syscons is the default console driver, resembling an SCO console device sc0 at isa? flags 0x100 options MAXCONS=12 # number of virtual consoles options SC_NORM_ATTR="(FG_LIGHTGREY|BG_BLACK)" options SC_NORM_REV_ATTR="(FG_YELLOW|BG_GREEN)" options SC_KERNEL_CONS_ATTR="(FG_WHITE|BG_BLUE)" options SC_KERNEL_CONS_REV_ATTR="(FG_BLACK|BG_RED)" # Floating point support - do not disable. device npx0 at nexus? port IO_NPX irq 13 # Power management support (see LINT for more options) device apm0 at nexus? disable flags 0x20 # Advanced Power Management # Serial (COM) ports device sio0 at isa? port IO_COM1 flags 0x10 irq 4 device sio1 at isa? port IO_COM2 irq 3 device sio2 at isa? disable port IO_COM3 irq 5 device sio3 at isa? disable port IO_COM4 irq 9 # Parallel port device ppc0 at isa? irq 7 device ppbus # Parallel port bus (required) device lpt # Printer device plip # TCP/IP over parallel device ppi # Parallel port interface device #device vpo # Requires scbus and da # PCI Ethernet NICs. device de # DEC/Intel DC21x4x (``Tulip'') device fxp # Intel EtherExpress PRO/100B (82557, 82558) device tx # SMC 9432TX (83c170 ``EPIC'') device vx # 3Com 3c590, 3c595 (``Vortex'') device wx # Intel Gigabit Ethernet Card (``Wiseman'') # PCI Ethernet NICs that use the common MII bus controller code. device miibus # MII bus support device dc # DEC/Intel 21143 and various workalikes device rl # RealTek 8129/8139 device sf # Adaptec AIC-6915 (``Starfire'') device sis # Silicon Integrated Systems SiS 900/SiS 7016 device ste # Sundance ST201 (D-Link DFE-550TX) device tl # Texas Instruments ThunderLAN device vr # VIA Rhine, Rhine II device wb # Winbond W89C840F device xl # 3Com 3c90x (``Boomerang'', ``Cyclone'') # Pseudo devices - the number indicates how many units to allocated. pseudo-device loop # Network loopback pseudo-device ether # Ethernet support pseudo-device sl 1 # Kernel SLIP pseudo-device ppp 1 # Kernel PPP pseudo-device tun # Packet tunnel. pseudo-device pty 256 # Pseudo-ttys (telnet etc) pseudo-device md # Memory "disks" pseudo-device gif 4 # IPv6 and IPv4 tunneling pseudo-device faith 1 # IPv6-to-IPv4 relaying (translation) pseudo-device vn pseudo-device snp 4 # The `bpf' pseudo-device enables the Berkeley Packet Filter. # Be aware of the administrative consequences of enabling this! pseudo-device bpf #Berkeley packet filter # USB support device uhci # UHCI PCI->USB interface device ohci # OHCI PCI->USB interface device usb # USB Bus (required) device ugen # Generic device uhid # "Human Interface Devices" device ukbd # Keyboard device ulpt # Printer device umass # Disks/Mass storage - Requires scbus and da device ums # Mouse # USB Ethernet, requires mii device aue # ADMtek USB ethernet device cue # CATC USB ethernet device kue # Kawasaki LSI USB ethernet Note to the ECC RAM: Altough the DIMM's are ECC compliant (9 chip) the system/BIOS locked up completely if the mode was set to ECC, and could only be recovered by erasing CMOS with the soldering-pins, so in this configuration ECC was not activated in BIOS. I checked the DIMM's with boot-disk-based memtest-x86 2.3 which does a huge variety of extensive pattern tests of the memory (with and without cache) and showed no error, so I suspect it's not a hardware problem ? >Description: The machine crashed 4 times with this configuration, always with the same reason (see below), but not after some reproducible actions. The machine was up more than 20 days in between, but then crashed twice in a few days. The machine runs as a IRCnet Server in production, and is therefore a primary DOS attack victim, but no particular attack has been registered (on routers, etc) when it crashed. here are the last two kgdb traces: 1. (no debugging symbols found)... IdlePTD 3964928 initial pcb at 3266a0 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0x20 fault code = supervisor read, page not present instruction pointer = 0x8:0xc01bba4d stack pointer = 0x10:0xc02fb320 frame pointer = 0x10:0xc02fb32c code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = Idle interrupt mask = trap number = 12 panic: page fault syncing disks... Fatal trap 12: page fault while in kernel mode fault virtual address = 0x30 fault code = supervisor read, page not present instruction pointer = 0x8:0xc0253680 stack pointer = 0x10:0xc02fb158 frame pointer = 0x10:0xc02fb15c code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = Idle interrupt mask = bio trap number = 12 panic: page fault Uptime: 23d2h32m25s dumping to dev #da/0x20001, offset 524320 dump 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 --- #0 0xc0161034 in boot () (kgdb) bt #0 0xc0161034 in boot () #1 0xc01613b8 in poweroff_wait () #2 0xc02a77ad in trap_fatal () #3 0xc02a7485 in trap_pfault () #4 0xc02a7083 in trap () #5 0xc0253680 in acquire_lock () #6 0xc025735c in softdep_update_inodeblock () #7 0xc025296d in ffs_update () #8 0xc025a534 in ffs_sync () #9 0xc018d517 in sync () #10 0xc0160e07 in boot () #11 0xc01613b8 in poweroff_wait () #12 0xc02a77ad in trap_fatal () #13 0xc02a7485 in trap_pfault () #14 0xc02a7083 in trap () #15 0xc01bba4d in tcp_timer_persist () #16 0xc01667b1 in softclock () AND 2. (no debugging symbols found)... IdlePTD 3964928 initial pcb at 3266a0 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0x20 fault code = supervisor read, page not present instruction pointer = 0x8:0xc01bba4d stack pointer = 0x10:0xc02fb320 frame pointer = 0x10:0xc02fb32c code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = Idle interrupt mask = trap number = 12 panic: page fault syncing disks... Fatal trap 12: page fault while in kernel mode fault virtual address = 0x30 fault code = supervisor read, page not present instruction pointer = 0x8:0xc0253680 stack pointer = 0x10:0xc02fb158 frame pointer = 0x10:0xc02fb15c code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = Idle interrupt mask = bio trap number = 12 panic: page fault Uptime: 3d17h36m34s dumping to dev #da/0x20001, offset 524320 dump 255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 --- #0 0xc0161034 in boot () (kgdb) bt #0 0xc0161034 in boot () #1 0xc01613b8 in poweroff_wait () #2 0xc02a77ad in trap_fatal () #3 0xc02a7485 in trap_pfault () #4 0xc02a7083 in trap () #5 0xc0253680 in acquire_lock () #6 0xc025735c in softdep_update_inodeblock () #7 0xc025296d in ffs_update () #8 0xc025a534 in ffs_sync () #9 0xc018d517 in sync () #10 0xc0160e07 in boot () #11 0xc01613b8 in poweroff_wait () #12 0xc02a77ad in trap_fatal () #13 0xc02a7485 in trap_pfault () #14 0xc02a7083 in trap () #15 0xc01bba4d in tcp_timer_persist () #16 0xc01667b1 in softclock () (kgdb) quit It is no mistake, but a fact, that the reports are identical. All pointers and addresses seem to be the same. For that reason I guess hardware failure may be outruled, but I have no experience in analysing crash dumps, so I can't really tell. I'm also wondering, if/why there seem to be _two_ panics occuring ? Or is this just regular behaviour during a crash dump. As this is a production machine that should have long uptimes, I did not build a debugging kernel, yet, but will do, to get more info during the next crash (if it happens). >How-To-Repeat: Well, I can't tell. It just happened, but on no particular action, that could possibly been identified. >Fix: No idea, BUT: I've installed a BIOS upgrade to 1007, that seemed to solve the problem with ECC RAM and updated to 4.1-STABLE. So the box is currently running with ECC enabled and 4.1-STABLE, however I cannot tell, if it will work from now on, yet, but I keep you up to date. >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message