Date: Sun, 27 Jan 2008 01:00:57 GMT From: kevin brintnall <kbrint@rufus.net> To: freebsd-gnats-submit@FreeBSD.org Subject: kern/120026: kernel panic in "sis" driver (page fault while in kernel mode) Message-ID: <200801270100.m0R10vDM049062@www.freebsd.org> Resent-Message-ID: <200801270110.m0R1A1sH042126@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 120026 >Category: kern >Synopsis: kernel panic in "sis" driver (page fault while in kernel mode) >Confidential: no >Severity: serious >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Jan 27 01:10:01 UTC 2008 >Closed-Date: >Last-Modified: >Originator: kevin brintnall >Release: 6.3-RELEASE >Organization: >Environment: FreeBSD maguro.rufus.net 6.3-RELEASE FreeBSD 6.3-RELEASE #0: Sun Jan 20 13:03:27 CST 2008 root@maguro.rufus.net:/usr/obj/usr/src/sys/RUFUS >Description: I've been experiencing regular kernel panics on 6.3-RELEASE machine. This machine was previously stable under 6.2-RELEASE. The crashes are always in the sis0 driver. The first crash I was able to capture looked like this: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xbfe7a7e8 fault code = supervisor read, page not present instruction pointer = 0x20:0xc07f8a16 stack pointer = 0x28:0xe6704c20 frame pointer = 0x28:0xe6704c68 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 31 (irq19: sis0) trap number = 12 panic: page fault cpuid = 0 Uptime: 59m55s Dumping 3007 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 3007MB (769776 pages) 2991 2975 2959 2943 2927 2911 2895 2879 2863 2847 2831 2815 2799 2783 2767 2751 2735 2719 2703 2687 2671 2655 2639 2623 2607 2591 2575 2559 2543 2527 2511 2495 2479 2463 2447 2431 2415 2399 2383 2367 2351 2335 2319 2303 2287 2271 2255 2239 2223 2207 2191 2175 2159 2143 2127 2111 2095 2079 2063 2047 2031 2015 1999 1983 1967 1951 1935 1919 1903 1887 1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 1455 1439 1423 1407 1391 1375 1359 1343 1327 13crash-1 The subsequent two captures looked like this: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x40 fault code = supervisor read, page not present instruction pointer = 0x20:0xc07f9bec stack pointer = 0x28:0xe6704c9c frame pointer = 0x28:0xe6704ca4 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 31 (irq19: sis0) trap number = 12 panic: page fault cpuid = 0 GEOM_MIRROR: Device gm1: rebuilding provider ad0s1 stopped. Uptime: 20m51s Dumping 3007 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 3007MB (769776 pages) 2991 2975 2959 2943 2927 2911 2895 2879 2863 2847 2831 2815 2799 2783 2767 2751 2735 2719 2703 2687 2671 2655 2639 2623 2607 2591 2575 2559 2543 2527 2511 2495 2479 2463 2447 2431 2415 2399 2383 2367 2351 2335 2319 2303 2287 2271 2255 2239 2223 2207 2191 2175 2159 2143 2127 2111 2095 2079 2063 2047 2031 2015 1999 1983 1967 1951 1935 1919 1903 1887 1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 1455 1439 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 1247 1231 1215 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 1023 1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15 ... ok The final crash was different... It hung the machine completely. I'm not sure if the hang was related to the crash, or whether it was background_fsck hanging the machine. I've had some problems with background_fsck in the past. sis0: discard frame w/o packet header panic: vm_fault: fault on nofault entry, addr: e3f7a000 cpuid = 0 GEOM_MIRROR: Device gm1: rebuilding provider ad0s1 stopped. Uptime: 5m42s Dumping 3007 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 3007MB (769776 pages)sis0: watchdog timeout sis0: watchdog timeout sis0: watchdog timeout sis0: watchdog timeout sis0: watchdog timeout sis0: watchdog timeout Although I have had several dumps to $dumpdev, savecore has never been able to retrieve any of them!! So, i can't provide a backtrace at this time. I'm attempting to deduce why savecore never produces anything.. savecore: error reading last dump header at offset 4296851456 in /dev/ad0s2b: Input/output error This is the full boot: /boot/kernel/acpi.ko text=0x46704 data=0x2440+0x1b8c syms=[0x4+0x8050+0x4+0xaec0] Copyright (c) 1992-2008 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.3-RELEASE #0: Sun Jan 20 13:03:27 CST 2008 root@maguro.rufus.net:/usr/obj/usr/src/sys/RUFUS Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) 4 CPU 2.80GHz (2796.35-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf29 Stepping = 9 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x4400<CNXT-ID,xTPR> Logical CPUs per core: 2 real memory = 3154051072 (3007 MB) avail memory = 3085287424 (2942 MB) ACPI APIC Table: <AMIINT SiS740XX> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0 <Version 1.1> irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: <AMIINT SiS740XX> on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 agp0: <SiS 661 host to AGP bridge> mem 0xe0000000-0xe3ffffff at device 0.0 on pci0 pcib1: <PCI-PCI bridge> at device 1.0 on pci0 pci1: <PCI bus> on pcib1 pci1: <display, VGA> at device 0.0 (no driver attached) isab0: <PCI-ISA bridge> at device 2.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <SiS 962/963 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xff00-0xff0f at device 2.5 on pci0 ata0: <ATA channel 0> on atapci0 ata1: <ATA channel 1> on atapci0 pci0: <multimedia, audio> at device 2.7 (no driver attached) ohci0: <SiS 5571 USB controller> mem 0xcfff9000-0xcfff9fff irq 20 at device 3.0 on pci0 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: <SiS 5571 USB controller> on ohci0 usb0: USB revision 1.0 uhub0: SiS OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 3 ports with 3 removable, self powered ohci1: <SiS 5571 USB controller> mem 0xcfffa000-0xcfffafff irq 21 at device 3.1 on pci0 ohci1: [GIANT-LOCKED] usb1: OHCI version 1.0, legacy support usb1: <SiS 5571 USB controller> on ohci1 usb1: USB revision 1.0 uhub1: SiS OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 3 ports with 3 removable, self powered ehci0: <EHCI (generic) USB 2.0 controller> mem 0xcfffb000-0xcfffbfff irq 23 at device 3.2 on pci0 ehci0: [GIANT-LOCKED] usb2: EHCI version 1.0 usb2: companion controllers, 3 ports each: usb0 usb1 usb2: <EHCI (generic) USB 2.0 controller> on ehci0 usb2: USB revision 2.0 uhub2: SiS EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub2: 6 ports with 6 removable, self powered sis0: <SiS 900 10/100BaseTX> port 0xd400-0xd4ff mem 0xcfff8000-0xcfff8fff irq 19 at device 4.0 on pci0 miibus0: <MII bus> on sis0 rlphy0: <RTL8201L 10/100 media interface> on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto sis0: Ethernet address: 00:0b:6a:59:ff:67 fdc0: <floppy drive controller> port 0x3f2-0x3f3,0x3f4-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A, console pmtimer0 on isa0 orm0: <ISA Option ROM> at iomem 0xc0000-0xcbfff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x100> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled Timecounters tick every 1.000 msec ad0: 114440MB <WDC WD1200JB-75CRA0 16.06V16> at ata0-master UDMA100 ad2: 114473MB <WDC WD1200JB-00REA0 20.00K20> at ata1-master UDMA100 SMP: AP CPU #1 Launched! GEOM_MIRROR: Device gm1 created (id=1450529221). GEOM_MIRROR: Device gm1: provider ad0s1 detected. GEOM_MIRROR: Device gm1: provider ad2s1 detected. GEOM_MIRROR: Device gm1: provider ad2s1 activated. GEOM_MIRROR: Device gm1: provider mirror/gm1 launched. GEOM_MIRROR: Device gm1: rebuilding provider ad0s1. Trying to mount root from ufs:/dev/mirror/gm1a WARNING: / was not properly dismounted Loading configuration files. kernel dumps on /dev/ad0s2b Entropy harvesting: interrupts ethernet point_to_point kickstart. swapon: adding /dev/ad0s2b as swap device swapon: adding /dev/ad2s2b as swap device Starting file system checks: /dev/mirror/gm1a: 2010 files, 41571 used, 212244 free (2924 frags, 26165 blocks, 1.2% fragmentation) /dev/mirror/gm1f: 22506 files, 3949371 used, 44943764 free (7292 frags, 5617059 blocks, 0.0% fragmentation) .. lots of fsck Mounting local file systems:. Setting hostname: maguro.rufus.net. kern.ipc.shm_use_phys: 0 -> 1 machdep.hyperthreading_allowed: 0 -> 1 kern.ipc.maxsockbuf: 262144 -> 16777216 net.inet.tcp.sendspace: 32768 -> 262144 net.inet.tcp.recvspace: 65536 -> 262144 net.inet.udp.recvspace: 41600 -> 2097152 net.inet.udp.maxdgram: 9216 -> 65535 lo0: flags=8049<UP,LOOPBACK,RUNNsING,MULTICAST> mitu 16384 s inet 127.0.0.1 0netmask 0xff0000:00 sis0: flags=8843l<UP,BROADCAST,RUiNNING,SIMPLEX,MUnLTICAST> mtu 150k0 options=8<VL AN_MTU> inet 2s09.240.71.9 netmtask 0xfffffff8 baroadcast 209.240t.71.15 ether 0e0:0b:6a:59:ff:67 media: Etherncet autoselect (1h00baseTX <full-dauplex>) statusn: active ged to UP Starting pflog. pflog0: promiscuous mode enabled Enabling pf. Jan 26 17:29:54 pflogd[311]: [priv]: msg PRIV_OPEN_LOG received pf enabled add net default: gateway 209.240.71.14 Additional routing options:. Starting devd. hw.acpi.cpu.cx_lowest: C1 sysctl: hw.acpi.cpu.cx_lowest: Invalid argument Additional TCP options:. Mounting NFS file systems:. ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/X11R6/lib /usr/local/lib /usr/local/lib/compat/pkg /usr/local/lib/apache2 /usr/local/lib/compat/pkg /usr/local/lib/graphviz /usr/local/lib/jabberd /usr/local/lib/pth /usr/local/lib/zsh /usr/local/lib/apache2 /usr/local/lib/compat/pkg /usr/local/lib/graphviz /usr/local/lib/jabberd /usr/local/lib/pth /usr/local/lib/zsh a.out ldconfig path: /usr/lib/aout /usr/lib/compat/aout Creating and/or trimming log files:. Starting syslogd. Checking for core dump on /dev/ad0s2b... savecore: error reading last dump header at offset 4296851456 in /dev/ad0s2b: Input/output error savecore: no dumps found Jan 26 17:29:56 maguro savecore: error reading last dump header at offset 4296851456 in /dev/ad0s2b: Input/output error Initial i386 initialization:. Additional ABI support:. Starting named. Clearing /tmp (X related). Starting socks5. Starting local daemons:. Updating motd. Mounting late file systems:. Starting ntpd. Jan 26 17:30:00 maguro ntpd[615]: no IPv6 interfaces found postfix/postfix-script: starting the Postfix mail system Configuring syscons: blanktime. Starting sshd. Starting cron. Local package initialization:. Sat Jan 26 17:30:03 CST 2008 FreeBSD/i386 (maguro.rufus.net) (ttyd0) login: Jan 26 17:32:10 maguro savecore: error reading last dump header at offset 4296851456 in /dev/ad0s2b: Input/output error Jan 26 17:32:31 maguro savecore: error reading last dump header at offset 4296851456 in /dev/ad0s2b: Input/output error >How-To-Repeat: The panic occurs most often when the system is under high load. "/usr/src# make cleandir" can cause the problem. Also, it is more likely to happen during gmirror re-builds (which are common due to all the crashes). However, it has also crashed a couple times under low load. >Fix: >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200801270100.m0R10vDM049062>