Date: Thu, 27 Sep 2007 15:11:51 -0700 From: Joao Pedras <jpedras@webvolution.net> To: Kris Kennaway <kris@FreeBSD.org> Cc: freebsd-current@freebsd.org Subject: Re: lock up Message-ID: <46FC2AA7.2040308@webvolution.net> In-Reply-To: <46FADB35.2090708@FreeBSD.org> References: <46FA0C14.10201@webvolution.net> <46FADB35.2090708@FreeBSD.org>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --]
Hi again,
hope this info sheds some light on this. A similar system is getting
setup to reproduce the problem there as well.
Thanks!
Joao
Kris Kennaway wrote:
> Joao Pedras wrote:
>> Greetings all!
>>
>> A system (Tyan S2932) I am testing CURRENT amd64 with is experiencing a
>> strange lock up. No panic, not much on the console, just a lock up,
>> freeze.
>>
>> I first noticed the issue while tailing a build in a ssh session over a
>> vpn connection. On the local network the issue doesn't seem to occur.
>> I can reproduce the lock up all the time.
>>
>> Last I tried, the system was running CURRENT from a couple hours ago.
>>
>> I have tried:
>>
>> - 4BSD and ULE
>> - switching network cards
>> - taking the IPMI card out (seems to work locally with freeipmi and
>> ipmitoll remotely)
>> - enabled/disabled "redirection after post" (BIOS setting)
>> - without debug
>> - without IPv6 and friends (see rtfree below)
>>
>> dmesg and pciconf attached. The dmesg is after a lock up.
>> rtfree pops a few times before the lock up. I noticed from a recent post
>> some action was taken and the related patch is there (ie. today's
>> CURRENT).
>>
>> The system had a fresh install this past weekend and the LSI MegaRAID
>> array doesn't contain any data, it is just mounted. The system boots off
>> the onboard LSI SAS (couple disks in RAID 1).
>>
>> Thank you for your input.
>
> Break to DDB and obtain process traces, etc. See the developers handbook.
>
> Kris
>
[-- Attachment #2 --]
[root@daemons ~]# rtfree: 0xffffff0003bbd4b0 has 1 refs
KDB: enter: manual escape to debugger
[thread pid 20 tid 100021 ]
Stopped at kdb_enter+0x31: leave
db> ps
pid ppid pgrp uid state wmesg wchan cmd
26528 26527 18507 0 RV+ make
26527 26513 18507 0 S+ ppwait 0xffffff0003b44000 make
26513 26512 18507 0 S+ wait 0xffffff0003b448c0 sh
26512 24037 18507 0 R+ make
24037 24028 18507 0 S+ wait 0xffffff0003c62000 sh
24028 22952 18507 0 S+ select 0xffffffff80aa6d10 make
23793 23780 18507 0 R+ sh
23780 23774 18507 0 S+ select 0xffffffff80aa6d10 make
23774 22927 18507 0 S+ wait 0xffffff000e026000 sh
23604 23506 18507 0 R+ sh
23528 23506 18507 0 R+ sh
23506 23469 18507 0 S+ select 0xffffffff80aa6d10 make
23469 23456 18507 0 S+ wait 0xffffff0003b33460 sh
23456 23448 18507 0 S+ select 0xffffffff80aa6d10 make
23448 22927 18507 0 S+ wait 0xffffff00062808c0 sh
22952 22946 18507 0 S+ wait 0xffffff00065dc000 sh
22950 22941 18507 0 R+ sh
22946 22942 18507 0 S+ select 0xffffffff80aa6d10 make
22942 22927 18507 0 S+ wait 0xffffff00065dd8c0 sh
--More-- 22941 22935 18507 0 S+ select 0xffffffff80aa6d10 make
22935 22927 18507 0 S+ wait 0xffffff0003ddc460 sh
22927 22926 18507 0 R+ make
22926 22475 18507 0 S+ wait 0xffffff00068778c0 sh
22475 22473 18507 0 S+ select 0xffffffff80aa6d10 make
22473 22388 18507 0 S+ wait 0xffffff0003afc8c0 sh
22388 18507 18507 0 S+ select 0xffffffff80aa6d10 make
18507 962 18507 0 S+ wait 0xffffff0003abc000 sh
962 961 962 0 S+ wait 0xffffff0003afc000 bash
961 959 961 1001 S+ wait 0xffffff0003a1c000 su
959 958 959 1001 Ss+ wait 0xffffff0003ba2000 bash
958 955 955 1001 R CPU 3 sshd
955 709 955 0 Ss sbwait 0xffffff0003e15c3c sshd
930 0 0 0 SL ipmireq 0xffffff0003a5e348 [ipmi0: kcs]
905 1 905 65 Ss select 0xffffffff80aa6d10 dhclient
885 1 884 0 S+ select 0xffffffff80aa6d10 dhclient
771 769 771 0 S+ ttyin 0xffffff0003945810 bash
770 1 770 0 Ss+ ttyin 0xffffff0003913810 getty
769 1 769 0 Ss+ wait 0xffffff0003cf8460 login
768 1 768 0 Ss+ ttyin 0xffffff000395b410 getty
--More-- 767 1 767 0 Ss+ ttyin 0xffffff0003954c10 getty
725 1 725 0 Ss nanslp 0xffffffff80a1e128 cron
719 1 719 25 Ss pause 0xffffff0003bb40c0 sendmail
715 1 715 0 Ss select 0xffffffff80aa6d10 sendmail
709 1 709 0 Ss select 0xffffffff80aa6d10 sshd
594 1 594 0 Ss select 0xffffffff80aa6d10 syslogd
548 1 548 0 Ss select 0xffffffff80aa6d10 devd
167 1 167 0 Ss pause 0xffffff0003b35520 adjkerntz
54 0 0 0 SL - 0xffffffff80a1dda8 [schedcpu]
53 0 0 0 SL sdflush 0xffffffff80ab6d28 [softdepflush]
52 0 0 0 SL syncer 0xffffffff80a1dda0 [syncer]
51 0 0 0 SL vlruwt 0xffffff0003a21460 [vnlru]
50 0 0 0 SL psleep 0xffffffff80aa759c [bufdaemon]
49 0 0 0 SL pgzero 0xffffffff80ab87a4 [pagezero]
48 0 0 0 SL psleep 0xffffffff80ab7ae8 [vmdaemon]
47 0 0 0 SL psleep 0xffffffff80ab7aac [pagedaemon]
46 0 0 0 SL waiting_ 0xffffffff80aaaca8 [sctp_iterator]
45 0 0 0 RL CPU 0 [irq1: atkbd0]
44 0 0 0 SL - 0xffffff000394f448 [fdc0]
43 0 0 0 WL [swi0: sio]
--More-- 42 0 0 0 WL [irq19: amr0]
41 0 0 0 SL idle 0xffffffff80e5a000 [mpt_recovery0]
40 0 0 0 WL [irq18: mpt0]
39 0 0 0 SL - 0xffffff0003880e00 [nfe1 taskq]
38 0 0 0 SL - 0xffffff0003696000 [nfe0 taskq]
37 0 0 0 SL idle 0xffffff0003430c00 [aic_recovery0]
36 0 0 0 WL [irq16: ahc0]
35 0 0 0 SL idle 0xffffff0003430c00 [aic_recovery0]
34 0 0 0 WL [irq20: atapci2]
33 0 0 0 WL [irq23: atapci1]
32 0 0 0 WL [irq15: ata1]
31 0 0 0 WL [irq14: ata0]
30 0 0 0 SL usbevt 0xffffff00012d8420 [usb1]
29 0 0 0 WL [irq22: ehci0]
28 0 0 0 SL usbtsk 0xffffffff80a19768 [usbtask-dr]
27 0 0 0 SL usbtsk 0xffffffff80a19740 [usbtask-hc]
26 0 0 0 SL usbevt 0xffffffff80e42420 [usb0]
25 0 0 0 WL [irq21: ohci0+]
24 0 0 0 WL [irq9: acpi0]
23 0 0 0 WL [swi2: cambio]
--More-- 22 0 0 0 SL ccb_scan 0xffffffff809e84a0 [xpt_thrd]
9 0 0 0 SL - 0xffffff0001238080 [acpi_task_2]
8 0 0 0 SL - 0xffffff0001238080 [acpi_task_1]
7 0 0 0 SL - 0xffffff0001238080 [acpi_task_0]
6 0 0 0 SL - 0xffffff0001238100 [kqueue taskq]
21 0 0 0 WL [swi6: task queue]
20 0 0 0 RL CPU 1 [swi6: Giant taskq]
5 0 0 0 SL - 0xffffff00011f0480 [thread taskq]
19 0 0 0 RL CPU 2 [swi5: +]
18 0 0 0 SL - 0xffffffff80a1dda8 [yarrow]
4 0 0 0 SL - 0xffffffff80a1a718 [g_down]
3 0 0 0 SL - 0xffffffff80a1a710 [g_up]
2 0 0 0 SL - 0xffffffff80a1a700 [g_event]
17 0 0 0 WL [swi1: net]
16 0 0 0 WL [swi3: vm]
15 0 0 0 RL [swi4: clock sio]
14 0 0 0 RL [idle: cpu0]
13 0 0 0 RL [idle: cpu1]
12 0 0 0 RL [idle: cpu2]
11 0 0 0 RL [idle: cpu3]
--More-- 1 0 1 0 SLs wait 0xffffff00010e48c0 [init]
10 0 0 0 SL audit_wo 0xffffffff80ab6200 [audit]
0 0 0 0 WLs [swapper]
24476 22927 18507 0 Z+ sh
24959 22927 18507 0 Z+ sh
25104 22927 18507 0 Z+ sh
db> trace
Tracing pid 20 tid 100021 td 0xffffff0001237680
kdb_enter() at kdb_enter+0x31
scgetc() at scgetc+0x461
sckbdevent() at sckbdevent+0xa4
kbdmux_intr() at kbdmux_intr+0x43
kbdmux_kbd_intr() at kbdmux_kbd_intr+0x20
taskqueue_run() at taskqueue_run+0x94
ithread_loop() at ithread_loop+0xe0
fork_exit() at fork_exit+0x12a
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffffac2bcd30, rbp = 0 ---
db> panc ic
panic: from debugger
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
mi_switch() at mi_switch+0x39c
sched_bind() at sched_bind+0x83
boot() at boot+0x3f
panic() at panic+0x167
db_panic() at db_panic+0x17
db_command_loop() at db_command_loop+0x1dc
db_trap() at db_trap+0x6f
kdb_trap() at kdb_trap+0x95
trap() at trap+0x295
calltrap() at calltrap+0x8
--- trap 0x3, rip = 0xffffffff8049dff1, rsp = 0xffffffffac2bca90, rbp = 0xffffffffac2bcaa0 ---
kdb_enter() at kdb_enter+0x31
scgetc() at scgetc+0x461
sckbdevent() at sckbdevent+0xa4
kbdmux_intr() at kbdmux_intr+0x43
kbdmux_kbd_intr() at kbdmux_kbd_intr+0x20
taskqueue_run() at taskqueue_run+0x94
ithread_loop() at ithread_loop+0xe0
fork_exit() at fork_exit+0x12a
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffffac2bcd30, rbp = 0 ---
db> continue
KDB: enter: manual escape to debugger
[thread pid 20 tid 100021 ]
Stopped at kdb_enter+0x31: leave
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?46FC2AA7.2040308>
