Date: Mon, 22 Oct 2012 16:19:13 -0700 From: Dennis Glatting <freebsd@pki2.com> To: Attila Nagy <bra@fsn.hu> Cc: freebsd-fs@FreeBSD.org Subject: Re: ZFS HBAs + LSI chip sets (Was: ZFS hang (system #2)) Message-ID: <1350947953.86715.138.camel@btw.pki2.com> In-Reply-To: <50856322.9070307@fsn.hu> References: <1350698905.86715.33.camel@btw.pki2.com> <1350711509.86715.59.camel@btw.pki2.com> <50825598.3070505@FreeBSD.org> <1350744349.88577.10.camel@btw.pki2.com> <1350765093.86715.69.camel@btw.pki2.com> <508322EC.4080700@FreeBSD.org> <1350778257.86715.106.camel@btw.pki2.com> <50856322.9070307@fsn.hu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 2012-10-22 at 17:15 +0200, Attila Nagy wrote: > Hi, > > On 10/21/2012 02:10 AM, Dennis Glatting wrote: > > I chosen the LSI2008 chip set because the code was donated by LSI, and > > they therefore demonstrated interest in supporting their products under > > FreeBSD, and that chip set is found in a lot of places, notably > > Supermicro boards. Additionally, there were stories of success on the > > lists for several boards. That said, I have received private email from > > others expressing frustration with ZFS and the "hang" problems, which I > > believe are also the LSI chips. > > > I have a Sun X4540, which shows similar symptoms. It has some (6) > on-board LSI 1068E SAS controllers with 1.27.02.00-IT firmware (latest > from Sun/Oracle) and 48 SATA disks. > It runs stable/9@r240134. > > Currently the machine does a resilver on its 48 disk pool (heavy IO > happens), which stops periodically. > I've set up watchdogd with a command of "ls /data" (the pool is mounted > there). It doesn't restart the machine when the IO freezes, because the > command always succeeds (coming from cache I guess). > But if something wants to touch the disks, it stucks in D state. > > zpool status shows: > scan: resilver in progress since Sun Oct 21 15:40:50 2012 > 3.16T scanned out of 13.8T at 26.4M/s, 117h45m to go > 133G resilvered, 22.82% done > And the estimated time grows constantly. > gstat shows no IO. > I've had this problem too. > If I issue an ls -R /data, it gets stuck: > root 36217 0.0 0.0 14380 1800 3 D+ 4:45PM 0:00.00 ls -R /data/ > # procstat -k 36217 > PID TID COMM TDNAME KSTACK > 36217 101469 ls - mi_switch sleepq_wait > _cv_wait zio_wait dbuf_read dbuf_findbp dbuf_hold_impl dbuf_hold > dmu_buf_hold zap_lockdir zap_cursor_retrieve zfs_freebsd_readdir > kern_getdirentries sys_getdirentries amd64_syscall Xfast_syscall > > Also, a dd on any of the disks waits forever, without reading a single byte: > root 36570 0.0 0.0 9876 1356 4 DL+ 4:46PM 0:00.00 dd > if=/dev/da0 of=/dev/null > # procstat -k 36570 > PID TID COMM TDNAME KSTACK > 36570 101489 dd - mi_switch sleepq_wait > _sleep bwait physio devfs_read_f dofileread kern_readv sys_read > amd64_syscall Xfast_syscall > > > Camcontrol works: > > # camcontrol devlist > <ATA SEAGATE ST35002N SU0F> at scbus0 target 0 lun 0 (pass0,da0) > <ATA SEAGATE ST35002N SU0F> at scbus0 target 1 lun 0 (pass1,da1) > <ATA SEAGATE ST35002N SU0F> at scbus0 target 2 lun 0 (pass2,da2) > <ATA HITACHI HDS7250S AJ0A> at scbus0 target 3 lun 0 (pass3,da3) > <ATA SEAGATE ST35002N SU0F> at scbus0 target 4 lun 0 (pass4,da4) > <ATA HITACHI HUA7250S AC5A> at scbus0 target 5 lun 0 (pass5,da5) > <ATA SEAGATE ST35002N SU0F> at scbus0 target 6 lun 0 (pass6,da6) > <ATA ST3500320NS SN04> at scbus0 target 7 lun 0 (pass7,da7) > <ATA HITACHI HDS7250S AJ0A> at scbus1 target 0 lun 0 (pass8,da8) > <ATA SEAGATE ST35002N SU0F> at scbus1 target 1 lun 0 (pass9,da9) > <ATA SEAGATE ST35002N SU0F> at scbus1 target 2 lun 0 (pass10,da10) > <ATA SEAGATE ST35002N SU0F> at scbus1 target 3 lun 0 (pass11,da11) > <ATA SEAGATE ST35002N SU0F> at scbus1 target 4 lun 0 (pass12,da12) > <ATA SEAGATE ST35002N SU0F> at scbus1 target 5 lun 0 (pass13,da13) > <ATA SEAGATE ST35002N SU0F> at scbus1 target 6 lun 0 (pass14,da14) > <ATA SEAGATE ST35002N SU0F> at scbus1 target 7 lun 0 (pass15,da15) > <ATA SEAGATE ST35002N SU0F> at scbus2 target 0 lun 0 (pass16,da16) > <ATA SEAGATE ST35002N SU0F> at scbus2 target 1 lun 0 (pass17,da17) > <ATA HITACHI HUA7250S AC5A> at scbus2 target 2 lun 0 (pass18,da18) > <ATA SEAGATE ST35002N SU0F> at scbus2 target 3 lun 0 (pass19,da19) > <ATA SEAGATE ST35002N SU0F> at scbus2 target 4 lun 0 (pass20,da20) > <ATA SEAGATE ST35002N SU0F> at scbus2 target 5 lun 0 (pass21,da21) > <ATA SEAGATE ST35002N SU0F> at scbus2 target 6 lun 0 (pass22,da22) > <ATA SEAGATE ST35002N SU0F> at scbus2 target 7 lun 0 (pass23,da23) > <ATA SEAGATE ST35002N SU0F> at scbus3 target 0 lun 0 (pass24,da24) > <ATA SEAGATE ST35002N SU0F> at scbus3 target 1 lun 0 (pass25,da25) > <ATA SEAGATE ST35002N SU0F> at scbus3 target 2 lun 0 (pass26,da26) > <ATA SEAGATE ST35002N SU0F> at scbus3 target 3 lun 0 (pass27,da27) > <ATA SEAGATE ST35002N SU0F> at scbus3 target 4 lun 0 (pass28,da28) > <ATA SEAGATE ST35002N SU0F> at scbus3 target 5 lun 0 (pass29,da29) > <ATA SEAGATE ST35002N SU0F> at scbus3 target 6 lun 0 (pass30,da30) > <ATA SEAGATE ST35002N SU0F> at scbus3 target 7 lun 0 (pass31,da31) > <ATA SEAGATE ST35002N SU0F> at scbus4 target 0 lun 0 (pass32,da32) > <ATA SEAGATE ST35002N SU0F> at scbus4 target 1 lun 0 (pass33,da33) > <ATA SEAGATE ST35002N SU0F> at scbus4 target 2 lun 0 (pass34,da34) > <ATA SEAGATE ST35002N SU0F> at scbus4 target 3 lun 0 (pass35,da35) > <ATA SEAGATE ST35002N SU0F> at scbus4 target 4 lun 0 (pass36,da36) > <ATA SEAGATE ST35002N SU0F> at scbus4 target 5 lun 0 (pass37,da37) > <ATA SEAGATE ST35002N SU0F> at scbus4 target 6 lun 0 (pass38,da38) > <ATA SEAGATE ST35002N SU0F> at scbus4 target 7 lun 0 (pass39,da39) > <ATA SEAGATE ST35002N SU0F> at scbus5 target 0 lun 0 (pass40,da40) > <ATA SEAGATE ST35002N SU0F> at scbus5 target 1 lun 0 (pass41,da41) > <ATA SEAGATE ST35002N SU0F> at scbus5 target 2 lun 0 (pass42,da42) > <ATA SEAGATE ST35002N SU0F> at scbus5 target 3 lun 0 (pass43,da43) > <ATA SEAGATE ST35002N SU0F> at scbus5 target 4 lun 0 (pass44,da44) > <ATA SEAGATE ST35002N SU0F> at scbus5 target 5 lun 0 (pass45,da45) > <ATA SEAGATE ST35002N SU0F> at scbus5 target 6 lun 0 (pass46,da46) > <ATA SEAGATE ST35002N SU0F> at scbus5 target 7 lun 0 (pass47,da47) > > # camcontrol tags da0 > (pass0:mpt0:0:0:0): device openings: 255 > > Also works (I guess it doesn't touch the disks): > # zfs list > NAME USED AVAIL REFER MOUNTPOINT > logpool 13.1T 7.17T 507K /data > logpool/jail 7.08G 7.17T 7.08G /data/jail > logpool/logs 13.1T 7.17T 3.40T /data/jail/logvm/logs > logpool/logs/OTHER 9.24T 7.17T 2.36T /data/jail/logvm/logs/OTHER > > But this doesn't: > root 36686 0.0 0.0 33384 2512 5 D+ 4:49PM 0:00.00 zfs list > -t snapshot > # procstat -k 36686 > PID TID COMM TDNAME KSTACK > 36686 101593 zfs - mi_switch sleepq_wait > _cv_wait zio_wait dbuf_read dmu_buf_hold zap_lockdir zap_cursor_retrieve > dmu_snapshot_list_next zfs_ioc_snapshot_list_next zfsdev_ioctl > devfs_ioctl_f kern_ioctl sys_ioctl amd64_syscall Xfast_syscall > > Entering into the debugger: > KDB: enter: sysctl debug.kdb.enter > [ thread pid 36959 tid 101484 ] > Stopped at kdb_enter+0x3b: movq $0,0x95ab72(%rip) > db> ps > pid ppid pgrp uid state wmesg wchan cmd > 36959 1769 36959 0 R+ CPU 0 sysctl > 36691 919 919 0 S sbwait 0xfffffe009d752144 perl > 36686 36677 36686 0 D+ zio->io_ 0xfffffe001ccb7d70 zfs > 36677 36208 36677 0 Ss+ pause 0xfffffe009d0030a0 csh > 36570 36567 36570 0 DL+ physrd 0xffffff87005a2980 dd > 36567 36208 36567 0 Ss+ pause 0xfffffe00115c4540 csh > 36217 36209 36217 0 D+ zio->io_ 0xfffffe001c2b2320 ls > 36209 36208 36209 0 Ss+ pause 0xfffffe022c8aa0a0 csh > 36208 36207 36208 0 Ss select 0xfffffe0665c92e40 screen > 36207 1782 36207 0 S+ pause 0xfffffe009d0010a0 screen > 32921 883 873 0 DL cbwait 0xfffffe000f7f7848 camcontrol > 1782 1780 1782 0 Ss+ pause 0xfffffe009d4559e0 csh > 1780 897 1780 0 Ss select 0xfffffe001d546740 sshd > 1776 1774 1776 0 Ss+ ttyin 0xfffffe001c02a4a8 csh > 1774 897 1774 0 Ss select 0xfffffe001cb4d0c0 sshd > 1769 1767 1769 0 Ss+ pause 0xfffffe001191a540 csh > 1767 897 1767 0 Ss select 0xfffffe000fd72bc0 sshd > 1079 1 1079 0 Ss+ ttyin 0xfffffe000c82c4a8 getty > 1078 1 1078 0 Ss+ ttyin 0xfffffe000c82c8a8 getty > 1077 1 1077 0 Ss+ ttyin 0xfffffe000c82cca8 getty > 1076 1 1076 0 Ss+ ttyin 0xfffffe000c82d0a8 getty > 1075 1 1075 0 Ss+ ttyin 0xfffffe000c82d4a8 getty > 1074 1 1074 0 Ss+ ttyin 0xfffffe000c82d8a8 getty > 1073 1 1073 0 Ss+ ttyin 0xfffffe000c82dca8 getty > 1072 1 1072 0 Ss+ ttyin 0xfffffe000c82f0a8 getty > 919 1 919 0 Ss select 0xfffffe000f5ac940 perl > 907 1 907 0 Ss nanslp 0xffffffff81244f08 cron > 903 1 903 25 Ss pause 0xfffffe001125e0a0 sendmail > 900 1 900 0 Ss select 0xfffffe001d549340 sendmail > 897 1 897 0 Ss select 0xfffffe001d546cc0 sshd > 892 884 873 0 S piperd 0xfffffe001e940888 fghack > 884 878 873 0 S wait 0xfffffe000fdee000 sh > 883 879 873 0 S piperd 0xfffffe022c08b000 perl > 879 875 873 0 S select 0xfffffe001ca6a8c0 supervise > 878 875 873 0 S select 0xfffffe000fd73d40 supervise > 876 1 873 0 S piperd 0xfffffe001e9c5b60 readproctitle > 875 1 873 0 S nanslp 0xffffffff81244f08 svscan > 870 868 867 123 S select 0xfffffe000fd934c0 ntpd > 868 867 867 123 S select 0xfffffe001ca68e40 ntpd > 867 1 867 0 Ss select 0xfffffe000fddd740 ntpd > 796 0 0 0 DL mdwait 0xfffffe000f52a000 [md2] > 774 1 774 53 Ss (threaded) named > 101524 S kqread 0xfffffe00115dd100 named > 101523 S uwait 0xfffffe000fde5200 named > 101522 S uwait 0xfffffe00110ce680 named > 101521 S uwait 0xfffffe000fda0300 named > 101520 S uwait 0xfffffe000fddd380 named > 101519 S uwait 0xfffffe001198ca00 named > 101518 S uwait 0xfffffe000fd58880 named > 101517 S uwait 0xfffffe000fd7ab80 named > 101516 S uwait 0xfffffe000f80e480 named > 101515 S uwait 0xfffffe000f80f400 named > 101501 S sigwait 0xfffffe00110dd000 named > 751 750 751 0 Ss select 0xfffffe001d549440 syslog-ng > 750 1 749 0 S wait 0xfffffe000c8144a0 syslog-ng > 612 608 608 64 S bpf 0xfffffe001ca94800 pflogd > 608 1 608 0 Ss sbwait 0xfffffe001eb4ae8c pflogd > 605 0 0 0 DL pftm 0xffffffff817547a0 [pfpurge] > 78 0 0 0 DL (threaded) [zfskern] > 101459 D spa->spa 0xfffffe0011462680 > [txg_thread_enter] > 101458 D tx->tx_q 0xfffffe001b199230 > [txg_thread_enter] > 100122 D l2arc_fe 0xffffffff8173ebc0 > [l2arc_feed_thread] > 100121 D arc_recl 0xffffffff8172ed20 > [arc_reclaim_thread] > 59 0 0 0 DL mdwait 0xfffffe000f521000 [md1] > 47 0 0 0 DL mdwait 0xfffffe000f523800 [md0] > 24 0 0 0 DL sdflush 0xffffffff812a6158 [softdepflush] > 23 0 0 0 DL syncer 0xffffffff812928c0 [syncer] > 22 0 0 0 DL vlruwt 0xfffffe000c80d000 [vnlru] > 21 0 0 0 DL psleep 0xffffffff81292348 [bufdaemon] > 20 0 0 0 DL pgzero 0xffffffff812b019c [pagezero] > 19 0 0 0 DL psleep 0xffffffff812af368 [vmdaemon] > 18 0 0 0 DL psleep 0xffffffff812af32c [pagedaemon] > 17 0 0 0 DL ccb_scan 0xffffffff811ff260 [xpt_thrd] > 16 0 0 0 DL idle 0xffffff8001df3000 [mpt_recovery5] > 9 0 0 0 DL idle 0xffffff8001dde000 [mpt_recovery4] > 8 0 0 0 DL idle 0xffffff8001dc9000 [mpt_recovery3] > 7 0 0 0 DL idle 0xffffff8001daa000 [mpt_recovery2] > 6 0 0 0 DL idle 0xffffff8001d95000 [mpt_recovery1] > 5 0 0 0 DL idle 0xffffff8001d80000 [mpt_recovery0] > 15 0 0 0 DL (threaded) [usb] > 100048 D - 0xffffff8001d73e18 [usbus1] > 100047 D - 0xffffff8001d73dc0 [usbus1] > 100046 D - 0xffffff8001d73d68 [usbus1] > 100045 D - 0xffffff8001d73d10 [usbus1] > 100043 D - 0xffffff8001d6b460 [usbus0] > 100042 D - 0xffffff8001d6b408 [usbus0] > 100041 D - 0xffffff8001d6b3b0 [usbus0] > 100040 D - 0xffffff8001d6b358 [usbus0] > 4 0 0 0 DL ctl_work 0xffffff8000a41000 [ctl_thrd] > 14 0 0 0 DL - 0xffffffff81243ba4 [yarrow] > 3 0 0 0 DL crypto_r 0xffffffff812a4ae0 [crypto > returns] > 2 0 0 0 DL crypto_w 0xffffffff812a4aa0 [crypto] > 13 0 0 0 DL (threaded) [geom] > 100023 D - 0xffffffff8123d030 [g_down] > 100022 D - 0xffffffff8123d028 [g_up] > 100021 D - 0xffffffff8123d018 [g_event] > 12 0 0 0 RL (threaded) [intr] > 100065 I [swi0: uart] > 100063 I [irq293: mpt5] > 100061 I [irq292: mpt4] > 100059 I [irq291: mpt3] > 100055 I [irq274: mpt2] > 100053 I [irq273: mpt1] > 100051 I [irq272: mpt0] > 100044 I [irq22: ehci0] > 100039 I [irq21: ohci0] > 100034 I [swi2: cambio] > 100031 I [swi6: task queue] > 100030 I [swi6: Giant taskq] > 100028 I [swi5: +] > 100020 I [swi1: netisr 0] > 100019 I [swi4: clock] > 100018 I [swi4: clock] > 100017 I [swi4: clock] > 100016 I [swi4: clock] > 100015 I [swi4: clock] > 100014 I [swi4: clock] > 100013 I [swi4: clock] > 100012 RunQ [swi4: clock] > 100011 I [swi3: vm] > 11 0 0 0 RL (threaded) [idle] > 100010 Run CPU 7 [idle: cpu7] > 100009 Run CPU 6 [idle: cpu6] > 100008 Run CPU 5 [idle: cpu5] > 100007 Run CPU 4 [idle: cpu4] > 100006 Run CPU 3 [idle: cpu3] > 100005 Run CPU 2 [idle: cpu2] > 100004 Run CPU 1 [idle: cpu1] > 100003 CanRun [idle: cpu0] > 1 0 1 0 SLs wait 0xfffffe000c068940 [init] > 10 0 0 0 DL audit_wo 0xffffffff812a50d0 [audit] > 0 0 0 0 DLs (threaded) [kernel] > 101463 D - 0xfffffe000fddab00 [zil_clean] > 101462 D - 0xfffffe000fd6a800 [zil_clean] > 101461 D - 0xfffffe000fdf6180 [zil_clean] > 101460 D - 0xfffffe001d546600 [zil_clean] > 101457 D - 0xfffffe000f359e00 [zfs_vn_rele_taskq] > 101456 D - 0xfffffe001198d080 [zio_ioctl_intr] > 101455 D - 0xfffffe001cb4fa80 [zio_ioctl_issue] > 101454 D - 0xfffffe000ffbf380 [zio_claim_intr] > 101453 D - 0xfffffe00110cf580 [zio_claim_issue] > 101452 D - 0xfffffe00110cf880 [zio_free_intr] > 101451 D - 0xfffffe000ffc1b80 [zio_free_issue_99] > 101450 D - 0xfffffe000ffc1b80 [zio_free_issue_98] > 101449 D - 0xfffffe000ffc1b80 [zio_free_issue_97] > 101448 D - 0xfffffe000ffc1b80 [zio_free_issue_96] > 101447 D - 0xfffffe000ffc1b80 [zio_free_issue_95] > 101446 D - 0xfffffe000ffc1b80 [zio_free_issue_94] > 101445 D - 0xfffffe000ffc1b80 [zio_free_issue_93] > 101444 D - 0xfffffe000ffc1b80 [zio_free_issue_92] > 101443 D - 0xfffffe000ffc1b80 [zio_free_issue_91] > 101442 D - 0xfffffe000ffc1b80 [zio_free_issue_90] > 101441 D - 0xfffffe000ffc1b80 [zio_free_issue_89] > 101440 D - 0xfffffe000ffc1b80 [zio_free_issue_88] > 101439 D - 0xfffffe000ffc1b80 [zio_free_issue_87] > 101438 D - 0xfffffe000ffc1b80 [zio_free_issue_86] > 101437 D - 0xfffffe000ffc1b80 [zio_free_issue_85] > 101436 D - 0xfffffe000ffc1b80 [zio_free_issue_84] > 101435 D - 0xfffffe000ffc1b80 [zio_free_issue_83] > 101434 D - 0xfffffe000ffc1b80 [zio_free_issue_82] > 101433 D - 0xfffffe000ffc1b80 [zio_free_issue_81] > 101432 D - 0xfffffe000ffc1b80 [zio_free_issue_80] > 101431 D - 0xfffffe000ffc1b80 [zio_free_issue_79] > 101430 D - 0xfffffe000ffc1b80 [zio_free_issue_78] > 101429 D - 0xfffffe000ffc1b80 [zio_free_issue_77] > 101428 D - 0xfffffe000ffc1b80 [zio_free_issue_76] > 101427 D - 0xfffffe000ffc1b80 [zio_free_issue_75] > 101426 D - 0xfffffe000ffc1b80 [zio_free_issue_74] > 101425 D - 0xfffffe000ffc1b80 [zio_free_issue_73] > 101424 D - 0xfffffe000ffc1b80 [zio_free_issue_72] > 101423 D - 0xfffffe000ffc1b80 [zio_free_issue_71] > 101422 D - 0xfffffe000ffc1b80 [zio_free_issue_70] > 101421 D - 0xfffffe000ffc1b80 [zio_free_issue_69] > 101420 D - 0xfffffe000ffc1b80 [zio_free_issue_68] > 101419 D - 0xfffffe000ffc1b80 [zio_free_issue_67] > 101418 D - 0xfffffe000ffc1b80 [zio_free_issue_66] > 101417 D - 0xfffffe000ffc1b80 [zio_free_issue_65] > 101416 D - 0xfffffe000ffc1b80 [zio_free_issue_64] > 101415 D - 0xfffffe000ffc1b80 [zio_free_issue_63] > 101414 D - 0xfffffe000ffc1b80 [zio_free_issue_62] > 101413 D - 0xfffffe000ffc1b80 [zio_free_issue_61] > 101412 D - 0xfffffe000ffc1b80 [zio_free_issue_60] > 101411 D - 0xfffffe000ffc1b80 [zio_free_issue_59] > 101410 D - 0xfffffe000ffc1b80 [zio_free_issue_58] > 101409 D - 0xfffffe000ffc1b80 [zio_free_issue_57] > 101408 D - 0xfffffe000ffc1b80 [zio_free_issue_56] > 101407 D - 0xfffffe000ffc1b80 [zio_free_issue_55] > 101406 D - 0xfffffe000ffc1b80 [zio_free_issue_54] > 101405 D - 0xfffffe000ffc1b80 [zio_free_issue_53] > 101404 D - 0xfffffe000ffc1b80 [zio_free_issue_52] > 101403 D - 0xfffffe000ffc1b80 [zio_free_issue_51] > 101402 D - 0xfffffe000ffc1b80 [zio_free_issue_50] > 101401 D - 0xfffffe000ffc1b80 [zio_free_issue_49] > 101400 D - 0xfffffe000ffc1b80 [zio_free_issue_48] > 101399 D - 0xfffffe000ffc1b80 [zio_free_issue_47] > 101398 D - 0xfffffe000ffc1b80 [zio_free_issue_46] > 101397 D - 0xfffffe000ffc1b80 [zio_free_issue_45] > 101396 D - 0xfffffe000ffc1b80 [zio_free_issue_44] > 101395 D - 0xfffffe000ffc1b80 [zio_free_issue_43] > 101394 D - 0xfffffe000ffc1b80 [zio_free_issue_42] > 101393 D - 0xfffffe000ffc1b80 [zio_free_issue_41] > 101392 D - 0xfffffe000ffc1b80 [zio_free_issue_40] > 101391 D - 0xfffffe000ffc1b80 [zio_free_issue_39] > 101390 D - 0xfffffe000ffc1b80 [zio_free_issue_38] > 101389 D - 0xfffffe000ffc1b80 [zio_free_issue_37] > 101388 D - 0xfffffe000ffc1b80 [zio_free_issue_36] > 101387 D - 0xfffffe000ffc1b80 [zio_free_issue_35] > 101386 D - 0xfffffe000ffc1b80 [zio_free_issue_34] > 101385 D - 0xfffffe000ffc1b80 [zio_free_issue_33] > 101384 D - 0xfffffe000ffc1b80 [zio_free_issue_32] > 101383 D - 0xfffffe000ffc1b80 [zio_free_issue_31] > 100569 D - 0xfffffe000ffc1b80 [zio_free_issue_30] > 100567 D - 0xfffffe000ffc1b80 [zio_free_issue_29] > 100565 D - 0xfffffe000ffc1b80 [zio_free_issue_28] > 100560 D - 0xfffffe000ffc1b80 [zio_free_issue_27] > 100554 D - 0xfffffe000ffc1b80 [zio_free_issue_26] > 100553 D - 0xfffffe000ffc1b80 [zio_free_issue_25] > 100547 D - 0xfffffe000ffc1b80 [zio_free_issue_24] > 100545 D - 0xfffffe000ffc1b80 [zio_free_issue_23] > 100542 D - 0xfffffe000ffc1b80 [zio_free_issue_22] > 100539 D - 0xfffffe000ffc1b80 [zio_free_issue_21] > 100536 D - 0xfffffe000ffc1b80 [zio_free_issue_20] > 100530 D - 0xfffffe000ffc1b80 [zio_free_issue_19] > 100487 D - 0xfffffe000ffc1b80 [zio_free_issue_18] > 100415 D - 0xfffffe000ffc1b80 [zio_free_issue_17] > 100413 D - 0xfffffe000ffc1b80 [zio_free_issue_16] > 100407 D - 0xfffffe000ffc1b80 [zio_free_issue_15] > 100403 D - 0xfffffe000ffc1b80 [zio_free_issue_14] > 100400 D - 0xfffffe000ffc1b80 [zio_free_issue_13] > 100393 D - 0xfffffe000ffc1b80 [zio_free_issue_12] > 100391 D - 0xfffffe000ffc1b80 [zio_free_issue_11] > 100387 D - 0xfffffe000ffc1b80 [zio_free_issue_10] > 100386 D - 0xfffffe000ffc1b80 [zio_free_issue_9] > 100385 D - 0xfffffe000ffc1b80 [zio_free_issue_8] > 100384 D - 0xfffffe000ffc1b80 [zio_free_issue_7] > 100383 D - 0xfffffe000ffc1b80 [zio_free_issue_6] > 100379 D - 0xfffffe000ffc1b80 [zio_free_issue_5] > 100372 D - 0xfffffe000ffc1b80 [zio_free_issue_4] > 100367 D - 0xfffffe000ffc1b80 [zio_free_issue_3] > 100366 D - 0xfffffe000ffc1b80 [zio_free_issue_2] > 100361 D - 0xfffffe000ffc1b80 [zio_free_issue_1] > 100360 D - 0xfffffe000ffc1b80 [zio_free_issue_0] > 100359 D - 0xfffffe001ca67280 [zio_write_intr_high] > 100358 D - 0xfffffe001ca67280 [zio_write_intr_high] > 100357 D - 0xfffffe001ca67280 [zio_write_intr_high] > 100354 D - 0xfffffe001ca67280 [zio_write_intr_high] > 100353 D - 0xfffffe001ca67280 [zio_write_intr_high] > 100349 D - 0xfffffe000fd72700 [zio_write_intr_7] > 100348 D - 0xfffffe000fd72700 [zio_write_intr_6] > 100345 D - 0xfffffe000fd72700 [zio_write_intr_5] > 100343 D - 0xfffffe000fd72700 [zio_write_intr_4] > 100342 D - 0xfffffe000fd72700 [zio_write_intr_3] > 100341 D - 0xfffffe000fd72700 [zio_write_intr_2] > 100340 D - 0xfffffe000fd72700 [zio_write_intr_1] > 100339 D - 0xfffffe000fd72700 [zio_write_intr_0] > 100337 D - 0xfffffe001196ce00 [zio_write_issue_hig] > 100336 D - 0xfffffe001196ce00 [zio_write_issue_hig] > 100334 D - 0xfffffe001196ce00 [zio_write_issue_hig] > 100330 D - 0xfffffe001196ce00 [zio_write_issue_hig] > 100327 D - 0xfffffe001196ce00 [zio_write_issue_hig] > 100324 D - 0xfffffe00110cfb00 [zio_write_issue_7] > 100322 D - 0xfffffe00110cfb00 [zio_write_issue_6] > 100321 D - 0xfffffe00110cfb00 [zio_write_issue_5] > 100316 D - 0xfffffe00110cfb00 [zio_write_issue_4] > 100314 D - 0xfffffe00110cfb00 [zio_write_issue_3] > 100312 D - 0xfffffe00110cfb00 [zio_write_issue_2] > 100311 D - 0xfffffe00110cfb00 [zio_write_issue_1] > 100307 D - 0xfffffe00110cfb00 [zio_write_issue_0] > 100306 D - 0xfffffe000ffbfc80 [zio_read_intr_7] > 100305 D - 0xfffffe000ffbfc80 [zio_read_intr_6] > 100303 D - 0xfffffe000ffbfc80 [zio_read_intr_5] > 100300 D - 0xfffffe000ffbfc80 [zio_read_intr_4] > 100298 D - 0xfffffe000ffbfc80 [zio_read_intr_3] > 100297 D - 0xfffffe000ffbfc80 [zio_read_intr_2] > 100293 D - 0xfffffe000ffbfc80 [zio_read_intr_1] > 100292 D - 0xfffffe000ffbfc80 [zio_read_intr_0] > 100291 D - 0xfffffe00110cf000 [zio_read_issue_7] > 100289 D - 0xfffffe00110cf000 [zio_read_issue_6] > 100288 D - 0xfffffe00110cf000 [zio_read_issue_5] > 100286 D - 0xfffffe00110cf000 [zio_read_issue_4] > 100282 D - 0xfffffe00110cf000 [zio_read_issue_3] > 100281 D - 0xfffffe00110cf000 [zio_read_issue_2] > 100280 D - 0xfffffe00110cf000 [zio_read_issue_1] > 100278 D - 0xfffffe00110cf000 [zio_read_issue_0] > 100275 D - 0xfffffe001113b500 [zio_null_intr] > 100273 D - 0xfffffe001196c800 [zio_null_issue] > 100120 D - 0xfffffe0011370300 [system_taskq_7] > 100119 D - 0xfffffe0011370300 [system_taskq_6] > 100118 D - 0xfffffe0011370300 [system_taskq_5] > 100117 D - 0xfffffe0011370300 [system_taskq_4] > 100116 D - 0xfffffe0011370300 [system_taskq_3] > 100115 D - 0xfffffe0011370300 [system_taskq_2] > 100114 D - 0xfffffe0011370300 [system_taskq_1] > 100113 D - 0xfffffe0011370300 [system_taskq_0] > 100066 D - 0xfffffe000f239a80 [mca taskq] > 100058 D - 0xfffffe000c69b900 [nfe3 taskq] > 100057 D - 0xfffffe000c698480 [nfe2 taskq] > 100050 D - 0xfffffe000c620400 [nfe1 taskq] > 100049 D - 0xfffffe000c61b500 [nfe0 taskq] > 100037 D - 0xfffffe000c24bb00 [acpi_task_2] > 100036 D - 0xfffffe000c24bb00 [acpi_task_1] > 100035 D - 0xfffffe000c24bb00 [acpi_task_0] > 100033 D - 0xfffffe000c24be00 [kqueue taskq] > 100032 D - 0xfffffe000c24c000 [ffs_trim taskq] > 100029 D - 0xfffffe000c20c780 [thread taskq] > 100024 D - 0xfffffe000c07fb80 [firmware taskq] > 100000 D sched 0xffffffff8123d280 [swapper] > 895 892 873 0 Z perl > > Setting this: > # sysctl dev.mpt.0.debug=255 > and doing a dd again from a disk on that controller prints this onto the > console: > SCSI IO Request @ 0xffffff80003046f0 > Chain Offset 0x00 > MsgFlags 0x00 > MsgContext 0x000201c5 > Bus: 0 > TargetID 0 > SenseBufferLength 32 > LUN: 0x0 > Control 0x02000200 READ ORDEREDQ > DataLength 0x00000200 > SenseBufAddr 0x0c678be0 > CDB[0:6] 08 00 00 00 01 00 > SE64 0xffffff87ffd33a30: Addr=0x000000070cc08400 FlagsLength=0xd3000200 > 64_BIT_ADDRESSING LAST_ELEMENT END_OF_BUFFER END_OF_LIST > mpt0: Send Request 453 (c678a00): > mpt0: 00000000 00002006 000201c5 00000000 00000000 02000200 00000008 > 00000001 > mpt0: 00000000 00000000 00000200 0c678be0 d3000200 0cc08400 00000007 > ffffffff > mpt0: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff > ffffffff > mpt0: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff > ffffffff > mpt0: enter mpt_intr > mpt0: Context Reply: 0x000201c5 > mpt0: exit mpt_intr > > And dd freezes. > > Alltrace from a couple of stuck processes: > Tracing command dd pid 36971 tid 101570 td 0xfffffe001efce000 > sched_switch() at sched_switch+0x115 > mi_switch() at mi_switch+0x186 > sleepq_wait() at sleepq_wait+0x42 > _sleep() at _sleep+0x379 > bwait() at bwait+0x64 > physio() at physio+0x1c8 > devfs_read_f() at devfs_read_f+0x90 > dofileread() at dofileread+0xa1 > kern_readv() at kern_readv+0x6c > sys_read() at sys_read+0x64 > amd64_syscall() at amd64_syscall+0x540 > Xfast_syscall() at Xfast_syscall+0xf7 > --- syscall (3, FreeBSD ELF64, sys_read), rip = 0x800916c8c, rsp = > 0x7fffffffd658, rbp = 0x7fffffffd6b0 --- > > Tracing command zfs pid 36686 tid 101593 td 0xfffffe001ecb3900 > sched_switch() at sched_switch+0x115 > mi_switch() at mi_switch+0x186 > sleepq_wait() at sleepq_wait+0x42 > _cv_wait() at _cv_wait+0x112 > zio_wait() at zio_wait+0x61 > dbuf_read() at dbuf_read+0x5e5 > dmu_buf_hold() at dmu_buf_hold+0xe0 > zap_lockdir() at zap_lockdir+0x58 > zap_cursor_retrieve() at zap_cursor_retrieve+0x19b > dmu_snapshot_list_next() at dmu_snapshot_list_next+0xaf > zfs_ioc_snapshot_list_next() at zfs_ioc_snapshot_list_next+0x101 > zfsdev_ioctl() at zfsdev_ioctl+0xe6 > devfs_ioctl_f() at devfs_ioctl_f+0x7b > kern_ioctl() at kern_ioctl+0x106 > sys_ioctl() at sys_ioctl+0xfd > amd64_syscall() at amd64_syscall+0x540 > Xfast_syscall() at Xfast_syscall+0xf7 > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x801be2c2c, rsp = > 0x7fffffff8938, rbp = 0x4000 --- >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1350947953.86715.138.camel>