Date: Wed, 4 Jan 2012 15:12:26 GMT From: Zaphod <zaphod@berentweb.com> To: freebsd-gnats-submit@FreeBSD.org Subject: amd64/163815: HDD timeout on ZFS + SB7x0 SATA Controller [AHCI] Message-ID: <201201041512.q04FCQPH088867@red.freebsd.org> Resent-Message-ID: <201201041520.q04FK5Zd063038@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 163815 >Category: amd64 >Synopsis: HDD timeout on ZFS + SB7x0 SATA Controller [AHCI] >Confidential: no >Severity: critical >Priority: high >Responsible: freebsd-amd64 >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Wed Jan 04 15:20:05 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Zaphod >Release: 9.0 >Organization: NA >Environment: FreeBSD 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #0 r228984: Fri Dec 30 12:57:09 EET 2011 amd64 >Description: Problem first showed its self during port builds (heavy HDD usage): -------------------------------- swap_pager: indefinite wait buffer: bufobj: 0, blkno: 32262, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 66056, size: 4096 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 82746, size: 8192 swap_pager: indefinite wait buffer: bufobj: 0, blkno: 44091, size: 4096 ahcich0: Timeout on slot 29 port 0 ahcich0: is 00000000 cs 000000ff ss e00000ff rs e00000ff tfd c0 serr 00000000 cmd 0004e017 ahcich0: AHCI reset... ahcich0: SATA connect time=100us status=00000123 ahcich0: AHCI reset: device found (ada0:ahcich0:0:0:0): Command timed out (ada0:ahcich0:0:0:0): Retrying command ------------------------------- Now, after latest update to /usr/src, buildworld breaks with "seg.fault 11" message, but actually due to swap_pager timeout. Break is near clang/lib/ARCMigrate/TransAutoreleasePool.cpp (but where is not so relevant). Also, CPU usage is not very heavy before system freeze. Hardware & Setup Info: - controller: ahci0@pci0:0:17:0: class=0x010601 card=0x43911002 chip=0x43911002 rev=0x00 hdr=0x00 device= 'SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode]' on 'RS780 Host Bridge'. Board: Biostar A780L - HDD is SAMSUNG HD322HJ, 320GB, ATA-8-ACS revision 3b, all FS on ZFS. - CPU: K8 [Athlon64/Opteron - mem/swap: RAM 1 GB / swap 2 GB (not zfs). Usage during buildworld: max RAM 65% / max swap 58% Previously built full-debug enabled kernel shows some errors as: kernel: lock order reversal: kernel: 1st 0xfffffe0010598248 filedesc structure (filedesc structure) @ /asp/src/sys/kern/kern_descrip.c:1197 kernel: 2nd 0xfffffe001052ccf0 zfs (zfs) @ /asp/src/sys/kern/vfs_subr.c:4245 kernel: KDB: stack backtrace: kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kernel: kdb_backtrace() at kdb_backtrace+0x37 kernel: _witness_debugger() at _witness_debugger+0x65 kernel: witness_checkorder() at witness_checkorder+0x833 kernel: __lockmgr_args() at __lockmgr_args+0xd9d kernel: vop_stdlock() at vop_stdlock+0x39 kernel: VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b kernel: _vn_lock() at _vn_lock+0x68 kernel: knlist_remove_kq() at knlist_remove_kq+0xfc kernel: knote_fdclose() at knote_fdclose+0x177 kernel: kern_close() at kern_close+0xe8 kernel: amd64_syscall() at amd64_syscall+0x27b kernel: Xfast_syscall() at Xfast_syscall+0xf7 kernel: --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x8015abcdc, rsp = 0x7fffffffd868, rbp = 0x801807b20 --- Another one: kernel: lock order reversal: kernel: 1st 0xfffffe0018e3a448 filedesc structure (filedesc structure) @ /asp/src/sys/kern/kern_descrip.c:1197 kernel: 2nd 0xfffffe0004533cf0 devfs (devfs) @ /asp/src/sys/kern/vfs_subr.c:4245 kernel: KDB: stack backtrace: kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kernel: kdb_backtrace() at kdb_backtrace+0x37 kernel: _witness_debugger() at _witness_debugger+0x65 kernel: witness_checkorder() at witness_checkorder+0x833 kernel: __lockmgr_args() at __lockmgr_args+0xd9d kernel: vop_stdlock() at vop_stdlock+0x39 kernel: VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b kernel: _vn_lock() at _vn_lock+0x68 kernel: knlist_remove_kq() at knlist_remove_kq+0xfc kernel: knote_fdclose() at knote_fdclose+0x177 kernel: kern_close() at kern_close+0xe8 kernel: amd64_syscall() at amd64_syscall+0x27b kernel: Xfast_syscall() at Xfast_syscall+0xf7 kernel: --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x8015abcdc, rsp = 0x7fffffffd498, rbp = 0x80180e230 --- kernel: lock order reversal: kernel: 1st 0xfffffe0018e3a448 filedesc structure (filedesc structure) @ /asp/src/sys/kern/kern_descrip.c:1197 kernel: 2nd 0xfffffe000b4f4a78 pseudofs (pseudofs) @ /asp/src/sys/kern/vfs_subr.c:4245 kernel: KDB: stack backtrace: kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kernel: kdb_backtrace() at kdb_backtrace+0x37 kernel: _witness_debugger() at _witness_debugger+0x65 kernel: witness_checkorder() at witness_checkorder+0x833 kernel: __lockmgr_args() at __lockmgr_args+0xd9d kernel: vop_stdlock() at vop_stdlock+0x39 kernel: VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b kernel: _vn_lock() at _vn_lock+0x68 kernel: knlist_remove_kq() at knlist_remove_kq+0xfc kernel: knote_fdclose() at knote_fdclose+0x177 kernel: kern_close() at kern_close+0xe8 kernel: amd64_syscall() at amd64_syscall+0x27b kernel: Xfast_syscall() at Xfast_syscall+0xf7 kernel: --- syscall (6, FreeBSD ELF64, sys_close), rip = 0x8015abcdc, rsp = 0x7fffffffd498, rbp = 0x80180e230 --- More details posted in forum: http://forums.freebsd.org/showthread.php?t=27452 >How-To-Repeat: >Fix: NA >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201201041512.q04FCQPH088867>