From owner-freebsd-stable@FreeBSD.ORG Tue Oct 29 07:17:44 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id D23A3328 for ; Tue, 29 Oct 2013 07:17:44 +0000 (UTC) (envelope-from me@lexasoft.ru) Received: from mail.fly-group.ru (mail.fly-group.ru [91.205.125.25]) by mx1.freebsd.org (Postfix) with ESMTP id 5399F2F36 for ; Tue, 29 Oct 2013 07:17:43 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.fly-group.ru (Postfix) with ESMTP id EA05196588F; Tue, 29 Oct 2013 11:17:35 +0400 (MSK) Received: from mail.fly-group.ru ([127.0.0.1]) by localhost (mail.fly-group.ru [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id MOryUwgxdQ9o; Tue, 29 Oct 2013 11:17:35 +0400 (MSK) Received: from localhost (localhost [127.0.0.1]) by mail.fly-group.ru (Postfix) with ESMTP id 57AD896588C; Tue, 29 Oct 2013 11:17:35 +0400 (MSK) DKIM-Filter: OpenDKIM Filter v2.8.0 mail.fly-group.ru 57AD896588C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lexasoft.ru; s=108F67EA-0FB0-11E3-B95A-30512DDDB480; t=1383031055; bh=SXuesv2FdsiqLW1vNNcjeo9ecRTdA9qiwiDO+2inLYQ=; h=Content-Type:Mime-Version:Subject:From:Date: Content-Transfer-Encoding:Message-Id:To; b=cObI4UmoU97LbFZ4C2etfvtxVBsfFoAsequr/lvRA9Sn477j4AXlEHqHNf7ITWJyv aFvm+OR8FXZWRZ+6SGwp5JajBn+mCN1JJTvdqB9zPEnJ2aK51PRKtf1ZIfHNYyle1o iy+PBA9XPuBmOJorJFU/irLN4NbV/puoVb+WUB2o= X-Virus-Scanned: amavisd-new at mail.fly-group.ru Received: from mail.fly-group.ru ([127.0.0.1]) by localhost (mail.fly-group.ru [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id z1s44p-3VfOC; Tue, 29 Oct 2013 11:17:35 +0400 (MSK) Received: from [10.6.0.88] (unknown [37.19.6.49]) by mail.fly-group.ru (Postfix) with ESMTPSA id 1868B965885; Tue, 29 Oct 2013 11:17:35 +0400 (MSK) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 7.0 \(1816\)) Subject: Re: FreeBSD 9.2 UFS + GELI softdep_deallocate_dependencies: unrecovered I/O error From: Alexey Tarasov In-Reply-To: <415FD2A4-E2D2-4784-A9AB-A7CCFBBAC27F@lexasoft.ru> Date: Tue, 29 Oct 2013 11:17:34 +0400 Content-Transfer-Encoding: quoted-printable Message-Id: <709C9B77-AD40-4662-96C9-A0F56369DBDE@lexasoft.ru> References: <2AA765E7-1F17-4C6F-98BD-004AEFF88D32@lexasoft.ru> <20131027184625.GI59496@kib.kiev.ua> <415FD2A4-E2D2-4784-A9AB-A7CCFBBAC27F@lexasoft.ru> To: Konstantin Belousov X-Mailer: Apple Mail (2.1816) Cc: freebsd-stable@freebsd.org, =?utf-8?B?0KLQsNGA0LDRgdC+0LIg0JDQu9C10LrRgdC10LnigI4KINCS0Lg=?= =?utf-8?B?0LrRgtC+0YDQvtCy0LjRhw==?= X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Oct 2013 07:17:44 -0000 Hello. Seems that setting kern.bio_transient_maxcnt to 8k resolved the problem. On 27 =D0=BE=D0=BA=D1=82. 2013 =D0=B3., at 23:00, Alexey Tarasov = wrote: > Hello! >=20 > Ok, I=E2=80=99ll try this. > So this is software defect of FreeBSD 9.2? >=20 > On 27 =D0=BE=D0=BA=D1=82. 2013 =D0=B3., at 22:46, Konstantin Belousov = wrote: >=20 >> On Sat, Oct 26, 2013 at 01:47:18PM +0400, Alexey Tarasov wrote: >>> Hello.=20 >>>=20 >>> I've upgraded server to 9.2 and now it hangs every 2-3 hours of = intensive I/O to UFS SUJ + GELI disk. On 9.1 everything was good for a = half of a year.=20 >>>=20 >>> g_vfs_done():da1.eli[WRITE(offset=3D614630752256, = length=3D32768)]error =3D 11=20 >>> g_vfs_done():da1.eli[WRITE(offset=3D614631211008, = length=3D32768)]error =3D 11=20 >>> g_vfs_done():da1.eli[WRITE(offset=3D614634815488, = length=3D32768)]error =3D 11=20 >>> g_vfs_done():da1.eli[WRITE(offset=3D614642319360, = length=3D32768)]error =3D 11=20 >>> g_vfs_done():da1.eli[WRITE(offset=3D614642909184, = length=3D32768)]error =3D 11=20 >>> g_vfs_done():da1.eli[WRITE(offset=3D614643007488, = length=3D32768)]error =3D 11=20 >>> g_vfs_done():da1.eli[WRITE(offset=3D614644875264, = length=3D32768)]error =3D 11=20 >>> g_vfs_done():da1.eli[WRITE(offset=3D550691995648, = length=3D98304)]error =3D 11=20 >>> g_vfs_done():da1.eli[WRITE(offset=3D550692519936, = length=3D32768)]error =3D 11=20 >>> g_vfs_done():da1.eli[WRITE(offset=3D550704152576, = length=3D32768)]error =3D 11=20 >>> /data/pgsql/data/base: got error 11 while accessing filesystem=20 >>> panic: softdep_deallocate_dependencies: unrecovered I/O error=20 >>> cpuid =3D 10=20 >>> KDB: stack backtrace:=20 >>> #0 0xffffffff80947986 at kdb_backtrace+0x66=20 >>> #1 0xffffffff8090d9ae at panic+0x1ce=20 >>> #2 0xffffffff80b3ff90 at clear_remove+0=20 >>> #3 0xffffffff8098fb65 at brelse+0x75=20 >>> #4 0xffffffff80990978 at bufdone+0x68=20 >>> #5 0xffffffff8098c83e at biodone+0xae=20 >>> #6 0xffffffff80872f4c at g_io_schedule_up+0xac=20 >>> #7 0xffffffff808736ac at g_up_procbody+0x5c=20 >>> #8 0xffffffff808db67f at fork_exit+0x11f=20 >>> #9 0xffffffff80cdc23e at fork_trampoline+0xe=20 >>> Uptime: 6d15h5m7s=20 >>> Dumping 7664 out of 196573 = MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%=20 >>>=20 >>> Full core.txt is here: http://lexasoft.ru/core.txt.1=20 >>>=20 >>> Server is HP Proliant DL180 G6 with P410 RAID controller.=20 >>=20 >> Look for your current value of the kern.bio_transient_maxcnt and = increase >> it by 4-8 times, using the same tunable. If this helps, fine. If = not, >> disable unmapped i/o with the vfs.unmapped_buf_allowed tunable. >>=20 >> Real solution is to convert geom classes like geli to use limited >> transient mapping windows to access the data, thus adding support for >> unmapped i/o to them. >=20 > -- > Alexey Tarasov >=20 > (\__/)=20 > (=3D'.'=3D)=20 > E[: | | | | :]=D0=97=20 > (")_(") >=20 -- Alexey Tarasov (\__/)=20 (=3D'.'=3D)=20 E[: | | | | :]=D0=97=20 (")_(")