From owner-freebsd-geom@FreeBSD.ORG Sat Dec 4 19:01:36 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9F3F616A4CE; Sat, 4 Dec 2004 19:01:36 +0000 (GMT) Received: from gromit.dlib.vt.edu (gromit.dlib.vt.edu [128.173.49.29]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5025043D58; Sat, 4 Dec 2004 19:01:36 +0000 (GMT) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (pool-151-199-90-129.roa.east.verizon.net [151.199.90.129]) by gromit.dlib.vt.edu (8.13.1/8.13.1) with ESMTP id iB4J1YWD067118 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 4 Dec 2004 14:01:35 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) Received: from zappa.Chelsea-Ct.Org (localhost.Chelsea-Ct.Org [127.0.0.1]) by zappa.Chelsea-Ct.Org (8.13.1/8.13.1) with ESMTP id iB4J1SDH020653 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 4 Dec 2004 14:01:29 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) Received: (from paul@localhost) by zappa.Chelsea-Ct.Org (8.13.1/8.13.1/Submit) id iB4J1RSS020652; Sat, 4 Dec 2004 14:01:27 -0500 (EST) (envelope-from paul@gromit.dlib.vt.edu) X-Authentication-Warning: zappa.Chelsea-Ct.Org: paul set sender to paul@gromit.dlib.vt.edu using -f From: Paul Mather To: freebsd-geom@freebsd.org Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Sat, 04 Dec 2004 14:01:26 -0500 Message-Id: <1102186886.20507.31.camel@zappa.Chelsea-Ct.Org> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 FreeBSD GNOME Team Port cc: le@freebsd.org Subject: Is there any way to throttle back plex reconstruction with geom_vinum? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Dec 2004 19:01:36 -0000 In other words, is there anything analogous to the geom_mirror sysctls "kern.geom.mirror.syncs_per_sec" and "kern.geom.mirror.reqs_per_sync" for geom_vinum? The reason I ask is because when I just tried rebuilding my "usr" plex (upon which /usr resides) my system became completely unresponsive over the network. The ssh session I was using to kick off the "gvinum start usr.p1" stalled dead. Attempts to slogin from another machine hung. Attempts to use other network daemons remotely failed. (The system is a nameserver, and DNS lookups against it failed---not good!) The system still responded to pings, indicating it was alive and not completely hung. Eventually, it returned to the network. In my logs, I noticed the following: Dec 4 11:43:17 handle kernel: GEOM_VINUM: plex sync usr.p0 -> usr.p1 started Dec 4 11:43:17 handle kernel: GEOM_VINUM: sd usr.p1.s0 is initializing Dec 4 11:43:17 handle kernel: GEOM_VINUM: plex usr.p1 is degraded Dec 4 12:35:07 handle kernel: OM_VINUM: plex request failed for gvinum/plex/usr .p1[READ(offset=1162870784, length=16384)] Dec 4 12:35:07 handle kernel: GEOM_VINUM: plex request failed for gvinum/plex/u sr.p1[READ(offset=1162870784, length=16384)] Dec 4 12:35:07 handle last message repeated 346 times Dec 4 12:35:07 handle kernel: GEOM_VINUM: plex usr.p1 is up Dec 4 12:35:08 handle kernel: GEOM_VINUM: plex sync usr.p0 -> usr.p1 finished and also this: Dec 4 12:35:08 handle sm-mta[428]: rejecting connections on daemon IPv4: load average: 20 Dec 4 12:35:08 handle sm-mta[428]: rejecting connections on daemon IPv6: load average: 20 Dec 4 12:35:08 handle sm-mta[428]: rejecting connections on daemon MSA: load average: 20 Dec 4 12:35:23 handle sm-mta[428]: rejecting connections on daemon IPv4: load average: 16 Dec 4 12:35:23 handle sm-mta[428]: rejecting connections on daemon IPv6: load average: 16 Dec 4 12:35:23 handle sm-mta[428]: rejecting connections on daemon MSA: load average: 16 Dec 4 12:35:38 handle sm-mta[428]: accepting connections again for daemon IPv4 Dec 4 12:35:38 handle sm-mta[428]: accepting connections again for daemon IPv6 Dec 4 12:35:38 handle sm-mta[428]: accepting connections again for daemon MSA (Note the load average.) I'm not sure what the "plex request failed" thing is all about (usr.p1 was the plex being reconstructed). What units are being used in the "READ(offset=1162870784, length=16384)" error? (And why read from the plex being reconstructed? Is this to verify the data are written correctly?) The "gvinum printconfig" output of the subdisk pertaining to that plex is as follows: sd name usr.p1.s0 drive hardy len 47360177s driveoffset 2621424s plex usr.p1 plexoffset 0s If the units is bytes, that would place the error somewhere near the start of the plex. So, it worries me a little that the plex is marked up at the same time the READ error is reported and reconstruction declared complete a second later. How can I be sure reconstruction was completed successfully? Are the timestamps just a side-effect of how syslogd does its logging? I.e., are the timestamps generated when syslogd actually gets to write to the log file (possible delayed access due to disk starvation caused by the flat-out geom_vinum reconstruction underway, hence lots of things logged with near-identical times when the drive is no longer monopolised), or when the logged message is received by syslogd? Cheers, Paul. PS: smartctl reports the health of the drive as okay. -- e-mail: paul@gromit.dlib.vt.edu "Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid." --- Frank Vincent Zappa