Date: Sat, 04 Dec 2004 14:01:26 -0500 From: Paul Mather <paul@gromit.dlib.vt.edu> To: freebsd-geom@freebsd.org Cc: le@freebsd.org Subject: Is there any way to throttle back plex reconstruction with geom_vinum? Message-ID: <1102186886.20507.31.camel@zappa.Chelsea-Ct.Org>
next in thread | raw e-mail | index | archive | help
In other words, is there anything analogous to the geom_mirror sysctls "kern.geom.mirror.syncs_per_sec" and "kern.geom.mirror.reqs_per_sync" for geom_vinum? The reason I ask is because when I just tried rebuilding my "usr" plex (upon which /usr resides) my system became completely unresponsive over the network. The ssh session I was using to kick off the "gvinum start usr.p1" stalled dead. Attempts to slogin from another machine hung. Attempts to use other network daemons remotely failed. (The system is a nameserver, and DNS lookups against it failed---not good!) The system still responded to pings, indicating it was alive and not completely hung. Eventually, it returned to the network. In my logs, I noticed the following: Dec 4 11:43:17 handle kernel: GEOM_VINUM: plex sync usr.p0 -> usr.p1 started Dec 4 11:43:17 handle kernel: GEOM_VINUM: sd usr.p1.s0 is initializing Dec 4 11:43:17 handle kernel: GEOM_VINUM: plex usr.p1 is degraded Dec 4 12:35:07 handle kernel: OM_VINUM: plex request failed for gvinum/plex/usr .p1[READ(offset=1162870784, length=16384)] Dec 4 12:35:07 handle kernel: GEOM_VINUM: plex request failed for gvinum/plex/u sr.p1[READ(offset=1162870784, length=16384)] Dec 4 12:35:07 handle last message repeated 346 times Dec 4 12:35:07 handle kernel: GEOM_VINUM: plex usr.p1 is up Dec 4 12:35:08 handle kernel: GEOM_VINUM: plex sync usr.p0 -> usr.p1 finished and also this: Dec 4 12:35:08 handle sm-mta[428]: rejecting connections on daemon IPv4: load average: 20 Dec 4 12:35:08 handle sm-mta[428]: rejecting connections on daemon IPv6: load average: 20 Dec 4 12:35:08 handle sm-mta[428]: rejecting connections on daemon MSA: load average: 20 Dec 4 12:35:23 handle sm-mta[428]: rejecting connections on daemon IPv4: load average: 16 Dec 4 12:35:23 handle sm-mta[428]: rejecting connections on daemon IPv6: load average: 16 Dec 4 12:35:23 handle sm-mta[428]: rejecting connections on daemon MSA: load average: 16 Dec 4 12:35:38 handle sm-mta[428]: accepting connections again for daemon IPv4 Dec 4 12:35:38 handle sm-mta[428]: accepting connections again for daemon IPv6 Dec 4 12:35:38 handle sm-mta[428]: accepting connections again for daemon MSA (Note the load average.) I'm not sure what the "plex request failed" thing is all about (usr.p1 was the plex being reconstructed). What units are being used in the "READ(offset=1162870784, length=16384)" error? (And why read from the plex being reconstructed? Is this to verify the data are written correctly?) The "gvinum printconfig" output of the subdisk pertaining to that plex is as follows: sd name usr.p1.s0 drive hardy len 47360177s driveoffset 2621424s plex usr.p1 plexoffset 0s If the units is bytes, that would place the error somewhere near the start of the plex. So, it worries me a little that the plex is marked up at the same time the READ error is reported and reconstruction declared complete a second later. How can I be sure reconstruction was completed successfully? Are the timestamps just a side-effect of how syslogd does its logging? I.e., are the timestamps generated when syslogd actually gets to write to the log file (possible delayed access due to disk starvation caused by the flat-out geom_vinum reconstruction underway, hence lots of things logged with near-identical times when the drive is no longer monopolised), or when the logged message is received by syslogd? Cheers, Paul. PS: smartctl reports the health of the drive as okay. -- e-mail: paul@gromit.dlib.vt.edu "Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid." --- Frank Vincent Zappa
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1102186886.20507.31.camel>