Date: Sun, 24 Feb 2013 12:05:06 +0200 From: Mikolaj Golub <trociny@FreeBSD.org> To: Pawel Jakub Dawidek <pjd@FreeBSD.org> Cc: Chad M Stewart <cms@balius.com>, freebsd-questions@freebsd.org Subject: Re: HAST - detect failure and restore avoiding an outage? Message-ID: <20130224100503.GA19308@gmail.com> In-Reply-To: <20130223205103.GN1377@garage.freebsd.pl> References: <E3C8C9A2-712E-4925-995A-0471CCD3515B@balius.com> <20130221220042.GA2900@gmail.com> <20130223205103.GN1377@garage.freebsd.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Feb 23, 2013 at 09:51:03PM +0100, Pawel Jakub Dawidek wrote: > I'm fine with the patchi except for missing breaks in switch added to > hastd/primary.c. Oops. Fixed. Thanks! > I'm also wondering... You count all those errors separately just to > print them as one number. If we do that already let's print them > separately, eg. > > local i/o errors: read(0), write(3), delete(5), flush(9) The idea was that hastd provided all available counters, and hastctl showed only aggregated counter just to save a screen space, but if one wanted to write its own utility to monitor hastd, which would talk directly to hastd via socket, she would be able to see all counters separately. But your idea with writing errors in one string looks better, as it allows to save a screen space and provide more detailed info. I would prefer a little different output though: role: secondary provname: test localpath: /dev/md102 extentsize: 2097152 (2.0MB) keepdirty: 0 remoteaddr: kopusha:7771 replication: memsync status: complete dirty: 0 (0B) statistics: reads: 13 writes: 521 deletes: 0 flushes: 0 activemap updates: 0 local i/o errors: read: 13, write: 425, delete: 0, flush: 0 but don't have a strong opinion and will be ok with yours if you don't like my version. > > BTW. Why not to count activemap update errors as write and flush errors? I need (internally) separate counters for activemap errors because they are updated by the different thread and I wouldn't want to introduce locking for error counter update operations. As hastctl was supposed to show an aggregated counter I didn't bother much how to make activemap update errors to count as write and flush errors. I improved this too in the updated patch: http://people.freebsd.org/~trociny/hast.stat_error.2.patch -- Mikolaj Golub
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130224100503.GA19308>