Date: Fri, 23 Oct 2009 11:56:24 +0100 From: Pete French <petefrench@ticketswitch.com> To: freebsd-stable@freebsd.org Subject: problems with gmirror on ggate over slow link Message-ID: <E1N1Hom-000HpE-6S@dilbert.ticketswitch.com>
next in thread | raw e-mail | index | archive | help
[ originally sent to geom, but am throwing it open to a wider audience as I didn;t get any replies there] I am using 7.2-STABLE from October 7th on all amchines, but this has been going on a while. Very simply I am mirroring together a pair of discs, one local, one remote. The remote disc is accessed using ggate. If the remote diisc is actually on a very close machine - e.g. a server plugged into the same ether net - then all works fine. If I make the remote disc somewhere actually substantially further away on the nbetwork, however, then when I attach the disc it starts to rebuild the mirror but then fails a fraction of a second later thus: GEOM_MIRROR: Device mysql0: rebuilding provider ggate1a. GEOM_MIRROR: Synchronization request failed (error=5). ggate1a[WRITE(offset=1310720, length=131072)] GEOM_MIRROR: Device mysql0: provider ggate1a disconnected. GEOM_MIRROR: Device mysql0: rebuilding provider ggate1a stopped. The interesting this is that the problem is only with gmirror, not with the underlying ggate disc which remains attached and accessible. I tested this by adding a second partition (ggate1b in the example above) and mounting a UFS filesystem on that. I've looked at the kernel code briefly, but it is not clear to me what is causing that write to fail. My conjecture would be that a buffer somewhere is filling up, causing a write to fail, and instead of gmirror waiting and retrying, instead it just fails the synchronisation. Any ideas ? Is this actually a bug ? I am wondering if it would also happen if mirroring a very fast disc against a very slow one (i.e. maybe it is independent of ggate) -pete.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E1N1Hom-000HpE-6S>