From owner-freebsd-fs@FreeBSD.ORG Mon Jun 14 15:07:58 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE56F1065672; Mon, 14 Jun 2010 15:07:58 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id C6D558FC15; Mon, 14 Jun 2010 15:07:57 +0000 (UTC) Received: by bwz2 with SMTP id 2so2861266bwz.13 for ; Mon, 14 Jun 2010 08:07:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:subject:references :x-comment-to:date:in-reply-to:message-id:user-agent:mime-version :content-type; bh=/NVGcIRNnO4e91QtM3WCWjkhgKG5ZtiE6x4QIPwb+v0=; b=NSLuFBoffTCx+B852xSwLAOfAiH8OSayCRe0FWcui6QsWR4CS4pd/vSstswAuy9GgQ 9FdCBngDV8GhvDR50LnTh/xNPRotWcq/Gf3Xu9uud+bRLHxlLqZcsE36K8sELleEUG+U Zw7qbpkPfkBeaUjyH+uEpAxvteHajCN/fLzzU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:references:x-comment-to:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=QPUmoYUlbndQrIghfPALcoRdZWobcdstpITVOCPhGUy1pSaxcYfSVNHNfrktBboRXh fgw4PXuEmsdd4EMG+k+mTS6Mu9pS6mOEMOSePF1620GoihlbaET+7BweqNDxUqRbDM0Z +81LSvcqH534MeP/gDbFYAlCHFXWWvqFJVAtc= Received: by 10.204.34.130 with SMTP id l2mr4367813bkd.164.1276528076720; Mon, 14 Jun 2010 08:07:56 -0700 (PDT) Received: from localhost ([95.69.160.52]) by mx.google.com with ESMTPS id v14sm19887393bkz.8.2010.06.14.08.07.54 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 14 Jun 2010 08:07:55 -0700 (PDT) From: Mikolaj Golub To: Pawel Jakub Dawidek References: <4C1372E0.1000903@soupacific.com> <20100612142311.GF2253@garage.freebsd.pl> <4C139F9C.2090305@soupacific.com> <86iq5oc82y.fsf@kopusha.home.net> <4C14215D.9090304@soupacific.com> <20100613003635.GA60012@icarus.home.lan> <20100613074921.GB1320@garage.freebsd.pl> <4C149A5C.3070401@soupacific.com> <20100613102401.GE1320@garage.freebsd.pl> <86eigavzsg.fsf@kopusha.home.net> <20100614095044.GH1721@garage.freebsd.pl> X-Comment-To: Pawel Jakub Dawidek Date: Mon, 14 Jun 2010 18:07:51 +0300 In-Reply-To: <20100614095044.GH1721@garage.freebsd.pl> (Pawel Jakub Dawidek's message of "Mon, 14 Jun 2010 11:50:44 +0200") Message-ID: <868w6hwt2w.fsf@kopusha.home.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org Subject: Re: FreeBSD 8.1 and HAST X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Jun 2010 15:07:58 -0000 On Mon, 14 Jun 2010 11:50:44 +0200 Pawel Jakub Dawidek wrote: PJD> On Mon, Jun 14, 2010 at 10:28:15AM +0300, Mikolaj Golub wrote: >> >> On Sun, 13 Jun 2010 12:24:01 +0200 Pawel Jakub Dawidek wrote: >> >> >> Jun 13 16:25:37 sv01A hastd: [zfshast] (primary) Header contains no 'seq' field. >> >> PJD> This is the most important bit from the primary node. >> >> PJD> The header either does not contain 'seq' field or this field is 0. It >> PJD> can only be 0 if you have old kernel. With recent kernel geom_gate.ko >> PJD> was modified to start seq at 1, so this should not happen. >> >> I am a bit confused how this seq is supposed to work. For sync thread. I have >> set up hast on 8-STABLE (before I used it on 9-CURRENT only) and have the same >> issue as hiroshi@ does. I have added >> >> pjdlog_debug(2, "remote_send: seq is %llu.", (uint64_t)ggio->gctl_seq); >> >> after >> >> nv_add_uint64(nv, (uint64_t)ggio->gctl_seq, "seq"); >> >> in primary/remote_send thread and observe the following: PJD> [...] PJD> Could you find where exactly it looses proper value? PJD> I found that in ggate_recv_thread() after ioctl(2), gctl_seq has PJD> expected value, but I'm not setup to test it further quickly. I suppose ggate_recv_thread() is ok but as I wrote earlier I am concerned about sync thread. I have added additional prints and how it looks: Jun 14 17:47:50 zhuzha hastd: [storage] (primary) Device hast/storage recovered. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) init_environment: hio: Ox28441580; seq: 0. ... Jun 14 17:47:50 zhuzha hastd: [storage] (primary) init_environment: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) init_environment: hio: Ox284f2640; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) ggate_recv: Taking free request. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) ggate_recv: (0x284f2640) Got free request. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) ggate_recv 1: hio: Ox284f2640; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) ggate_recv: (0x284f2640) Waiting for request from the kernel. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) ggate_recv 2: hio: Ox284f2640; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) remote_send: Taking request. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) local_send: Taking request. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) ggate_send: Taking request. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) remote_recv: No requests, waiting. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) remote_guard: Checking connections. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) Synchronization started. 10485760 bytes to go. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) remote_guard: Connection to tcp4://192.168.120.6 is ok. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync: Taking free request. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync: (0x284f2600) Got free request. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync 1: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync 2: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync 3: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync: (0x284f2600) Sending sync request: READ(0, 131072). Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync: (0x284f2600) Moving request to the send queue. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync 4: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync 5: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) local_send: (0x284f2600) Got request. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) local_send 1: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) local_send: Taking request. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync 6: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync 7: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync: (0x284f2600) Sending sync request: WRITE(0, 131072). Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync: (0x284f2600) Moving request to the send queue. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync 8: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync: (0x284f2600) Moving request to the send queues. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) sync 9: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) remote_send: (0x284f2600) Got request. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) remote_send 1: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) remote_send 3: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) remote_send: seq is 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) remote_send: (0x284f2600) Moving request to the recv queue. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) remote_send 7: hio: Ox284f2600; seq: 0. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) remote_send: Taking request. Jun 14 17:47:50 zhuzha hastd: [storage] (primary) Header contains no 'seq' field. Jun 14 17:47:50 zhuzha kernel: Jun 14 17:47:50 zhuzha hastd: [storage] (primary) Header contains no 'seq' field. So ggate_recv takes free just initilized hio (0x284f2640) from the free queue and is waiting for the data from the kernel. At this time sync thread starts syncronization, takes another just initilized hio (0x284f2600) from the free queue and puts it to remote_send, so the request is sent with seq == 0. -- Mikolaj Golub