From owner-freebsd-stable@FreeBSD.ORG Sun Mar 27 12:41:18 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B26FA1065672; Sun, 27 Mar 2011 12:41:18 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 141D68FC14; Sun, 27 Mar 2011 12:41:17 +0000 (UTC) Received: by fxm11 with SMTP id 11so2646796fxm.13 for ; Sun, 27 Mar 2011 05:41:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:from:to:cc:subject:references:x-comment-to :sender:date:in-reply-to:message-id:user-agent:mime-version :content-type; bh=RNw8cxP7uUema0n/2iBx6BBZ0XqmIMrHnxtccd3kl7g=; b=ar8m+psd84djSlwsEtFOVdFg54hew2SUjYbQ4cry2kcRg3EnF7MPXoGVxuNs6uSCeo 3gSM3qbmxLg/89cW4Cd33uRIwJh3WABvo6Y2LxDBJ3x6tWCmZzz5N2/HUI0UVgHTy1X1 9CP/ekMqqE46kyO7zPRT4RfWt2/tnJMIdMNDA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:subject:references:x-comment-to:sender:date:in-reply-to :message-id:user-agent:mime-version:content-type; b=xFp/L9ahnj2oNKGHVsKWEr2oPl1LSBcQBt5URS+ea35ZuNzTW/2OR5Zkv6cKW5DuYJ axs7JoIAPJDsNzvH8qAEOTOQp7/lrlq8UexQJ8jngDAcpkjiMZpCwCJxx2Cv/m6AEwii w1R/wkpR5w7JZw4DafZqxd791d7zgpzmDT3YY= Received: by 10.223.59.81 with SMTP id k17mr3204143fah.94.1301228179896; Sun, 27 Mar 2011 05:16:19 -0700 (PDT) Received: from localhost ([95.69.172.154]) by mx.google.com with ESMTPS id n1sm1070163fam.16.2011.03.27.05.16.17 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 27 Mar 2011 05:16:18 -0700 (PDT) From: Mikolaj Golub To: Freddie Cash References: <20110325075541.GA1742@garage.freebsd.pl> X-Comment-To: Freddie Cash Sender: Mikolaj Golub Date: Sun, 27 Mar 2011 15:16:15 +0300 In-Reply-To: (Freddie Cash's message of "Sat, 26 Mar 2011 10:52:08 -0700") Message-ID: <86zkogep2o.fsf@kopusha.home.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Cc: FreeBSD Filesystems , FreeBSD Stable , FreeBSD-Current , Pawel Jakub Dawidek Subject: Re: Any success stories for HAST + ZFS? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Mar 2011 12:41:18 -0000 --=-=-= On Sat, 26 Mar 2011 10:52:08 -0700 Freddie Cash wrote: FC> hastd backtrace is here: FC> http://www.sd73.bc.ca/downloads/crash/hast-backtrace.png It is not a hastd crash, but a kernel crash triggered by hastd process. I am not sure I got the same crash as you but apparently the race is possible in g_gate on device creation. I got the following crash starting many hast providers simultaneously: fault virtual address = 0x0 #8 0xc0c11adc in calltrap () at /usr/src/sys/i386/i386/exception.s:168 #9 0xc086ac6b in g_gate_ioctl (dev=0xc6a24300, cmd=3374345472, addr=0xc9fec000 "\002", flags=3, td=0xc7ff0b80) at /usr/src/sys/geom/gate/g_gate.c:410 #10 0xc0853c5b in devfs_ioctl_f (fp=0xc9b9e310, com=3374345472, data=0xc9fec000, cred=0xc8c9c200, td=0xc7ff0b80) at /usr/src/sys/fs/devfs/devfs_vnops.c:678 #11 0xc09210cd in kern_ioctl (td=0xc7ff0b80, fd=3, com=3374345472, data=0xc9fec000 "\002") at file.h:262 #12 0xc0921254 in ioctl (td=0xc7ff0b80, uap=0xf5edbcec) at /usr/src/sys/kern/sys_generic.c:679 #13 0xc0916616 in syscallenter (td=0xc7ff0b80, sa=0xf5edbce4) at /usr/src/sys/kern/subr_trap.c:315 #14 0xc0c2b9ff in syscall (frame=0xf5edbd28) at /usr/src/sys/i386/i386/trap.c:1086 #15 0xc0c11b71 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:266 Or just creating many ggate devices simultaneously: for i in `jot 100`; do ./ggiocreate $i& done ggiocreate.c is attached. In my case the kernel crashes in g_gate_create() when checking for name collisions in strcmp(): /* Check for name collision. */ for (unit = 0; unit < g_gate_maxunits; unit++) { if (g_gate_units[unit] == NULL) continue; if (strcmp(name, g_gate_units[unit]->sc_provider->name) != 0) continue; mtx_unlock(&g_gate_units_lock); mtx_destroy(&sc->sc_queue_mtx); free(sc, M_GATE); return (EEXIST); } I think the issue is the following. When preparing sc we take g_gate_units_lock, check for name collision, fill sc fields except sc->sc_provider, and registers sc in g_gate_units[unit]. sc_provider is filled later, when g_gate_units_lock is released. So the scenario is possible: 1) Thread A registers sc in g_gate_units[unit] with g_gate_units[unit]->sc_provider still null and releases g_gate_units_lock. 2) Thread B traverses g_gate_units[] when checking for name collision and craches accessing g_gate_units[unit]->sc_provider->name. The attached patch fixes the issue in my case. -- Mikolaj Golub --=-=-= Content-Type: application/octet-stream Content-Disposition: attachment; filename=ggiocreate.c Content-Transfer-Encoding: base64 I2luY2x1ZGUgPHN5cy9jZGVmcy5oPgojaW5jbHVkZSA8c3lzL3R5cGVzLmg+CgojaW5jbHVkZSA8 Z2VvbS9nYXRlL2dfZ2F0ZS5oPgoKI2luY2x1ZGUgPGVyci5oPgojaW5jbHVkZSA8ZXJybm8uaD4K I2luY2x1ZGUgPGZjbnRsLmg+CiNpbmNsdWRlIDxzdGRpby5oPgojaW5jbHVkZSA8c3RyaW5ncy5o PgoKaW50Cm1haW4gKGludCBhcmdjLCBjaGFyICphcmd2W10pCnsKCXN0cnVjdCBnX2dhdGVfY3Rs X2NyZWF0ZSBnZ2lvY3JlYXRlOwoJc3RydWN0IGdfZ2F0ZV9jdGxfY2FuY2VsIGdnaW9jYW5jZWw7 CglzdHJ1Y3QgZ19nYXRlX2N0bF9kZXN0cm95IGdnaW9kOwoJaW50IGZkLCB1bml0OwoKCWlmIChh cmdjIDwgMikKCQllcnJ4KDEsICJ1c2FnZTogZ2dpb2NyZWF0ZSBuYW1lIik7CgkKCWZkID0gb3Bl bigiL2Rldi8iIEdfR0FURV9DVExfTkFNRSwgT19SRFdSKTsKCWlmIChmZCA8IDApCgkJZXJyKDEs ICJVbmFibGUgdG8gb3BlbiAvZGV2LyIgR19HQVRFX0NUTF9OQU1FKTsKCWJ6ZXJvKCZnZ2lvY3Jl YXRlLCBzaXplb2YoZ2dpb2NyZWF0ZSkpOwoJZ2dpb2NyZWF0ZS5nY3RsX3ZlcnNpb24gPSBHX0dB VEVfVkVSU0lPTjsKCWdnaW9jcmVhdGUuZ2N0bF9tZWRpYXNpemUgPSAxMDI0ICogMTAyNDA7Cgln Z2lvY3JlYXRlLmdjdGxfc2VjdG9yc2l6ZSA9IDUxMjsKCWdnaW9jcmVhdGUuZ2N0bF9mbGFncyA9 IDA7CglnZ2lvY3JlYXRlLmdjdGxfbWF4Y291bnQgPSBHX0dBVEVfTUFYX1FVRVVFX1NJWkU7Cgln Z2lvY3JlYXRlLmdjdGxfdGltZW91dCA9IDA7CglnZ2lvY3JlYXRlLmdjdGxfdW5pdCA9IEdfR0FU RV9OQU1FX0dJVkVOOwoJc25wcmludGYoZ2dpb2NyZWF0ZS5nY3RsX25hbWUsIHNpemVvZihnZ2lv Y3JlYXRlLmdjdGxfbmFtZSksCgkgICAgInRlc3RoYXN0LyVzIiwgYXJndlsxXSk7CglpZiAoaW9j dGwoZmQsIEdfR0FURV9DTURfQ1JFQVRFLCAmZ2dpb2NyZWF0ZSkgIT0gMCkKCQllcnIoMSwgIlVu YWJsZSB0byBjcmVhdGUgdGVzdGhhc3QvJXMgZGV2aWNlIiwgYXJndlsxXSk7CgllbHNlCgkJdW5p dCA9IGdnaW9jcmVhdGUuZ2N0bF91bml0OwoJCglzbGVlcCgxMCk7CgoJYnplcm8oJmdnaW9kLCBz aXplb2YoZ2dpb2QpKTsKCWdnaW9kLmdjdGxfdmVyc2lvbiA9IEdfR0FURV9WRVJTSU9OOwoJZ2dp b2QuZ2N0bF91bml0ID0gdW5pdDsKCWdnaW9kLmdjdGxfZm9yY2UgPSAxOwoJaWYgKGlvY3RsKGZk LCBHX0dBVEVfQ01EX0RFU1RST1ksICZnZ2lvZCkgPCAwKQoJCWVycigxLCAiVW5hYmxlIHRvIGRl c3Ryb3kgdGVzdGhhc3QvJXMgZGV2aWNlIiwgYXJndlsxXSk7CgoJcmV0dXJuIDA7Cn0K --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=g_gate.patch Index: sys/geom/gate/g_gate.c =================================================================== --- sys/geom/gate/g_gate.c (revision 220050) +++ sys/geom/gate/g_gate.c (working copy) @@ -407,13 +407,14 @@ g_gate_create(struct g_gate_ctl_create *ggio) for (unit = 0; unit < g_gate_maxunits; unit++) { if (g_gate_units[unit] == NULL) continue; - if (strcmp(name, g_gate_units[unit]->sc_provider->name) != 0) + if (strcmp(name, g_gate_units[unit]->sc_name) != 0) continue; mtx_unlock(&g_gate_units_lock); mtx_destroy(&sc->sc_queue_mtx); free(sc, M_GATE); return (EEXIST); } + sc->sc_name = name; g_gate_units[sc->sc_unit] = sc; g_gate_nunits++; mtx_unlock(&g_gate_units_lock); @@ -432,6 +433,9 @@ g_gate_create(struct g_gate_ctl_create *ggio) sc->sc_provider = pp; g_error_provider(pp, 0); g_topology_unlock(); + mtx_lock(&g_gate_units_lock); + sc->sc_name = sc->sc_provider->name; + mtx_unlock(&g_gate_units_lock); if (sc->sc_timeout > 0) { callout_reset(&sc->sc_callout, sc->sc_timeout * hz, Index: sys/geom/gate/g_gate.h =================================================================== --- sys/geom/gate/g_gate.h (revision 220050) +++ sys/geom/gate/g_gate.h (working copy) @@ -76,6 +76,7 @@ * 'P:' means 'Protected by'. */ struct g_gate_softc { + char *sc_name; /* P: (read-only) */ int sc_unit; /* P: (read-only) */ int sc_ref; /* P: g_gate_list_mtx */ struct g_provider *sc_provider; /* P: (read-only) */ @@ -96,7 +97,6 @@ struct g_gate_softc { LIST_ENTRY(g_gate_softc) sc_next; /* P: g_gate_list_mtx */ char sc_info[G_GATE_INFOSIZE]; /* P: (read-only) */ }; -#define sc_name sc_provider->geom->name #define G_GATE_DEBUG(lvl, ...) do { \ if (g_gate_debug >= (lvl)) { \ --=-=-=--