From owner-freebsd-bugs@FreeBSD.ORG Thu Jun 30 14:38:17 2011 Return-Path: Delivered-To: freebsd-bugs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 19F58106566B for ; Thu, 30 Jun 2011 14:38:17 +0000 (UTC) (envelope-from compudj@mail.openrapids.net) Received: from blackscsi.openrapids.net (mail.openrapids.net [64.15.138.104]) by mx1.freebsd.org (Postfix) with ESMTP id D81128FC0A for ; Thu, 30 Jun 2011 14:38:16 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by blackscsi.openrapids.net (Postfix) with ESMTP id CE10B140638; Thu, 30 Jun 2011 10:38:15 -0400 (EDT) Received: from blackscsi.openrapids.net ([127.0.0.1]) by localhost (blackscsi.openrapids.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eckIGw6lEWzf; Thu, 30 Jun 2011 10:38:14 -0400 (EDT) Received: by blackscsi.openrapids.net (Postfix, from userid 1003) id 05FC6140789; Thu, 30 Jun 2011 10:38:13 -0400 (EDT) Date: Thu, 30 Jun 2011 10:38:13 -0400 From: Mathieu Desnoyers To: freebsd-bugs@freebsd.org Message-ID: <20110630143813.GA5431@Krystal> References: <20110629210840.GA30887@Krystal> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110629210840.GA30887@Krystal> X-Editor: vi X-Info: http://www.efficios.com X-Operating-System: Linux/2.6.26-2-686 (i686) X-Uptime: 10:34:47 up 218 days, 19:37, 1 user, load average: 0.18, 0.05, 0.01 User-Agent: Mutt/1.5.18 (2008-05-17) Cc: "Paul E. McKenney" Subject: Re: [BUG] FreeBSD 8.2: race condition in pthread mutex on UP system X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Jun 2011 14:38:17 -0000 * Mathieu Desnoyers (mathieu.desnoyers@efficios.com) wrote: > Hi, > > I ran the Userspace RCU test suite on a freshly installed FreeBSD 8.2 > i386 system running in single-cpu mode in a KVM virtual machine, and > notice what looks like a race condition in the handling of pthread > mutexes, only occurring on single-cpu systems. Please disregard this bug report. Further testing with a simpler test indicated that the problem was in the test-case custom allocator: a thread preempted for a long period of time could race, when scheduled again, with another thread that would have been allocating/freeing entries (thus wrapping-around the available buffer), which would trigger this race only when overcommitting the number of threads compared to the number of available CPUs. Best regards, Mathieu > > To reproduce the problem, start FreeBSD 8.2 on a single-processor > machine, and do the following: > > git clone git://git.lttng.org/userspace-rcu.git > cd userspace-rcu > ./bootstrap > ./configure > make > cd tests > ./test_urcu_mb 0 2 100 > > With this configuration (0 reader thread, 2 updater threads), this test > is just updating data structures protected by pthread mutexes. It uses > its own allocator which poisons the memory entries when freed. It checks > for poison value upon allocation of new entries, which therefore detects > racy updates. > > My test setup is the following: > > FreeBSD 8.2 i386, running on a x86_64 i7, under KVM. The physical > machine has 2 physical CPUs with hyperthreading enabled. > > Physical CPU model: > model name : Intel(R) Core(TM) i7 CPU L 640 @ 2.13GHz > > With a single-cpu virtual machine (UP): > > $ ./test_urcu_mb 0 2 100 > Assertion failed: (test_array[index].a == ARRAY_POISON || > test_array[index].a == 0), function test_array_alloc, file test_urcu.c, line 204. > Abort trap (core dumped) > > The problem does not reproduce with multiple RCU readers, single writer > thread, which does not rely on mutex synchronization, thus pointing me > into the direction of a mutex problem. > > The problem did not reproduce when I increased the number of KVM virtual > CPUs to 4 with just those two writer threads. Therefore, it starts to > look like a kernel implementation problem of the waitqueue/wakeup > scheme supporting mutex lock/unlock operations. > > I tried to start 6, and then 100 writer threads on the 4 virtual CPU > setup, and I cannot trigger the problem so far. So I suspect a bad > optimisation for the UP case in the FreeBSD 8.2 kernel. > > This test runs fine on a wide range of Linux systems. > > I'll be happy to provide more info if needed. See dmesg below. > > Thanks, > > Mathieu Desnoyers > > > Copyright (c) 1992-2011 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 8.2-RELEASE #0: Fri Feb 18 02:24:46 UTC 2011 > root@almeida.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Pentium II/Pentium II Xeon/Celeron (2128.01-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0x633 Family = 6 Model = 3 Stepping = 3 > Features=0x781abfd > Features2=0x80800001> > real memory = 805306368 (768 MB) > avail memory = 773296128 (737 MB) > ACPI APIC Table: > ioapic0: Changing APIC ID to 1 > ioapic0 irqs 0-23 on motherboard > kbd1 at kbdmux0 > acpi0: on motherboard > acpi0: [ITHREAD] > acpi0: Power Button (fixed) > Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0 > cpu0: on acpi0 > pcib0: port 0xcf8-0xcff on acpi0 > pci0: on pcib0 > pci_link4: Unable to route IRQs: AE_NOT_FOUND > isab0: at device 1.0 on pci0 > isa0: on isab0 > atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc000-0xc00f at device 1.1 on pci0 > ata0: on atapci0 > ata0: [ITHREAD] > ata1: on atapci0 > ata1: [ITHREAD] > uhci0: port 0xc020-0xc03f irq 11 at device 1.2 on pci0 > uhci0: [ITHREAD] > usbus0: controller did not stop > usbus0: on uhci0 > pci0: at device 1.3 (no driver attached) > vgapci0: mem 0xf0000000-0xf1ffffff,0xf2000000-0xf2000fff at device 2.0 on pci0 > em0: port 0xc040-0xc07f mem 0xf2020000-0xf203ffff irq 11 at device 3.0 on pci0 > em0: Memory Access and/or Bus Master bits were not set! > em0: [FILTER] > em0: Ethernet address: 52:54:00:3e:67:3b > pci0: at device 4.0 (no driver attached) > pci0: at device 5.0 (no driver attached) > acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 > Timecounter "HPET" frequency 100000000 Hz quality 900 > atrtc0: port 0x70-0x71,0x72-0x77 irq 8 on acpi0 > atkbdc0: port 0x60,0x64 irq 1 on acpi0 > atkbd0: irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > atkbd0: [ITHREAD] > psm0: irq 12 on atkbdc0 > psm0: [GIANT-LOCKED] > psm0: [ITHREAD] > psm0: model IntelliMouse Explorer, device ID 4 > fdc0: port 0x3f2-0x3f5,0x3f7 irq 6 drq 2 on acpi0 > fdc0: does not respond > device_attach: fdc0 attach returned 6 > pmtimer0 on isa0 > orm0: at iomem 0xc9000-0xd0fff pnpid ORM0000 on isa0 > sc0: at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > ppc0: parallel port not found. > uart0: at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 > uart0: [FILTER] > Timecounter "TSC" frequency 2128010988 Hz quality 800 > Timecounters tick every 10.000 msec > usbus0: 12Mbps Full Speed USB v1.0 > ad0: 5120MB at ata0-master WDMA2 > acd0: CDROM at ata1-master WDMA2 > ugen0.1: at usbus0 > uhub0: on usbus0 > Root mount waiting for: usbus0 > uhub0: 2 ports with 2 removable, self powered > Trying to mount root from ufs:/dev/ad0s1a > > -- > Mathieu Desnoyers > Operating System Efficiency R&D Consultant > EfficiOS Inc. > http://www.efficios.com -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com