From owner-freebsd-bugs@FreeBSD.ORG Wed Jun 29 21:28:42 2011 Return-Path: Delivered-To: freebsd-bugs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D3DE8106566C for ; Wed, 29 Jun 2011 21:28:42 +0000 (UTC) (envelope-from compudj@mail.openrapids.net) Received: from blackscsi.openrapids.net (mail.openrapids.net [64.15.138.104]) by mx1.freebsd.org (Postfix) with ESMTP id AA42D8FC15 for ; Wed, 29 Jun 2011 21:28:42 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by blackscsi.openrapids.net (Postfix) with ESMTP id 8A961140CE8; Wed, 29 Jun 2011 17:08:42 -0400 (EDT) Received: from blackscsi.openrapids.net ([127.0.0.1]) by localhost (blackscsi.openrapids.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CEGkhLjr+EwO; Wed, 29 Jun 2011 17:08:40 -0400 (EDT) Received: by blackscsi.openrapids.net (Postfix, from userid 1003) id D67B7140D01; Wed, 29 Jun 2011 17:08:40 -0400 (EDT) Date: Wed, 29 Jun 2011 17:08:40 -0400 From: Mathieu Desnoyers To: freebsd-bugs@freebsd.org Message-ID: <20110629210840.GA30887@Krystal> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Editor: vi X-Info: http://www.efficios.com X-Operating-System: Linux/2.6.26-2-686 (i686) X-Uptime: 16:45:21 up 218 days, 1:48, 3 users, load average: 0.24, 0.07, 0.01 User-Agent: Mutt/1.5.18 (2008-05-17) Cc: "Paul E. McKenney" Subject: [BUG] FreeBSD 8.2: race condition in pthread mutex on UP system X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jun 2011 21:28:42 -0000 Hi, I ran the Userspace RCU test suite on a freshly installed FreeBSD 8.2 i386 system running in single-cpu mode in a KVM virtual machine, and notice what looks like a race condition in the handling of pthread mutexes, only occurring on single-cpu systems. To reproduce the problem, start FreeBSD 8.2 on a single-processor machine, and do the following: git clone git://git.lttng.org/userspace-rcu.git cd userspace-rcu ./bootstrap ./configure make cd tests ./test_urcu_mb 0 2 100 With this configuration (0 reader thread, 2 updater threads), this test is just updating data structures protected by pthread mutexes. It uses its own allocator which poisons the memory entries when freed. It checks for poison value upon allocation of new entries, which therefore detects racy updates. My test setup is the following: FreeBSD 8.2 i386, running on a x86_64 i7, under KVM. The physical machine has 2 physical CPUs with hyperthreading enabled. Physical CPU model: model name : Intel(R) Core(TM) i7 CPU L 640 @ 2.13GHz With a single-cpu virtual machine (UP): $ ./test_urcu_mb 0 2 100 Assertion failed: (test_array[index].a == ARRAY_POISON || test_array[index].a == 0), function test_array_alloc, file test_urcu.c, line 204. Abort trap (core dumped) The problem does not reproduce with multiple RCU readers, single writer thread, which does not rely on mutex synchronization, thus pointing me into the direction of a mutex problem. The problem did not reproduce when I increased the number of KVM virtual CPUs to 4 with just those two writer threads. Therefore, it starts to look like a kernel implementation problem of the waitqueue/wakeup scheme supporting mutex lock/unlock operations. I tried to start 6, and then 100 writer threads on the 4 virtual CPU setup, and I cannot trigger the problem so far. So I suspect a bad optimisation for the UP case in the FreeBSD 8.2 kernel. This test runs fine on a wide range of Linux systems. I'll be happy to provide more info if needed. See dmesg below. Thanks, Mathieu Desnoyers Copyright (c) 1992-2011 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.2-RELEASE #0: Fri Feb 18 02:24:46 UTC 2011 root@almeida.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Pentium II/Pentium II Xeon/Celeron (2128.01-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x633 Family = 6 Model = 3 Stepping = 3 Features=0x781abfd Features2=0x80800001> real memory = 805306368 (768 MB) avail memory = 773296128 (737 MB) ACPI APIC Table: ioapic0: Changing APIC ID to 1 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <24-bit timer at 3.579545MHz> port 0xb008-0xb00b on acpi0 cpu0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci_link4: Unable to route IRQs: AE_NOT_FOUND isab0: at device 1.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xc000-0xc00f at device 1.1 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] uhci0: port 0xc020-0xc03f irq 11 at device 1.2 on pci0 uhci0: [ITHREAD] usbus0: controller did not stop usbus0: on uhci0 pci0: at device 1.3 (no driver attached) vgapci0: mem 0xf0000000-0xf1ffffff,0xf2000000-0xf2000fff at device 2.0 on pci0 em0: port 0xc040-0xc07f mem 0xf2020000-0xf203ffff irq 11 at device 3.0 on pci0 em0: Memory Access and/or Bus Master bits were not set! em0: [FILTER] em0: Ethernet address: 52:54:00:3e:67:3b pci0: at device 4.0 (no driver attached) pci0: at device 5.0 (no driver attached) acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 100000000 Hz quality 900 atrtc0: port 0x70-0x71,0x72-0x77 irq 8 on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model IntelliMouse Explorer, device ID 4 fdc0: port 0x3f2-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: does not respond device_attach: fdc0 attach returned 6 pmtimer0 on isa0 orm0: at iomem 0xc9000-0xd0fff pnpid ORM0000 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ppc0: parallel port not found. uart0: at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 uart0: [FILTER] Timecounter "TSC" frequency 2128010988 Hz quality 800 Timecounters tick every 10.000 msec usbus0: 12Mbps Full Speed USB v1.0 ad0: 5120MB at ata0-master WDMA2 acd0: CDROM at ata1-master WDMA2 ugen0.1: at usbus0 uhub0: on usbus0 Root mount waiting for: usbus0 uhub0: 2 ports with 2 removable, self powered Trying to mount root from ufs:/dev/ad0s1a -- Mathieu Desnoyers Operating System Efficiency R&D Consultant EfficiOS Inc. http://www.efficios.com