From owner-freebsd-threads@FreeBSD.ORG Fri Feb 17 01:56:48 2012 Return-Path: Delivered-To: threads@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2835D106567C; Fri, 17 Feb 2012 01:56:48 +0000 (UTC) (envelope-from listlog2011@gmail.com) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id EDC238FC12; Fri, 17 Feb 2012 01:56:47 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q1H1ujN2056748; Fri, 17 Feb 2012 01:56:46 GMT (envelope-from listlog2011@gmail.com) Message-ID: <4F3DB3DB.2060603@gmail.com> Date: Fri, 17 Feb 2012 09:56:43 +0800 From: David Xu User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:10.0.1) Gecko/20120208 Thunderbird/10.0.1 MIME-Version: 1.0 To: Julian Elischer References: <4F3C2671.3090808__7697.00510795719$1329343207$gmane$org@freebsd.org> <4F3D3E2D.9090100@FreeBSD.org> <4F3D6FDD.9050808@freebsd.org> <4F3D89CD.9050309@freebsd.org> <4F3DA27A.3090903@freebsd.org> In-Reply-To: <4F3DA27A.3090903@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Alexander Kabaev , threads@freebsd.org, FreeBSD Stable , David Xu , Andriy Gapon Subject: Re: pthread_cond_timedwait() broken in 9-stable? (from JAN 10) X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: davidxu@freebsd.org List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Feb 2012 01:56:48 -0000 On 2012/2/17 8:42, Julian Elischer wrote: > Adding David Xu for his thoughts since he reqrote the code in quesiton > in revision 213098 > > On 2/16/12 2:57 PM, Julian Elischer wrote: >> On 2/16/12 1:06 PM, Julian Elischer wrote: >>> On 2/16/12 9:34 AM, Andriy Gapon wrote: >>>> on 15/02/2012 23:41 Julian Elischer said the following: >>>>> The program fio (an IO test in ports) uses pthreads >>>>> >>>>> the following code (from fio-2.0.3, but its in earlier code too) >>>>> has suddenly started misbehaving. >>>>> >>>>> clock_gettime(CLOCK_REALTIME,&t); >>>>> t.tv_sec += seconds + 10; >>>>> >>>>> pthread_mutex_lock(&mutex->lock); >>>>> >>>>> while (!mutex->value&& !ret) { >>>>> mutex->waiters++; >>>>> ret = >>>>> pthread_cond_timedwait(&mutex->cond,&mutex->lock,&t); >>>>> mutex->waiters--; >>>>> } >>>>> >>>>> if (!ret) { >>>>> mutex->value--; >>>>> pthread_mutex_unlock(&mutex->lock); >>>>> } >>>>> >>>>> >>>>> It turns out that 'ret' sometimes comes back instantly (on my >>>>> machine) with a >>>>> value of 60 (ETIMEDOUT) >>>>> despite the fact that we set the timeout 10 seconds into the future. >>>>> >>>>> Has anyone else seen anything like this? >>>>> (and yes the condition variable attribute have been set to use the >>>>> REALTIME clock). >>>> But why? >>>> >>>> Just a hypothesis that maybe there is some issue with time keeping >>>> on that system. >>>> How would that code work out for you with MONOTONIC? >>> >>> Jens Axboe, (CC'd) tried both CLOCK_REALTIME and CLOCK_MONOTONIC, >>> and they both had the same problem.. >>> i.e. random early returns with ETIMEDOUT. >>> >>> I think we will try move out machine forward to a newer -stable to >>> see if it resolves. >> Kan upgraded the machine today to today's 9.x branch tip and the >> problem still occurs. >> 8.x does not have this problem. >> >> I have not got a 9-RELEASE machine to test on.. so I can not tell if >> this came in with the burst of stuff >> that came in after the 9.x branch was unfrozen after the release of 9.0. >> >> > I am trying to reproduce the problem, do you have complete sample code to test ? Regards, David Xu