From owner-freebsd-bugs@FreeBSD.ORG Mon Dec 29 06:30:07 2008 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 572C11065670 for ; Mon, 29 Dec 2008 06:30:07 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 472048FC17 for ; Mon, 29 Dec 2008 06:30:07 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id mBT6U7Zj078856 for ; Mon, 29 Dec 2008 06:30:07 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id mBT6U7vj078851; Mon, 29 Dec 2008 06:30:07 GMT (envelope-from gnats) Date: Mon, 29 Dec 2008 06:30:07 GMT Message-Id: <200812290630.mBT6U7vj078851@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Sean Bruno Cc: Subject: Re: kern/118093: firewire bus reset hogs CPU, causing data to be lost X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Sean Bruno List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Dec 2008 06:30:07 -0000 The following reply was made to PR kern/118093; it has been noted by GNATS. From: Sean Bruno To: Dieter Cc: freebsd-firewire@freebsd.org, bug-followup@freebsd.org Subject: Re: kern/118093: firewire bus reset hogs CPU, causing data to be lost Date: Sun, 28 Dec 2008 22:22:26 -0800 Dieter wrote: >>> >>> >>> >> I believe that the spl() calls are just left there as a hint where >> locking should be. >> >> As far as I understand, we need to pay attention to the mutex locks. >> > > I'll rephrase my question. In the old days, locking was done with spl. > The new way is with mutex. But with the spl calls being replaced with > noops, and as far as I can tell the driver is not using mutex, there > doesn't appear to be any locking. So the driver can step on itself. > > Well, there is locking around a couple of mutex's via FW_GLOCK(). It appears that the locking is not robust, and that is one of the issues that I am looking into right now. > >>>> is to real behavior, but /var/log/messages has a tendency to get garbled >>>> like this: >>>> >>>> Dec 22 16:00:18 home-test kernel: fwohci1: Initiate bus reset >>>> Dec 22 16:00:18 home-test kernel: fwohci1: BUS reset >>>> Dec 22 16:00:18 home-test kernel: fwohci1: node_id=0xc800ffc0, gen=8, >>>> CYCLEMASTER mode >>>> Dec 22 16:00:18 home-test kernel: firewi >>>> Dec 22 16:00:18 home-test kernel: re1: >>>> Dec 22 16:00:18 home-test kernel: 1 n >>>> Dec 22 16:00:18 home-test kernel: odes >>>> Dec 22 16:00:18 home-test kernel: , ma >>>> Dec 22 16:00:18 home-test kernel: xhop >>>> Dec 22 16:00:18 home-test kernel: <= >>>> Dec 22 16:00:18 home-test kernel: 0, c >>>> Dec 22 16:00:18 home-test kernel: able >>>> >>>> >>> Do the lines get folded on the console, or only in /var/log/messages? >>> >>> >> As far as I can see, the console messages are fine. It's only the >> messages that get >> garbled. >> > > Perhaps an artifact of syslogd? > I doubt it. I'm working on a patch that improves the locking a bit and does some other "gross" things to try and keep things from flying apart. I've SEEN this behaviour in my implementations with sbp_targ and couldn't pin it down. Scott Long gave me a couple of pointers this evening, but I'm still working on locking down the taskqueues and some of the callback_handlers. There are some bad things going on specifically during initialization that are pre-empting normal operation. Sean