Date: Sat, 26 Jan 2013 00:34:09 +0100 From: Marius Strobl <marius@alchemy.franken.de> To: Paul Keusemann <pkeusem@visi.com> Cc: freebsd-net@freebsd.org Subject: Re: Cas driver fails to load first time after boot. Message-ID: <20130125233409.GZ85306@alchemy.franken.de> In-Reply-To: <5102D9AB.9010405@visi.com> References: <50FEFAB8.1070006@visi.com> <20130124150904.GA27559@alchemy.franken.de> <51017FF0.5080001@visi.com> <20130124215017.GS85306@alchemy.franken.de> <5101F264.7030206@visi.com> <20130125161916.GV85306@alchemy.franken.de> <5102D9AB.9010405@visi.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jan 25, 2013 at 01:14:51PM -0600, Paul Keusemann wrote: > > On 01/25/13 10:19, Marius Strobl wrote: > > On Thu, Jan 24, 2013 at 08:48:04PM -0600, Paul Keusemann wrote: > >> On 01/24/13 15:50, Marius Strobl wrote: > >>> On Thu, Jan 24, 2013 at 12:39:44PM -0600, Paul Keusemann wrote: > >>>> On 01/24/13 09:09, Marius Strobl wrote: > >>>>> On Tue, Jan 22, 2013 at 02:46:48PM -0600, Paul Keusemann wrote: > >>>>>> Hi, > >>>>>> > >>>>>> I've got a Dell R200 which I'm trying to build into a gateway with a Sun > >>>>>> QGE (501-6738-10). The cas driver fails to load the first time I try to > >>>>>> load it but succeeds the second time. Is this a problem with the card, > >>>>>> the driver, my karma? > >>>>> Wrong phase of the moon, apparently :) > >>>>> The MII setup of these chips is a bit tricky and I'm not sure whether > >>>>> I've hit all code paths during development of the driver. I certainly > >>>>> didn't test with a 501-6738, these have been reported as working before, > >>>>> though. It also doesn't make much sense that attaching the devices > >>>>> succeeds on the second attempt. Could you please use a if_cas.ko built > >>>>> with the attached patch and report the debug output for one of the > >>>>> interfaces in both the working and the non-working case? > >>>> I would love to give you output from the working and non-working case > >>>> but apparently the phase of the moon has changed, I can't get it to fail > >>>> now. The messages output from the working case is attached. > >>>> > >>> Thanks but unfortunately this doesn't make any sense either. In general, > >>> printf()s cause deays which can be relevant. In the locations I've put > >>> them they hardly can make such a difference though. > >>> If you haven't already done so, could you please power off the machine > >>> before doing the test with the patched module? Is the problem still gone > >>> if you revert to the original module? > >> OK, power-cycling makes a difference. The driver fails to attach all of > >> the devices after power-cycling most of the time if not all of the > >> time. The number of devices attached varies, the attached message file > >> fragment is from my last test. Three of the devices were attached on > >> the first load attempt and all four of them on the second attempt. > > Okay, so we now at least have a way to reproduce the problem. > > Unfortunately, it's still unclear what's the exact cause of it. At > > least the problem is not what I suspected and hoped it most likely is. > > Could you please test how things behave after a power-cycle with the > > attached patche (after reverting the previous one). > > The patched driver fails to compile with the following error message: > <...> > > I found the following defintion of nitems in the iwn and usb/wlan drivers: > > #define nitems(_a) (sizeof((_a)) / sizeof((_a)[0])) > > so I added it to if_cas.c and rebuilt without errors. > Sorry, I didn't think of 8.3 not having nitems(), yet. Actually, this part of the patch is orthogonal to your problem and just a change I had in that tree. > This looks like like it fixed the problem. I ran three tests from > power-up to loading the driver and the driver loaded successfully all > three times. I then added if_cas_load="YES" to /boot/loader.conf and > did two more successful reboots from power-up. Great! Thanks a lot for testing! > > Will this driver work on FreeBSD 9.1? > Yes, the patch should also solve the problem in 9.1. I suspect the hang you are seeing there isn't specific to cas(4) but rather a general regression that came in with the VIMAGE changes. Now, if a network device driver fails to attach during boot and tries to clean up by detaching and freeing the interface part at that stage again this causes problems. I already talked to bz@ about this and what I remember from his reply this is an ordering issue that is at least very hard to fix. Marius
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130125233409.GZ85306>