From owner-freebsd-stable Wed Apr 4 19:30:18 2001 Delivered-To: freebsd-stable@freebsd.org Received: from smtp.popsite.net (smtp.popsite.net [216.126.128.17]) by hub.freebsd.org (Postfix) with ESMTP id 6786137B443 for ; Wed, 4 Apr 2001 19:30:14 -0700 (PDT) (envelope-from bill@twwells.com) Received: from twwells.com (02-067.051.popsite.net [64.24.21.67]) by smtp.popsite.net (Postfix) with ESMTP id 1CD7850838 for ; Wed, 4 Apr 2001 21:29:58 -0500 (CDT) Received: from bill by twwells.com with local (Exim 3.22 #1) id 14kzWr-000M5Y-00; Wed, 04 Apr 2001 22:29:45 -0400 Subject: Re: possible problem with dc driver To: barney@tp.databus.com (Barney Wolff) Date: Wed, 4 Apr 2001 22:29:45 -0400 (EDT) Cc: stable@freebsd.org In-Reply-To: <20010404193113.A43291@tp.databus.com> from "Barney Wolff" at Apr 04, 2001 07:31:13 PM X-Mailer: ELM [version 2.5 PL3] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: From: "T. William Wells" Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Barney Wolff wrote: > If the machine actually locks up, rather than the transfer > just running slowly while other things on the machine run > normally, then it's either h/w or s/w on the machine itself. It had actually locked up. I've been daring my machine to lock up again by running some tests while writing this e-mail. :) So far, it's been behaving normally. That is, not locking up but the transfers are slow. [I got a crash!] > The full list of possibilities includes cables, the hub, > the cards, the motherboards, the disk controllers, the disks. > And of course, it might actually be a driver bug. I've eliminated the cables and hub as possibilities, by swapping things and using tests between various machines. > To elminate the disks, either run ttcp (from ports) or make > sure the file is cached and ftp it to /dev/null. Well, if memory serves, it was transfers to the machine that caused lockups. [Memory did not serve -- I made it fail by transferring from the machine.] My current tests have been with, and without, writing the file to disk. Still no locks, though I've definitely been getting losses. > You might also try moving the card to a different slot, where > it doesn't share an irq with something else (if dmesg.boot > shows that it does that). It doesn't. > If you need to start swapping components, start with the cheapest > and keep going until the problem is gone. There's only one swappable "component" -- the NIC. :) Everything else is on the motherboard. > Try to enjoy this :) Heh. I'm not a driver newbie (I've written a few) so I won't be lost. However, I'm really not looking forward to debugging a driver I didn't write. :) OTOH, the dc driver looks better than most.... ======== As my editorial notes said, I got it to crash. What I did was, on P (the machine with the failing NIC): while :; do scp /usr/tmp/bigfile G:/usr/tmp/bigfile; done and waited until the system froze. Occasionally, I noted that after a transfer, there was no output from the next scp. The system wasn't frozen; I could switch to another console and do a ps, to see that scp was waiting in [connec] (I think) state. It was possible to ^C out of that and re-enter the command to continue the test. After a bit of work, I got a coredump. BTW, the handbook is in error here; DDB panic followed by continue doesn't do the right thing. What I did was 'call panic(0)' but I wouldn't be surprised to discover there is a better way. Anyway, here's the relevant portion of the backtrace. (I've removed the nonsense relating to the control-alt-ESC that got me into the debugger.) #14 0xc019787b in dc_rxeof (sc=0xc095f000) at /usr/src/sys/pci/if_dc.c:2365 #15 0xc0197edf in dc_intr (arg=0xc095f000) at /usr/src/sys/pci/if_dc.c:2640 #16 0xc020aa92 in slow_copyin () #17 0xc015d248 in sosend (so=0xc3d03480, addr=0x0, uio=0xc420fed8, top=0x0, control=0x0, flags=0, p=0xc3fa73c0) at /usr/src/sys/kern/uipc_socket.c:585 #18 0xc015178c in soo_write (fp=0xc09dbd00, uio=0xc420fed8, cred=0xc0a69180, flags=0, p=0xc3fa73c0) at /usr/src/sys/kern/sys_socket.c:81 #19 0xc014e3b1 in dofilewrite (p=0xc3fa73c0, fp=0xc09dbd00, fd=3, buf=0x8145004, nbyte=135088, offset=-1, flags=0) at /usr/src/sys/sys/file.h:163 #20 0xc014e26a in write (p=0xc3fa73c0, uap=0xc420ff80) at /usr/src/sys/kern/sys_generic.c:329 #21 0xc020c27d in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = 134684360, tf_esi = -1077939264, tf_ebp = -1077939324, tf_isp = -1004470316, tf_ebx = 135088, tf_edx = 134682404, tf_ecx = 3, tf_eax = 4, tf_trapno = 0, tf_err = 2, tf_eip = 672976284, tf_cs = 31, tf_eflags = 642, tf_esp = -1077939368, tf_ss = 47}) at /usr/src/sys/i386/i386/trap.c:1150 #22 0xc01ffe65 in Xint0x80_syscall () #23 0x804ef42 in ?? () #24 0x804c7aa in ?? () #25 0x804c111 in ?? () #26 0x804af05 in ?? () Suggestions as to where to go from here? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message