From owner-freebsd-current Mon Nov 17 18:08:56 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id SAA05513 for current-outgoing; Mon, 17 Nov 1997 18:08:56 -0800 (PST) (envelope-from owner-freebsd-current) Received: from lamb.sas.com (root@lamb.sas.com [192.35.83.8]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id SAA05500 for ; Mon, 17 Nov 1997 18:08:38 -0800 (PST) (envelope-from jwd@unx.sas.com) Received: from mozart (markham.southpeak.com [192.35.83.31]) by lamb.sas.com (8.8.7/8.8.7) with SMTP id VAA03348; Mon, 17 Nov 1997 21:08:21 -0500 (EST) Received: from iluvatar.unx.sas.com by mozart (5.65c/SAS/Domains/5-6-90) id AA20735; Mon, 17 Nov 1997 21:08:20 -0500 From: "John W. DeBoskey" Received: by iluvatar.unx.sas.com (5.65c/SAS/Generic 9.01/3-26-93) id AA05446; Mon, 17 Nov 1997 21:08:19 -0500 Message-Id: <199711180208.AA05446@iluvatar.unx.sas.com> Subject: Re: fxp0 causes machine lockup To: dg@root.com Date: Mon, 17 Nov 1997 21:08:19 -0500 (EST) Cc: freebsd-current@freebsd.org In-Reply-To: <199711060128.RAA04958@implode.root.com> from "David Greenman" at Nov 5, 97 05:28:11 pm X-Mailer: ELM [version 2.4 PL23] Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-current@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, Well, it's been a few days and I've gotten alittle farther with my fxp0 problem, and also found a few other oddities. My SNAP date: FreeBSD 3.0-971102-SNAP (GENERIC) #0: Mon Nov 17 22:44:15 GMT 1997 in /sys/pci/if_fxp.c the following fragment from fxp_init() appears to be the problem: /* * Start the config command/DMA. */ fxp_scb_wait(sc); CSR_WRITE_4(sc, FXP_CSR_SCB_GENERAL, vtophys(cbp)); CSR_WRITE_1(sc, FXP_CSR_SCB_COMMAND, FXP_SCB_COMMAND_CU_START); /* ...and wait for it to complete. */ while (!(cbp->cb_status & FXP_CB_STATUS_C)); The fxp_scb_wait, CSR_WRITE_4 and CSR_WRITE_1 calls (appear to) work correctly. What I cannot find is the location in the code where the FXP_CB_STATUS_C bit is set in a (interrupt?) routine. Nor can I break into DDB at this point. grep FXP_CB_STATUS_C *.c if_fxp.c: (txp->cb_status & FXP_CB_STATUS_C) != 0; if_fxp.c: while (!(cbp->cb_status & FXP_CB_STATUS_C)); if_fxp.c: while (!(cb_ias->cb_status & FXP_CB_STATUS_C)); if_fxp.c: txp[i].cb_status = FXP_CB_STATUS_C | FXP_CB_STATUS_OK; The oddities: I rebuilt my kernel (using GENERIC) and added options DDB. I then built and installed the new kernel. When rebooting, I specified the -d option to bring up the kernel debugger. I then specified: b fxp_init c and the system panic'd in the bounce buffer code saying it could not malloc enough memory.. Ok, well, I don't need bounce buffers on my machine, so I removed options BOUNCE_BUFFERS from GENERIC and once again rebuilt & installed. Again I rebooted and specified -d and issued the break & continue commands. This time, a series of "Could not malloc" messages went by, but none stopped the system. Finally, it got to the point where is wanted to mount the root filesystem. It said it could not mount the root filesystem and hung. Any comments, helpful hints, critiques, etc, are welcome. Thanks, John My complete dmesg output: Copyright (c) 1992-1997 FreeBSD Inc. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. FreeBSD 3.0-971102-SNAP #0: Mon Nov 17 22:44:15 GMT 1997 root@mrose.pc.sas.com:/usr/src/sys/compile/GENERIC CPU: Pentium Pro (199.43-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x617 Stepping=7 Features=0xfbff real memory = 67108864 (65536K bytes) avail memory = 62734336 (61264K bytes) Probing for devices on PCI bus 0: Correcting Natoma config for non-SMP chip0: rev 0x02 on pci0.0.0 chip1: rev 0x00 on pci0.13.0 ide_pci0: rev 0x00 on pci0.13.1 chip2: rev 0x00 on pci0.14.0 vga0: rev 0x00 int a irq 9 on pci0.16.0 vx0: <3COM 3C905 Fast Etherlink XL PCI> rev 0x00 int a irq 15 on pci0.17.0 mii[*mii*] address 00:a0:24:bb:88:3e Probing for devices on PCI bus 1: fxp0: rev 0x04 int a irq 14 on pci1.9.0 fxp0: Ethernet address 00:a0:c9:8b:09:a5 ahc0: rev 0x00 int a irq 11 on pci1.10.0 ahc0: aic7880 Wide Channel, SCSI Id=7, 16 SCBs ahc0: waiting for scsi devices to settle scbus0 at ahc0 bus 0 sd0 at scbus0 target 2 lun 0 sd0: type 0 fixed SCSI 2 sd0: Direct-Access 4095MB (8388315 512 byte sectors) Probing for devices on the ISA bus: sc0 at 0x60-0x6f irq 1 on motherboard sc0: VGA color <16 virtual consoles, flags=0x0> ed0 not found at 0x280 fe0 not found at 0x300 sio0 at 0x3f8-0x3ff irq 4 flags 0x10 on isa sio0: type 16550A sio1 at 0x2f8-0x2ff irq 3 on isa sio1: type 16550A lpt0 at 0x378-0x37f irq 7 on isa lpt0: Interrupt-driven port lp0: TCP/IP capable interface lpt1 not found mse0 not found at 0x23c psm0 at 0x60-0x64 irq 12 on motherboard psm0: device ID 0 fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa fdc0: FIFO enabled, 8 bytes threshold fd0: 1.44MB 3.5in wdc0 not found at 0x1f0 wdc1 not found at 0x170 bt0 not found at 0x330 uha0 not found at 0x330 aha0 not found at 0x330 aic0 not found at 0x340 nca0 not found at 0x1f88 nca1 not found at 0x350 sea0 not found wt0 not found at 0x300 mcd0 not found at 0x300 matcdc0 not found at 0x230 scd0 not found at 0x230 ie0: unknown board_id: f000 ie0 not found at 0x300 ep0 not found at 0x300 ex0 not found le0 not found at 0x300 lnc0 not found at 0x280 ze0 not found at 0x300 zp0 not found at 0x300 npx0 on motherboard npx0: INT 16 interface changing root device to sd0a > > > In looking through the archives I found this message which appears > >to be similar, though with different hardware: > > > >>From: "Mike Durian" > >>Date: Wed, 01 Oct 1997 12:45:27 -0600 > >>Subject: strange interaction with Pentium and fxp > ... > > That turned out to be caused by some local kernel changes that they had > made - they had a SCSI card's EEPROM responding to physical addresses that > were in the area of system RAM. This caused the DMA to hang; it wasn't bug > in FreeBSD or the hardware and it went away when they fixed their code. > I don't have any idea why your machine is hanging. Very odd and your's > is the only report I've gotten of a problem like that. The first thing to > do would be to figure out if it is a DMA or interrupt problem by adding > printf's all over the place inside the driver, and then see where it dies. > > -DG > > David Greenman > Core-team/Principal Architect, The FreeBSD Project > -- jwd@unx.sas.com (w) John W. De Boskey (919) 677-8000 x6915