Date: Wed, 18 Oct 2006 13:49:46 +0400 From: Oleg Bulyzhin <oleg@FreeBSD.org> To: John-Mark Gurney <gurney_j@resnet.uoregon.edu> Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/dev/pci pci.c pci_if.m pci_private.h pcivar.h src/sys/dev/sk if_sk.c if_skreg.h Message-ID: <20061018094946.GA37107@lath.rinet.ru> In-Reply-To: <20061017175752.GI23971@funkthat.com> References: <200610091615.k99GFuPD054744@repoman.freebsd.org> <20061016081442.GA344@lath.rinet.ru> <20061016170517.GF23971@funkthat.com> <20061017105330.GC20789@lath.rinet.ru> <20061017175752.GI23971@funkthat.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Oct 17, 2006 at 10:57:52AM -0700, John-Mark Gurney wrote: > Oleg Bulyzhin wrote this message on Tue, Oct 17, 2006 at 14:53 +0400: > > On Mon, Oct 16, 2006 at 10:05:17AM -0700, John-Mark Gurney wrote: > > > Oleg Bulyzhin wrote this message on Mon, Oct 16, 2006 at 12:14 +0400: > > > > On Mon, Oct 09, 2006 at 04:15:56PM +0000, John-Mark Gurney wrote: > > > > > jmg 2006-10-09 16:15:56 UTC > > > > > > > > > > FreeBSD src repository > > > > > > > > > > Modified files: > > > > > sys/dev/pci pci.c pci_if.m pci_private.h pcivar.h > > > > > sys/dev/sk if_sk.c if_skreg.h > > > > > Log: > > > > > provide routines to access VPD data at the PCI layer... > > > > > > > > > > remove sk's own implementation, and use the new calls to get the data... > > > > > > > > > > Reviewed by: -arch > > > > > > > > > > Revision Changes Path > > > > > 1.315 +339 -3 src/sys/dev/pci/pci.c > > > > > 1.9 +13 -0 src/sys/dev/pci/pci_if.m > > > > > 1.18 +4 -0 src/sys/dev/pci/pci_private.h > > > > > 1.71 +34 -0 src/sys/dev/pci/pcivar.h > > > > > 1.131 +7 -148 src/sys/dev/sk/if_sk.c > > > > > 1.39 +0 -31 src/sys/dev/sk/if_skreg.h > > > > > > > > I have problem with my test machine since this commit: > > > > kernel is panicing on boot if i have my pci bge(4) NIC plugged in. > > > > > > > > Last kernel messages are: > > > > pci1: physical bus=1 > > > > pci1:2:0: bad VPD cksum, remain 244 > > > > > > > > Invoking ddb after panic gives this backtrace: > > > > [skipped] > > > > pci_read_vpd > > > > pci_read_extcap > > > > pci_read_device > > > > pci_add_children > > > > acpi_pci_attach > > > > device_attach > > > > [skipped] > > > > > > > > (i'm unable to get crashdump) > > > > > > > > If i unplug bge, kernel boots just fine. > > > > > > > > P.S. i can provide any additional info needed and can test patches. > > > > > > Can you get a line number from pci_read_vpd? Even if you can't get a > > > crash dump, you can use addr2line (or kgdb) w/ the ip of the panic... > > > That would help.. > > > > > > Looks like some manufacturers aren't following the PCI standard.. :( > > > > > > -- > > > John-Mark Gurney Voice: +1 415 225 5579 > > > > > > "All that I will do, has been done, All that I have, has not." > > > > Okay, there is some details: > > > > When kernel boots with bge plugged in: > > 1) message about bad VPD cksum > > 2) then kernel seems to hang for about 30 seconds (i'm not able to break into > > DDB) > > 3) then i get panic. > > > > Panic message is about trap inside 'swapper' process (backtrace shows 2 traps > > first inside pci_read_vpd, next one in swapper). > > > > 1st trap is at pci_read_vpd+0x2c6 it is /usr/src/sys/dev/pci/pci.c:665 > > I've added debug printf there: > > case 3: /* VPD-R Keyword Value */ > > printf("off: %d i: %d\n", off, i); > > cfg->vpd.vpd_ros[off].value[i++] = byte; > > > > It seems that 30 seconds delay is memory rewriting (off is constant, while i > > is incrementing). I get panic message after: > > off:5 i:15407 > > I just realized that if the card puts a length of zero there, we will end > up wrapping dflen.. This should not happened, definately not for RV > which is required to be at least 1 byte in length... This will allow > non-RV's to be zero bytes in length, but if an RV has zero bytes in > length, we'll treat it as a checksum failure and remove all read-only > VPD data, and not continue on processing read-write data... > > Here is an updated patch that should handle your card properly... > > -- > John-Mark Gurney Voice: +1 415 225 5579 > > "All that I will do, has been done, All that I have, has not." > Index: pci.c > =================================================================== > RCS file: /home/ncvs/src/sys/dev/pci/pci.c,v > retrieving revision 1.315 > diff -u -r1.315 pci.c > --- pci.c 9 Oct 2006 16:15:55 -0000 1.315 > +++ pci.c 17 Oct 2006 17:56:15 -0000 > @@ -653,12 +653,36 @@ > cfg->vpd.vpd_ros[off].keyword[0] = byte; > cfg->vpd.vpd_ros[off].keyword[1] = vpd_nextbyte(&vrs); > dflen = vpd_nextbyte(&vrs); > - cfg->vpd.vpd_ros[off].value = malloc((dflen + 1) * > - sizeof *cfg->vpd.vpd_ros[off].value, > - M_DEVBUF, M_WAITOK); > + if (dflen == 0 && > + strncmp(vpd.vpd_ros[off], "RV", 2) == 0) { > + /* > + * if this happens, we can't trust the rest > + * of the VPD. > + */ > + printf("pci%d:%d:%d: bad keyword length: %d\n", > + cfg->bus, cfg->slot, cfg->func, dflen); > + cksumvalid = 0; > + end = 1; > + break; > + } else if (dflen == 0) { > + cfg->vpd.vpd_ros[off].value = malloc(1 * > + sizeof *cfg->vpd.vpd_ros[off].value, > + M_DEVBUF, M_WAITOK) > + cfg->vpd.vpd_ros[off].value[0] = '\x00'; > + } else > + cfg->vpd.vpd_ros[off].value = malloc( > + (dflen + 1) * > + sizeof *cfg->vpd.vpd_ros[off].value, > + M_DEVBUF, M_WAITOK); > remain -= 3; > i = 0; > - state = 3; > + /* keep in sync w/ state 3's transistions */ > + if (dflen == 0 && remain == 0) > + state = 0; > + else if (dflen == 0) > + state = 2; > + else > + state = 3; > break; > > case 3: /* VPD-R Keyword Value */ > @@ -673,10 +697,13 @@ > cfg->bus, cfg->slot, cfg->func, > vrs.cksum); > cksumvalid = 0; > + end = 1; > + break; > } > } > dflen--; > remain--; > + /* keep in sync w/ state 2's transistions */ > if (dflen == 0) > cfg->vpd.vpd_ros[off++].value[i++] = '\0'; > if (dflen == 0 && remain == 0) { > @@ -736,6 +763,15 @@ > break; > } > } > + > + if (cksumvalid == 0) { > + /* read-only data bad, clean up */ > + for (; off; off--) > + free(cfg->vpd.vpd_ros[off].value, M_DEVBUF); > + > + free(cfg->vpd.vpd_ros, M_DEVBUF); > + cfg->vpd.vpd_ros = NULL; > + } > #undef REG > } > It works, thanks! (i had to correct syntax errors and change vpd.vpd_ros[off] --> cfg->vpd_ros[off].keyword inside strncmp() call). [skipped] pci1: <ACPI PCI bus> on pcib1 pci1: physical bus=1 pci1:2:0: bad VPD cksum, remain 244 found-> vendor=0x173b, dev=0x03ea, revid=0x15 bus=1, slot=2, func=0 class=02-00-00, hdrtype=0x00, mfdev=0 cmdreg=0x0116, statreg=0x02b0, cachelnsz=16 (dwords) lattimer=0x20 (960 ns), mingnt=0x40 (16000 ns), maxlat=0x00 (0 ns) intpin=a, irq=9 powerspec 2 supports D0 D3 current D0 VPD Ident: NETGEAR GA302T Gigabit Adapter MSI supports 8 messages, 64 bit map[10]: type 1, range 64, base ff8f0000, size 16, enabled [skipped] -- Oleg.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061018094946.GA37107>