Date: Mon, 7 Dec 2009 17:30:11 -0500 From: Alexander Sack <pisymbol@gmail.com> To: freebsd-current@freebsd.org Subject: Re: aac(4) resource FIB starvation on BUS scan revisited Message-ID: <3c0b01820912071430o545e0ae4u45cb3b658f48c306@mail.gmail.com> In-Reply-To: <3c0b01820912071342u1c722b2clf9c8413e40097279@mail.gmail.com> References: <3c0b01820912071342u1c722b2clf9c8413e40097279@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Dec 7, 2009 at 4:42 PM, Alexander Sack <pisymbol@gmail.com> wrote: > > Folks: > > I posted a similar thread on freebsd-scsi only to realize that scottl had= fixed my first issue during some MP CAM cleanup with respect to a race dur= ing resource allocation issues on a later version of the driver we are usin= g (I believe we did the same thing to resolve a lock issue on bootup). > > However on my RELENG_8 box with (2) Adaptec 5085s connected to some JBODs= (9TB each) I still have a FIB starvation issue during the LUN scan: > > The number of FIBs allocated to this card is 512 (older cards are 256).= =A0 The max_target per bus is 287.=A0 On a six channel controller with a BU= S scan done in parallel I see a lot of this: > > ... > (probe501:aacp1:0:214:0): Request Requeued > (probe501:aacp1:0:214:0): Retrying Command > (probe520:aacp1:0:233:0): Request Requeued > (probe520:aacp1:0:233:0): Retrying Command > (probe528:aacp1:0:241:0): Request Requeued > (probe528:aacp1:0:241:0): Retrying Command > (probe540:aacp1:0:253:0): Request Requeued > (probe540:aacp1:0:253:0): Retrying Command > (probe541:aacp1:0:254:0): Request Requeued > (probe541:aacp1:0:254:0): Retrying Command > .... > > I think the driver is much happier with the following attached patch (wit= h dmesg). Patch again but this time not base-64 encoded: Index: aac.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/dev/aac/aac.c,v retrieving revision 1.143.2.4 diff -u -r1.143.2.4 aac.c --- aac.c 5 Nov 2009 18:34:01 -0000 1.143.2.4 +++ aac.c 7 Dec 2009 21:23:43 -0000 @@ -604,7 +604,7 @@ TAILQ_INIT(&sc->aac_fibmap_tqh); sc->aac_commands =3D malloc(sc->aac_max_fibs * sizeof(struct aac_command)= , M_AACBUF, M_WAITOK|M_ZERO); - while (sc->total_fibs < AAC_PREALLOCATE_FIBS) { + while (sc->total_fibs < sc->aac_max_fibs) { if (aac_alloc_commands(sc) !=3D 0) break; } Index: aac_cam.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/dev/aac/aac_cam.c,v retrieving revision 1.31.2.2 diff -u -r1.31.2.2 aac_cam.c --- aac_cam.c 5 Nov 2009 18:34:01 -0000 1.31.2.2 +++ aac_cam.c 7 Dec 2009 21:23:43 -0000 @@ -261,7 +261,7 @@ cpi->target_sprt =3D 0; /* Resetting via the passthrough causes problems. */ - cpi->hba_misc =3D PIM_NOBUSRESET; + cpi->hba_misc =3D PIM_NOBUSRESET | PIM_SEQSCAN; cpi->hba_eng_cnt =3D 0; cpi->max_target =3D camsc->inf->TargetsPerBus; cpi->max_lun =3D 8; /* Per the controller spec */ Index: aacvar.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/sys/dev/aac/aacvar.h,v retrieving revision 1.52.2.2 diff -u -r1.52.2.2 aacvar.h --- aacvar.h 2 Nov 2009 16:54:23 -0000 1.52.2.2 +++ aacvar.h 7 Dec 2009 21:23:44 -0000 @@ -57,13 +57,6 @@ #define AAC_ADAPTER_FIBS 8 /* - * FIBs are allocated in page-size chunks and can grow up to the 512 - * limit imposed by the hardware. - */ -#define AAC_PREALLOCATE_FIBS 128 -#define AAC_NUM_MGT_FIB 8 - -/* * The controller reports status events in AIFs. We hang on to a number o= f * these in order to pass them out to user-space management tools. */ And dmesg: Copyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-STABLE #2: Sun Dec 6 21:19:10 EST 2009 root@watchmen.localdomain:/usr/home/asack/Development/freebsd/RELENG_8/= src/sys/amd64/compile/GENERIC-DDB amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU E5410 @ 2.33GHz (2327.52-MHz K8-class = CPU) Origin =3D "GenuineIntel" Id =3D 0x1067a Stepping =3D 10 Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PG= E,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=3D0x40ce3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,= PDCM,DCA,SSE4.1,XSAVE> AMD Features=3D0x20100800<SYSCALL,NX,LM> AMD Features2=3D0x1<LAHF> TSC: P-state invariant real memory =3D 17179869184 (16384 MB) avail memory =3D 16526032896 (15760 MB) ACPI APIC Table: <INTEL S5000PAL> FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs FreeBSD/SMP: 1 package(s) x 8 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 cpu4 (AP): APIC ID: 4 cpu5 (AP): APIC ID: 5 cpu6 (AP): APIC ID: 6 cpu7 (AP): APIC ID: 7 ioapic0 <Version 2.0> irqs 0-23 on motherboard ioapic1 <Version 2.0> irqs 24-47 on motherboard lapic0: Forcing LINT1 to edge trigger kbd1 at kbdmux0 acpi0: <INTEL S5000PAL> on motherboard acpi0: [ITHREAD] ACPI Error: Package List length (6) larger than NumElements count (2), trun= cated 20090521 dsobject-590 ACPI Error: Package List length (6) larger than NumElements count (2), trun= cated 20090521 dsobject-590 ACPI Error: Package List length (6) larger than NumElements count (2), trun= cated 20090521 dsobject-590 ACPI Error: Package List length (6) larger than NumElements count (2), trun= cated 20090521 dsobject-590 ACPI Error: Package List length (6) larger than NumElements count (2), trun= cated 20090521 dsobject-590 ACPI Error: Package List length (6) larger than NumElements count (2), trun= cated 20090521 dsobject-590 ACPI Error: Package List length (6) larger than NumElements count (2), trun= cated 20090521 dsobject-590 ACPI Error: Package List length (6) larger than NumElements count (2), trun= cated 20090521 dsobject-590 acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acp= i0 Timecounter "HPET" frequency 14318180 Hz quality 900 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xca2,0xca3,0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pcib2: <ACPI PCI-PCI bridge> irq 16 at device 0.0 on pci1 pci2: <ACPI PCI bus> on pcib2 pcib3: <ACPI PCI-PCI bridge> irq 16 at device 0.0 on pci2 pci3: <ACPI PCI bus> on pcib3 pcib4: <ACPI PCI-PCI bridge> at device 0.0 on pci3 pci4: <ACPI PCI bus> on pcib4 mfi0: <LSI MegaSAS 1064R> mem 0xb9000000-0xb900ffff,0xb8900000-0xb891ffff irq 18 at device 14.0 on pci4 mfi0: Megaraid SAS driver Ver 3.00 mfi0: 1804 (313511129s/0x0020/info) - Shutdown command received from host mfi0: 1805 (boot + 0s/0x0020/info) - Firmware initialization started (PCI ID 0411/1000/3501/8086) mfi0: 1806 (boot + 0s/0x0020/info) - Firmware version 1.12.230-0598 mfi0: 1807 (boot + 0s/0x0020/info) - Firmware initialization started (PCI ID 0411/1000/3501/8086) mfi0: 1808 (boot + 0s/0x0020/info) - Firmware version 1.12.230-0598 mfi0: 1809 (boot + 71s/0x0008/info) - Battery temperature is normal mfi0: 1810 (boot + 71s/0x0008/info) - Battery Present mfi0: 1811 (boot + 71s/0x0020/info) - Board Revision mfi0: 1812 (boot + 100s/0x0004/info) - Enclosure (SES) discovered on PD 0c(c None/p1) mfi0: 1813 (boot + 100s/0x0002/info) - Inserted: Encl PD 0c mfi0: 1814 (boot + 100s/0x0002/info) - Inserted: PD 0c(c None/p1) Info: enclPd=3D0c, scsiType=3Dd, portMap=3D09, sasAddr=3D500150796b8c0000,0000000000000000 mfi0: 1815 (boot + 100s/0x0002/info) - Inserted: PD 0a(e0x0c/s0) mfi0: 1816 (boot + 100s/0x0002/info) - Inserted: PD 0a(e0x0c/s0) Info: enclPd=3D0c, scsiType=3D0, portMap=3D00, sasAddr=3D71903a26a4948e89,0000000000000000 mfi0: 1817 (boot + 100s/0x0002/info) - Inserted: PD 0b(e0x0c/s1) mfi0: 1818 (boot + 100s/0x0002/info) - Inserted: PD 0b(e0x0c/s1) Info: enclPd=3D0c, scsiType=3D0, portMap=3D01, sasAddr=3D71903a27a68d958a,0000000000000000 mfi0: [ITHREAD] pcib5: <PCI-PCI bridge> at device 0.2 on pci3 pci5: <PCI bus> on pcib5 pcib6: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci2 pci6: <ACPI PCI bus> on pcib6 pcib7: <ACPI PCI-PCI bridge> irq 16 at device 2.0 on pci2 pci7: <ACPI PCI bus> on pcib7 em0: <Intel(R) PRO/1000 Network Connection 6.9.14> port 0x2020-0x203f mem 0xb8820000-0xb883ffff,0xb8400000-0xb87fffff irq 18 at device 0.0 on pci7 em0: Using MSI interrupt em0: [FILTER] em0: Ethernet address: 00:15:17:96:b8:c0 em1: <Intel(R) PRO/1000 Network Connection 6.9.14> port 0x2000-0x201f mem 0xb8800000-0xb881ffff,0xb8000000-0xb83fffff irq 19 at device 0.1 on pci7 em1: Using MSI interrupt em1: [FILTER] em1: Ethernet address: 00:15:17:96:b8:c1 pcib8: <ACPI PCI-PCI bridge> at device 0.3 on pci1 pci8: <ACPI PCI bus> on pcib8 pcib9: <PCI-PCI bridge> at device 3.0 on pci0 pci9: <PCI bus> on pcib9 pcib10: <ACPI PCI-PCI bridge> at device 4.0 on pci0 pci10: <ACPI PCI bus> on pcib10 aac0: <Adaptec RAID 5085> mem 0xb8e00000-0xb8ffffff irq 16 at device 0.0 on pci10 aac0: Enabling 64-bit address support aac0: Enable Raw I/O aac0: Enable 64-bit array aac0: New comm. interface enabled aac0: [ITHREAD] aac0: Adaptec 5085, aac driver 2.0.0-1 aacp0: <SCSI Passthrough Bus> on aac0 aacp1: <SCSI Passthrough Bus> on aac0 aacp2: <SCSI Passthrough Bus> on aac0 pcib11: <ACPI PCI-PCI bridge> at device 5.0 on pci0 pci11: <ACPI PCI bus> on pcib11 aac1: <Adaptec RAID 5085> mem 0xb8c00000-0xb8dfffff irq 18 at device 0.0 on pci11 aac1: Enabling 64-bit address support aac1: Enable Raw I/O aac1: Enable 64-bit array aac1: New comm. interface enabled aac1: [ITHREAD] aac1: Adaptec 5085, aac driver 2.0.0-1 aacp3: <SCSI Passthrough Bus> on aac1 aacp4: <SCSI Passthrough Bus> on aac1 aacp5: <SCSI Passthrough Bus> on aac1 pcib12: <ACPI PCI-PCI bridge> at device 6.0 on pci0 pci12: <ACPI PCI bus> on pcib12 pci12: <network> at device 0.0 (no driver attached) pcib13: <ACPI PCI-PCI bridge> at device 7.0 on pci0 pci13: <ACPI PCI bus> on pcib13 pci13: <network> at device 0.0 (no driver attached) pci0: <base peripheral> at device 8.0 (no driver attached) pcib14: <ACPI PCI-PCI bridge> at device 30.0 on pci0 pci14: <ACPI PCI bus> on pcib14 vgapci0: <VGA-compatible display> port 0x1000-0x10ff mem 0xb0000000-0xb7ffffff,0xb9100000-0xb910ffff irq 17 at device 12.0 on pci14 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel 63XXESB2 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x3040-0x304f irq 20 at device 31.1 on pci0 ata0: <ATA channel 0> on atapci0 ata0: [ITHREAD] atapci1: <Intel 63XXESB2 SATA300 controller> port 0x3058-0x305f,0x3074-0x3077,0x3050-0x3057,0x3070-0x3073,0x3020-0x303f mem 0xb9400000-0xb94003ff irq 20 at device 31.2 on pci0 atapci1: [ITHREAD] atapci1: AHCI called from vendor specific driver atapci1: AHCI v1.10 controller with 6 3Gbps ports, PM supported ata2: <ATA channel 0> on atapci1 ata2: [ITHREAD] ata3: <ATA channel 1> on atapci1 ata3: [ITHREAD] ata4: <ATA channel 2> on atapci1 ata4: [ITHREAD] ata5: <ATA channel 3> on atapci1 ata5: [ITHREAD] ata6: <ATA channel 4> on atapci1 ata6: [ITHREAD] ata7: <ATA channel 5> on atapci1 ata7: [ITHREAD] pci0: <serial bus, SMBus> at device 31.3 (no driver attached) atrtc0: <AT realtime clock> port 0x70-0x71,0x74-0x77 irq 8 on acpi0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 uart1: [FILTER] atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: <PS/2 Mouse> irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model IntelliMouse, device ID 3 cpu0: <ACPI CPU> on acpi0 est0: <Enhanced SpeedStep Frequency Control> on cpu0 p4tcc0: <CPU Frequency Thermal Control> on cpu0 cpu1: <ACPI CPU> on acpi0 est1: <Enhanced SpeedStep Frequency Control> on cpu1 p4tcc1: <CPU Frequency Thermal Control> on cpu1 cpu2: <ACPI CPU> on acpi0 est2: <Enhanced SpeedStep Frequency Control> on cpu2 p4tcc2: <CPU Frequency Thermal Control> on cpu2 cpu3: <ACPI CPU> on acpi0 est3: <Enhanced SpeedStep Frequency Control> on cpu3 p4tcc3: <CPU Frequency Thermal Control> on cpu3 cpu4: <ACPI CPU> on acpi0 est4: <Enhanced SpeedStep Frequency Control> on cpu4 p4tcc4: <CPU Frequency Thermal Control> on cpu4 cpu5: <ACPI CPU> on acpi0 est5: <Enhanced SpeedStep Frequency Control> on cpu5 p4tcc5: <CPU Frequency Thermal Control> on cpu5 cpu6: <ACPI CPU> on acpi0 est6: <Enhanced SpeedStep Frequency Control> on cpu6 p4tcc6: <CPU Frequency Thermal Control> on cpu6 cpu7: <ACPI CPU> on acpi0 est7: <Enhanced SpeedStep Frequency Control> on cpu7 p4tcc7: <CPU Frequency Thermal Control> on cpu7 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc8fff,0xc9000-0xcf7ff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=3D0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ppc0: cannot reserve I/O port range Timecounters tick every 1.000 msec acd0: CDROM <CD-224E-R/1.CA> at ata0-slave UDMA33 mfi0: 1819 (313511305s/0x0020/info) - Time established as 12/07/09 14:28:25; (102 seconds since power on) mfid0: <MFI Logical Disk> on mfi0 mfid0: 238418MB (488280064 sectors) RAID volume '' is optimal aacd0: <RAID 5> on aac1 aacd0: 9533430MB (19524464640 sectors) aacd1: <RAID 5> on aac1 aacd1: 9533430MB (19524464640 sectors) ses0 at aacp5 bus 0 scbus5 target 0 lun 0 ses0: <Newisys SA2120 T033> Fixed Enclosure Services SCSI-5 device ses0: 3.300MB/s transfers ses0: SCSI-3 SES Device ses1 at aacp5 bus 0 scbus5 target 1 lun 0 ses1: <Newisys SA2120 T033> Fixed Enclosure Services SCSI-5 device ses1: 3.300MB/s transfers ses1: SCSI-3 SES Device lapic3: Forcing LINT1 to edge trigger SMP: AP CPU #3 Launched! lapic1: Forcing LINT1 to edge trigger SMP: AP CPU #1 Launched! lapic2: Forcing LINT1 to edge trigger SMP: AP CPU #2 Launched! lapic4: Forcing LINT1 to edge trigger SMP: AP CPU #4 Launched! lapic7: Forcing LINT1 to edge trigger SMP: AP CPU #7 Launched! lapic5: Forcing LINT1 to edge trigger SMP: AP CPU #5 Launched! lapic6: Forcing LINT1 to edge trigger SMP: AP CPU #6 Launched! Trying to mount root from ufs:/dev/mfid0s1a em0: link state changed to UP etc. Thanks! -aps
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3c0b01820912071430o545e0ae4u45cb3b658f48c306>