Date: Tue, 8 Dec 2009 11:22:00 -0500 From: Jung-uk Kim <jkim@FreeBSD.org> To: Scott Long <scottl@samsco.org> Cc: Alexander Sack <pisymbol@gmail.com>, scottl@freebsd.org, freebsd-current@freebsd.org, emaste@freebsd.org Subject: Re: aac(4) resource FIB starvation on BUS scan revisited Message-ID: <200912081122.02870.jkim@FreeBSD.org> In-Reply-To: <0FFC216C-E938-48E4-B0E4-351077C6088A@samsco.org> References: <3c0b01820912071342u1c722b2clf9c8413e40097279@mail.gmail.com> <3c0b01820912072000l7ad1a67ek3514dfccb96417be@mail.gmail.com> <0FFC216C-E938-48E4-B0E4-351077C6088A@samsco.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 07 December 2009 11:04 pm, Scott Long wrote: > On Dec 7, 2009, at 9:00 PM, Alexander Sack wrote: > > On Mon, Dec 7, 2009 at 8:14 PM, Scott Long <scottl@samsco.org> wrote: > >> On Dec 7, 2009, at 6:05 PM, Jung-uk Kim wrote: > >>> On Monday 07 December 2009 07:47 pm, Scott Long wrote: > >>>> On Dec 7, 2009, at 5:31 PM, Jung-uk Kim wrote: > >>>>> On Monday 07 December 2009 05:30 pm, Alexander Sack wrote: > >>>>>> On Mon, Dec 7, 2009 at 4:42 PM, Alexander Sack > >>>>>> <pisymbol@gmail.com> > >>>>> > >>>>> wrote: > >>>>>>> Folks: > >>>>>>> > >>>>>>> I posted a similar thread on freebsd-scsi only to realize > >>>>>>> that scottl had fixed my first issue during some MP CAM > >>>>>>> cleanup with respect to a race during resource allocation > >>>>>>> issues on a later version of the driver we are using (I > >>>>>>> believe we did the same thing to resolve a lock issue on > >>>>>>> bootup). > >>>>>>> > >>>>>>> However on my RELENG_8 box with (2) Adaptec 5085s connected > >>>>>>> to some JBODs (9TB each) I still have a FIB starvation > >>>>>>> issue during the LUN scan: > >>>>>>> > >>>>>>> The number of FIBs allocated to this card is 512 (older > >>>>>>> cards are 256). The max_target per bus is 287. On a six > >>>>>>> channel controller with a BUS scan done in parallel I see a > >>>>>>> lot of this: > >>>>>>> > >>>>>>> ... > >>>>>>> (probe501:aacp1:0:214:0): Request Requeued > >>>>>>> (probe501:aacp1:0:214:0): Retrying Command > >>>>>>> (probe520:aacp1:0:233:0): Request Requeued > >>>>>>> (probe520:aacp1:0:233:0): Retrying Command > >>>>>>> (probe528:aacp1:0:241:0): Request Requeued > >>>>>>> (probe528:aacp1:0:241:0): Retrying Command > >>>>>>> (probe540:aacp1:0:253:0): Request Requeued > >>>>>>> (probe540:aacp1:0:253:0): Retrying Command > >>>>>>> (probe541:aacp1:0:254:0): Request Requeued > >>>>>>> (probe541:aacp1:0:254:0): Retrying Command > >>>>>>> .... > >>>>>>> > >>>>>>> I think the driver is much happier with the following > >>>>>>> attached patch (with dmesg). > >>>>>> > >>>>>> Patch again but this time not base-64 encoded: > >>>>> > >>>>> [SNIP!] > >>>>> > >>>>> I want it to be little conservative here, i.e., > >>>>> pre-allocating half of max_fibs. Will the attached patch > >>>>> work for you? > >>>> > >>>> The FIB allocation scheme was written when it was common for > >>>> machines to only have 64MB of RAM and proportionally less KVA, > >>>> so 256KB or 512KB was a lot of RAM to wire down. Those days > >>>> have probably passed. > >>> > >>> So, what would do if you were hypothetically rewriting it > >>> today? :-) > >> > >> Most hardware have mechanisms for probing their command queue > >> depth. What I > >> typically do these days is allocate a minimum number of commands > >> so that > >> this probing can be done, then do a single slab allocation based > >> on the > >> results. AAC doesn't have this capability, but the 256/512 size > >> is pretty > >> well understood. The page-by-page allocation of aac works, but > >> adds extra > >> bookkeeping and complication to the driver. > > > > Right Scott, that is what JK and I discussed this evening. I > > figured the 128 macro was just historical cruft and your email > > confirms it. So are we ALL okay with the original patch as it > > stands for now? JK I am fine with the divide 2 change but I > > think raising it to 256 is really the way to go at this point! > > :D > > If you're going to increase it, why not simply increase it to the > max amount that is appropriate for each card? My intention was to minimize impact as little as possible, i.e., old card: max fibs == 256, max fibs / 2 == 128, no change new card: max fibs == 512, max fibs / 2 == 256, twice Old cards are most likely to be used on old systems with very little RAM (if they are still in production). Hence, no change is necessary. Anyway I just committed OP's patch (with a minor comment tweak). > One other thing I forgot to mention was contiguous memory. The > page- by-page allocation in aac has another benefit, and that's to > not tax contigmalloc with finding 256KB of contiguous memory. > That's not a big deal at boot, but is a problem if you load the > driver after the system has been running for a while. It's > immensely useful during development, but it's never been clear to > me how useful it is in real life. Thanks for your review and comments! Jung-uk Kim
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200912081122.02870.jkim>