Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 12 Jan 2014 16:00:08 -0600
From:      Alan Cox <alc@rice.edu>
To:        Hans Petter Selasky <hans.petter.selasky@bitfrost.no>, Alfred Perlstein <alfred@freebsd.org>, Neel Natu <neel@FreeBSD.org>, FreeBSD Hackers <freebsd-hackers@freebsd.org>
Cc:        Tommy Stiansen <ts@norse-corp.com>
Subject:   Re: usb + other drivers stop working on 128GB+ memory machines
Message-ID:  <52D31068.6080007@rice.edu>
In-Reply-To: <zarafa.52d06f98.4be3.6f35d1467ec79bee@mail.lockless.no>
References:  <50BDB148.1060607@mu.org> <zarafa.52d06f98.4be3.6f35d1467ec79bee@mail.lockless.no>

next in thread | previous in thread | raw e-mail | index | archive | help
On 01/10/2014 16:09, Hans Petter Selasky wrote:
> RE: usb + other drivers stop working on 128GB+ memory machines
> Hi,
>
> The newer XHCI chipset does support full 64-bit ranges, if the HW guys did their job. We've seen in the past 32-bit hardware being cut down to 2GB of RAM in the hardware because some OS'es don't support more :-)
>
> I think in general that keeping DMA buffers below 4GB is a good idea. Wasn't the allocator changed some years back to allocate from the top of memory instead of the bottom?
>

That was the very old O(n^2) contigmalloc() from the early part of last
decade.  We haven't used that allocator since ~2007.

Alan

>  
> -----Original message-----
> > From:Alfred Perlstein <alfred@freebsd.org <mailto:alfred@freebsd.org>>
> > Sent: Friday 10th January 2014 22:43
> > To: Alan Cox <alc@rice.edu <mailto:alc@rice.edu>>; Hans Petter Selasky <hans.petter.selasky@bitfrost.no <mailto:hans.petter.selasky@bitfrost.no>>; Neel Natu <neel@FreeBSD.org <mailto:neel@FreeBSD.org>>; FreeBSD Hackers <freebsd-hackers@freebsd.org <mailto:freebsd-hackers@freebsd.org>>
> > Cc: Tommy Stiansen <ts@norse-corp.com <mailto:ts@norse-corp.com>>
> > Subject: usb + other drivers stop working on 128GB+ memory machines
> > 
> > Hey Alan, Neel and Hans,
> > 
> > We're testing FreeBSD 10 here and still having problems, once we go over 
> > 128GB of memory then USB stops working.  When we artificially limit 
> > memory to 128GB or lower we are OK.
> > 
> > Is there any chance we can revisit this patch so that large memory 
> > systems don't use up the lower memory space which seems to be needed by 
> > some drivers?
> > 
> > I'm having a bit of trouble explaining to people that too much memory == 
> > no keyboard on FreeBSD.
> > 
> > I have the patch that seemed to work for us before.  Any chance this can 
> > go into FreeBSD soon?
> > 
> > 
> > 
> > -Alfred
> > 
> > 
> > -------- Original Message --------
> > Subject: 	Re: Questions about FreeBSD amd64 memory layout.
> > Date: 	Tue, 04 Dec 2012 00:16:08 -0800
> > From: 	Alfred Perlstein <bright@mu.org <mailto:bright@mu.org>>
> > To: 	Alan Cox <alc@rice.edu <mailto:alc@rice.edu>>
> > CC: 	Alan Cox <alc@FreeBSD.org <mailto:alc@FreeBSD.org>>, Xin LI <delphij@delphij.net <mailto:delphij@delphij.net>>
> > 
> > 
> > 
> > On 12/3/12 11:23 PM, Alan Cox wrote:
> > > On 12/03/2012 18:15, Alfred Perlstein wrote:
> > >> Hello Alan,
> > >>
> > >> The other day I ran a copy of FreeBSD 9.1 with my maxusers patches
> > >> (from current).
> > >>
> > >> The machine had 256 gigs of RAM.
> > >>
> > >> Due to that much memory, maxusers was upwards of 24860.
> > >>
> > >> What then happened was that the mfi driver, and I think also the USB
> > >> driver would not work.
> > >>
> > >> The mfi driver stopped working because it got the following error:
> > >> mfi0: Cannot allocate verbuf_h_dmamap memory
> > >>
> > >> This appears to be due to this in the mfi driver:
> > >>>          /* Start: LSIP200113393 */
> > >>>          if (bus_dma_tag_create( sc->mfi_parent_dmat,    /** parent **/
> > >>>                                  1, 0,                   /* algnmnt,
> > >>> boundary */
> > >>>                                  BUS_SPACE_MAXADDR_32BIT,/** lowaddr **/
> > >>>                                  BUS_SPACE_MAXADDR,      /** highaddr **/
> > >>>                                  NULL, NULL,             /* filter,
> > >>> filterarg */
> > >>> MEGASAS_MAX_NAME*sizeof(bus_addr_t),                    /** maxsize **/
> > >>>                                  1,                      /** msegments **/
> > >>> MEGASAS_MAX_NAME*sizeof(bus_addr_t),                    /** maxsegsize **/
> > >>>                                  0,                      /** flags **/
> > >>>                                  NULL, NULL,             /* lockfunc,
> > >>> lockarg */
> > >>>                                  &sc->verbuf_h_dmat)) {
> > >>>                  device_printf(sc->mfi_dev, "Cannot allocate
> > >>> verbuf_h_dmat DMA tag\n");
> > >>>                  return (ENOMEM);
> > >>>          }
> > >>>          if (bus_dmamem_alloc(sc->verbuf_h_dmat, (void **)&sc->verbuf,
> > >>>              BUS_DMA_NOWAIT, &sc->verbuf_h_dmamap)) {
> > >>>                  device_printf(sc->mfi_dev, "Cannot allocate
> > >>> verbuf_h_dmamap memory\n");
> > >> What I'm thinking is happening is that by the time we get to mfi
> > >> driver enough of the below 4GB memory is used up by callout wheels,
> > >> nbufs, various hash tables, etc that we wind up unable to get memory
> > >> in this region.
> > >>
> > >> This could (and probably is) a wrong assumption, but it's what makes
> > >> sense to me right now.
> > >>
> > >
> > > I can believe it, or more precisely I know of nothing that immediately
> > > disproves it.
> > >
> > >
> > >> I'm wondering how the kernel map gets populated, and if it would be
> > >> possible, and if it would be advisable to change the allocation
> > >> strategy to come from the tail end of physical memory instead of the
> > >> front.
> > >>
> > >
> > > There is no intentional "allocation strategy" in the sense that you are
> > > using the phrase here.  Much of the VM system, including the physical
> > > memory allocator, is initialized early in the boot process, in fact,
> > > before callout wheels, nbufs, etc. are allocated.  So, the standard
> > > physical memory allocator is being used for callout wheels, nbufs, etc.,
> > > and this allocator takes pages from the cache/free page queues in
> > > whatever arbitrary order they happen to be in.  I can believe that we
> > > currently initialize the cache/free page queues in an order that results
> > > in the allocation of pages from low physical addresses first.
> > >
> > > The physical memory allocator does, however, have a way of dealing with
> > > low physical address ranges that you don't want to allocate from except
> > > explicitly, e.g., contigmalloc()/kmem_alloc_contig(), or as a last
> > > resort.  This is currently only used for the physical address range for
> > > ISA DMA.
> > >
> > > I've attached a patch that abuses the ISA DMA range, extending it to
> > > 4GB.  See if this patch enables you to boot.
> > >
> > >
> > It does!  Everything is fixed now.
> > 
> > What now?  Can I help somehow?
> > 
> > ˜ % sysctl -a| grep maxuser
> > kern.maxusers: 33049
> > ˜ % dmesg| grep mfi
> > mfi0: <ThunderBolt> port 0x8000-0x80ff mem
> > 0xc7a60000-0xc7a63fff,0xc7a00000-0xc7a3ffff irq 26 at device 0.0 on pci1
> > mfi0: Using MSI
> > mfi0: Megaraid SAS driver Ver 4.23
> > mfi0: MaxCmd = 3f0 MaxSgl = 46 state = b75003f0
> > mfi0: 1436 (407894536s/0x0020/info) - Shutdown command received from host
> > mfi0: 1437 (boot + 4s/0x0020/info) - Firmware initialization started
> > (PCI ID 005b/1000/0690/15d9)
> > mfi0: 1438 (boot + 4s/0x0020/info) - Firmware version 3.190.05-1669
> > mfi0: 1439 (boot + 5s/0x0020/info) - Package version 23.7.0-0029
> > mfi0: 1440 (boot + 5s/0x0020/info) - Board Revision
> > mfi0: 1441 (boot + 25s/0x0002/info) - Inserted: PD 10(e0xfc/s0)
> > mfi0: 1442 (boot + 25s/0x0002/info) - Inserted: PD 10(e0xfc/s0) Info:
> > enclPd=fc, scsiType=0, portMap=00, sasAddr=4433221103000000,0000000000000000
> > mfi0: 1443 (boot + 26s/0x0001/info) - Policy change on VD 00/0 to
> > [ID=00,dcp=65,ccp=64,ap=0,dc=0] from [ID=00,dcp=65,ccp=65,ap=0,dc=0]
> > mfi0: 1444 (407894583s/0x0020/info) - Time established as 12/04/12
> > 0:03:03; (37 seconds since power on)
> > mfi0: 1445 (407894819s/0x0020/info) - Host driver is loaded and operational
> > mfid0 on mfi0
> > mfid0: 2861022MB (5859373056 sectors) RAID volume (no label) is optimal
> > Trying to mount root from ufs:/dev/mfid0p2 [rw]...
> > 
> > 
> > 
> > _______________________________________________
> > freebsd-hackers@freebsd.org <mailto:freebsd-hackers@freebsd.org> mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org <mailto:freebsd-hackers-unsubscribe@freebsd.org>"




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?52D31068.6080007>