From owner-freebsd-hackers@FreeBSD.ORG  Fri Jan 10 21:42:44 2014
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id CCCF036C;
 Fri, 10 Jan 2014 21:42:44 +0000 (UTC)
Received: from elvis.mu.org (elvis.mu.org [192.203.228.196])
 by mx1.freebsd.org (Postfix) with ESMTP id A83B212E7;
 Fri, 10 Jan 2014 21:42:44 +0000 (UTC)
Received: from Alfreds-MacBook-Pro.local (unknown [216.38.154.18])
 by elvis.mu.org (Postfix) with ESMTPSA id D9F4B1A3C2D;
 Fri, 10 Jan 2014 13:42:43 -0800 (PST)
Message-ID: <52D06954.9000304@freebsd.org>
Date: Fri, 10 Jan 2014 13:42:44 -0800
From: Alfred Perlstein <alfred@freebsd.org>
Organization: FreeBSD
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9;
 rv:24.0) Gecko/20100101 Thunderbird/24.2.0
MIME-Version: 1.0
To: Alan Cox <alc@rice.edu>, 
 Hans Petter Selasky <hans.petter.selasky@bitfrost.no>,
 Neel Natu <neel@FreeBSD.org>, 
 FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject: usb + other drivers stop working on 128GB+ memory machines
References: <50BDB148.1060607@mu.org>
In-Reply-To: <50BDB148.1060607@mu.org>
X-Forwarded-Message-Id: <50BDB148.1060607@mu.org>
Content-Type: multipart/mixed; boundary="------------010602060907090305070104"
X-Content-Filtered-By: Mailman/MimeDel 2.1.17
Cc: Tommy Stiansen <ts@norse-corp.com>
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
 <freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-hackers>, 
 <mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers/>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
 <mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Jan 2014 21:42:45 -0000

This is a multi-part message in MIME format.
--------------010602060907090305070104
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit

Hey Alan, Neel and Hans,

We're testing FreeBSD 10 here and still having problems, once we go over 
128GB of memory then USB stops working.  When we artificially limit 
memory to 128GB or lower we are OK.

Is there any chance we can revisit this patch so that large memory 
systems don't use up the lower memory space which seems to be needed by 
some drivers?

I'm having a bit of trouble explaining to people that too much memory == 
no keyboard on FreeBSD.

I have the patch that seemed to work for us before.  Any chance this can 
go into FreeBSD soon?


-Alfred


-------- Original Message --------
Subject: 	Re: Questions about FreeBSD amd64 memory layout.
Date: 	Tue, 04 Dec 2012 00:16:08 -0800
From: 	Alfred Perlstein <bright@mu.org>
To: 	Alan Cox <alc@rice.edu>
CC: 	Alan Cox <alc@FreeBSD.org>, Xin LI <delphij@delphij.net>


On 12/3/12 11:23 PM, Alan Cox wrote:
> On 12/03/2012 18:15, Alfred Perlstein wrote:
>> Hello Alan,
>>
>> The other day I ran a copy of FreeBSD 9.1 with my maxusers patches
>> (from current).
>>
>> The machine had 256 gigs of RAM.
>>
>> Due to that much memory, maxusers was upwards of 24860.
>>
>> What then happened was that the mfi driver, and I think also the USB
>> driver would not work.
>>
>> The mfi driver stopped working because it got the following error:
>> mfi0: Cannot allocate verbuf_h_dmamap memory
>>
>> This appears to be due to this in the mfi driver:
>>>          /* Start: LSIP200113393 */
>>>          if (bus_dma_tag_create( sc->mfi_parent_dmat,    /* parent */
>>>                                  1, 0,                   /* algnmnt,
>>> boundary */
>>>                                  BUS_SPACE_MAXADDR_32BIT,/* lowaddr */
>>>                                  BUS_SPACE_MAXADDR,      /* highaddr */
>>>                                  NULL, NULL,             /* filter,
>>> filterarg */
>>> MEGASAS_MAX_NAME*sizeof(bus_addr_t),                    /* maxsize */
>>>                                  1,                      /* msegments */
>>> MEGASAS_MAX_NAME*sizeof(bus_addr_t),                    /* maxsegsize */
>>>                                  0,                      /* flags */
>>>                                  NULL, NULL,             /* lockfunc,
>>> lockarg */
>>>                                  &sc->verbuf_h_dmat)) {
>>>                  device_printf(sc->mfi_dev, "Cannot allocate
>>> verbuf_h_dmat DMA tag\n");
>>>                  return (ENOMEM);
>>>          }
>>>          if (bus_dmamem_alloc(sc->verbuf_h_dmat, (void **)&sc->verbuf,
>>>              BUS_DMA_NOWAIT, &sc->verbuf_h_dmamap)) {
>>>                  device_printf(sc->mfi_dev, "Cannot allocate
>>> verbuf_h_dmamap memory\n");
>> What I'm thinking is happening is that by the time we get to mfi
>> driver enough of the below 4GB memory is used up by callout wheels,
>> nbufs, various hash tables, etc that we wind up unable to get memory
>> in this region.
>>
>> This could (and probably is) a wrong assumption, but it's what makes
>> sense to me right now.
>>
>
> I can believe it, or more precisely I know of nothing that immediately
> disproves it.
>
>
>> I'm wondering how the kernel map gets populated, and if it would be
>> possible, and if it would be advisable to change the allocation
>> strategy to come from the tail end of physical memory instead of the
>> front.
>>
>
> There is no intentional "allocation strategy" in the sense that you are
> using the phrase here.  Much of the VM system, including the physical
> memory allocator, is initialized early in the boot process, in fact,
> before callout wheels, nbufs, etc. are allocated.  So, the standard
> physical memory allocator is being used for callout wheels, nbufs, etc.,
> and this allocator takes pages from the cache/free page queues in
> whatever arbitrary order they happen to be in.  I can believe that we
> currently initialize the cache/free page queues in an order that results
> in the allocation of pages from low physical addresses first.
>
> The physical memory allocator does, however, have a way of dealing with
> low physical address ranges that you don't want to allocate from except
> explicitly, e.g., contigmalloc()/kmem_alloc_contig(), or as a last
> resort.  This is currently only used for the physical address range for
> ISA DMA.
>
> I've attached a patch that abuses the ISA DMA range, extending it to
> 4GB.  See if this patch enables you to boot.
>
>
It does!  Everything is fixed now.

What now?  Can I help somehow?

~ % sysctl -a| grep maxuser
kern.maxusers: 33049
~ % dmesg| grep mfi
mfi0: <ThunderBolt> port 0x8000-0x80ff mem
0xc7a60000-0xc7a63fff,0xc7a00000-0xc7a3ffff irq 26 at device 0.0 on pci1
mfi0: Using MSI
mfi0: Megaraid SAS driver Ver 4.23
mfi0: MaxCmd = 3f0 MaxSgl = 46 state = b75003f0
mfi0: 1436 (407894536s/0x0020/info) - Shutdown command received from host
mfi0: 1437 (boot + 4s/0x0020/info) - Firmware initialization started
(PCI ID 005b/1000/0690/15d9)
mfi0: 1438 (boot + 4s/0x0020/info) - Firmware version 3.190.05-1669
mfi0: 1439 (boot + 5s/0x0020/info) - Package version 23.7.0-0029
mfi0: 1440 (boot + 5s/0x0020/info) - Board Revision
mfi0: 1441 (boot + 25s/0x0002/info) - Inserted: PD 10(e0xfc/s0)
mfi0: 1442 (boot + 25s/0x0002/info) - Inserted: PD 10(e0xfc/s0) Info:
enclPd=fc, scsiType=0, portMap=00, sasAddr=4433221103000000,0000000000000000
mfi0: 1443 (boot + 26s/0x0001/info) - Policy change on VD 00/0 to
[ID=00,dcp=65,ccp=64,ap=0,dc=0] from [ID=00,dcp=65,ccp=65,ap=0,dc=0]
mfi0: 1444 (407894583s/0x0020/info) - Time established as 12/04/12
0:03:03; (37 seconds since power on)
mfi0: 1445 (407894819s/0x0020/info) - Host driver is loaded and operational
mfid0 on mfi0
mfid0: 2861022MB (5859373056 sectors) RAID volume (no label) is optimal
Trying to mount root from ufs:/dev/mfid0p2 [rw]...


--------------010602060907090305070104
Content-Type: text/plain; charset=UTF-8; x-mac-type="0"; x-mac-creator="0";
 name="hack2.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
 filename="hack2.patch"

Index: amd64/include/vmparam.h
===================================================================
--- amd64/include/vmparam.h	(revision 243366)
+++ amd64/include/vmparam.h	(working copy)
@@ -106,10 +106,13 @@
  * accessible by ISA DMA and VM_FREELIST_ISADMA is for physical pages
  * that are below that address.
  */
-#define	VM_NFREELIST		2
-#define	VM_FREELIST_DEFAULT	0
-#define	VM_FREELIST_ISADMA	1
+#define	VM_NFREELIST		3
+#define	VM_FREELIST_HIGHMEM	0
+#define	VM_FREELIST_DEFAULT	1
+#define	VM_FREELIST_ISADMA	2
 
+#define	VM_HIGHMEM_ADDRESS	((vm_paddr_t)1 << 32)
+
 /*
  * An allocation size of 16MB is supported in order to optimize the
  * use of the direct map by UMA.  Specifically, a cache line contains

--------------010602060907090305070104--