From owner-freebsd-arch@FreeBSD.ORG Mon Oct 27 06:59:35 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F3ED721B; Mon, 27 Oct 2014 06:59:34 +0000 (UTC) Received: from pp2.rice.edu (proofpoint2.mail.rice.edu [128.42.201.101]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9BFF829A; Mon, 27 Oct 2014 06:59:33 +0000 (UTC) Received: from pps.filterd (pp2.rice.edu [127.0.0.1]) by pp2.rice.edu (8.14.5/8.14.5) with SMTP id s9R6xQFn024741; Mon, 27 Oct 2014 01:59:26 -0500 Received: from mh11.mail.rice.edu (mh11.mail.rice.edu [128.42.199.30]) by pp2.rice.edu with ESMTP id 1q7yw40j3w-1; Mon, 27 Oct 2014 01:59:25 -0500 X-Virus-Scanned: by amavis-2.7.0 at mh11.mail.rice.edu, auth channel Received: from 108-254-203-201.lightspeed.hstntx.sbcglobal.net (108-254-203-201.lightspeed.hstntx.sbcglobal.net [108.254.203.201]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh11.mail.rice.edu (Postfix) with ESMTPSA id 4752B4C00A5; Mon, 27 Oct 2014 01:59:25 -0500 (CDT) Message-ID: <544DED4C.3010501@rice.edu> Date: Mon, 27 Oct 2014 01:59:24 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Svatopluk Kraus Subject: Re: vm_page_array and VM_PHYSSEG_SPARSE References: <5428AF3B.1030906@rice.edu> <54497DC1.5070506@rice.edu> In-Reply-To: X-Enigmail-Version: 1.6 Content-Type: multipart/mixed; boundary="------------090909070100060609070401" X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 kscore.is_bulkscore=0 kscore.compositescore=0.999328515101207 circleOfTrustscore=0 compositescore=0.601496849000349 urlsuspect_oldscore=0.00149684900034924 suspectscore=11 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=0 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=0 rbsscore=0.601496849000349 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.9 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1410270079 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: alc@freebsd.org, FreeBSD Arch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Oct 2014 06:59:35 -0000 This is a multi-part message in MIME format. --------------090909070100060609070401 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 10/24/2014 06:33, Svatopluk Kraus wrote: > > On Fri, Oct 24, 2014 at 12:14 AM, Alan Cox > wrote: > > On 10/08/2014 10:38, Svatopluk Kraus wrote: > > On Mon, Sep 29, 2014 at 3:00 AM, Alan Cox > wrote: > > > >> On 09/27/2014 03:51, Svatopluk Kraus wrote: > >> > >> > >> On Fri, Sep 26, 2014 at 8:08 PM, Alan Cox > wrote: > >> > >>> > >>> On Wed, Sep 24, 2014 at 7:27 AM, Svatopluk Kraus > > > >>> wrote: > >>> > >>>> Hi, > >>>> > >>>> I and Michal are finishing new ARM pmap-v6 code. There is one > problem > >>>> we've > >>>> dealt with somehow, but now we would like to do it better. > It's about > >>>> physical pages which are allocated before vm subsystem is > initialized. > >>>> While later on these pages could be found in vm_page_array when > >>>> VM_PHYSSEG_DENSE memory model is used, it's not true for > >>>> VM_PHYSSEG_SPARSE > >>>> memory model. And ARM world uses VM_PHYSSEG_SPARSE model. > >>>> > >>>> It really would be nice to utilize vm_page_array for such > preallocated > >>>> physical pages even when VM_PHYSSEG_SPARSE memory model is > used. Things > >>>> could be much easier then. In our case, it's about pages > which are used > >>>> for > >>>> level 2 page tables. In VM_PHYSSEG_SPARSE model, we have two > sets of such > >>>> pages. First ones are preallocated and second ones are > allocated after vm > >>>> subsystem was inited. We must deal with each set differently. > So code is > >>>> more complex and so is debugging. > >>>> > >>>> Thus we need some method how to say that some part of > physical memory > >>>> should be included in vm_page_array, but the pages from that > region > >>>> should > >>>> not be put to free list during initialization. We think that such > >>>> possibility could be utilized in general. There could be a > need for some > >>>> physical space which: > >>>> > >>>> (1) is needed only during boot and later on it can be freed > and put to vm > >>>> subsystem, > >>>> > >>>> (2) is needed for something else and vm_page_array code could > be used > >>>> without some kind of its duplication. > >>>> > >>>> There is already some code which deals with blacklisted pages in > >>>> vm_page.c > >>>> file. So the easiest way how to deal with presented situation > is to add > >>>> some callback to this part of code which will be able to > either exclude > >>>> whole phys_avail[i], phys_avail[i+1] region or single pages. > As the > >>>> biggest > >>>> phys_avail region is used for vm subsystem allocations, there > should be > >>>> some more coding. (However, blacklisted pages are not dealt > with on that > >>>> part of region.) > >>>> > >>>> We would like to know if there is any objection: > >>>> > >>>> (1) to deal with presented problem, > >>>> (2) to deal with the problem presented way. > >>>> Some help is very appreciated. Thanks > >>>> > >>>> > >>> As an experiment, try modifying vm_phys.c to use dump_avail > instead of > >>> phys_avail when sizing vm_page_array. On amd64, where the > same problem > >>> exists, this allowed me to use VM_PHYSSEG_SPARSE. Right now, > this is > >>> probably my preferred solution. The catch being that not all > architectures > >>> implement dump_avail, but my recollection is that arm does. > >>> > >> Frankly, I would prefer this too, but there is one big open > question: > >> > >> What is dump_avail for? > >> > >> > >> > >> dump_avail[] is solving a similar problem in the minidump code, > hence, the > >> prefix "dump_" in its name. In other words, the minidump code > couldn't use > >> phys_avail[] either because it didn't describe the full range > of physical > >> addresses that might be included in a minidump, so dump_avail[] > was created. > >> > >> There is already precedent for what I'm suggesting. > dump_avail[] is > >> already (ab)used outside of the minidump code on x86 to solve > this same > >> problem in x86/x86/nexus.c, and on arm in arm/arm/mem.c. > >> > >> > >> Using it for vm_page_array initialization and segmentation > means that > >> phys_avail must be a subset of it. And this must be stated and > be visible > >> enough. Maybe it should be even checked in code. I like the idea of > >> thinking about dump_avail as something what desribes all memory > in a > >> system, but it's not how dump_avail is defined in archs now. > >> > >> > >> > >> When you say "it's not how dump_avail is defined in archs now", > I'm not > >> sure whether you're talking about the code or the comments. In > terms of > >> code, dump_avail[] is a superset of phys_avail[], and I'm not > aware of any > >> code that would have to change. In terms of comments, I did a > grep looking > >> for comments defining what dump_avail[] is, because I couldn't > remember > >> any. I found one ... on arm. So, I don't think it's a onerous > task > >> changing the definition of dump_avail[]. :-) > >> > >> Already, as things stand today with dump_avail[] being used > outside of the > >> minidump code, one could reasonably argue that it should be > renamed to > >> something like phys_exists[]. > >> > >> > >> > >> I will experiment with it on monday then. However, it's not > only about how > >> memory segments are created in vm_phys.c, but it's about how > vm_page_array > >> size is computed in vm_page.c too. > >> > >> > >> > >> Yes, and there is also a place in vm_reserv.c that needs to > change. I've > >> attached the patch that I developed and tested a long time > ago. It still > >> applies cleanly and runs ok on amd64. > >> > >> > >> > > > > > > Well, I've created and tested minimalistic patch which - I hope - is > > commitable. It runs ok on pandaboard (arm-v6) and solves > presented problem. > > I would really appreciate if this will be commited. Thanks. > > > Sorry for the slow reply. I've just been swamped with work lately. I > finally had some time to look at this in the last day or so. > > The first thing that I propose to do is commit the attached > patch. This > patch changes pmap_init() on amd64, armv6, and i386 so that it no > longer > consults phys_avail[] to determine the end of memory. Instead, it > calls > a new function provided by vm_phys.c to obtain the same > information from > vm_phys_segs[]. > > With this change, the new variable phys_managed in your patch wouldn't > need to be a global. It could be a local variable in > vm_page_startup() > that we pass as a parameter to vm_phys_init() and vm_reserv_init(). > > More generally, the long-term vision that I have is that we would stop > using phys_avail[] after vm_page_startup() had completed. It > would only > be used during initialization. After that we would use vm_phys_segs[] > and functions provided by vm_phys.c. > > > I understand. The patch and the long-term vision are fine for me. I > just was not to bold to pass phys_managed as a parameter to > vm_phys_init() and vm_reserv_init(). However, I certainly was thinking > about it. While reading comment above vm_phys_get_end(), do we care of > if last usable address is 0xFFFFFFFF? To date, this hasn't been a problem. However, handling 0xFFFFFFFF is easy. So, the final version of the patch that I committed this weekend does so. Can you please try the attached patch? It replaces phys_avail[] with vm_phys_segs[] in arm's busdma. > Do you think that the rest of my patch considering changes due to your > patch is ok? > Basically, yes. I do, however, think that +#if defined(__arm__) + phys_managed = dump_avail; +#else + phys_managed = phys_avail; +#endif should also be conditioned on VM_PHYSSEG_SPARSE. > > > > > > BTW, while I was inspecting all archs, I think that maybe it's > time to do > > what was done for busdma not long ago. There are many similar > codes across > > archs which deal with physical memory and could be generalized > and put to > > kern/subr_physmem.c for utilization. All work with physical > memory could be > > simplify to two arrays of regions. > > > > phys_present[] ... describes all present physical memory regions > > phys_exclude[] ... describes various exclusions from phys_present[] > > > > Each excluded region will be labeled by flags to say what kind > of exclusion > > it is. The flags like NODUMP, NOALLOC, NOMANAGE, NOBOUNCE, > NOMEMRW could > > be combined. This idea is taken from sys/arm/arm/physmem.c. > > > > All other arrays like phys_managed[], phys_avail[], dump_avail[] > will be > > created from these phys_present[] and phys_exclude[]. > > This way bootstrap codes in archs could be simplified and > unified. For > > example, dealing with either hw.physmem or page with PA > 0x00000000 could be > > transparent. > > > > I'm prepared to volunteer if the thing is ripe. However, some > tutor will be > > looked for. > > > I've never really looked at arm/arm/physmem.c before. Let me do that > before I comment on this. > > No problem. This could be long-term aim. However, I hope the > VM_PHYSSEG_SPARSE problem could be dealt with in MI code in present > time. In every case, thanks for your help. > > --------------090909070100060609070401 Content-Type: text/plain; charset=ISO-8859-15; name="busdma_arm1.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="busdma_arm1.patch" SW5kZXg6IGFybS9hcm0vYnVzZG1hX21hY2hkZXAtdjYuYwo9PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBh cm0vYXJtL2J1c2RtYV9tYWNoZGVwLXY2LmMJKHJldmlzaW9uIDI3MzY5OSkKKysrIGFybS9h cm0vYnVzZG1hX21hY2hkZXAtdjYuYwkod29ya2luZyBjb3B5KQpAQCAtNTQsOCArNTQsOSBA QCBfX0ZCU0RJRCgiJEZyZWVCU0QkIik7CiAjaW5jbHVkZSA8c3lzL3Vpby5oPgogCiAjaW5j bHVkZSA8dm0vdm0uaD4KKyNpbmNsdWRlIDx2bS92bV9wYXJhbS5oPgogI2luY2x1ZGUgPHZt L3ZtX3BhZ2UuaD4KLSNpbmNsdWRlIDx2bS92bV9tYXAuaD4KKyNpbmNsdWRlIDx2bS92bV9w aHlzLmg+CiAjaW5jbHVkZSA8dm0vdm1fZXh0ZXJuLmg+CiAjaW5jbHVkZSA8dm0vdm1fa2Vy bi5oPgogCkBAIC0yNzcsMTYgKzI3OCwxOCBAQCBTWVNJTklUKGJ1c2RtYSwgU0lfU1VCX0tN RU0rMSwgU0lfT1JERVJfRklSU1QsIGJ1cwogICogZXhwcmVzcywgc28gd2UgdGFrZSBhIGZh c3Qgb3V0LgogICovCiBzdGF0aWMgaW50Ci1leGNsdXNpb25fYm91bmNlX2NoZWNrKHZtX29m ZnNldF90IGxvd2FkZHIsIHZtX29mZnNldF90IGhpZ2hhZGRyKQorZXhjbHVzaW9uX2JvdW5j ZV9jaGVjayh2bV9wYWRkcl90IGxvd2FkZHIsIHZtX3BhZGRyX3QgaGlnaGFkZHIpCiB7CisJ c3RydWN0IHZtX3BoeXNfc2VnICpzZWc7CiAJaW50IGk7CiAKIAlpZiAobG93YWRkciA+PSBC VVNfU1BBQ0VfTUFYQUREUikKIAkJcmV0dXJuICgwKTsKIAotCWZvciAoaSA9IDA7IHBoeXNf YXZhaWxbaV0gJiYgcGh5c19hdmFpbFtpICsgMV07IGkgKz0gMikgewotCQlpZiAoKGxvd2Fk ZHIgPj0gcGh5c19hdmFpbFtpXSAmJiBsb3dhZGRyIDwgcGh5c19hdmFpbFtpICsgMV0pIHx8 Ci0JCSAgICAobG93YWRkciA8IHBoeXNfYXZhaWxbaV0gJiYgaGlnaGFkZHIgPj0gcGh5c19h dmFpbFtpXSkpCisJZm9yIChpID0gMDsgaSA8IHZtX3BoeXNfbnNlZ3M7IGkrKykgeworCQlz ZWcgPSAmdm1fcGh5c19zZWdzW2ldOworCQlpZiAoKGxvd2FkZHIgPj0gc2VnLT5zdGFydCAm JiBsb3dhZGRyIDwgc2VnLT5lbmQpIHx8CisJCSAgICAobG93YWRkciA8IHNlZy0+c3RhcnQg JiYgaGlnaGFkZHIgPj0gc2VnLT5zdGFydCkpCiAJCQlyZXR1cm4gKDEpOwogCX0KIAlyZXR1 cm4gKDApOwpJbmRleDogYXJtL2FybS9idXNkbWFfbWFjaGRlcC5jCj09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0K LS0tIGFybS9hcm0vYnVzZG1hX21hY2hkZXAuYwkocmV2aXNpb24gMjczNjk5KQorKysgYXJt L2FybS9idXNkbWFfbWFjaGRlcC5jCSh3b3JraW5nIGNvcHkpCkBAIC03MCwxMCArNzAsMTEg QEAgX19GQlNESUQoIiRGcmVlQlNEJCIpOwogCiAjaW5jbHVkZSA8dm0vdW1hLmg+CiAjaW5j bHVkZSA8dm0vdm0uaD4KKyNpbmNsdWRlIDx2bS92bV9wYXJhbS5oPgogI2luY2x1ZGUgPHZt L3ZtX2V4dGVybi5oPgogI2luY2x1ZGUgPHZtL3ZtX2tlcm4uaD4KICNpbmNsdWRlIDx2bS92 bV9wYWdlLmg+Ci0jaW5jbHVkZSA8dm0vdm1fbWFwLmg+CisjaW5jbHVkZSA8dm0vdm1fcGh5 cy5oPgogCiAjaW5jbHVkZSA8bWFjaGluZS9hdG9taWMuaD4KICNpbmNsdWRlIDxtYWNoaW5l L2J1cy5oPgpAQCAtMzI2LDE3ICszMjcsMTkgQEAgcnVuX2ZpbHRlcihidXNfZG1hX3RhZ190 IGRtYXQsIGJ1c19hZGRyX3QgcGFkZHIpCiAgKiBleHByZXNzLCBzbyB3ZSB0YWtlIGEgZmFz dCBvdXQuCiAgKi8KIHN0YXRpYyBfX2lubGluZSBpbnQKLV9idXNfZG1hX2Nhbl9ib3VuY2Uo dm1fb2Zmc2V0X3QgbG93YWRkciwgdm1fb2Zmc2V0X3QgaGlnaGFkZHIpCitfYnVzX2RtYV9j YW5fYm91bmNlKHZtX3BhZGRyX3QgbG93YWRkciwgdm1fcGFkZHJfdCBoaWdoYWRkcikKIHsK KwlzdHJ1Y3Qgdm1fcGh5c19zZWcgKnNlZzsKIAlpbnQgaTsKIAogCWlmIChsb3dhZGRyID49 IEJVU19TUEFDRV9NQVhBRERSKQogCQlyZXR1cm4gKDApOwogCi0JZm9yIChpID0gMDsgcGh5 c19hdmFpbFtpXSAmJiBwaHlzX2F2YWlsW2kgKyAxXTsgaSArPSAyKSB7Ci0JCWlmICgobG93 YWRkciA+PSBwaHlzX2F2YWlsW2ldICYmIGxvd2FkZHIgPD0gcGh5c19hdmFpbFtpICsgMV0p Ci0JCSAgICB8fCAobG93YWRkciA8IHBoeXNfYXZhaWxbaV0gJiYKLQkJICAgIGhpZ2hhZGRy ID4gcGh5c19hdmFpbFtpXSkpCisJZm9yIChpID0gMDsgaSA8IHZtX3BoeXNfbnNlZ3M7IGkr KykgeworCQlzZWcgPSAmdm1fcGh5c19zZWdzW2ldOworCQlpZiAoKGxvd2FkZHIgPj0gc2Vn LT5zdGFydCAmJiBsb3dhZGRyIDw9IHNlZy0+ZW5kKQorCQkgICAgfHwgKGxvd2FkZHIgPCBz ZWctPnN0YXJ0ICYmCisJCSAgICBoaWdoYWRkciA+IHNlZy0+c3RhcnQpKQogCQkJcmV0dXJu ICgxKTsKIAl9CiAJcmV0dXJuICgwKTsK --------------090909070100060609070401--