Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 18 Jan 2018 09:55:15 -0600
From:      Alan Cox <alc@rice.edu>
To:        Konstantin Belousov <kostikbel@gmail.com>, Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        Marius Strobl <marius@freebsd.org>, svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r327950 - in head/sys/powerpc: aim include powerpc ps3
Message-ID:  <2d1644f8-c056-2b07-97b1-7ac211cf8e1c@rice.edu>
In-Reply-To: <20180118153532.GR55707@kib.kiev.ua>
References:  <20180115111812.GF1684@kib.kiev.ua> <f6350c61-55d1-9bf7-c4b3-e10fb329a42a@freebsd.org> <20180115170603.GJ1684@kib.kiev.ua> <9e5554d7-6a0c-5910-8cb6-74f98259536f@freebsd.org> <20180115175335.GK1684@kib.kiev.ua> <bb27ba01-8383-6b85-8b2b-65227ff46efc@freebsd.org> <20180116193208.GA12364@alchemy.franken.de> <11a7fdd6-cfd6-26c1-ae3c-7d8a63924d5a@freebsd.org> <20180117094413.GF55707@kib.kiev.ua> <57f837ce-1209-1e9a-158f-7eac5ae6d59a@freebsd.org> <20180118153532.GR55707@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 01/18/2018 09:35, Konstantin Belousov wrote:
> On Thu, Jan 18, 2018 at 07:24:11AM -0800, Nathan Whitehorn wrote:
>>
>> On 01/17/18 01:44, Konstantin Belousov wrote:
>>> On Tue, Jan 16, 2018 at 09:30:29PM -0800, Nathan Whitehorn wrote:
>>>> On 01/16/18 11:32, Marius Strobl wrote:
>>>>> On Mon, Jan 15, 2018 at 03:20:49PM -0800, Nathan Whitehorn wrote:
>>>>>> On 01/15/18 09:53, Konstantin Belousov wrote:
>>>>>>> On Mon, Jan 15, 2018 at 09:32:56AM -0800, Nathan Whitehorn wrote:=

>>>>>>>> That seems fine to me. I don't think a less-clumsy way that does=
 not
>>>>>>>> involve extra indirection is possible. The PHYS_TO_DMAP() return=
ing NULL
>>>>>>>> is about the best thing I can come up with from a clumsiness sta=
ndpoint
>>>>>>>> since plenty of code checks for null pointers already, but doesn=
't
>>>>>>>> cleanly handle the rarer case where you want to test for the exi=
stence
>>>>>>>> of direct maps in general without testing some potemkin address.=

>>>>>>>>
>>>>>>>> My one reservation about PMAP_HAS_DMAP or the like as a selector=
 is that
>>>>>>>> it does not encode the full shape of the problem: one could imag=
ine
>>>>>>>> having a direct map that only covers a limited range of RAM (I a=
m not
>>>>>>>> sure whether the existence of dmaplimit on amd64 implies this ca=
n happen
>>>>>>>> with non-device memory in real life), for example. These cases a=
re
>>>>>>>> currently covered by an assert() in PHYS_TO_DMAP(), whereas havi=
ng
>>>>>>>> PHYS_TO_DMAP() return NULL allows a more flexible signalling and=
 the
>>>>>>>> potential for the calling code to do something reasonable to han=
dle the
>>>>>>>> error. A single global flag can't convey information at this kin=
d of
>>>>>>>> granularity. Is this a reasonable concern? Or am I overthinking =
things?
>>>>>>> IMO it is overreaction.  amd64 assumes that all normal memory is =
covered
>>>>>>> by DMAP.  It must never fail.   See, for instance, the implementa=
tion
>>>>>>> of the sf bufs for it.
>>>>>>>
>>>>>>> If device memory not covered by DMAP can exists, it is the driver=
 problem.
>>>>>>> For instance, for NVDIMMs I wrote specific mapping code which est=
ablishes
>>>>>>> kernel mapping for it, when not covered by EFI memory map and cor=
respondingly
>>>>>>> not included into DMAP.
>>>>>>>
>>>>>> Fair enough. Here's a patch with a new flag (DIRECT_MAP_AVAILABLE)=
=2E I've
>>>>>> also retooled the sfbuf code to use this rather than its own flags=
 that
>>>>>> mean the same things. The sparc64 part of the patch is untested.
>>>>>> -Nathan
>>>>>> Index: sparc64/include/vmparam.h
>>>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>>>>> --- sparc64/include/vmparam.h	(revision 328006)
>>>>>> +++ sparc64/include/vmparam.h	(working copy)
>>>>>> @@ -240,10 +240,12 @@
>>>>>>     */
>>>>>>    #define	ZERO_REGION_SIZE	PAGE_SIZE
>>>>>>   =20
>>>>>> +#include <machine/tlb.h>
>>>>>> +
>>>>>>    #define	SFBUF
>>>>>>    #define	SFBUF_MAP
>>>>>> -#define	SFBUF_OPTIONAL_DIRECT_MAP	dcache_color_ignore
>>>>>> -#include <machine/tlb.h>
>>>>>> -#define	SFBUF_PHYS_DMAP(x)		TLB_PHYS_TO_DIRECT(x)
>>>>>>   =20
>>>>>> +#define DIRECT_MAP_AVAILABLE	dcache_color_ignore
>>>>>> +#define	PHYS_TO_DMAP(x)	(DIRECT_MAP_AVAILABLE ? (TLB_PHYS_TO_DIRE=
CT(x) : 0)
>>>>> What dcache_color_ignore actually indicates is the presence of
>>>>> hardware unaliasing support, in other words the ability to enter
>>>>> duplicate cacheable mappings into the MMU. While a direct map is
>>>>> available and used by MD code on all supported CPUs down to US-I,
>>>>> the former feature is only implemented in the line of Fujitsu SPARC=
64
>>>>> processors. IIRC, the sfbuf(9) code can't guarantee that there isn'=
t
>>>>> already a cacheable mapping from a different VA to the same PA,
>>>>> which is why it employs dcache_color_ignore. Is that a general
>>>>> constraint of all MI PHYS_TO_DMAP users or are there consumers
>>>>> which can guarantee that they are the only users of a mapping
>>>>> to the same PA?
>>>>>
>>>>> Marius
>>>>>
>>>> With the patch, there are four uses of this in the kernel: the sfbuf=

>>>> code, a diagnostic check on page zeroing, part of the EFI runtime co=
de,
>>>> and part of the Linux KBI compat. The second looks safe from this
>>>> perspective and at least some of the others (EFI runtime) are irrele=
vant
>>>> on sparc64. But I really have no idea what was intended for the
>>>> semantics of this API -- I didn't even know it *was* an MI API until=

>>>> this commit. Maybe kib can comment? If this is outside the semantics=
 of
>>>> PHYS_TO_DMAP, then we need to keep the existing sfbuf code.
>>> sfbufs cannot guarantee that there is no other mapping of the page wh=
en
>>> the sfbuf is created.  For instance, one of the use of sfbufs is to m=
ap
>>> the image page 0 to read ELF headers when doing the image activation.=

>>> The image might be mapped by other processes, and we do not control t=
he
>>> address at which it mapped.
>>>
>>> So the direct map accesses must work regardless of the presence of ot=
her
>>> page mappings, and the check for dcache_color_ignore is needed to all=
ow
>>> MI code to take advantage of DMAP.
>>>
>> So: what do you want to happen with PHYS_TO_DMAP()? Do we want to clai=
m=20
>> to MI that a direct map is "available" in such circumstances, or=20
>> "unavailable"? Should sfbuf retain a separate API? I have no preferenc=
es=20
>> here and just want to close out this issue.
> Perhaps DMAP should be conditionally available to the MI layer, same as=

> on powerpc ? I.e. your patch cited above looks right to me, unless I
> misunderstand the Marius' response.
>
Yes, it should.  Only the sparc64 machines where dcache_color_ignore is
true should attempt to use the direct map in MI code.  The
machine-dependent uiomove_fromphys() on sparc64 shows the hoops one has
to jump through in order to correctly use the direct map on sparc64.






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2d1644f8-c056-2b07-97b1-7ac211cf8e1c>