Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Oct 2014 19:33:07 +0000 (GMT)
From:      Rui Paulo <rpaulo@me.com>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Rui Paulo <rpaulo@FreeBSD.org>
Subject:   Re: svn commit: r273598 - in head: include sys/dev/acpica
Message-ID:  <33decfcd-e77c-4e4c-8161-9f4a232213c6@me.com>

next in thread | raw e-mail | index | archive | help
On Oct 24, 2014, at 12:20 PM, Konstantin Belousov <kostikbel@gmail.com> wr=
ote:=0A=0AOn Fri, Oct 24, 2014 at 06:39:16PM +0000, Rui Paulo wrote:=0A> A=
uthor: rpaulo=0A> Date: Fri Oct 24 18:39:15 2014=0A> New Revision: 273598=0A=
> URL: https://svnweb.freebsd.org/changeset/base/273598=0A>; =0A> Log:=0A> =
HPET: create /dev/hpetN as a way to access HPET from userland.=0A> =0A> In=
 some cases, TSC is broken and special applications might benefit=0A> from=
 memory mapping HPET and reading the registers to count time.=0A> Most oft=
en the main HPET counter is 32-bit only[1], so this only gives=0A> the app=
lication a 300 second window based on the default HPET=0A> interval.=0A> O=
ther applications, such as Intel's DPDK, expect /dev/hpet to be=0A> presen=
t and use it to count time as well.=0A> =0A> Although we have an almost us=
erland version of gettimeofday() which=0A> uses rdtsc in userland, it's no=
t always possible to use it, depending=0A> on how broken the multi-socket =
hardware is.=0AYes, and hpet userland mapping would be better handled thro=
ugh the same=0Afake-vdso framework. As designed, it has discriminator to i=
nform=0Auserspace about algorithm, and can happilly utilize HPET timecount=
er=0Aautomatically mapped by kernel into the process address space.=0A=C2=A0=
=0AI'm aware of that, but I found the vdso a bit confusing and decided to =
work on that later.=0A=0A> +static int=0A> +hpet_open(struct cdev *cdev, i=
nt oflags, int devtype, struct thread *td)=0A> +{=0A> + =C2=A0 =C2=A0 =C2=A0=
struct hpet_softc *sc;=0A> +=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0sc =3D cdev-=
>si_drv1;=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0if (!sc->mmap_allow)=0A> + =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0return (EPERM);=0A> + =C2=A0 =C2=A0 =C2=A0if (=
atomic_cmpset_32(&sc->devinuse, 0, 1) =3D=3D 0)=0A> + =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0return (EBUSY);=0AThis is extra-weird.=0AThe devinuse=
 business disallows simultaneous opens, which prevents=0Aother process fro=
m opening and mapping. But if the original caller=0Adoes mmap and close, s=
econd process now is allowed to open and mmap.=0A=0AThat said, why do you =
need this devinuse at all ?=0A=C2=A0=0AHmm, I wanted to avoid multiple mma=
p's, but that doesn't work like you said. =C2=A0I may just remove this res=
triction.=C2=A0=0A=0A> +static int=0A> +hpet_mmap(struct cdev *cdev, vm_oo=
ffset_t offset, vm_paddr_t *paddr,=0A> + int nprot, vm_memattr_t *memattr)=
=0A> +{=0A> + =C2=A0 =C2=A0struct hpet_softc *sc;=0A> +=0A> + =C2=A0 =C2=A0=
sc =3D cdev->si_drv1;=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0if (offset > rman_g=
et_size(sc->mem_res))=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0return (EINVAL);=0A> + =C2=A0 =C2=A0if (!sc->mmap_allow_write && (n=
prot & PROT_WRITE))=0A> + =C2=A0 =C2=A0 =C2=A0return (EPERM);=0A> + =C2=A0=
 =C2=A0 =C2=A0*paddr =3D rman_get_start(sc->mem_res) + offset;=0AWhat is t=
he memattr for the backing page ? Is it set to non-cached=0Amode somehow ?=
 I was not able to find place where would this happen.=0A=C2=A0=0AI expect=
 it to be set to non-cached since it's a device anyway, but I don't know w=
here it is. =C2=A0During my testing, I did not see any problems with cache=
d values, though.=0A=0A> + =C2=A0sc->pdev =3D make_dev(&hpet_cdevsw, 0, UI=
D_ROOT, GID_WHEEL,=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 0600, "hpet%d", devic=
e_get_unit(dev));=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0if (sc->pdev) {=0A> + =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sc->pdev->si_drv1 =3D =
sc;=0A> + =C2=A0 =C2=A0 =C2=A0sc->mmap_allow =3D 1;=0A> + =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0TUNABLE_INT_FETCH("hw.acpi.hpet.mmap=
_allow",=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 &sc->mmap_allow);=
=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sc->mmap_allow_write =3D 1=
;=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0TUNABLE_INT_FETCH("hw.acp=
i.hpet.mmap_allow_write",=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 &sc->mmap_allow_write);=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0SYSCTL_ADD_INT(device_get_sysctl_ctx(dev),=0A> + =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 SYSCTL_CHILDREN(devic=
e_get_sysctl_tree(dev)),=0A> + =C2=A0 =C2=A0 =C2=A0 OID_AUTO, "mmap_allow"=
,=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CTLFLAG_RW, &sc->mmap_allow, 0,=
=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "Allow user=
land to memory map HPET");=0AWhy is mmap_allow is per-instance, while mmap=
_allow_write taken from=0Athe global tunable ?=0A=C2=A0=0AAre you asking w=
hy there's no sysctl for it?=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?33decfcd-e77c-4e4c-8161-9f4a232213c6>