Date: Fri, 24 Oct 2014 19:33:07 +0000 (GMT) From: Rui Paulo <rpaulo@me.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Rui Paulo <rpaulo@FreeBSD.org> Subject: Re: svn commit: r273598 - in head: include sys/dev/acpica Message-ID: <33decfcd-e77c-4e4c-8161-9f4a232213c6@me.com>
next in thread | raw e-mail | index | archive | help
On Oct 24, 2014, at 12:20 PM, Konstantin Belousov <kostikbel@gmail.com> wr= ote:=0A=0AOn Fri, Oct 24, 2014 at 06:39:16PM +0000, Rui Paulo wrote:=0A> A= uthor: rpaulo=0A> Date: Fri Oct 24 18:39:15 2014=0A> New Revision: 273598=0A= > URL: https://svnweb.freebsd.org/changeset/base/273598=0A> =0A> Log:=0A> = HPET: create /dev/hpetN as a way to access HPET from userland.=0A> =0A> In= some cases, TSC is broken and special applications might benefit=0A> from= memory mapping HPET and reading the registers to count time.=0A> Most oft= en the main HPET counter is 32-bit only[1], so this only gives=0A> the app= lication a 300 second window based on the default HPET=0A> interval.=0A> O= ther applications, such as Intel's DPDK, expect /dev/hpet to be=0A> presen= t and use it to count time as well.=0A> =0A> Although we have an almost us= erland version of gettimeofday() which=0A> uses rdtsc in userland, it's no= t always possible to use it, depending=0A> on how broken the multi-socket = hardware is.=0AYes, and hpet userland mapping would be better handled thro= ugh the same=0Afake-vdso framework. As designed, it has discriminator to i= nform=0Auserspace about algorithm, and can happilly utilize HPET timecount= er=0Aautomatically mapped by kernel into the process address space.=0A=C2=A0= =0AI'm aware of that, but I found the vdso a bit confusing and decided to = work on that later.=0A=0A> +static int=0A> +hpet_open(struct cdev *cdev, i= nt oflags, int devtype, struct thread *td)=0A> +{=0A> + =C2=A0 =C2=A0 =C2=A0= struct hpet_softc *sc;=0A> +=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0sc =3D cdev-= >si_drv1;=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0if (!sc->mmap_allow)=0A> + =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0return (EPERM);=0A> + =C2=A0 =C2=A0 =C2=A0if (= atomic_cmpset_32(&sc->devinuse, 0, 1) =3D=3D 0)=0A> + =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0return (EBUSY);=0AThis is extra-weird.=0AThe devinuse= business disallows simultaneous opens, which prevents=0Aother process fro= m opening and mapping. But if the original caller=0Adoes mmap and close, s= econd process now is allowed to open and mmap.=0A=0AThat said, why do you = need this devinuse at all ?=0A=C2=A0=0AHmm, I wanted to avoid multiple mma= p's, but that doesn't work like you said. =C2=A0I may just remove this res= triction.=C2=A0=0A=0A> +static int=0A> +hpet_mmap(struct cdev *cdev, vm_oo= ffset_t offset, vm_paddr_t *paddr,=0A> + int nprot, vm_memattr_t *memattr)= =0A> +{=0A> + =C2=A0 =C2=A0struct hpet_softc *sc;=0A> +=0A> + =C2=A0 =C2=A0= sc =3D cdev->si_drv1;=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0if (offset > rman_g= et_size(sc->mem_res))=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0return (EINVAL);=0A> + =C2=A0 =C2=A0if (!sc->mmap_allow_write && (n= prot & PROT_WRITE))=0A> + =C2=A0 =C2=A0 =C2=A0return (EPERM);=0A> + =C2=A0= =C2=A0 =C2=A0*paddr =3D rman_get_start(sc->mem_res) + offset;=0AWhat is t= he memattr for the backing page ? Is it set to non-cached=0Amode somehow ?= I was not able to find place where would this happen.=0A=C2=A0=0AI expect= it to be set to non-cached since it's a device anyway, but I don't know w= here it is. =C2=A0During my testing, I did not see any problems with cache= d values, though.=0A=0A> + =C2=A0sc->pdev =3D make_dev(&hpet_cdevsw, 0, UI= D_ROOT, GID_WHEEL,=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 0600, "hpet%d", devic= e_get_unit(dev));=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0if (sc->pdev) {=0A> + =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sc->pdev->si_drv1 =3D = sc;=0A> + =C2=A0 =C2=A0 =C2=A0sc->mmap_allow =3D 1;=0A> + =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0TUNABLE_INT_FETCH("hw.acpi.hpet.mmap= _allow",=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 &sc->mmap_allow);= =0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0sc->mmap_allow_write =3D 1= ;=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0TUNABLE_INT_FETCH("hw.acp= i.hpet.mmap_allow_write",=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 &sc->mmap_allow_write);=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0SYSCTL_ADD_INT(device_get_sysctl_ctx(dev),=0A> + =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 SYSCTL_CHILDREN(devic= e_get_sysctl_tree(dev)),=0A> + =C2=A0 =C2=A0 =C2=A0 OID_AUTO, "mmap_allow"= ,=0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 CTLFLAG_RW, &sc->mmap_allow, 0,= =0A> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "Allow user= land to memory map HPET");=0AWhy is mmap_allow is per-instance, while mmap= _allow_write taken from=0Athe global tunable ?=0A=C2=A0=0AAre you asking w= hy there's no sysctl for it?=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?33decfcd-e77c-4e4c-8161-9f4a232213c6>