Date: Fri, 24 Oct 2014 22:20:16 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Rui Paulo <rpaulo@FreeBSD.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r273598 - in head: include sys/dev/acpica Message-ID: <20141024192016.GD1877@kib.kiev.ua> In-Reply-To: <201410241839.s9OIdG0E077139@svn.freebsd.org> References: <201410241839.s9OIdG0E077139@svn.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Oct 24, 2014 at 06:39:16PM +0000, Rui Paulo wrote: > Author: rpaulo > Date: Fri Oct 24 18:39:15 2014 > New Revision: 273598 > URL: https://svnweb.freebsd.org/changeset/base/273598 > > Log: > HPET: create /dev/hpetN as a way to access HPET from userland. > > In some cases, TSC is broken and special applications might benefit > from memory mapping HPET and reading the registers to count time. > Most often the main HPET counter is 32-bit only[1], so this only gives > the application a 300 second window based on the default HPET > interval. > Other applications, such as Intel's DPDK, expect /dev/hpet to be > present and use it to count time as well. > > Although we have an almost userland version of gettimeofday() which > uses rdtsc in userland, it's not always possible to use it, depending > on how broken the multi-socket hardware is. Yes, and hpet userland mapping would be better handled through the same fake-vdso framework. As designed, it has discriminator to inform userspace about algorithm, and can happilly utilize HPET timecounter automatically mapped by kernel into the process address space. > > Install the acpi_hpet.h so that applications can use the HPET register > definitions. > > [1] I haven't found a system where HPET's main counter uses more than > 32 bit. There seems to be a discrepancy in the Intel documentation > (claiming it's a 64-bit counter) and the actual implementation (a > 32-bit counter in a 64-bit memory area). > > MFC after: 1 week > Relnotes: yes > > Modified: > head/include/Makefile > head/sys/dev/acpica/acpi_hpet.c > > Modified: head/include/Makefile > ============================================================================== > --- head/include/Makefile Fri Oct 24 17:40:32 2014 (r273597) > +++ head/include/Makefile Fri Oct 24 18:39:15 2014 (r273598) > @@ -159,6 +159,8 @@ copies: > cd ${.CURDIR}/../sys/dev/acpica; \ > ${INSTALL} -C -o ${BINOWN} -g ${BINGRP} -m 444 acpiio.h \ > ${DESTDIR}${INCLUDEDIR}/dev/acpica > + ${INSTALL} -C -o ${BINOWN} -g ${BINGRP} -m 444 acpi_hpet.h \ > + ${DESTDIR}${INCLUDEDIR}/dev/acpica > cd ${.CURDIR}/../sys/dev/agp; \ > ${INSTALL} -C -o ${BINOWN} -g ${BINGRP} -m 444 agpreg.h \ > ${DESTDIR}${INCLUDEDIR}/dev/agp > @@ -243,7 +245,7 @@ symlinks: > done > .endfor > cd ${.CURDIR}/../sys/dev/acpica; \ > - for h in acpiio.h; do \ > + for h in acpiio.h acpi_hpet.h; do \ > ln -fs ../../../../sys/dev/acpica/$$h \ > ${DESTDIR}${INCLUDEDIR}/dev/acpica; \ > done > > Modified: head/sys/dev/acpica/acpi_hpet.c > ============================================================================== > --- head/sys/dev/acpica/acpi_hpet.c Fri Oct 24 17:40:32 2014 (r273597) > +++ head/sys/dev/acpica/acpi_hpet.c Fri Oct 24 18:39:15 2014 (r273598) > @@ -35,11 +35,13 @@ __FBSDID("$FreeBSD$"); > #include "opt_apic.h" > #endif > #include <sys/param.h> > +#include <sys/conf.h> > #include <sys/bus.h> > #include <sys/kernel.h> > #include <sys/module.h> > #include <sys/proc.h> > #include <sys/rman.h> > +#include <sys/mman.h> > #include <sys/time.h> > #include <sys/smp.h> > #include <sys/sysctl.h> > @@ -106,6 +108,23 @@ struct hpet_softc { > char name[8]; > } t[32]; > int num_timers; > + struct cdev *pdev; > + int mmap_allow; > + int mmap_allow_write; > + int devinuse; > +}; > + > +static d_open_t hpet_open; > +static d_close_t hpet_close; > +static d_mmap_t hpet_mmap; > + > +static struct cdevsw hpet_cdevsw = { > + .d_version = D_VERSION, > + .d_flags = D_TRACKCLOSE, > + .d_name = "hpet", > + .d_open = hpet_open, > + .d_close = hpet_close, > + .d_mmap = hpet_mmap, > }; > > static u_int hpet_get_timecount(struct timecounter *tc); > @@ -317,6 +336,47 @@ hpet_find_irq_rid(device_t dev, u_long s > } > } > > +static int > +hpet_open(struct cdev *cdev, int oflags, int devtype, struct thread *td) > +{ > + struct hpet_softc *sc; > + > + sc = cdev->si_drv1; > + if (!sc->mmap_allow) > + return (EPERM); > + if (atomic_cmpset_32(&sc->devinuse, 0, 1) == 0) > + return (EBUSY); This is extra-weird. The devinuse business disallows simultaneous opens, which prevents other process from opening and mapping. But if the original caller does mmap and close, second process now is allowed to open and mmap. That said, why do you need this devinuse at all ? > + else > + return (0); > +} > + > +static int > +hpet_close(struct cdev *cdev, int fflag, int devtype, struct thread *td) > +{ > + struct hpet_softc *sc; > + > + sc = cdev->si_drv1; > + sc->devinuse = 0; > + > + return (0); > +} > + > +static int > +hpet_mmap(struct cdev *cdev, vm_ooffset_t offset, vm_paddr_t *paddr, > + int nprot, vm_memattr_t *memattr) > +{ > + struct hpet_softc *sc; > + > + sc = cdev->si_drv1; > + if (offset > rman_get_size(sc->mem_res)) > + return (EINVAL); > + if (!sc->mmap_allow_write && (nprot & PROT_WRITE)) > + return (EPERM); > + *paddr = rman_get_start(sc->mem_res) + offset; What is the memattr for the backing page ? Is it set to non-cached mode somehow ? I was not able to find place where would this happen. > + > + return (0); > +} > + > /* Discover the HPET via the ACPI table of the same name. */ > static void > hpet_identify(driver_t *driver, device_t parent) > @@ -701,6 +761,26 @@ hpet_attach(device_t dev) > maxhpetet++; > } > } > + > + sc->pdev = make_dev(&hpet_cdevsw, 0, UID_ROOT, GID_WHEEL, > + 0600, "hpet%d", device_get_unit(dev)); > + if (sc->pdev) { > + sc->pdev->si_drv1 = sc; > + sc->mmap_allow = 1; > + TUNABLE_INT_FETCH("hw.acpi.hpet.mmap_allow", > + &sc->mmap_allow); > + sc->mmap_allow_write = 1; > + TUNABLE_INT_FETCH("hw.acpi.hpet.mmap_allow_write", > + &sc->mmap_allow_write); > + SYSCTL_ADD_INT(device_get_sysctl_ctx(dev), > + SYSCTL_CHILDREN(device_get_sysctl_tree(dev)), > + OID_AUTO, "mmap_allow", > + CTLFLAG_RW, &sc->mmap_allow, 0, > + "Allow userland to memory map HPET"); Why is mmap_allow is per-instance, while mmap_allow_write taken from the global tunable ? > + } else > + device_printf(dev, "could not create /dev/hpet%d\n", > + device_get_unit(dev)); > + > return (0); > } >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20141024192016.GD1877>