Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Oct 2014 22:45:46 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Rui Paulo <rpaulo@me.com>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Rui Paulo <rpaulo@FreeBSD.org>
Subject:   Re: svn commit: r273598 - in head: include sys/dev/acpica
Message-ID:  <20141024194546.GE1877@kib.kiev.ua>
In-Reply-To: <33decfcd-e77c-4e4c-8161-9f4a232213c6@me.com>

index | next in thread | previous in thread | raw e-mail

On Fri, Oct 24, 2014 at 07:33:07PM +0000, Rui Paulo wrote:
> On Oct 24, 2014, at 12:20 PM, Konstantin Belousov <kostikbel@gmail.com> wrote:
> 
> On Fri, Oct 24, 2014 at 06:39:16PM +0000, Rui Paulo wrote:
> > Author: rpaulo
> > Date: Fri Oct 24 18:39:15 2014
> > New Revision: 273598
> > URL: https://svnweb.freebsd.org/changeset/base/273598
> > 
> > Log:
> > HPET: create /dev/hpetN as a way to access HPET from userland.
> > 
> > In some cases, TSC is broken and special applications might benefit
> > from memory mapping HPET and reading the registers to count time.
> > Most often the main HPET counter is 32-bit only[1], so this only gives
> > the application a 300 second window based on the default HPET
> > interval.
> > Other applications, such as Intel's DPDK, expect /dev/hpet to be
> > present and use it to count time as well.
> > 
> > Although we have an almost userland version of gettimeofday() which
> > uses rdtsc in userland, it's not always possible to use it, depending
> > on how broken the multi-socket hardware is.
> Yes, and hpet userland mapping would be better handled through the same
> fake-vdso framework. As designed, it has discriminator to inform
> userspace about algorithm, and can happilly utilize HPET timecounter
> automatically mapped by kernel into the process address space.
> š
> I'm aware of that, but I found the vdso a bit confusing and decided to work on that later.
> 
> > +static int
> > +hpet_open(struct cdev *cdev, int oflags, int devtype, struct thread *td)
> > +{
> > + š š šstruct hpet_softc *sc;
> > +
> > + š š š šsc = cdev->si_drv1;
> > + š š š šif (!sc->mmap_allow)
> > + š š š š šreturn (EPERM);
> > + š š šif (atomic_cmpset_32(&sc->devinuse, 0, 1) == 0)
> > + š š š š š šreturn (EBUSY);
> This is extra-weird.
> The devinuse business disallows simultaneous opens, which prevents
> other process from opening and mapping. But if the original caller
> does mmap and close, second process now is allowed to open and mmap.
> 
> That said, why do you need this devinuse at all ?
> š
> Hmm, I wanted to avoid multiple mmap's, but that doesn't work like you said. šI may just remove this restriction.š
> 
This is probably best.

> > +static int
> > +hpet_mmap(struct cdev *cdev, vm_ooffset_t offset, vm_paddr_t *paddr,
> > + int nprot, vm_memattr_t *memattr)
> > +{
> > + š šstruct hpet_softc *sc;
> > +
> > + š šsc = cdev->si_drv1;
> > + š š š šif (offset > rman_get_size(sc->mem_res))
> > + š š š š š š š šreturn (EINVAL);
> > + š šif (!sc->mmap_allow_write && (nprot & PROT_WRITE))
> > + š š šreturn (EPERM);
> > + š š š*paddr = rman_get_start(sc->mem_res) + offset;
> What is the memattr for the backing page ? Is it set to non-cached
> mode somehow ? I was not able to find place where would this happen.
> š
> I expect it to be set to non-cached since it's a device anyway, but I don't know where it is. šDuring my testing, I did not see any problems with cached values, though.
> 
I am not claiming that it is wrong, only that I do not see an easy reason
why it is right.  Just printing the *memattr would provide the confidence.

> > + šsc->pdev = make_dev(&hpet_cdevsw, 0, UID_ROOT, GID_WHEEL,
> > + š š š š 0600, "hpet%d", device_get_unit(dev));
> > + š š š šif (sc->pdev) {
> > + š š š š š š š šsc->pdev->si_drv1 = sc;
> > + š š šsc->mmap_allow = 1;
> > + š š š š š š š šTUNABLE_INT_FETCH("hw.acpi.hpet.mmap_allow",
> > + š š š š š š &sc->mmap_allow);
> > + š š š š š šsc->mmap_allow_write = 1;
> > + š š š š š šTUNABLE_INT_FETCH("hw.acpi.hpet.mmap_allow_write",
> > + š š š š š š š š &sc->mmap_allow_write);
> > + š š š š š š š šSYSCTL_ADD_INT(device_get_sysctl_ctx(dev),
> > + š š š š š š š š SYSCTL_CHILDREN(device_get_sysctl_tree(dev)),
> > + š š š OID_AUTO, "mmap_allow",
> > + š š š š š CTLFLAG_RW, &sc->mmap_allow, 0,
> > + š š š š š š š š "Allow userland to memory map HPET");
> Why is mmap_allow is per-instance, while mmap_allow_write taken from
> the global tunable ?
> š
> Are you asking why there's no sysctl for it?

IMO the allow-write must be controllable per-instance, or just managed
by /dev/hpet* unix permissions.  Having one global knob, which is
consulted at the module load, is not flexible.

What is the use-case for writing to HPET page ? To manually micro-adjust
the timer ?  Then, you probably want to disable write for instance used
by system timecounter or eventtimer, but allow for HPET utilized by
other consumers.


home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20141024194546.GE1877>