Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 Aug 2014 22:57:20 +1000 (EST)
From:      Ian Smith <smithi@nimnet.asn.au>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        Anthony Jenkins <Anthony.B.Jenkins@att.net>, freebsd-acpi@freebsd.org
Subject:   Re: [PATCH] ACPI CMOS region support - submission?
Message-ID:  <20140808185539.V62404@sola.nimnet.asn.au>
In-Reply-To: <20140806040804.P849@besplex.bde.org>
References:  <53DD61BD.7050508@att.net> <20140805175541.G62404@sola.nimnet.asn.au> <20140806040804.P849@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 6 Aug 2014 06:49:12 +1000, Bruce Evans wrote:
 > On Tue, 5 Aug 2014, Ian Smith wrote:
 > 
 > > On Sat, 2 Aug 2014 18:10:05 -0400, Anthony Jenkins wrote:
 > > 
 > > > I made a few minor changes since the last incarnation:
 > > >
 > > >  - Defined the CMOS address/data register addresses as macros
 > > >  - Defined the (apparent) I/O delay as a macro
 > > 
 > > Looks good, Anthony.
 > > 
 > > > I also verified the ACPI CMOS region code only accesses up to
 > > > register 63 (0x3F - in previous emails I mistakenly said 0x7F).
 > > 
 > > Is 0x3f what the ACPI spec limits ACPI access to?
 > > 
 > > This is mildly surprising, as all modern RTC chips are at least 128
 > > bytes; noone's actually used a MC146818 in over ten years, I gather.
 > 
 > BIOSes at least used to use much more.  It is good for ACPI to not allow
 > clobbering the BIOS state.
 > 
 > > > If/when this gets in, I'd like to add sysctl controls to e.g. allow
 > > > ACPI access to the date/time registers (I currently return failure
 > > > when attempting to write them via ACPI).  I don't see anything in the
 > > > spec (after re-reading it) that disallows ACPI from touching those.
 > > 
 > > I don't know about the spec, but I can't see allowing ACPI write access
 > > to time/alarm/date registers as being a sensible idea; this could allow
 > > completely out-of-band modification of time/alarm/date regs without the
 > > necessary precautions needed for setting these, namely use of the SET
 > > bit in status reg B (0x0b) to disable updates while setting time & date,
 > > and anyway why allow changing time/date without the OS knowing about it?
 > > Reading should be no problem, though without proper observance of UIP
 > > bit timing, data may not be consistent.
 > 
 > They might need to be read on resume.  I can't see how resume can possibly
 > work without reading them or some clock to reset the time.  But this should
 > be done by calling the standard time reading functions, not at a low level.

Well, we've long - not sure about 'always' - saved current timestamp to 
RTC clocktime (ts_to_ct) on suspend and recalled on resume (ct_to_ts), 
shown when bootverbose is enabled, x86 anyway; this has nothing to do 
with ACPI.

Actually, 9.x and presumably 10.x needs debug.clocktime=1 to see the 
messages re these, previously (eg below from 8.2-R) you see these by 
default every 30 minutes, having booted verbose, which was annoying:

Aug  8 00:41:32 t23 kernel: ts_to_ct(1407458492.000380514) = [2014-08-08 00:41:32]
Aug  8 01:11:31 t23 kernel: ts_to_ct(1407460291.799401540) = [2014-08-08 01:11:31]
Aug  8 01:41:31 t23 kernel: ts_to_ct(1407462091.598502848) = [2014-08-08 01:41:31]
Aug  8 02:11:31 t23 kernel: ts_to_ct(1407463891.397341911) = [2014-08-08 02:11:31]
Aug  8 02:41:31 t23 kernel: ts_to_ct(1407465691.196399111) = [2014-08-08 02:41:31]
Aug  8 03:11:31 t23 kernel: ts_to_ct(1407467490.995395001) = [2014-08-08 03:11:30]
Aug  8 03:41:30 t23 kernel: ts_to_ct(1407469290.794307971) = [2014-08-08 03:41:30]

This reveals some warts; seconds are always rounded down, so that on 
resume you can expect, on average, to lose half a second; worst case 
almost a second.  Thus the first ntpd correction after (even immediate) 
resume is often positive, maybe more than a second, sometimes two or so.

Similarly when restoring time from RTC on resume, there's no synching 
with UIP so, on average, another half second will be dropped before ntpd 
picks up the pieces.  Whether sync is worth the bother or time is moot.

 > > I guess there may be a case for messing with the alarm bytes, though we
 > > don't currently allow use of the alarm for wake from poweroff (S5) or
 > > STR (S3) states, as doth Linux.  I looked into this some years ago when
 > > there were a few requests for the feature, however it would involve some
 > > major reworking to always preserve the alarm interrupt bit through init
 > > and suspend/resume, and appropriate clearing of same after use, along
 > > with a utility to setup the alarm .. a potential to leave open, perhaps?
 > 
 > I use the seconds alarm for pps, but don't use suspend.

Is it accurate enough for pps?  Are you using ntpd to update your RTC?

Ok, and others may want to use alarm or periodic interrupts for whatever 
purpose, possibly at much higher rates than once per second, so there's 
still a case for maintaining the reg select optimisation noted below ..

 > > Which leads to my other concern here: that you are allowing out-of-band
 > > access to especially status reg C (0x0c), which when read clears all
 > > enabled interrupts, and the other status regs whose settings are also
 > > too important to allow screwing with.
 > 
 > Yes, reading it would break interrupt handling.
 > 
 > > We used to use the RTC periodic interrupt as a clock source for a 128Hz
 > > interrupt (1024Hz while profiling) but since mav@'s reworking of all the
 > > clocks and timers sometime prior to 9.0-REL (IIRC), that's now rarely if
 > > ever used as such, but remains an available clock or timer source.
 > > 
 > > Reg A (0x0a) is r/w and sets clock divider and rate selection bits.  Do
 > > we want ACPI to be able to mess with these?  I think not.  Readonly, ok.
 > > 
 > > Reg B (0x0b) is r/w and contains the aforementioned SET bit, the three
 > > interrupt enable bits (PIE, AIE, UIE), the SQWE, DM, 24/12 and DSE bits,
 > > again none of which should be allowed to be written to out-of-band.
 > > 
 > > And reg C (0x0c) is read-only, and as mentioned clears interrupts; again
 > > not something that should be permitted without the OS knowing about it.
 > > 
 > > I suppose you have the MC146818 spec to hand?
 > > 
 > > Couple of things from your patch:
 > > 
 > > -static	int	rtc_reg = -1;
 > > 
 > > Indeed; I never could figure out the advantage in this, as how rarely
 > > would you want to read or write the same reg twice in a row anyway?
 > 
 > About 99.9999% of accesses are to the status register for handling RTC
 > interrupts.  The RTC is almost never used except for the 128Hz interrupt,
 > so the index register is normally constant at the single value used by
 > this interrupt (RTC_INTR).  With mav's changes, it is now rarely used.

Right, I had completely missed its purpose.  So perhaps did Anthony?

 > Removing this is a good re-pessimization.  rtcin() does 4 i/o's with the
 > pessimization.  This takes about 5 usec.  At 128 Hz this only wastes
 > 0.064% of a CPU.  At 1024 Hz it wastes 0.512%.  The non-pessimized
 > version only does 1 i/o so it wastes 1/4 as much.  The wastage should
 > be much larger.  1024 Hz is far too slow for profiling a modern system.
 > It is about right for a 4.77 MHz 8086, except the RTC overhead is too
 > high for that (I used the i8254).  Scaling this for a 2 GHz modern CPU
 > gives 429 kHz (or more, since instructions per cycle is more).  The
 > overhead for that with the pessimization would be 214% of a CPU.
 > Without the pessimization, it would only be 53.7% of a CPU.  Still too
 > much.  The RTC is just unsuitable for profiling interrupts at a useful
 > rate.  The i8254 is better and the lapic timer is much better, but
 > profhz has only been increased to 8192.  Other things don't scale so
 > well.  There are other i/o's for interrupts that make it difficult
 > to go much higher than 429 KHz though 429 Khz is certainly possible
 > using less than 100% of a CPU.

I'm struggling to see why it's a 'good' re-pessimization?  You suggest 
below that adding another one or two I/O delays would be even better?

 > > static	u_char	rtc_statusa = RTCSA_DIVIDER | RTCSA_NOPROF;
 > > static	u_char	rtc_statusb = RTCSB_24HR;
 > > 
 > > +static void acpi_cmos_read(ACPI_PHYSICAL_ADDRESS address, UINT8 *buf,
 > > UINT32 buflen)
 > > {
 > > +	UINT32 offset;
 > 
 > Please no Windows typedefs in non-Windows code.

I don't write in C at all, so I'll let style comments through to the 
keeper .. in this case Anthony :)

 > It is a minor layering violation to have acpi functions in non-acpi drivers.
 > At least use the driver's spellings and not the Windows ones in drivers.

I originally thought the former also, but Anthony has been encouraged by 
someone here to add the ACPI CMOS Region methods here.

Linux has a separate driver for this, acpi_cmos_rtc.c, on top of its 
usual rtc-cmos.c, which is itself way more complex than ours, much of 
that in supporting wake-on-alarm but also supporting 256 bytes of NVRAM, 
which uses a second MC146818-like chip - so still only using 0-0x7f byte 
access on each one.  Linux also just ignores the issue of NMIs; quote:

/* Most newer x86 systems have two register banks, the first used
 * for RTC and NVRAM and the second only for NVRAM.  Caller must
 * own rtc_lock ... and we won't worry about access during NMI.
 */

 > > 	RTC_LOCK;
 > > 
 > > +	for (offset = 0U; offset < buflen; ++offset) {
 > > +		IO_DELAY();
 > > +		outb(IO_RTC_ADDR, ((address + offset) | 0x80) & 0xFF);
 > > +		IO_DELAY();
 > > +		buf[offset] = inb(IO_RTC_DATA);
 > 
 > This should just call rtcin() in a loop instead of breaking it.  Then
 > there would be no need to put this in the driver.

Yes, I think that'd be the right way to go; leave rtcin() and writertc() 
alone, warts and all allowing r/w of locations 0-0xff for its consumers, 
and as you say, calling rtcin() in the loop.  This preserves POLA and 
further would allow moving the ACPI stuff to anywhere? more appropriate, 
I guess ACPICA when that's updated to ACPI 5 or wherever this came in.

 > > Sorry, but I don't get '| 0x80' there, and below in acpi_cmos_write ?
 > 
 > 0x80 disables NMI on some systems (mostly old ones?).  Old books (Van
 > Gilluwe) say to use it.  FreeBSD never bothered with it.  Locking makes
 > it unecessary, modulo likely bugs in the NMI handler.
 > 
 > It is also necessary disable ordinary interrupts and be careful about
 > reenabling NMIs.  FreeBSD's locking disables ordinary interrupts as a
 > side effect of using spinlocks, except in my version where spinlocks
 > don't block fast interrupt handlers.  The remaining care is just to
 > turn NMIs back on in the RTC at the end of the loop before unlocking
 > (normally) reenables ordinary interrupts.  This releases any pending
 > NMIs.

Ok, but if we don't bother disabling NMI, no need to reenable it, right? 
And that bit 0x80 isn't 'in' the RTC, it must be separately decoded on 
the address bus to a latch or such ..

 > Spinlocks don't block NMIs, so the NMI handler must be careful not to
 > block endlessly on a spinlock or call any code that depends on spinlocks
 > for locking, like the code here.  Fast interrupt handlers should be
 > similarly careful, but aren't except in my version.
 > 
 > NMIs are rare except for pmc and SMP stopping, and the NMI handler never
 > bothered about correctness in FreeBSD.  isa_nmi() for x86 normally just
 > prints things (using log()) and returns for the system to panic.
 > Printing is supposed to work when called in the "any" context, but it
 > is quite broken now.  It has locks galore.  When an NMI occurs while
 > one of these locks is held, then the NMI handler deadlocks on the lock
 > if it is a (non-recursive) spinlock held by the same CPU.  If the lock
 > is held by another CPU, then there is some chance that the lock will
 > go away, and the bug is just that the printing is delayed.  If the
 > lock is a recursive spinlock held by the same CPU, then the locking
 > just doesn't work.  The buggy locking in printf() mostly uses non-
 > recursive spinlocks.
 > 
 > The sloppiness in the NMI handler is not much of a problem for the RTC.
 > The NMI handler just doesn't normally go near the RTC code, so it won't
 > deadlock.  Unless there is something like a resettodr() call for shutdown
 > and panic() stumbles into this.

Interesting, but I'm just going to take your word for all of that :)

 > > +int
 > > +rtcin(int reg)
 > > +{
 > > +	u_char val;
 > > +
 > > +	acpi_cmos_read(reg & 0xFF, &val, 1);
 > 
 > rtcin() is pessimized too.
 > 
 > The pessimization factor must also be increased from 4 to 5 or 6 (1
 > more i/o to turn off the NMI disable bit at the end, and probably
 > another to delay after this).

But we really don't want to get into the NMI handling, unless getting 
right into it, yeah?  I suspect right now it'd be added complication.

 > > ...
 > > static int
 > > +is_datetime_reg(ACPI_PHYSICAL_ADDRESS address)
 > > +{
 > > +	return address == 0x00 ||
 > > +		address == 0x02 ||
 > > +		address == 0x04 ||
 > > +		address == 0x04 ||
 > > +		(address >= 0x06 && address <= 0x09);
 > > +}
 > 
 > Lots of style bugs.  The KNF style is:
 > 
 > 	<empty line after null declarations>
 > 	return (address == 0x00 || address == 0x02 || address == 0x04 ||
 > 	    address == 0x04 || (address >= 0x06 && address <= 0x09));
 > 
 > This includes:
 > - silly parentheses around return values
 > - no splitting of statements after every subexpression to make split
 >   statements even more split
 > - continuation indent is 4 spaces.

<Not waving my bat anywhere near any of those balls>

 > > I guess that second 0x04 should be 0x06?  But why are you limiting to
 > > even addresses, ie time regs but not the alarm regs?  If anything, the
 > > alarm regs seem more likely something BIOS/AML may? want to know about?

Actually, I missed the negative connotation of is_datetime_reg; it does 
in fact attempt to allow only access to the alarm regs, though using
multibyte access might defeat that.  But further thoughts on what ACPI 
should or shouldn't have write access to, in a later message to Anthony.

 > > +static ACPI_STATUS
 > > +acpi_rtc_cmos_handler(UINT32 function, ACPI_PHYSICAL_ADDRESS address,
 > > UINT32 width,
 > > +			    UINT64 *value, void *context, void
 > > *region_context)
 > 
 > More style bugs.  Now the line splitting doesn't even keep the line
 > short enough, and the continuation indentation isn't even 1 tab or gnu style
 > (-lp).
 > 
 > > +{
 > > +	struct atrtc_softc *sc;
 > > +
 > > +	sc = (struct atrtc_softc *)context;
 > > +	if (!value || !sc)
 > > +		return AE_BAD_PARAMETER;
 > > +	if (width > 32 || (width & 0x07) || address >= 64U)
 > > +		return AE_BAD_PARAMETER;
 > 
 > Missing some Windows bad style (U suffix on 64 but not on 32).
 > 
 > > 
 > > Ok, address is limited to 0x3f here, though strictly address should be
 > > less than (64 - width>>3) for multibyte access?  I guess non-ACPI access
 > > via rtcin() and writertc() will still be able to access registers, well,
 > > from 0 to 0xff ..
 > > 
 > > +	if (function == ACPI_WRITE &&
 > > +			(is_datetime_reg(address) ||
 > > +				 (width > 8 && address <= 0x09)))
 > > +		return AE_BAD_PARAMETER;
 > 
 > Larger indentation errors than above.  The missing silly parentheses on
 > return values seem to be consistent.
 > 
 > > 
 > > I'd exclude AT LEAST 0x0a - 0x0d for write, certainly 0x0c from read.
 > > 
 > > + printf("%s: %-5s%02u address=%04lx value=%08x\n", + __FUNCTION__,
 > > function == ACPI_READ ? "READ" : "WRITE", + width >> 3, address,
 > > *((UINT32 *)value));
 > > + return AE_OK;
 > > 
 > > The printf is good to see.  Maybe you'd eventually want to limit this to
 > > when verbose booting is on?  It would be useful to see with suspend and
 > > resume messages, or anywhere else ACPI wanted R/W access to 'CMOS' RAM?
 > 
 > It's so ugly it is obviously only for debugging :-).
 > 
 > > I tried hunting down a list of what CMOS locations various systems use,
 > > and find some anecdotal info but nothing like 'manuf X uses bytes Y-Z'.
 > 
 > Why do we have to care about acpi doing unsafe accesses?  Only the
 > block access seems dangerous.  With scattered accesses like
 > rtcin(RTC_FOO) using the system (non-acpi) spelling for 'FOO', it is
 > very easy to see any dangerous ones.

You can't easily 'see' what BIOS AML code is doing without examining it 
specifically, and it's both vendor- and user-updateable.  This is why I 
think it's important to print exactly which parts of RTC/NVRAM are being
accessded, apart from developing some doc on what different vendors do.

 > rtcin() used to be only used for
 > memory sizing and in the fd driver (for dubious disk equipment byte
 > reads; the equipment bytes just tell you about equipment that systems
 > might have had in 1985, so it is almost useless, while other BIOS bytes
 > are even more useless since they are not standard).  It is now used in:
 > - nvram driver.  This seems to allow anyone with access to the device
 >   to read and/or write the offsets 14-127.  So they can manage or
 >   corrupt some BIOS bytes but not the ones used by FreeBSD :-).  acpia
 >   should be even more trustable, but might need read access to the
 >   clock bytes.

I don't think it makes sense.  We already look after writing datetime to 
RTC on suspend, and on ntpd corrections.  We read it back on boot, or 
resume, albeit both ways ignoring synching seconds properly.  Why would 
we want to let AML code use OS ACPI services to write to those fields, 
without having actually requested a service, when the OS thinks it is in 
charge of managing time?  Perhaps I just have a naive view on this?

Other bytes in (so-called) NVRAM that a particular machine wants saved 
to and perhaps later reloaded from there across boot or suspend/resume - 
that this code is designed to deal with in the first place - is all very 
well, and _perhaps_ some BIOS might want to manage updating the alarm 
bytes this way (although normally configured in BIOS SETUP), but any 
potential to mess with date or time itself seems out-of-bounds to me.

Further to /sys/dev/nvram/nvram.c .. it allows access to bytes 0x0e 
(EQUIPMENT) through 0x7f, but also checks then recalculates the checksum 
for bytes 0x10 through 0x2d, stored in 0x2e and 0x2f.  So if ACPI wants 
to write anywhere in that area, it has to update the checksum too.  We 
still have yet to see examples of reads or writes that this code allows.

 > - fb driver.  This reads the video equipment byte.  This is so hackish
 >   that RTC_EQUIPMENT is still not defined in rtc.h but is defined here.
 >   EQUIPMENT is the standard name for this byte (see van Gilluwe), but
 >   it is not the only equipment-related byte.  There is also the diskette
 >   equipment byte.  Many other equipment-related bytes that worked in
 >   1985 are also not defined in rtc.h.  They are also not used in FreeBSD.
 > - acpi_support/icpi_ibm.c uses a large subset of RTC space.  It uses
 >   rtcin(), but has its own defines for 6 RTC registers between 0x64 and
 >   0x6e (for things like brightness and volume).  Obviously very
 >   vendor-specific.

Yes, which is why I went looking for what other vendors want to use, to 
no avail.  None of our other acpi_support modules appear to use NVRAM, 
or at least, via rtcin() and writertc().

 > - i386/xen/clock.c is a copy of the old x86 clock.c.  It duplicates
 >   almost everything including the badly named readrtc() BCD translation
 >   wrapper, but uses the systtem rtcin() and writertc().
 > Write accesses used to be protected by writerc() being static, but this
 > was broken by exporting it for nvram and xen.

At least nothing outside /sys/ accesses rtcin() or writertc().

One other thing; I don't really know where to go to check Linux code; 
google landed me at mit.edu a lot so I gathered bits there (Linux 3.x) 
but couldn't find a modern definition for tHeir CMOS_READ / CMOS_WRITE 
macros.  This looks too old, maybe:

http://members.xoom.virgilio.it/ozma/mulinux/Download/El-Torito_iso/usr.163
from 1993 ..

#define CMOS_READ(addr) ({ \
outb_p((addr),RTC_PORT(0)); \
inb_p(RTC_PORT(1)); \
})
#define CMOS_WRITE(val, addr) ({ \
outb_p((addr),RTC_PORT(0)); \
outb_p((val),RTC_PORT(1)); \
})

.. but if that's all, why are we bothering with any I/O delay at all?  
Or do outb_p() and inb_p() take care of that?  Yes, too many questions!

Thanks for all the fish.

cheers, Ian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140808185539.V62404>