Date: Wed, 8 May 2019 01:45:42 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: John Baldwin <jhb@freebsd.org> Cc: Mark Johnston <markj@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r347063 - head/sys/kern Message-ID: <20190508001838.B1127@besplex.bde.org> In-Reply-To: <52484f6b-fdae-565b-6c03-37a63d56ad30@FreeBSD.org> References: <201905032126.x43LQilu092655@repo.freebsd.org> <335d828e-ac61-bc59-bac3-f80f27b951c7@FreeBSD.org> <20190506184502.GA35464@raichu> <52484f6b-fdae-565b-6c03-37a63d56ad30@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 6 May 2019, John Baldwin wrote: > On 5/6/19 11:45 AM, Mark Johnston wrote: >> On Mon, May 06, 2019 at 11:07:18AM -0700, John Baldwin wrote: >>> On 5/3/19 2:26 PM, Mark Johnston wrote: >>>> Author: markj >>>> Date: Fri May 3 21:26:44 2019 >>>> New Revision: 347063 >>>> URL: https://svnweb.freebsd.org/changeset/base/347063 >>>> >>>> Log: >>>> Disallow excessively small times of day in clock_settime(2). This actually disallows negative timespecs in clock_settime(), with collateral disallowment for small positive times. I don't like disallowing either. The bounds checking in the clock code is badly implemented, so it crashes on such times even if the clock hardware supports them, but clock_settime() is privileged and root never makes mistakes. This breaks: - testing of setting negative times - my enclosed test to demonstrate bugs in the bounds checking. >>>> >>>> Reported by: syzkaller >>>> Reviewed by: cem, kib >>>> MFC after: 1 week >>>> Sponsored by: The FreeBSD Foundation >>>> Differential Revision: https://reviews.freebsd.org/D20151 >>>> >>>> Modified: >>>> head/sys/kern/kern_time.c >>>> >>>> Modified: head/sys/kern/kern_time.c >>>> ============================================================================== >>>> --- head/sys/kern/kern_time.c Fri May 3 21:13:09 2019 (r347062) >>>> +++ head/sys/kern/kern_time.c Fri May 3 21:26:44 2019 (r347063) >>>> @@ -412,7 +412,9 @@ kern_clock_settime(struct thread *td, clockid_t clock_ >>>> if (ats->tv_nsec < 0 || ats->tv_nsec >= 1000000000 || >>>> ats->tv_sec < 0) >>>> return (EINVAL); >>>> - if (!allow_insane_settime && ats->tv_sec > 8000ULL * 365 * 24 * 60 * 60) >>>> + if (!allow_insane_settime && >>>> + (ats->tv_sec > 8000ULL * 365 * 24 * 60 * 60 || >>>> + ats->tv_sec < utc_offset())) >>>> return (EINVAL); The test of the upper bound is broken. It not only uses the long long abomination, but uses large unsigned type which gives unsigned poisoning for all supported sizes of time_t. So negative times were already broken here (without allow_insane_settime). They are converted to large positive times so are treated as insanely large. utc_offset() uses a too-small signed type with no bounds checking so it has overflow bugs but doesn't give unsigned poisoning. time_t is signed on all supported arches. It could be uint32_t so as to work until 2106 on 32-bit arches, but this would give unsigned poisoning and there is too much broken code for changing it to that to just work. So the new code doesn't have unsigned poisoning, and it disallows most negative values even when utc_offset() is negative. >>>> /* XXX Don't convert nsec->usec and back */ >>>> TIMESPEC_TO_TIMEVAL(&atv, ats); >>> >>> Pardon my ignorance, but I can't see why you are checking against utc_offset() >>> vs some small constant? None of the discussion in the review mentioned the >>> reason for using this particular value, and I didn't see any comparisons >>> against utc_offset or kernadjtz in kern_clock_setttime() or settime() that >>> would have underflowed or panicked. Can you give a bit more detail on why >>> utc_offset() is the lower bound? Thanks. >> >> I chose it because we subtract utc_offset() from the time passed in to >> clock_settime(); see settime_task_func(). That subtraction caused the >> underflow that later caused the observed panics. Not underflow, but overflow. Underflow is when rounding a small nonzero floating point value gives a denormal value of 0. This was the main overflow bug demonstrated by my test program. > Ok, thanks. A few things I didn't see anyone else note in the review then: > > 1) This subtraction is actually not done for all rtc drivers, so it seems > like we might block small times for RTC clocks that set > CLOCKF_GETTIME_NO_ADJ. It is probably bogus to check it for them. The check is misplaced for them -- this level can't know the details. > 2) utc_offset can be negative for machines using local time in timezones > "before" UTC. This is why it is less bad to check utc_offset() than 0. Otherwise, it would be impossible to test setting the time to the Epoch in these timezones. Hmm, the offset is confusing, especially its sign. In these timezones, when the time is the Epoch, the local time is before the Epoch so the RTC hardware needs to be able to represent year 1969 even if utc_offset() is sane. But some clock drivers convery 1969 to 2069, and the sanity check in clock_ct_to_ts() is insane -- it doesn't allow years before 1970, although signed time_t has no problems before year 1901. So testing of setting the time to the Epoch in such timezones would show that even if the RTC can be set to 1969, the kernel time can't be restored to the Epoch from the RTC. I don't know of any panics from buggy bounds checks for this. The worst that can happen is that adjkerntz is preposterous. Then utc_offset() is preposterous and adding it can change a 32-bit time_t to anything and a 64-bit time_t by a lot > I suppose we don't think any FreeBSD machines actually need to set the > running clock to 0 anyway so fixing it here rather than rejecting invalid > values only for RTCs that can't handle it is probably ok, but the > connection doesn't feel obvious that we are rejecting times that might > be non-representable in RTCs. We do some up-front checks because the layering for passing up details for individual RTCs would be onerous. I only like the up-front check for the year being <= 2037 even with 64-bit time_t, and some for hours/minutes/seconds/etc being in range. libkern still has horrible inline functions for bcd conversions, with errors mishandled by KASSERT()s, because passing up the errors would be to hard. But the KASSERT()s are worse. I think they are (mostly?) unnecessary now, since most bcd conversions are in clock_bcd_to_ts() which does correct bounds checking resulting in error EINVAL. Move all the bcd conversions to there after the validbcd() checks and it would be obvious that the KASSERT()s are not needed. Here is my test program. It was last edited on Dec 14 1901 (sic), after using itself to set the time back to then. I didn't report the bug then, but reported it a year or 2 ago. XX #include <sys/time.h> XX #include <limits.h> XX XX int XX main(void) XX { XX struct timeval tv; XX struct timezone tz; XX int r; XX XX r = gettimeofday(&tv, &tz); XX if (r != 0) XX err(1, "gettimeofday"); XX tv.tv_usec = 0; XX #ifdef NEG Compile with -DNEG to get the panic in old versions. XX tv.tv_sec = 0; XX /* XX * The RTC time wants to be tv_sec + adjkerntz - tz_minuteswest. XX * Make this go negative (and panic) by putting a positive value XX * in tz_minuteswest large enough to kill any non-perverse value XX * in adjkerntz. We can get the panic (and lots of races) more XX * easily using sysctl to put perverse values in adjkerntz before XX * calling settimeofday. XX */ XX tz.tz_minuteswest = 24 * 60; /* 1 day before 1970 */ Someone axed tz, so before this commit, to get the panic it was necessary to set adjkerntz using its sysctl instead of setting tz using settimeofday(). After this commit, I think the panic is "fixed". Set adjkerntz to test the fix. XX #elif YETANOTHERBUG XX tv.tv_sec = INT_MAX - 2; /* works */ XX tv.tv_sec = INT_MAX - 1; /* should give 2038; actually 1901 */ XX tv.tv_sec = INT_MAX; /* should give 2038; actually 1901 */ XX tz.tz_minuteswest = 0; XX #else XX tv.tv_sec = INT_MAX - 2; XX tz.tz_minuteswest = -24 * 60; /* 1 day after end of 32-bit time_t */ XX #endif XX r = settimeofday(&tv, &tz); Everything needs to use adjkerntz instead of tz. tz.tz_minuteswest gave extra possibilities for overflow, but the above only sets it to +- 1 day so give a small nonzero combined setting after adding any sane value adjkerntz, to avoid touching adjkerntz directly. Without tz, just use adjkerntz instead. XX if (r != 0) XX err(1, "1st settimeofday"); XX /* XX * One of many bugs in settimeofday() is that its tz_minuteswest XX * setting doesn't apply until the next call. Make this call. XX */ XX r = settimeofday(&tv, &tz); Another bug that was fixed by removing tz. XX if (r != 0) XX err(1, "2nd settimeofday"); XX return (0); XX } Insane values in adjkerntz can be used to overflow to negative years: set the RTC to year 2037, or just leave it as 2019 and use a larger adjkerntz. Then adding 1 year in adjkerntz gives 2038 which overflows to ~1901. Or start near midnight in 2037; then even a sane positive utc_offset() gives year 2038, but doesn't overflow 32-bit time_t since the overflow occurs in the middle of the year. This overflow is in the misnamed function subr_rtc.c:read_clocks(). It blindly adds the unchecked utc_offset() to the checked CLOCK_GETTIME() timespec. Root never makes mistakes, but can probably demonstrate panics here even more easily that using clock_settime(), since sysctl(8) is much easier to use than clock_settime(2). Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190508001838.B1127>