Date: Tue, 25 May 2021 14:34:10 +0200 From: =?UTF-8?Q?Dawid_G=C3=B3recki?= <dgr@semihalf.com> To: Cy Schubert <Cy.Schubert@cschubert.com> Cc: Marcin Wojtas <mw@semihalf.com>, Jessica Clarke <jrtc27@freebsd.org>, shawn.webb@hardenedbsd.org, Marcin Wojtas <mw@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, "dev-commits-src-all@freebsd.org" <dev-commits-src-all@freebsd.org>, "dev-commits-src-main@freebsd.org" <dev-commits-src-main@freebsd.org> Subject: Re: git: af949c590bd8 - main - Disable stack gap for ntpd during build. Message-ID: <CAGJeAm4_W70UPonp7MdmhsbX3zPCD4JKZBZknTzES-pp5p7NXg@mail.gmail.com> In-Reply-To: <202105211753.14LHrpAN004663@slippy.cwsent.com> References: <202105211334.14LDYqoa004343@gitrepo.freebsd.org> <04F25FD0-7863-4AC1-A257-EF0F1EB90659@freebsd.org> <CAPv3WKeV1Oz8Gbv0LBFD03J6k3k%2B2XMBEvi28DuMM8LVq8cjrQ@mail.gmail.com> <02078965-24BE-4F23-92D5-5E8E54A0C3E7@freebsd.org> <202105211446.14LEk8kZ009266@slippy.cwsent.com> <CAPv3WKe4O--Jne20ozpMfLe3XvyPZXawUx%2BLgvOF8bsDEVsa7g@mail.gmail.com> <202105211753.14LHrpAN004663@slippy.cwsent.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Cy, On Fri, May 21, 2021 at 10:53:51AM -0700, Cy Schubert wrote: > In message <CAPv3WKe4O--Jne20ozpMfLe3XvyPZXawUx+LgvOF8bsDEVsa7g@mail.gmai= l.c > om> > , Marcin Wojtas writes: > > Hi Cy, > > > > pt., 21 maj 2021 o 16:46 Cy Schubert <Cy.Schubert@cschubert.com> napisa= =C5=82(a): > > > > > > In message <02078965-24BE-4F23-92D5-5E8E54A0C3E7@freebsd.org>, Jessic= a > > > Clarke w > > > rites: > > > > > On 21 May 2021, at 15:11, Marcin Wojtas <mw@semihalf.com> wrote: > > > > > > > > > > Hi Jess > > > > > > > > > > pt., 21 maj 2021 o 15:39 Jessica Clarke <jrtc27@freebsd.org> napi= sa=C5=82(a > > ): > > > > >> > > > > >> On 21 May 2021, at 14:34, Marcin Wojtas <mw@FreeBSD.org> wrote: > > > > >>> > > > > >>> The branch main has been updated by mw: > > > > >>> > > > > >>> URL: https://cgit.FreeBSD.org/src/commit/?id=3Daf949c590bd8a00a= 5973b587 > > 5d7e > > > > 0fa6832ea64a > > > > >>> > > > > >>> commit af949c590bd8a00a5973b5875d7e0fa6832ea64a > > > > >>> Author: Marcin Wojtas <mw@FreeBSD.org> > > > > >>> AuthorDate: 2021-05-21 09:29:22 +0000 > > > > >>> Commit: Marcin Wojtas <mw@FreeBSD.org> > > > > >>> CommitDate: 2021-05-21 13:33:06 +0000 > > > > >>> > > > > >>> Disable stack gap for ntpd during build. > > > > >>> > > > > >>> When starting, ntpd calls setrlimit(2) to limit maximum size = of its > > > > >>> stack. The stack limit chosen by ntpd is 200K, so when stack = gap > > > > >>> is enabled, the stack gap is larger than this limit, which re= sults > > > > >>> in ntpd crashing. > > > > >> > > > > >> Isn=E2=80=99t the bug that the unusable gap counts as usage? > > > > >> > > > > >> Jess > > > > >> > > > > > > > > > > An alternative solution was submitted > > > > > (https://reviews.freebsd.org/D29832), so that to extend the limit= for > > > > > ntpd, but eventually it was recommended to simple disable the sta= ck > > > > > gap for it until it's fixed upstream (see the last comment in the > > > > > linked revision). > > > > > > > > That=E2=80=99s my point, there is nothing to =E2=80=9Cfix=E2=80=9D = upstream. NTPD uses less > > tha > > > > n 200K > > > > of stack, thus it is perfectly reasonable for it to set its limit t= o that > > . Th > > > > e > > > > fact that FreeBSD decides to count an arbitrary, non-deterministic = amount > > of > > > > additional unusable virtual address space towards that limit is not= its f > > ault > > > > , > > > > but a bug in FreeBSD that needs to be fixed as it=E2=80=99s entirel= y unreasonab > > le f > > > > or > > > > applications to have to account for that. > > > > > > This latest problem is not stack gap. It is PIE. > > > > > > > I have to disagree. > > We are talking cross purposes. Your examples later on in your email prove > my point. > > > ntpd does not start because of stack gap, not PIE, even though it may > > seem like PIE causes this. This is due to the fact that stack gap is > > disabled if PIE is disabled. Because of that value of sysctl > > kern.elf64.aslr.stack_gap does not matter when kern.elf64.aslr.pie_enab= le > > is set to 0. When pie_enabled is set to 1 and stack gap is enabled, the= n > > ntpd fails to start, but when pie_enabled is set to 1 and stack_gap > > is set to 0, then ntpd starts without any issue. We verified this on > > FreeBSD-CURRENT snapshot from 2021-05-20. > > I verified the PIE problem on a -CURRENT as of my comments in the review. > Enabling stack gap and disabling PIE resolved the issue. The reason for > stack gap is not a problem is that ntpd disables stack gap at line 441 of > ntpd.c. > > Furthermore enabling stack gap and disabling PIE circumvents the problem.= I > tested this myself and left that note in the review. > > Enable stack gap and disable PIE: It works. But look at line 441 of ntpd.= c > to see stack gap disabled before ntpd forks itself. The issue is caused by stack gap, not by PIE. However, it may seem like pie_enabled sysctl causes it. ASLR stack gap is only created if kern.elf.aslr.pie_enable is set to 1 when the binary has ET_DYN type. For ET_EXEC type, sysctl kern.elf.aslr.enable has to be set to 1 instead. Otherwise, the value of kern.elf.aslr.stack_gap will be ignored and it will work as if set to 0. The code governing this behavior can be found in sys/kern/imgact_elf.c lines 1175-1196, 1228-1232 and in sys/kern/kern_exec.c lines 1547-1557. About procctl - in FreeBSD there are in fact two different stack gaps. One is the stack gap located at the bottom of the stack, the second one has a random size and is located at the top of the stack. The second stack gap is related to ASLR, while the first exists to prevent stack overflow overwriting nearby mappings. Procctl only affects the first stack gap, the second one - which is causing the segfault - is not affected by procctl. > > > > > The fact that this is a stack gap issue can be verified using following > > procedure: > > 1. Install FreeBSD-CURRENT snapshot from 2021-05-20 using default > > configuration. > > 2. On a newly installed system start ntpd. With default configuration > > it should start successfully. > > 3. Set sysctl kern.elf64.aslr.pie_enable=3D1 and start ntpd. This time = ntpd > > should fail. An entry indicating that ntpd was killed because of signal > > 11 should be visible in /var/log/messages. > > 4. Set sysctl kern.elf64.aslr.stack_gap=3D0 and start ntpd once again. = This > > time ntpd should start even though pie_enable is set to 1. > > > > Exact log from the boot it was tested: > > root@freebsd-ntpd-test:~ # sysctl -a | grep aslr > > kern.elf32.aslr.stack_gap: 3 > > kern.elf32.aslr.honor_sbrk: 1 > > kern.elf32.aslr.pie_enable: 0 > > kern.elf32.aslr.enable: 0 > > kern.elf64.aslr.stack_gap: 3 > > kern.elf64.aslr.honor_sbrk: 1 > > kern.elf64.aslr.pie_enable: 0 > > kern.elf64.aslr.enable: 0 > > vm.aslr_restarts: 0 > > root@freebsd-ntpd-test:~ # ntpd > > root@freebsd-ntpd-test:~ # ps aux | grep ntpd > > root 826 0.0 0.2 22060 6960 - Ss 17:38 0:00.01 ntpd > > root 828 0.0 0.1 12976 2416 0 S+ 17:38 0:00.00 grep ntpd > > root@freebsd-ntpd-test:~ # killall ntpd > > root@freebsd-ntpd-test:~ # ps aux | grep ntpd > > root 831 0.0 0.1 12976 2416 0 S+ 17:38 0:00.00 grep ntpd > > root@freebsd-ntpd-test:~ # sysctl kern.elf64.aslr.pie_enable=3D1 > > kern.elf64.aslr.pie_enable: 0 -> 1 > > This causes the problem. Yes, this seems to cause the problem. However, what really happens is that along with pie_enable, the stack gap is enabled. When pie_enable was set to 0, stack_gap was ignored. > > > root@freebsd-ntpd-test:~ # ntpd > > root@freebsd-ntpd-test:~ # ps aux | grep ntpd > > root 836 0.0 0.1 14128 2452 0 S+ 17:39 0:00.00 grep ntpd > > root@freebsd-ntpd-test:~ # cat /var/log/messages | tail > > May 21 17:38:25 freebsd-ntpd-test ntpd[826]: ntpd exiting on signal 15 > > (Terminated) > > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: ntpd 4.2.8p15-a (1): Start= ing > > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: Command line: ntpd > > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: > > ---------------------------------------------------- > > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: ntp-4 is maintained by > > Network Time Foundation, > > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: Inc. (NTF), a non-profit > > 501(c)(3) public-benefit > > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: corporation. Support and > > training for ntp-4 are > > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: available at > > https://www.nwtime.org/support > > May 21 17:39:14 freebsd-ntpd-test ntpd[833]: > > ---------------------------------------------------- > > May 21 17:39:14 freebsd-ntpd-test kernel: pid 834 (ntpd), jid 0, uid > > 0: exited on signal 11 (core dumped) This happened when kern.elf64.aslr.pie_enable=3D1 and kern.elf64.aslr.stack_gap=3D3. > > root@freebsd-ntpd-test:~ # sysctl kern.elf64.aslr.stack_gap=3D0 > > kern.elf64.aslr.stack_gap: 3 -> 0 > > root@freebsd-ntpd-test:~ # sysctl -a | grep aslr > > kern.elf32.aslr.stack_gap: 3 > > kern.elf32.aslr.honor_sbrk: 1 > > kern.elf32.aslr.pie_enable: 0 > > kern.elf32.aslr.enable: 0 > > kern.elf64.aslr.stack_gap: 0 > > kern.elf64.aslr.honor_sbrk: 1 > > kern.elf64.aslr.pie_enable: 1 > > This is the problem. At this point the stack gap was disabled while still leaving pie_enable set to 1. > > > kern.elf64.aslr.enable: 0 > > vm.aslr_restarts: 1 > > root@freebsd-ntpd-test:~ # ntpd > > root@freebsd-ntpd-test:~ # ps aux | grep ntpd > > root 845 0.0 0.2 22060 6924 - Ss 17:40 0:00.01 ntpd > > root 847 0.0 0.1 12976 2440 0 S+ 17:40 0:00.00 grep ntpd Here the ntpd daemon started with pie_enable set to 1. stack_gap was set to 0. No segfault. > > root@freebsd-ntpd-test:~ # cat /var/log/messages | tail > > May 21 17:39:14 freebsd-ntpd-test kernel: pid 834 (ntpd), jid 0, uid > > 0: exited on signal 11 (core dumped) > > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: ntpd 4.2.8p15-a (1): Start= ing > > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: Command line: ntpd > > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: > > ---------------------------------------------------- > > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: ntp-4 is maintained by > > Network Time Foundation, > > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: Inc. (NTF), a non-profit > > 501(c)(3) public-benefit > > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: corporation. Support and > > training for ntp-4 are > > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: available at > > https://www.nwtime.org/support > > May 21 17:40:52 freebsd-ntpd-test ntpd[844]: > > ---------------------------------------------------- > > May 21 17:40:52 freebsd-ntpd-test ntpd[845]: leapsecond file > > ('/var/db/ntpd.leap-seconds.list'): stat failed: No such file or > > directory > > root@freebsd-ntpd-test:~ # killall ntpd > > > > Best regards, > > Marcin > > Running on my firewall, which has had this same ASLR configuration for > about a year. > > cwfw# sysctl kern.elf64.aslr > kern.elf64.aslr.stack_gap: 3 > kern.elf64.aslr.honor_sbrk: 1 > kern.elf64.aslr.pie_enable: 0 > kern.elf64.aslr.enable: 1 > cwfw# ps auxww | grep ntpd > ntpd 1499 0.0 0.1 22044 5776 - Ss 09:30 0:00.28 > /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f > /var/db/ntp/ntpd.drift -g > root 3032 0.0 0.0 13044 2384 0 S+ 10:49 0:00.00 grep ntpd > cwfw# uptime > 10:49AM up 1:20, 1 user, load averages: 1.06, 1.02, 0.97 > cwfw# uname -a > FreeBSD cwfw 14.0-CURRENT FreeBSD 14.0-CURRENT #151 > komquats-n246804-af949c590bd8-dirty: Fri May 21 07:09:32 PDT 2021 > root@cwsys:/export/obj/opt/src/git-src/amd64.amd64/sys/PROD2 amd64 > cwfw# > > My laptop: > > slippy# sysctl kern.elf64.aslr > kern.elf64.aslr.stack_gap: 3 > kern.elf64.aslr.honor_sbrk: 1 > kern.elf64.aslr.pie_enable: 0 > kern.elf64.aslr.enable: 1 > slippy# ps auxww | grep ntpd > ntpd 2100 0.0 0.1 22036 8600 - Ss 09:35 0:00.27 > /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f > /var/db/ntp/ntpd.drift -g > root 4632 0.0 0.0 13040 2724 1 S+ 10:51 0:00.00 grep ntpd > slippy# uptime > 10:51AM up 1:17, 0 users, load averages: 0.11, 0.16, 0.16 > slippy# uname -a > FreeBSD slippy 14.0-CURRENT FreeBSD 14.0-CURRENT #155 > komquats-n246804-af949c590bd8-dirty: Fri May 21 07:07:22 PDT 2021 > root@cwsys:/export/obj/opt/src/git-src/amd64.amd64/sys/BREAK amd64 > slippy# > > One of my poudriere machines: > > cwsys# sysctl kern.elf64.aslr > kern.elf64.aslr.stack_gap: 3 > kern.elf64.aslr.honor_sbrk: 1 > kern.elf64.aslr.pie_enable: 0 > kern.elf64.aslr.enable: 1 > cwsys# ps auxww | grep ntpd > ntpd 4039 0.0 0.1 22040 7340 - Ss 09:34 0:00.46 > /usr/sbin/ntpd -p /var/db/ntp/ntpd.pid -c /etc/ntp.conf -f > /var/db/ntp/ntpd.drift -g > root 6385 0.0 0.0 13044 2712 2 S+ 10:52 0:00.01 grep ntpd > cwsys# uptime > 10:52AM up 1:19, 2 users, load averages: 0.26, 0.25, 0.24 > cwsys# uname -a > FreeBSD cwsys 14.0-CURRENT FreeBSD 14.0-CURRENT #155 > komquats-n246804-af949c590bd8-dirty: Fri May 21 07:07:22 PDT 2021 > root@cwsys:/export/obj/opt/src/git-src/amd64.amd64/sys/BREAK amd64 > cwsys# > > Three examples of stack gap enabled and PIE disabled. When I enable PIE, > ntpd fails. If the binaries are ET_DYN type, the stack gap is actually disabled. Since this is 14 this is most likely the case, unless explicitly disabled by using WITHOUT_PIE build option. > > > -- > Cheers, > Cy Schubert <Cy.Schubert@cschubert.com> > FreeBSD UNIX: <cy@FreeBSD.org> Web: https://FreeBSD.org > NTP: <cy@nwtime.org> Web: https://nwtime.org > > The need of the many outweighs the greed of the few. > > You can also see that on 13.0-RELEASE ntpd will also segfault, but instead of kern.elf64.aslr.pie_enabled set to 1, kern.elf64.aslr.enabled should be set to 1. This is due to the fact that 13 is built with WITHOUT_PIE option and the executables are of ET_EXEC type. Again, you can set stack_gap to 0 there and the problem will disappear. Setting kern.elf64.aslr.stack_gap to value > 0 does not necessarily mean that stack gap is actually enabled, so the whole situation can be confusing. Best regards, Dawid
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGJeAm4_W70UPonp7MdmhsbX3zPCD4JKZBZknTzES-pp5p7NXg>