From owner-freebsd-arch@FreeBSD.ORG Mon Dec 12 15:43:52 2005 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9BC8416A41F for ; Mon, 12 Dec 2005 15:43:52 +0000 (GMT) (envelope-from max@love2party.net) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id A5B7F43D5E for ; Mon, 12 Dec 2005 15:43:51 +0000 (GMT) (envelope-from max@love2party.net) Received: from [84.163.197.19] (helo=amd64.laiers.local) by mrelayeu.kundenserver.de (node=mrelayeu5) with ESMTP (Nemesis), id 0ML25U-1Elpq72UQx-0008I1; Mon, 12 Dec 2005 16:43:49 +0100 From: Max Laier Organization: FreeBSD To: freebsd-arch@freebsd.org Date: Mon, 12 Dec 2005 16:43:33 +0100 User-Agent: KMail/1.8.2 References: <1023.1134389663@critter.freebsd.dk> In-Reply-To: <1023.1134389663@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart11122301.lkS6gTjF6f"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200512121643.39236.max@love2party.net> X-Provags-ID: kundenserver.de abuse@kundenserver.de login:61c499deaeeba3ba5be80f48ecc83056 Cc: Poul-Henning Kamp Subject: Re: printf behaviour with illegal or malformed format string X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Dec 2005 15:43:52 -0000 --nextPart11122301.lkS6gTjF6f Content-Type: text/plain; charset="iso-8859-6" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Monday 12 December 2005 13:14, Poul-Henning Kamp wrote: > Obligatory bikeshed avoidance notice: > >> Please read all the way to the bottom of this email before you reply << > > Given that illegal or malformed format strings to the printf family > of functions mostly result in harmless misformatting but occationally > in coredumps and maybe some times in security issues, what is the > desired behaviour of libc's printf implementation ? > > A very good example is > > printf("%hf", 1.0); > > The 'h' modifier is not legal for %f format, and it is therefore a good > bet that the programmer was confused and we know that the program > contains at least one error. > > > Our first line of defence against this kind of error is compile-time > checking by GCC, but we cannot rely on the string being sane in libc, > we still need to do error checking. > > The context for the above is that I'm working on adding extensibility > to our printf, compatible with the GLIBC (see 12.13 in the glibc > manual). Obviously, gcc cannot compile-time check such extensions > for us, and therefore the question gains a bit more relevance. > > In an ideal world, the printf family of functions would have been > defined to return EINVAL in this case. Almost nobody checks the > return values of printf-like functions however and those few that > do, all pressume that it is an I/O error so such an approach is > unlikely to gain us much if anything. > > Another alternative is to spit out the format string unformatted, > possibly with an attached notice, but this doesn't really seem to > help anybody either, but at least indicates what the problem is. > > > I'm leaning towards doing what phkmalloc has migrated to over time: > Make a variable which can select between "normal/paranoia" and force > it to paranoia for (uid=3D=3D0 || gid=3D=3D0 || setuid || setgid). > > If the variable is set, a bogus format string will result in abort(2). > > If it is not set, the format string will be output unformatted in > the message "WARNING: Illegal printf() format string: \"...\". I agree on principle but would like to ask if we need to revisit some of th= e=20 error cases. Especially with regard to 64bit porting there are some=20 "artifacts" that might cause serious pain for ported applications if the=20 above is adopted. Specifically, right now the following will warn "long long int format, int6= 4_t=20 arg (arg 2)" on our 64bit architectures while it is required on - at least = =2D=20 i386 int64_t i =3D 1; printf("%lld", i); Many other platforms allow it for 64bit architectures as well. As for all = our=20 64bit architectures sizeof(long) =3D=3D sizeof(long long) (as far as I am a= ware),=20 I am not convinced this should be a (fatal) error. There might be other=20 similar cases. So the question is, how strict should this check be? Are there cases where= we=20 are better off with a "just do it"-sollution? As a community service, there is a right way to do this (according to C99): int64_t i =3D 1; printf("%" PRIi64 "\n", i); but it's obvious this is not going to be adopted. The other often used=20 workaround is: int64_t i =3D 1; printf("%jd\n", (intmax_t)i); or: printf("%lld\n", (long long)i); which kind of reverts the idea behind useing C99-types. Note that: printf("%jd\n, i); seems to work as well, but I not sure this is correct. =2D-=20 /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News --nextPart11122301.lkS6gTjF6f Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQBDnZqrXyyEoT62BG0RAlvXAJ9LxkWexHGT7ZC457d9690Gj6jNBwCdECb3 SnuQiz887BQ0tH0sSvZ/fgs= =iICx -----END PGP SIGNATURE----- --nextPart11122301.lkS6gTjF6f--