From owner-freebsd-arch@FreeBSD.ORG Thu Jul 31 21:09:39 2014 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 24285FDE; Thu, 31 Jul 2014 21:09:39 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id F043D27F7; Thu, 31 Jul 2014 21:09:38 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s6VL9bDd034796 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 31 Jul 2014 14:09:38 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s6VL9bSc034795; Thu, 31 Jul 2014 14:09:37 -0700 (PDT) (envelope-from jmg) Date: Thu, 31 Jul 2014 14:09:37 -0700 From: John-Mark Gurney To: Phil Shafer Subject: Re: XML Output: libxo - provide single API to output TXT, XML, JSON and HTML Message-ID: <20140731210937.GV43962@funkthat.com> Mail-Followup-To: Phil Shafer , arch@freebsd.org, sjg@freebsd.org, marcel@freebsd.org References: <20140731175547.GO43962@funkthat.com> <201407311839.s6VIdlMK096434@idle.juniper.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201407311839.s6VIdlMK096434@idle.juniper.net> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Thu, 31 Jul 2014 14:09:38 -0700 (PDT) Cc: sjg@freebsd.org, arch@freebsd.org, marcel@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 Jul 2014 21:09:39 -0000 Phil Shafer wrote this message on Thu, Jul 31, 2014 at 14:39 -0400: > John-Mark Gurney writes: > >Return an error? printf can return an error, yet most people don't > >check it.. so no real difference in API/bugs... > > My concern is emitting half a string, where the half we don't emit > is something important. I don't want to make the opposite of an > injection attack, where arranging some daemon to call xo_emit with > a broken UTF-8 string allows an evil-doer to fix their evil content > into the other half of the string. > > I'm escaping XML, JSON, and HTML content already, so the simplest > scheme is to: > > a) UTF-8 check the format string; > if it fails, nothing is emitted > b) for each format descriptor, check the content generared; > if it fails, nothing is emitted from the xo_emit call > anything already generated is discarded > > Simple and easy. Seem reasonable? The other option would be to > discard only that specific format descriptor or only that field > description. > > xo_emit("{:good/%d}{:bad/%d%s}{:ugly}", 0, 55, "\xff\x01\xff", "cat"); > > Does the "cat" get emitted? Is "55" emitted? > > If "ugly" was phil, and the bogus > string blocked the generation of that vital bit of info, life could > be bad. I agree... > Unfortunately, even this isn't a simple fix for "w", which wants > call wcsftime() to get wide values for month and day-of-the-week > names. Does wcsrtombs() convert this to UTF-8? Is there a locale > for UTF-8? Well, from my understanding there can't be a "locale" that is UTF-8 as a locale contains more than just character encoding... It also includes month/day names, sorting, etc... I think you can get a C locale (the default) w/ UTF-8 by setting the correct environment variables, but I don't know them well enough to say... Should we add a locale that does this? There is UTF-8 in /usr/share/locale, but if you set LANG to it, things don't work.. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."