From owner-svn-src-all@freebsd.org Sun Jun 7 18:58:58 2020 Return-Path: Delivered-To: svn-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 27593334901; Sun, 7 Jun 2020 18:58:58 +0000 (UTC) (envelope-from oliver.pntr@gmail.com) Received: from mail-yb1-xb2f.google.com (mail-yb1-xb2f.google.com [IPv6:2607:f8b0:4864:20::b2f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49g5Jd3BHKz4WG9; Sun, 7 Jun 2020 18:58:57 +0000 (UTC) (envelope-from oliver.pntr@gmail.com) Received: by mail-yb1-xb2f.google.com with SMTP id j8so7968490ybj.12; Sun, 07 Jun 2020 11:58:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=mUVJrRriZYKFadQ5MsRztb7V5nsAh/Di0RkNRMJW27M=; b=HAi6XAZmUbBaQSeZd8Cys+oPtR68JhzUEz4WELDCpg37gpi47c6WDjwih4QzzygISJ gN7oEgUoY5V0V4GvManaRvi0/Z2FDDSNQ0SvgWyktOBy9pHZCCpzlwgAN+Dtdavysimn 9t+663FOgxmKKSQlnpPpMV3G2dHv+ruFVerudR/Cia4MqhbIm/pP7V6Peuxi5SdGl9AZ TA1fAmBpgAp/MSMVrH5sq/jHU2x0ysbqGNJBylpH/5Q1GJhO9mY58KHjb8+CftQuuM6T F6GQzS4wyjl86v8dmJ9jrmCi8Z5nOwXp+LdV2e7VsuYibJ9qGkT5CpfQVfUPOhVHOjNr L4ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=mUVJrRriZYKFadQ5MsRztb7V5nsAh/Di0RkNRMJW27M=; b=AjTwY45xTsGWEb1OmjGXUN861f14hqgjxqcC+gMAWkNpWgZMz8BSujuZ1EQWaSkvbN 4yO5qBz0QbAVSPmIChz8w3SYZI+rWYOR/T82XtyGss//27T6dqc47GRvUQ5xz6eyMUPG fwBWdGbj0Iq0grgNfz0qhYfWRuP6GSmLmNp/6j31x4IrMVrXSveGm0QOw1uUUG2DDxs1 MAuF3WIam5SR3LojizUlZWaYfTetTnz1EL7JnvqOGld/dreXIrYQgM2DBb9XLZpkTAs9 AtZnqru+KwGxCEHez1uSobUotPtbRrYhm5CXVlHscipWv/eAtD6BAmzbCgI1rVqssD8o yYPw== X-Gm-Message-State: AOAM530tV1hmjUVu5gtJpjZ0Z22853ee9jjMt9yXnP5VAZca++GS/6ET ddgOHl4PvMzgk1V74AJSYA4rE/H/uh9blirLjKmtqQ== X-Google-Smtp-Source: ABdhPJySrVdw9866rDIFDs+p1/7dQHogm1imHU4o4T7LV5zZU0C/1XIA9d6JRr3uhj2hZPJ69ZkiIHuYnqJ3p3Xj0+k= X-Received: by 2002:a25:1f05:: with SMTP id f5mr31966414ybf.44.1591556336516; Sun, 07 Jun 2020 11:58:56 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a25:2bc5:0:0:0:0:0 with HTTP; Sun, 7 Jun 2020 11:58:55 -0700 (PDT) In-Reply-To: <202006070432.0574Wc1L063319@repo.freebsd.org> References: <202006070432.0574Wc1L063319@repo.freebsd.org> From: Oliver Pinter Date: Sun, 7 Jun 2020 20:58:55 +0200 Message-ID: Subject: Re: svn commit: r361884 - in head/usr.bin/sed: . tests To: Kyle Evans Cc: "src-committers@freebsd.org" , "svn-src-all@freebsd.org" , "svn-src-head@freebsd.org" X-Rspamd-Queue-Id: 49g5Jd3BHKz4WG9 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; TAGGED_FROM(0.00)[]; REPLY(-4.00)[] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.33 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Jun 2020 18:58:58 -0000 On Sunday, June 7, 2020, Kyle Evans wrote: > Author: kevans > Date: Sun Jun 7 04:32:38 2020 > New Revision: 361884 > URL: https://svnweb.freebsd.org/changeset/base/361884 > > Log: > sed: attempt to learn about hex escapes (e.g. \x27) > > Somewhat predictably, software often wants to use \x27/\x24 among others > so > that they can decline worrying about ugly escaping, if said escaping is > even > possible. Right now, this software is using these and getting the wrong > results, as we'll interpret those as x27 and x24 respectively. Some > examples > of this, when an exp-run was ran, were science/octopus and misc/vifm. > > Go ahead and process these at all times. We allow either one or two > digits, > and the tests account for both. If extra digits are specified, e.g. > \x2727, > then the third and fourth digits are interpreted literally as one might > expect. > > PR: 229925 > MFC after: 2 weeks Could you please put an entry from this to release notes? :) > > Modified: > head/usr.bin/sed/compile.c > head/usr.bin/sed/tests/sed2_test.sh > > Modified: head/usr.bin/sed/compile.c > ============================================================ > ================== > --- head/usr.bin/sed/compile.c Sun Jun 7 03:11:34 2020 (r361883) > +++ head/usr.bin/sed/compile.c Sun Jun 7 04:32:38 2020 (r361884) > @@ -49,6 +49,7 @@ static const char sccsid[] = "@(#)compile.c 8.1 (Berke > #include > #include > #include > +#include > #include > #include > #include > @@ -365,6 +366,51 @@ nonsel: /* Now parse the command */ > } > } > > +static int > +hex2char(const char *in, char *out, int len) > +{ > + long ord; > + char *endptr, hexbuf[3]; > + > + hexbuf[0] = in[0]; > + hexbuf[1] = len > 1 ? in[1] : '\0'; > + hexbuf[2] = '\0'; > + > + errno = 0; > + ord = strtol(hexbuf, &endptr, 16); > + if (*endptr != '\0' || errno != 0) > + return (ERANGE); > + *out = (char)ord; > + return (0); > +} > + > +static bool > +hexdigit(char c) > +{ > + int lc; > + > + lc = tolower(c); > + return isdigit(lc) || (lc >= 'a' && lc <= 'f'); > +} > + > +static bool > +dohex(const char *in, char *out, int *len) > +{ > + int tmplen; > + > + if (!hexdigit(in[0])) > + return (false); > + tmplen = 1; > + if (hexdigit(in[1])) > + ++tmplen; > + if (hex2char(in, out, tmplen) == 0) { > + *len = tmplen; > + return (true); > + } > + > + return (false); > +} > + > /* > * Get a delimited string. P points to the delimiter of the string; d > points > * to a buffer area. Newline and delimiter escapes are processed; other > @@ -377,6 +423,7 @@ nonsel: /* Now parse the command */ > static char * > compile_delimited(char *p, char *d, int is_tr) > { > + int hexlen; > char c; > > c = *p++; > @@ -412,6 +459,12 @@ compile_delimited(char *p, char *d, int is_tr) > } > p += 2; > continue; > + } else if (*p == '\\' && p[1] == 'x') { > + if (dohex(&p[2], d, &hexlen)) { > + ++d; > + p += hexlen + 2; > + continue; > + } > } else if (*p == '\\' && p[1] == '\\') { > if (is_tr) > p++; > @@ -431,7 +484,7 @@ compile_delimited(char *p, char *d, int is_tr) > static char * > compile_ccl(char **sp, char *t) > { > - int c, d; > + int c, d, hexlen; > char *s = *sp; > > *t++ = *s++; > @@ -459,6 +512,10 @@ compile_ccl(char **sp, char *t) > *t = '\t'; > s++; > break; > + case 'x': > + if (dohex(&s[2], t, &hexlen)) > + s += hexlen + 1; > + break; > } > } > } > @@ -499,7 +556,7 @@ static char * > compile_subst(char *p, struct s_subst *s) > { > static char lbuf[_POSIX2_LINE_MAX + 1]; > - int asize, size; > + int asize, hexlen, size; > u_char ref; > char c, *text, *op, *sp; > int more = 1, sawesc = 0; > @@ -562,6 +619,21 @@ compile_subst(char *p, struct s_subst *s) > break; > case 't': > *p = '\t'; > + break; > + case 'x': > +#define ADVANCE_N(s, n) \ > + do { \ > + char *adv = (s); \ > + while (*(adv + (n) - 1) != '\0') { \ > + *adv = *(adv + (n)); \ > + ++adv; \ > + } \ > + *adv = '\0'; \ > + } while (0); > + if (dohex(&p[1], p, > &hexlen)) { > + ADVANCE_N(p + 1, > + hexlen); > + } > break; > } > } > > Modified: head/usr.bin/sed/tests/sed2_test.sh > ============================================================ > ================== > --- head/usr.bin/sed/tests/sed2_test.sh Sun Jun 7 03:11:34 2020 > (r361883) > +++ head/usr.bin/sed/tests/sed2_test.sh Sun Jun 7 04:32:38 2020 > (r361884) > @@ -88,10 +88,39 @@ escape_subst_body() > atf_check -o 'inline:abcx\n' sed 's/[ \r\t]//g' c > } > > +atf_test_case hex_subst > +hex_subst_head() > +{ > + atf_set "descr" "Verify proper conversion of hex escapes" > +} > +hex_subst_body() > +{ > + printf "test='foo'" > a > + printf "test='27foo'" > b > + printf "\rn" > c > + printf "xx" > d > + > + atf_check -o 'inline:test="foo"' sed 's/\x27/"/g' a > + atf_check -o "inline:'test'='foo'" sed 's/test/\x27test\x27/g' a > + > + # Make sure we take trailing digits literally. > + atf_check -o "inline:test=\"foo'" sed 's/\x2727/"/g' b > + > + # Single digit \x should work as well. > + atf_check -o "inline:xn" sed 's/\xd/x/' c > + > + # Invalid digit should cause us to ignore the sequence. This test > + # invokes UB, escapes of an ordinary character. A future change > will > + # make regex(3) on longer tolerate this and we'll need to adjust > what > + # we're doing, but for now this will suffice. > + atf_check -o "inline:" sed 's/\xx//' d > +} > + > atf_init_test_cases() > { > atf_add_test_case inplace_command_q > atf_add_test_case inplace_hardlink_src > atf_add_test_case inplace_symlink_src > atf_add_test_case escape_subst > + atf_add_test_case hex_subst > } > _______________________________________________ > svn-src-head@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/svn-src-head > To unsubscribe, send any mail to "svn-src-head-unsubscribe@freebsd.org" >