From owner-freebsd-current@FreeBSD.ORG Tue Aug 26 04:49:24 2014 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 95A6E7B7; Tue, 26 Aug 2014 04:49:24 +0000 (UTC) Received: from kefka.worrbase.com (unknown [IPv6:2620:8d:8000:e49:f435:9b0c:255b:e411]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 32A01351F; Tue, 26 Aug 2014 04:49:24 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]); by kefka.worrbase.com (OpenSMTPD) with ESMTP id 49d76b64; Mon, 25 Aug 2014 21:49:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=worrbase.com; h= content-transfer-encoding:content-type:content-type:in-reply-to :references:subject:subject:mime-version:user-agent:from:from :date:date:message-id:received:received; s=mail; t=1409028554; x=1410842955; bh=/4W5Ebf31tZy/x8OfApTYkgfriBM6WqCM5qVlHBY/FI=; b= BV6IvN1r35VEwDGfYneIu1anDYgUkpPEsFNSGqrKUz7lyfSmT3PE1FxbgWUv6s57 Ns1u0F3z0ispGvWlEBMY9TN4cn7SsJvOuw7uAmEsDYp6o6xAORbupQTfoHCan9Nd N3aYxyvFTxyU+YsVh52ESUmaEkQgZh2cC4Qqg/ycA4jij6K783nTnHgt3rO9Z0OA KjyTGf0kLkDmaqBthKefLfF9PNbUkY6yLUYJyNYqT1Qn1TWAH+NIsVDbwZB1rjL1 1nw8zrt3D4mTkl2LSw59sc0OwGukFiPVkoidLCb83aer++HIDyfYvElHxoygszaE bH8f3nt+OJDFNBkROm+QNqWmZw9OQZjix75viiQdHRVJX4KGGuTRcJeEIjBUhOWR BJZ7zjCNO1lCsCxBxp4MpwmE/OWy7+aYN13SeyNGfEo7BUPQT+0/+XqGJOlnKfmR cVac70V9Em5i9c2oou+IwrvKCAV/cpCJz+/7VtnbR23Kup3BNqUqTdb6izYpOHLL OHC46lPtAq7Ier7e82xjLXSBQmzptwVraMiMnjM3RNuodAYi3qnZXhts4mviSteS ptdXUJwkwSae6qrwAdfSSvTPVTIdSk0SdI/Xc+pESpOiT4OZVP4LLVlAcmyY70ib pNZ593RJrT2w+fC7KMIyqxnfgZWEVEI727RiDXKWhtM= X-Virus-Scanned: amavisd-new at worrbase.com Received: from kefka.worrbase.com ([IPv6:::1]) by localhost (kefka.worrbase.com [IPv6:::1]) (amavisd-new, port 10026) with ESMTP id drtLKmETIofO; Mon, 25 Aug 2014 21:49:14 -0700 (PDT) Received: from [10.0.1.3] (c-50-161-97-206.hsd1.ca.comcast.net [50.161.97.206]); by kefka.worrbase.com (OpenSMTPD) with ESMTPSA id 1e53ee64; TLS version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO; Mon, 25 Aug 2014 21:49:14 -0700 (PDT) Message-ID: <53FC11C8.5050803@worrbase.com> Date: Mon, 25 Aug 2014 21:49:12 -0700 From: William Orr User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Thunderbird/31.0 MIME-Version: 1.0 To: Alan Somers , FreeBSD CURRENT Subject: Re: Inconsistent behavior with dd(1) References: <3C86F281-D618-4B93-BBA3-2DA33AC407EC@worrbase.com> <20140816173439.GZ83475@funkthat.com> <53F24D60.4080107@worrbase.com> In-Reply-To: <53F24D60.4080107@worrbase.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Aug 2014 04:49:24 -0000 On 8/18/14 12:00 PM, William Orr wrote: > Reply inline. > > On 08/16/2014 10:34 AM, John-Mark Gurney wrote: >> Alan Somers wrote this message on Fri, Aug 15, 2014 at 10:42 -0600: >>> On Thu, Aug 14, 2014 at 11:55 PM, William Orr wrote: >>>> Hey, >>>> >>>> I found some inconsistent behavior with dd(1) when it comes to specifying arguments in -CURRENT. >>>> >>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=18446744073709551616 >>>> dd: count: Result too large >>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=18446744073709551617 >>>> dd: count: Result too large >>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=18446744073709551615 >>>> dd: count cannot be negative >>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=-18446744073709551615 >>>> 1+0 records in >>>> 1+0 records out >>>> 512 bytes transferred in 0.000373 secs (1373071 bytes/sec) >>>> [ worr on terra ] ( ~ ) % dd if=/dev/zero of=/dev/null count=-1 >>>> dd: count cannot be negative >>>> >>>> ??? >>>> >>>> Any chance someone has the time and could take a look? https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=191263 >>>> >>>> Thanks, >>>> William Orr >>>> >>>> ??? >>> >>> IMHO, this is a bug in strtouq(3), not in dd(1). Why should it parse >>> negative numbers at all, when there is stroq(3) for that purpose? The >>> standard is clear that it must, though. Oddly enough, stroq would >>> probably not accept -18446744073709551615, even though strtouq does. >>> Specific comments on your patch below: >>> >>> >>>> Here???s the patch: >>>> >>>> Index: bin/dd/args.c >>>> =================================================================== >>>> --- bin/dd/args.c (revision 267712) >>>> +++ bin/dd/args.c (working copy) >>>> @@ -186,46 +186,31 @@ >>>> static void >>>> f_bs(char *arg) >>>> { >>>> - uintmax_t res; >>>> - >>>> - res = get_num(arg); >>>> - if (res < 1 || res > SSIZE_MAX) >>>> - errx(1, "bs must be between 1 and %jd", (intmax_t)SSIZE_MAX); >>>> - in.dbsz = out.dbsz = (size_t)res; >>>> + in.dbsz = out.dbsz = get_num(arg); >>>> + if (in.dbsz < 1 || out.dbsz < 1) >>> Why do you need to check both in and out? Aren't they the same? >>> Also, you eliminated the check for overflowing SSIZE_MAX. That's not >>> ok, because these values get passed to places that expect signed >>> numbers, for example in dd.c:303. >> The type of dbsz is size_t, so really: >> >>>> + errx(1, "bs must be between 1 and %ju", (uintmax_t)-1); >> This should be SIZE_MAX, except there isn't a define for this? So maybe >> the code really should be: >> (uintmax_t)(size_t)-1 >> >> to get the correct value for SIZE_MAX... >> >> Otherwise on systems that uintmax_t is >32bits and size_t is 32bits, >> the error message will be wrong... > Yes, this should probably be SIZE_MAX rather than that cast. Same with > the others > >>>> } >>>> >>>> static void >>>> f_cbs(char *arg) >>>> { >>>> - uintmax_t res; >>>> - >>>> - res = get_num(arg); >>>> - if (res < 1 || res > SSIZE_MAX) >>>> - errx(1, "cbs must be between 1 and %jd", (intmax_t)SSIZE_MAX); >>>> - cbsz = (size_t)res; >>>> + cbsz = get_num(arg); >>>> + if (cbsz < 1) >>>> + errx(1, "cbs must be between 1 and %ju", (uintmax_t)-1); >>>> } >>> Again, you eliminated the check for SSIZE_MAX, but cbsz must be signed. >> What do you mean by this? cbsz is size_t which is unsigned... > I believe he's referring to this use of cbsz/in.dbsz/out.dbsz: > > https://svnweb.freebsd.org/base/head/bin/dd/dd.c?revision=265698&view=markup#l171 > > Really, this is more wrong since there is math inside of a malloc(3) > call without any overflow handling. By virtue of making this max out at > a ssize_t, it becomes more unlikely that you'll have overflow. > > This math should probably be done ahead of time with proper overflow > handling. I'll include that in my next patch, if there's no objection. > > I don't see any other reason why in.dbsz, out.dbsz or cbsz should be > signed, but it's very possible that I didn't look hard enough. > >> Again, the cast above is wrong... Maybe we should add a SIZE_MAX >> define so we don't have to see the double cast... >> >>>> static void >>>> f_count(char *arg) >>>> { >>>> - intmax_t res; >>>> - >>>> - res = (intmax_t)get_num(arg); >>>> - if (res < 0) >>>> - errx(1, "count cannot be negative"); >>>> - if (res == 0) >>>> - cpy_cnt = (uintmax_t)-1; >>> This is a special case. See dd_in(). I think that eliminating this >>> special case will have the unintended effect of breaking count=0. >>> >>>> - else >>>> - cpy_cnt = (uintmax_t)res; >>>> + cpy_cnt = get_num(arg); >>>> } >>>> >>>> static void >>>> f_files(char *arg) >>>> { >>>> - >> Don't eliminate these blank lines.. they are intentional per style(9): >> /* Insert an empty line if the function has no local variables. */ >> >>>> files_cnt = get_num(arg); >>>> if (files_cnt < 1) >>>> - errx(1, "files must be between 1 and %jd", (uintmax_t)-1); >>>> + errx(1, "files must be between 1 and %ju", (uintmax_t)-1); >>> Good catch. >>> >>>> } >>>> >>>> static void >>>> @@ -241,14 +226,10 @@ >>>> static void >>>> f_ibs(char *arg) >>>> { >>>> - uintmax_t res; >>>> - >>>> if (!(ddflags & C_BS)) { >>>> - res = get_num(arg); >>>> - if (res < 1 || res > SSIZE_MAX) >>>> - errx(1, "ibs must be between 1 and %jd", >>>> - (intmax_t)SSIZE_MAX); >>>> - in.dbsz = (size_t)res; >>>> + in.dbsz = get_num(arg); >>>> + if (in.dbsz < 1) >>>> + errx(1, "ibs must be between 1 and %ju", (uintmax_t)-1); >>> Again, you eliminated the check for SSIZE_MAX, but dbsz must be signed. >> If dbsz must be signed, we should change it's definition to ssize_t >> instead of size_t... Can you point to the line that says this? >> >> In investigating this, it looks like we may have a bug in ftruncate in >> that out.offset * out.dbsz may overflow and return incorrect results... >> We should probably check that the output (cast to off_t) is greater than >> both the inputs before calling ftruncate... This is safe as both are >> unsigned... > Yeah, there probably ought to be integer overflow handling here as well. > >>>> } >>>> } >>>> >>>> @@ -262,14 +243,10 @@ >>>> static void >>>> f_obs(char *arg) >>>> { >>>> - uintmax_t res; >>>> - >>>> if (!(ddflags & C_BS)) { >>>> - res = get_num(arg); >>>> - if (res < 1 || res > SSIZE_MAX) >>>> - errx(1, "obs must be between 1 and %jd", >>>> - (intmax_t)SSIZE_MAX); >>>> - out.dbsz = (size_t)res; >>>> + out.dbsz = get_num(arg); >>>> + if (out.dbsz < 1) >>>> + errx(1, "obs must be between 1 and %ju", (uintmax_t)-1); >>>> } >>>> } >>> Again, you eliminated the check for SSIZE_MAX, but dbsz must be signed. >>> >>>> @@ -378,11 +355,14 @@ >>>> uintmax_t num, mult, prevnum; >>>> char *expr; >>>> >>>> + if (val[0] == '-') >>>> + errx(1, "%s: cannot be negative", oper); >>>> + >>> In general, I like this part of the diff. Every user of get_num >>> checks for negative values, so why not move the check into get_num >>> itself? But you changed it from a numeric check to a text check, and >>> writing text parsers makes me nervous. I can't see any problems, >>> though. > Funnily enough this part of the diff was wrong. I didn't account for > spaces, so I'll add that in my upcoming diff. > >>>> errno = 0; >>>> num = strtouq(val, &expr, 0); >>>> if (errno != 0) /* Overflow or underflow. */ >>>> err(1, "%s", oper); >>>> - >>>> + >>>> if (expr == val) /* No valid digits. */ >>>> errx(1, "%s: illegal numeric value", oper); >>>> >>>> Index: bin/dd/dd.c >>>> =================================================================== >>>> --- bin/dd/dd.c (revision 267712) >>>> +++ bin/dd/dd.c (working copy) >>>> @@ -284,8 +284,6 @@ >>>> >>>> for (;;) { >>>> switch (cpy_cnt) { >>>> - case -1: /* count=0 was specified */ >>>> - return; >>> Again, I don't think this will do what you want it to do. Previously, >>> leaving count unspecified resulted in cpy_cnt being 0, and specifying >>> count=0 set cpy_cnt to -1. With your patch, setting count=0 will have >>> the same effect as leaving it unspecified. > Nope. It didn't do what I wanted. I'll submit an updated diff with this > fixed as well as the other things I mentioned, provided there's no > objection to my direction. > >>>> case 0: >>>> break; >>>> default: >>>> >>>> Here is a new version of this patch. I've now changed the members of the struct that are used as ssize_ts to ssize_ts, and have fixed casts appropriately. I have also opted for strtoull over the deprecated strtouq. I changed some struct members that were declared as uintmax_t's to size_t's. Any comments? Criticisms? Gross attacks on my character? Index: args.c =================================================================== --- args.c (revision 270645) +++ args.c (working copy) @@ -41,6 +41,7 @@ #include +#include #include #include #include @@ -171,8 +172,7 @@ */ if (in.offset > OFF_MAX / (ssize_t)in.dbsz || out.offset > OFF_MAX / (ssize_t)out.dbsz) - errx(1, "seek offsets cannot be larger than %jd", - (intmax_t)OFF_MAX); + errx(1, "seek offsets cannot be larger than %jd", OFF_MAX); } static int @@ -186,37 +186,28 @@ static void f_bs(char *arg) { - uintmax_t res; - res = get_num(arg); - if (res < 1 || res > SSIZE_MAX) - errx(1, "bs must be between 1 and %jd", (intmax_t)SSIZE_MAX); - in.dbsz = out.dbsz = (size_t)res; + in.dbsz = out.dbsz = get_num(arg); + if (out.dbsz < 1 || out.dbsz > SSIZE_MAX) + errx(1, "bs must be between 1 and %jd", SSIZE_MAX); } static void f_cbs(char *arg) { - uintmax_t res; - res = get_num(arg); - if (res < 1 || res > SSIZE_MAX) - errx(1, "cbs must be between 1 and %jd", (intmax_t)SSIZE_MAX); - cbsz = (size_t)res; + cbsz = get_num(arg); + if (cbsz < 1 || cbsz > SSIZE_MAX) + errx(1, "cbs must be between 1 and %jd", SSIZE_MAX); } static void f_count(char *arg) { - intmax_t res; - res = (intmax_t)get_num(arg); - if (res < 0) - errx(1, "count cannot be negative"); - if (res == 0) - cpy_cnt = (uintmax_t)-1; - else - cpy_cnt = (uintmax_t)res; + cpy_cnt = get_num(arg); + if (cpy_cnt == 0) + cpy_cnt = -1; } static void @@ -225,7 +216,7 @@ files_cnt = get_num(arg); if (files_cnt < 1) - errx(1, "files must be between 1 and %jd", (uintmax_t)-1); + errx(1, "files must be between 1 and %ju", SIZE_MAX); } static void @@ -241,14 +232,11 @@ static void f_ibs(char *arg) { - uintmax_t res; if (!(ddflags & C_BS)) { - res = get_num(arg); - if (res < 1 || res > SSIZE_MAX) - errx(1, "ibs must be between 1 and %jd", - (intmax_t)SSIZE_MAX); - in.dbsz = (size_t)res; + in.dbsz = get_num(arg); + if (in.dbsz < 1 || in.dbsz > SSIZE_MAX) + errx(1, "ibs must be between 1 and %ju", SSIZE_MAX); } } @@ -262,14 +250,11 @@ static void f_obs(char *arg) { - uintmax_t res; if (!(ddflags & C_BS)) { - res = get_num(arg); - if (res < 1 || res > SSIZE_MAX) - errx(1, "obs must be between 1 and %jd", - (intmax_t)SSIZE_MAX); - out.dbsz = (size_t)res; + out.dbsz = get_num(arg); + if (out.dbsz < 1 || out.dbsz > SSIZE_MAX) + errx(1, "obs must be between 1 and %jd", SSIZE_MAX); } } @@ -378,11 +363,17 @@ uintmax_t num, mult, prevnum; char *expr; + while (isspace(val[0])) + val++; + + if (val[0] == '-') + errx(1, "%s: cannot be negative", oper); + errno = 0; - num = strtouq(val, &expr, 0); + num = strtoull(val, &expr, 0); if (errno != 0) /* Overflow or underflow. */ err(1, "%s", oper); - + if (expr == val) /* No valid digits. */ errx(1, "%s: illegal numeric value", oper); Index: conv.c =================================================================== --- conv.c (revision 270645) +++ conv.c (working copy) @@ -133,7 +133,7 @@ */ ch = 0; for (inp = in.dbp - in.dbcnt, outp = out.dbp; in.dbcnt;) { - maxlen = MIN(cbsz, in.dbcnt); + maxlen = MIN(cbsz, (size_t)in.dbcnt); if ((t = ctab) != NULL) for (cnt = 0; cnt < maxlen && (ch = *inp++) != '\n'; ++cnt) @@ -146,7 +146,7 @@ * Check for short record without a newline. Reassemble the * input block. */ - if (ch != '\n' && in.dbcnt < cbsz) { + if (ch != '\n' && (size_t)in.dbcnt < cbsz) { (void)memmove(in.db, in.dbp - in.dbcnt, in.dbcnt); break; } @@ -228,7 +228,7 @@ * translation has to already be done or we might not recognize the * spaces. */ - for (inp = in.db; in.dbcnt >= cbsz; inp += cbsz, in.dbcnt -= cbsz) { + for (inp = in.db; (size_t)in.dbcnt >= cbsz; inp += cbsz, in.dbcnt -= cbsz) { for (t = inp + cbsz - 1; t >= inp && *t == ' '; --t) ; if (t >= inp) { Index: dd.c =================================================================== --- dd.c (revision 270645) +++ dd.c (working copy) @@ -168,10 +168,10 @@ * record oriented I/O, only need a single buffer. */ if (!(ddflags & (C_BLOCK | C_UNBLOCK))) { - if ((in.db = malloc(out.dbsz + in.dbsz - 1)) == NULL) + if ((in.db = malloc((size_t)out.dbsz + in.dbsz - 1)) == NULL) err(1, "input buffer"); out.db = in.db; - } else if ((in.db = malloc(MAX(in.dbsz, cbsz) + cbsz)) == NULL || + } else if ((in.db = malloc(MAX((size_t)in.dbsz, cbsz) + cbsz)) == NULL || (out.db = malloc(out.dbsz + cbsz)) == NULL) err(1, "output buffer"); @@ -343,7 +343,7 @@ ++st.in_full; /* Handle full input blocks. */ - } else if ((size_t)n == in.dbsz) { + } else if ((size_t)n == (size_t)in.dbsz) { in.dbcnt += in.dbrcnt = n; ++st.in_full; @@ -493,7 +493,7 @@ outp += nw; st.bytes += nw; - if ((size_t)nw == n && n == out.dbsz) + if ((size_t)nw == n && n == (size_t)out.dbsz) ++st.out_full; else ++st.out_part; Index: dd.h =================================================================== --- dd.h (revision 270645) +++ dd.h (working copy) @@ -38,10 +38,9 @@ typedef struct { u_char *db; /* buffer address */ u_char *dbp; /* current buffer I/O address */ - /* XXX ssize_t? */ - size_t dbcnt; /* current buffer byte count */ - size_t dbrcnt; /* last read byte count */ - size_t dbsz; /* block size */ + ssize_t dbcnt; /* current buffer byte count */ + ssize_t dbrcnt; /* last read byte count */ + ssize_t dbsz; /* block size */ #define ISCHR 0x01 /* character device (warn on short) */ #define ISPIPE 0x02 /* pipe-like (see position.c) */ @@ -57,13 +56,13 @@ } IO; typedef struct { - uintmax_t in_full; /* # of full input blocks */ - uintmax_t in_part; /* # of partial input blocks */ - uintmax_t out_full; /* # of full output blocks */ - uintmax_t out_part; /* # of partial output blocks */ - uintmax_t trunc; /* # of truncated records */ - uintmax_t swab; /* # of odd-length swab blocks */ - uintmax_t bytes; /* # of bytes written */ + size_t in_full; /* # of full input blocks */ + size_t in_part; /* # of partial input blocks */ + size_t out_full; /* # of full output blocks */ + size_t out_part; /* # of partial output blocks */ + size_t trunc; /* # of truncated records */ + size_t swab; /* # of odd-length swab blocks */ + size_t bytes; /* # of bytes written */ struct timespec start; /* start time of dd */ } STAT; Index: position.c =================================================================== --- position.c (revision 270645) +++ position.c (working copy) @@ -178,7 +178,7 @@ n = write(out.fd, out.db, out.dbsz); if (n == -1) err(1, "%s", out.name); - if ((size_t)n != out.dbsz) + if (n != out.dbsz) errx(1, "%s: write failure", out.name); } break;