Date: Sat, 28 Nov 2009 16:59:54 GMT From: "D'Arcy Cain" <darcy@NetBSD.org> To: freebsd-gnats-submit@FreeBSD.org Subject: bin/140976: comm(1) mishandles lines with tabs Message-ID: <200911281659.nASGxsoS077804@www.freebsd.org> Resent-Message-ID: <200911281700.nASH03P7076024@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 140976 >Category: bin >Synopsis: comm(1) mishandles lines with tabs >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Nov 28 17:00:03 UTC 2009 >Closed-Date: >Last-Modified: >Originator: D'Arcy Cain >Release: 7.2-RELEASE >Organization: NetBSD developer >Environment: FreeBSD shell.vex.net 7.2-RELEASE FreeBSD 7.2-RELEASE #0: Fri May 1 08:49:13 UTC 2009 root@walker.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 >Description: If an input file contains tabs it may not be handled correctly. In fact, the problem would happen with any character that compares lower than newline. >How-To-Repeat: Run this script. The two tests should print the same thing. #! /bin/sh TEST=/tmp/$$.test trap "rm -f $TEST.*" 0 cat << MOUSE > $TEST.1a a b c d e f e f g h i MOUSE cat << MOUSE > $TEST.1b a b e f g MOUSE tr ' ' '\t' < $TEST.1a > $TEST.2a tr ' ' '\t' < $TEST.1b > $TEST.2b echo "Test 1 (spaces) output:" comm -12 $TEST.1a $TEST.1b echo "" echo "Test 2 (tabs) output:" comm -12 $TEST.2a $TEST.2b >Fix: http://cvsweb.netbsd.org/bsdweb.cgi/src/usr.bin/comm/comm.c.diff?r1=1.17&r2=1.18&only_with_tag=MAIN&f=h is how I fixed it on NetBSD but you have a much different version of comm.c. The basic fix is to not read the newline. The newline is the separator between lines, not part of the line and including it causes it to be erroneously included in the comparisons. sort(1) gets this right and that's where the problem occurs. comm(1) does not agree with the sorting criteria. In NetBSD current there is a library function called getline which more or less does what the getline included in comm.c does except that it doesn't return the newline. Perhaps you should pull that in and use it instead. Don't forget to change your printf statements to add the newline. >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200911281659.nASGxsoS077804>