Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 9 Feb 2002 21:46:00 -0500
From:      Garance A Drosihn <drosih@rpi.edu>
To:        "Brian F. Feldman" <green@FreeBSD.ORG>, "M. Warner Losh" <imp@village.org>
Cc:        fyre@orbital.wiretapped.net, tim@robbins.dropbear.id.au, freebsd-standards@FreeBSD.ORG
Subject:   Re: diff problem
Message-ID:  <p05101421b88b76a6aa12@[128.113.24.47]>
In-Reply-To: <200202091752.g19HqFP11551@green.bikeshed.org>
References:  <200202091752.g19HqFP11551@green.bikeshed.org>

next in thread | previous in thread | raw e-mail | index | archive | help
At 12:52 PM -0500 2/9/02, Brian F. Feldman wrote:
>"M. Warner Losh" <imp@village.org> wrote:
>  > I'd like to revert this change.  The main reason is that gnupatch can
>>  reconstruct files that have missing stuff at the end, and subversion
>>  whines that FreeBSD's diff is bogus.
>
>Do I get to whine that subversion's non-standard use of pseudo-diff lines
>is bogus, then?  All told, I'd actually rather our diff would refuse to
>act  upon files that have no end-line.  Encourage people to use non-stupid
>editing tools that conform to the way things have been done for decades.

I think it would be impossible to convince anyone that a file should
be called "binary" simply because it is missing the final newline
character.

You are suggesting that we penalize the person who is running 'diff',
but that person may not be the same person as the one who generated
the problematic file.  Even if I agreed with the stated objective,
this behavior hassles the wrong person.

The official standards do not say anything on this issue, one way or
another.  In the absence of such standards, the only standards we
can refer to are de facto standards.  You blithely invoked the phrase
'non-standard behavior'.  Here is a list of behaviors that I find on
the OS's I can get to:

The behavior of 'diff' where the first file has the final newline and
the second file is missing it:
     1) diff stupidly thinks the second file is missing the entire
        last *line*, not just the newline character.  It generates
        a change to delete the entire last line from the first file.
        Applying the patch to the first file gives you a file which
        matches neither the first file nor the second file.
        [NeXTSTEP 3.3 - a 9-year old system...]
     2) diff does not realize there is any difference in the two
        files.  no change is generated.  [irix]
     3) diff realizes the last line of the two files are different,
        so it prints out a "change", but the change is such that it
        is impossible to tell what the difference is.  It deletes
        the line and adds it back in, and there is no difference
        between the old and new lines (in the output).  [freebsd]
     4) diff realizes the last lines are different, prints out the
        change-lines to stdout.  It still deletes and adds the exact
        same text for the final line, but it also writes a warning
        message to stderr:
           Warning: missing newline at end of file gadtest2
        So, the patch generated by 'diff' can not reconstruct the
        second file from the first file, but at least the user gets
        some kind of warning message.  [aix, solaris]
     5) diff realizes the last line is missing the newline, and
        writes out a specific marker-line to stdout.  This makes
        it possible to construct the second file from the first
        file and the diff.  [linux, netbsd, openbsd, darwin, and
        of course gnu-diff when it is installed anywhere]

I think that the last behavior is the most useful and thus the most
desirable one, from a purely practical point of view.  The point of
'diff' is to tell the person the difference between two files, it is
not there to slap the wrists of someone for using the "wrong tool"
to work on some file.  When it comes to following standards, freebsd
can't claim to be following anyone that I can see.  We've invented
our own behavior, and that behavior is the least useful behavior of
any system which could be called "standard" (I assume that none of
us refer to 'irix' when we think of 'standard unix behavior').

If we have to fix our patch to understand the extra line, then let's
do that.  I think if it does come time to address this issue in a
written standard, then I doubt anyone who is already in group-5 will
ever ever agree to making their diff less useful.  Even if people hate
text files which are missing the final newline character, the freebsd
version of 'diff' does nothing useful for them.  It merely irritates
them by saying "there is a difference between these two lines, and I
am not going to tell you what it is".  It isn't just slapping their
wrists, it's slapping their wrists and not letting them know why.
While this is a fun exercise in user-hostility, I fail to see any
practical benefit from it.

-- 
Garance Alistair Drosehn            =   gad@eclipse.acs.rpi.edu
Senior Systems Programmer           or  gad@freebsd.org
Rensselaer Polytechnic Institute    or  drosih@rpi.edu

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-standards" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?p05101421b88b76a6aa12>