Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 Jun 1997 10:14:37 -0500 (EST)
From:      John Fieber <jfieber@indiana.edu>
To:        Brian Somers <brian@awfulhak.org>
Cc:        Annelise Anderson <andrsn@andrsn.stanford.edu>, Chuck Robey <chuckr@glue.umd.edu>, "Jordan K. Hubbard" <jkh@time.cdrom.com>, kleon@bellsouth.net, freebsd-hackers@FreeBSD.ORG, freebsd-doc@FreeBSD.ORG, freebsd-questions@FreeBSD.ORG
Subject:   Re: Handbook - ascii form?? 
Message-ID:  <Pine.BSF.3.96.970623093950.313P-100000@fallout.campusview.indiana.edu>
In-Reply-To: <199706230803.JAA09485@awfulhak.demon.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 23 Jun 1997, Brian Somers wrote:

> [cc'd also to -questions & -doc]
> 
> > On Mon, 23 Jun 1997, Chuck Robey wrote:
> > 
> > > OK, there's one problem with that.  I think that the ftp server is letting
> > > folks download handbook.ascii as ascii text, which is eating the backspace
> > > keys.  Gotta download this as binary!
> > 
> > That's right, if it's downloaded as a binary file it retains the ^H etc.
> > formatting codes; otherwise it doesn't.
> 
> Which brings us back to the question.  Why does .ascii have non-ascii
> characters.  A diff between .latin1 and .ascii says that only the
> '-' at the end of lines is missing in the .ascii version :(  Surely
> .latin1 should have the overstrikes and .ascii shouldn't ?
> 
> Is this a "sgml" bug ?

Okay, lets get this straight:

1) ^H >>IS<< ASCII.

2) A underscore followed by a ^H, followed by a letter is a
   common idiom for creating underlining that originates in the
   days of typewriters and teletypes.  The idiom is widely,
   but not universally supported.

3) Simlarly, a letter followed by a ^H, followed by the same
   letter is a common idiom for creating boldface.

4) The non-HTML renditions of the handbook/FAQ come from groff
   which uses these idioms for underlining and boldface.  This
   ^H debate has >>nothing<< to do with SGML.

5) The difference between the .ascii and .latin1 generated by
   sgmlfmt(1) is that the former uses only 7-bit codes, while
   the latter uses ISO 8859-1 encoding for characters above 128.
   The differences show up in a few of our authors names that
   use diacritics, bulleted lists have bullets instead of lower
   case o, automatic hyphenation uses a soft-hyphen (AD) rather
   than a regular hyphen*, and probabaly some other odds and
   ends.
   
6) If the ^H is missing from a downloaded file, chances are it
   got stripped rogue software.  Claims were made that this
   did indeed happen, but I don't believe the details of the
   software used were mentioned--these are essential things to
   provide with any bug report.

7) col -b is the most reliable way to undo the underline/boldface
   that groff does, but it will not work if the ^H characters
   got eaten in transit.

8) Earlier versions of sgmlfmt(1) sent all groff output through
   col -b by default.  If there is a consensus that this would
   be better, it is trivial to change.


...and now back to your regularly scheduled broadcast...

-john


* This could be considered a bug because there is little
  agreement on the correct handling of the character.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.96.970623093950.313P-100000>