Date: Mon, 23 Jun 1997 10:14:37 -0500 (EST) From: John Fieber <jfieber@indiana.edu> To: Brian Somers <brian@awfulhak.org> Cc: Annelise Anderson <andrsn@andrsn.stanford.edu>, Chuck Robey <chuckr@glue.umd.edu>, "Jordan K. Hubbard" <jkh@time.cdrom.com>, kleon@bellsouth.net, freebsd-hackers@FreeBSD.ORG, freebsd-doc@FreeBSD.ORG, freebsd-questions@FreeBSD.ORG Subject: Re: Handbook - ascii form?? Message-ID: <Pine.BSF.3.96.970623093950.313P-100000@fallout.campusview.indiana.edu> In-Reply-To: <199706230803.JAA09485@awfulhak.demon.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 23 Jun 1997, Brian Somers wrote: > [cc'd also to -questions & -doc] > > > On Mon, 23 Jun 1997, Chuck Robey wrote: > > > > > OK, there's one problem with that. I think that the ftp server is letting > > > folks download handbook.ascii as ascii text, which is eating the backspace > > > keys. Gotta download this as binary! > > > > That's right, if it's downloaded as a binary file it retains the ^H etc. > > formatting codes; otherwise it doesn't. > > Which brings us back to the question. Why does .ascii have non-ascii > characters. A diff between .latin1 and .ascii says that only the > '-' at the end of lines is missing in the .ascii version :( Surely > .latin1 should have the overstrikes and .ascii shouldn't ? > > Is this a "sgml" bug ? Okay, lets get this straight: 1) ^H >>IS<< ASCII. 2) A underscore followed by a ^H, followed by a letter is a common idiom for creating underlining that originates in the days of typewriters and teletypes. The idiom is widely, but not universally supported. 3) Simlarly, a letter followed by a ^H, followed by the same letter is a common idiom for creating boldface. 4) The non-HTML renditions of the handbook/FAQ come from groff which uses these idioms for underlining and boldface. This ^H debate has >>nothing<< to do with SGML. 5) The difference between the .ascii and .latin1 generated by sgmlfmt(1) is that the former uses only 7-bit codes, while the latter uses ISO 8859-1 encoding for characters above 128. The differences show up in a few of our authors names that use diacritics, bulleted lists have bullets instead of lower case o, automatic hyphenation uses a soft-hyphen (AD) rather than a regular hyphen*, and probabaly some other odds and ends. 6) If the ^H is missing from a downloaded file, chances are it got stripped rogue software. Claims were made that this did indeed happen, but I don't believe the details of the software used were mentioned--these are essential things to provide with any bug report. 7) col -b is the most reliable way to undo the underline/boldface that groff does, but it will not work if the ^H characters got eaten in transit. 8) Earlier versions of sgmlfmt(1) sent all groff output through col -b by default. If there is a consensus that this would be better, it is trivial to change. ...and now back to your regularly scheduled broadcast... -john * This could be considered a bug because there is little agreement on the correct handling of the character.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.96.970623093950.313P-100000>