From owner-freebsd-hackers Mon Jun 23 08:15:54 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id IAA08680 for hackers-outgoing; Mon, 23 Jun 1997 08:15:54 -0700 (PDT) Received: from fallout.campusview.indiana.edu (fallout.campusview.indiana.edu [149.159.1.1]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id IAA08640; Mon, 23 Jun 1997 08:15:28 -0700 (PDT) Received: from localhost (jfieber@localhost) by fallout.campusview.indiana.edu (8.8.5/8.8.5) with SMTP id KAA06556; Mon, 23 Jun 1997 10:14:37 -0500 (EST) Date: Mon, 23 Jun 1997 10:14:37 -0500 (EST) From: John Fieber Reply-To: John Fieber To: Brian Somers cc: Annelise Anderson , Chuck Robey , "Jordan K. Hubbard" , kleon@bellsouth.net, freebsd-hackers@FreeBSD.ORG, freebsd-doc@FreeBSD.ORG, freebsd-questions@FreeBSD.ORG Subject: Re: Handbook - ascii form?? In-Reply-To: <199706230803.JAA09485@awfulhak.demon.co.uk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-hackers@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Mon, 23 Jun 1997, Brian Somers wrote: > [cc'd also to -questions & -doc] > > > On Mon, 23 Jun 1997, Chuck Robey wrote: > > > > > OK, there's one problem with that. I think that the ftp server is letting > > > folks download handbook.ascii as ascii text, which is eating the backspace > > > keys. Gotta download this as binary! > > > > That's right, if it's downloaded as a binary file it retains the ^H etc. > > formatting codes; otherwise it doesn't. > > Which brings us back to the question. Why does .ascii have non-ascii > characters. A diff between .latin1 and .ascii says that only the > '-' at the end of lines is missing in the .ascii version :( Surely > .latin1 should have the overstrikes and .ascii shouldn't ? > > Is this a "sgml" bug ? Okay, lets get this straight: 1) ^H >>IS<< ASCII. 2) A underscore followed by a ^H, followed by a letter is a common idiom for creating underlining that originates in the days of typewriters and teletypes. The idiom is widely, but not universally supported. 3) Simlarly, a letter followed by a ^H, followed by the same letter is a common idiom for creating boldface. 4) The non-HTML renditions of the handbook/FAQ come from groff which uses these idioms for underlining and boldface. This ^H debate has >>nothing<< to do with SGML. 5) The difference between the .ascii and .latin1 generated by sgmlfmt(1) is that the former uses only 7-bit codes, while the latter uses ISO 8859-1 encoding for characters above 128. The differences show up in a few of our authors names that use diacritics, bulleted lists have bullets instead of lower case o, automatic hyphenation uses a soft-hyphen (AD) rather than a regular hyphen*, and probabaly some other odds and ends. 6) If the ^H is missing from a downloaded file, chances are it got stripped rogue software. Claims were made that this did indeed happen, but I don't believe the details of the software used were mentioned--these are essential things to provide with any bug report. 7) col -b is the most reliable way to undo the underline/boldface that groff does, but it will not work if the ^H characters got eaten in transit. 8) Earlier versions of sgmlfmt(1) sent all groff output through col -b by default. If there is a consensus that this would be better, it is trivial to change. ...and now back to your regularly scheduled broadcast... -john * This could be considered a bug because there is little agreement on the correct handling of the character.