Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 04 Sep 2023 22:07:38 +0000
From:      bugzilla-noreply@freebsd.org
To:        doc@FreeBSD.org
Subject:   [Bug 273245] textproc/groff: groff_mdoc(7): output from 'man 7 groff_mdoc' is badly broken
Message-ID:  <bug-273245-9-xha1EPcTCy@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-273245-9@https.bugs.freebsd.org/bugzilla/>
References:  <bug-273245-9@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D273245

--- Comment #6 from G. Branden Robinson <g.branden.robinson@gmail.com> ---
Hi Wolfgang,

I cloned the freebsd-src repository to have a look at the 35 cases within it
that concerned you.

> for i in freebsd-{src,ports,doc};do (cd $i && printf "$i "; git grep 'rof=
f.* -man[^d]' |wc -l );done
> freebsd-src 35

$ git grep -n 'roff.* -man[^d]'
contrib/byacc/aclocal.m4:1047:${NROFF_NOTE}     [\$](SHELL) -c "tbl [\$]*.$=
2 |
nroff -man | col -bx" >[\$]@
contrib/byacc/aclocal.m4:1053:${GROFF_NOTE}     [\$](SHELL) -c "tbl [\$]*.$=
2 |
groff -man" >[\$]@
contrib/byacc/aclocal.m4:1056:${GROFF_NOTE}     GROFF_NO_SGR=3Dstupid [\$](=
SHELL)
-c "tbl [\$]*.$2 | nroff -rHY=3D0 -Tascii -man | col -bx" >[\$]@
contrib/byacc/configure:8368:${NROFF_NOTE}      \$(SHELL) -c "tbl \$*.1 | n=
roff
-man | col -bx" >\$@
contrib/byacc/configure:8374:${GROFF_NOTE}      \$(SHELL) -c "tbl \$*.1 | g=
roff
-man" >\$@
contrib/byacc/configure:8377:${GROFF_NOTE}      GROFF_NO_SGR=3Dstupid \$(SH=
ELL)
-c "tbl \$*.1 | nroff -rHY=3D0 -Tascii -man | col -bx" >\$@

byacc is maintained by Thomas Dickey.  He's using Autoconf macros to produce
output from sources that are known to be in man(7) format.

https://github.com/freebsd/freebsd-src/blob/main/contrib/byacc/yacc.1#L30

`CF_MAKE_DOCS` appears to be an Autoconf macro private to the byacc
distribution; I see no other occurrences in `freebsd-src`.  He also has a
Autoconf test to determine how to generate HTML from man(7) pages.

https://github.com/freebsd/freebsd-src/blob/main/contrib/byacc/aclocal.m4#L=
1580

Note in particular the here document at line 1694.

https://github.com/freebsd/freebsd-src/blob/main/contrib/byacc/aclocal.m4#L=
1694

contrib/dialog/makefile.in:145:@NROFF_NOTE@     GROFF_NO_SGR=3Dstupid $(SHE=
LL) -c
"tbl $< | nroff -rHY=3D0 -Tascii -man | col -bx" >$@
contrib/dialog/makefile.in:151:@GROFF_NOTE@     $(SHELL) -c "tbl $< | groff
-man" >$@

Dialog is another Thomas Dickey project.

It also builds inputs it knows to be in man(7) format.

$ grep -nrFw .TH contrib/dialogcontrib/dialog/dialog.3:50:.TH \*D 3 "" "$Da=
te:
2021/01/17 18:02:44 $"
contrib/dialog/dialog.1:51:.TH \*D 1 "" "$Date: 2021/01/17 17:25:01 $"
contrib/dialog/configure:8028:.TH HEAD1 HEAD2 HEAD3 HEAD4 HEAD5
contrib/dialog/aclocal.m4:5919:.TH HEAD1 HEAD2 HEAD3 HEAD4 HEAD5
$ grep -nrFw .Dd contrib/dialog || echo NONE # look for mdoc(7) documents
NONE

Next...

contrib/ee/ee.1:5:.\"    nroff -man ee.1

$ head contrib/ee/ee.1=20
.\"
.\"
.\"  To format this reference page, use the command:
.\"
.\"    nroff -man ee.1
.\"
.\"  $Header: /home/hugh/sources/old_ae/RCS/ee.1,v 1.22 2001/12/16 04:49:27
hugh Exp $
.\"
.\"
.TH ee 1 "" "" ""

The man page is telling us explicitly (in a comment) what macro package to =
use
to format it, and unsurprisingly getting it right.

Next...

contrib/ldns/makewin.sh:243:for x in man1/*.1; do groff -man -Tascii -Z "$x=
" |
grotty -cbu > cat1/"$(basename "$x" .1).txt"; done
contrib/ldns/makewin.sh:246:for x in man3/*.3; do groff -man -Tascii -Z "$x=
" |
grotty -cbu > cat3/"$(basename "$x" .3).txt"; done

Again we have renderings of known documents.  Let's see what package they u=
se.

$ find contrib/ldns -name "*.[13]" | xargs grep -nEw '\.(Dd|TH)'
contrib/ldns/drill/drill.1:2:.TH drill 1 "28 May 2006"
contrib/ldns/packaging/ldns-config.1:1:.TH ldns-config 1 "22 Sep 2011"

So that's two more correct uses of '-man'.

Next...

contrib/ncurses/aclocal.m4:5607:                nroff -man \$TMP >\$TMP.out
contrib/ncurses/configure:14517:                nroff -man \$TMP >\$TMP.out

Another Thomas Dickey project.  These come from his Autoconf macro
`CF_MAN_PAGES`.  I'll skip ahead here and note that I'm familiar with the
ncurses man pages, having recently proposed patches to them.=20
https://lists.gnu.org/archive/html/bug-ncurses/2023-09/

They are exclusively in man(7) format, not mdoc(7).

Here again we have a case of a maintainer knowing what format is required, =
and
using it.

Next...

contrib/tcp_wrappers/Banners.Makefile:12:# sequences as described in the
hosts_access.5 manual page (`nroff -man'
contrib/tcp_wrappers/CHANGES:2:configuration checker. See the `tcpdchk.8'
manual page (`nroff -man'
contrib/tcp_wrappers/CHANGES:349:have all rules within a single file. See
"nroff -man hosts_options.5"
contrib/tcp_wrappers/Makefile:575:# and hosts_options.5 manual pages (`nroff
-man' format).
contrib/tcp_wrappers/README:240:hosts_access.5 manual page, which is in `nr=
off
-man' format. A later
contrib/tcp_wrappers/README:257:The hosts_options.5 manual page (`nroff -ma=
n'
format) documents an
contrib/tcp_wrappers/README:395:documented in the hosts_options.5 document,
which is in `nroff -man'
contrib/tcp_wrappers/README:432:`nroff -man' format) can guide the requests=
 to
the right server.  These
contrib/tcp_wrappers/README:453:given in the hosts_options.5 manual page
(`nroff -man' format). An
contrib/tcp_wrappers/README:897:hosts_access.5, which is in `nroff -man'
format. This is a lengthy
contrib/tcp_wrappers/README:904:The examples in the hosts_access.5 document
(`nroff -man' format) show
contrib/tcp_wrappers/README:912:hosts_options.5 document (`nroff -man' form=
at).
contrib/tcp_wrappers/README:918:program is described in the tcpdchk.8 docum=
ent
(`nroff -man' format).
contrib/tcp_wrappers/README:929:described in the tcpdmatch.8 document (`nro=
ff
-man' format).
contrib/tcp_wrappers/README:967:programs.  The hosts_access.3 manual page
(`nroff -man' format)
contrib/tcp_wrappers/options.c:4:  * manual page (source file: hosts_option=
s.5,
"nroff -man" format).

These are all source comments or text file contents, and do not drive
construction of anything; they therefore cannot cause failures.  Neverthele=
ss,
let us see what macro package is employed by "tcp_wrappers".

$ find contrib/tcp_wrappers -name "*.[1-9]" | xargs grep -nEw '\.(Dd|TH)'
contrib/tcp_wrappers/hosts_options.5:1:.TH HOSTS_OPTIONS 5
contrib/tcp_wrappers/tcpdmatch.8:1:.TH TCPDMATCH 8
contrib/tcp_wrappers/tcpd.8:1:.TH TCPD 8
contrib/tcp_wrappers/tcpdchk.8:1:.TH TCPDCHK 8
contrib/tcp_wrappers/hosts_access.5:1:.TH HOSTS_ACCESS 5
contrib/tcp_wrappers/hosts_access.3:1:.TH HOSTS_ACCESS 3

It would appear once again that the upstream maintainer is familiar with th=
eir
own man pages.

Next...

contrib/tcsh/tcsh.man2html:13:# in the exact same style of nroff -man, i.e.=
 any
other manpage.

This is a comment.  Some context might be helpful.

$ git grep -C2 -n 'roff.* -man[^d]' contrib/tcsh/
contrib/tcsh/tcsh.man2html-11-#
contrib/tcsh/tcsh.man2html-12-# Designed for tcsh manpage. Guaranteed not to
work on manpages not written
contrib/tcsh/tcsh.man2html:13:# in the exact same style of nroff -man, i.e.=
 any
other manpage.
contrib/tcsh/tcsh.man2html-14-#
contrib/tcsh/tcsh.man2html-15-# Makes links FROM items which are both a) in
particular sections (see

Given that context "guaranteed *not* to work on [other man pages]", it does=
 not
seem fair to hold this source remark as evidence militating against groff's
change.

Next...

contrib/tzcode/workman.sh:18:.rm }F" | nroff -man - ${1+"$@"} | perl -ne '

More context is warranted here, too.

$ head -n 18 contrib/tzcode/workman.sh
#! /bin/sh
# Convert manual page troff stdin to formatted .txt stdout.

# This file is in the public domain, so clarified as of
# 2009-05-17 by Arthur David Olson.

if (type nroff && type perl) >/dev/null 2>&1; then

  # Tell groff not to emit SGR escape sequences (ANSI color escapes).
  GROFF_NO_SGR=3D1
  export GROFF_NO_SGR

  echo ".am TH
.hy 0
.na
..
.rm }H
.rm }F" | nroff -man - ${1+"$@"} | perl -ne '

Your concern might, at first glance, seem warranted here; the tool does pur=
port
to be of general use.  However, closer inspection reveals that it was writt=
en
in ignorance or deliberate neglect of the mdoc(7) package altogether; obser=
ve
how it appends to the 'TH' macro, which is unused in mdoc(7).

It would be straightforward to make this script handle mdoc(7) as well; sim=
ply
append the same two requests to the `Dd` macro.  The removals of '}H' and '=
}F'
strings/macros/diversions suggests a familiarity with the AT&T Unix man(7)
implementation or its descendants in USG/System III/System V proprietary Un=
ix
or BSD prior to Networking Release/2 (when the Berkeley CSRG replaced Unix
troff with groff).=20

Further, if nroff or perl programs (or shell functions) are unavailable, th=
is
shell script proceeds to use mandoc anwyay.

$ tail -n 6 contrib/tzcode/workman.sh
elif (type mandoc && type col) >/dev/null 2>&1; then
  mandoc -man -T ascii "$@" | col -bx
else
  echo >&2 "$0: please install nroff and perl, or mandoc and col"
  exit 1
fi

Next...

usr.bin/man/man.conf.5:116:NROFF_JA     /usr/local/bin/groff -man
-dlang=3Dja_JP.eucJP
usr.bin/man/man.conf.5:117:TROFF_JA     /usr/local/bin/groff -man
-dlang=3Dja_JP.euc.jp

This is from an example in FreeBSD's man.conf(5) page.  The assignment to t=
he
"lang" string has no particular effect in groff, and to the best of my
knowledge it never has.  This may be evidence of jgroff, a fork of groff th=
at
was made unnecessary about in the Debian Project about 20 years ago
(https://www.debian.org/security/2002/dsa-107 ) and was superseded by groff
support in the 1.21 release about 13 years ago.=20
https://lists.gnu.org/archive/html/groff/2010-12/msg00051.html

(Also, is that a typo adding a period in line 117.)  I suggest that the
foregoing might be bitrotted.

Last...

usr.bin/man/man.sh:1074:NROFF=3D'groff -S -P-h -Wall -mtty-char -man'
usr.bin/man/man.sh:1078:TROFF=3D'groff -S -man'

This is a case that should certainly be addressed, if this script still
fulfills the purpose claimed for it in its initial commit in 2010:
"Implementaiton [sic] of man, manpath, whatis, and apropos written entirely=
 in
sh."

I strongly recommend s/-man/&doc/ here so that FreeBSD users will continue =
to
have a positive experience.

Also, for what it's worth, the `-S` option has been unnecessary; "safer" mo=
de
has been the default in groff since version 1.12, released on 14 December 1=
999.

However, 32-33 false positives in a set of 35 suggests to me that your scan=
ning
criteria could benefit from sensitivity tuning.  An error rate of over 90% =
is
generally considered unusable in serious measurement applications.

Regards,
Branden

--=20
You are receiving this mail because:
You are on the CC list for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-273245-9-xha1EPcTCy>