Date: Thu, 28 Feb 2002 08:41:50 +0000 From: Nik Clayton <nik@freebsd.org> To: Peter Pentchev <roam@ringlet.net> Cc: freebsd-doc@FreeBSD.org Subject: Re: FreeBSD web build failed on freefall.freebsd.org Message-ID: <20020228084150.K4562@canyon.nothing-going-on.org> References: <200202181359.g1IDxip44049@freefall.freebsd.org> <20020218180524.A1671@straylight.oblivion.bg>
next in thread | previous in thread | raw e-mail | index | archive | help
--IJAclU0AInkryoed Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Feb 18, 2002 at 06:05:24PM +0200, Peter Pentchev wrote: > It seems that we stumbled upon a DSSSL/DocBook bug: a <row><entry> > does not like whitespace, even if the whitespace lies between > a </para> and the next <para>. Maybe an entry was never meant to > contain more than one paragraph? It's an SGML 'bug'. From DocBook: The Definitive Guide[1], and the=20 section dealing with the <entry> element: =3D=3D Pernicious Mixed Content The content model of the Entry element exhibits a nasty peculiarity that we call "pernicious mixed content". Every other element in DocBook contains either block elements or inline elements (including #PCDATA) unambiguously. In these cases, the meaning of line breaks and spaces are well understood; they are insignificant between block elements and significant (to the SGML parser, anyway) where inline markup can occur. Table entries are different; they can contain either block or inline elements, but not both at the same time. In other words, one Entry in a tab= le might contain a paragraph or a list while another contains simply #PCDATA or another inline markup, but no single Entry can contain both. Because the content model of an Entry allows both kinds of markup, each time the SGML parser encounters an Entry, it has to decide what variety of markup it contains. SGML parsers are forbidden to use more than a single token of lookahead to reach this decision. In practical terms, what this means is th= at a line feed or space after an Entry start tag causes the parser to decide that the cell contains inline markup. Subsequent discovery of a paragraph or another block element causes a parsing error. All of these are legal: <entry>3.1415927</entry> <entry>General <emphasis>#PCDATA</emphasis></entry> <entry><para> A paragraph of text </para></entry> However, each of these is an error: <entry> Error, cannot have a line break before a block eleme= nt <para> A paragraph of text. </para></entry> <entry><para> A paragraph of text. </para> Error, cannot have a line break between block elements <para> A paragraph of text. </para></entry> <entry><para> A paragraph of text. </para> Error, cannot have a line break after a block element </entry> When designing a DTD, it is wise to avoid pernicious mixed content. Unfortunately, the only way to correct the pernicious mixed content problem that already exists in DocBook is to require some sort of wrapper (a block element, or an inline like Phrase) around #PCDATA within table Entrys. This is annoying and inconvenient in a great many tables in which #PCDATA cells predominate and, in addition, differ from CALS. =3D=3D N [1] www.docbook.org -- FreeBSD: The Power to Serve http://www.freebsd.org/ (__) FreeBSD Documentation Project http://www.freebsd.org/docproj/ \\\'',) \/ \= ^ --- 15B8 3FFC DDB4 34B0 AA5F 94B7 93A8 0764 2C37 E375 --- .\._/= _) --IJAclU0AInkryoed Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iEYEARECAAYFAjx97U4ACgkQk6gHZCw343W7VgCfUR/uMcr0+3f35XQG0k0I9H2e 7UIAnRTuBWx4t4izC69TD6DB4ADt/IhX =LfLw -----END PGP SIGNATURE----- --IJAclU0AInkryoed-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-doc" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020228084150.K4562>