From owner-freebsd-questions Wed Jan 23 15:22:13 2002 Delivered-To: freebsd-questions@freebsd.org Received: from sduwebship.student.umd.edu (sduwebship.student.umd.edu [129.2.156.15]) by hub.freebsd.org (Postfix) with ESMTP id 0A58E37B400 for ; Wed, 23 Jan 2002 15:22:09 -0800 (PST) Received: from localhost (philip@localhost) by sduwebship.student.umd.edu (8.11.6/8.11.6) with ESMTP id g0NIOga63424 for ; Wed, 23 Jan 2002 18:24:42 GMT (envelope-from philip@sduwebship.student.umd.edu) Date: Wed, 23 Jan 2002 18:24:41 +0000 (GMT) From: "Philip M. Gollucci" To: Subject: MS+HTML -> Unix Message-ID: <20020123182426.E63410-100000@sduwebship.student.umd.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Say I have a webpage where I want to offer people the ability to upload either a .txt or a .html file. Now these people basically are computer illierate, and don't even konw that UNIX is different from Microsh$t. At anyrate, they will use "Save as (HTML) from MSWord 97/2000, "Save as (txt)", or worse yet, "Save as RTF". Then upload that. Big surprise it gets it really wrong basically meaning it doesn't format correctly before or after they use the site in any Browser. One file, tidy told me had over 300 errors and that was just with HTML4.01 not XHTML1.0. Is there anyway I can on the fly take the messed up HTML file I get and covert it to what they meant to give me. Important cases : Parrell Columns not in a table Bullets tags actually closing tags so the whole page isn't underlined. I've see the demoronizer port, but don't know that much about it, and I don't think its quite what I want. Basically I have to take html given me and make the html they mean. Any Great Ideas END ------------------------------------------------------------------------------ Philip M. Gollucci (p6m7g8) philip@p6m7g8.com 301.314.3118 Science, Discovery, & the Universe (UMCP) Webmaster & Webship Teacher URL: http://www.sdu.umd.edu EJPress.com Database/PERL Programmer & System Admin URL : http://www.ejournalpress.com Resume : http://www.p6m7g8.com/resume.txt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message