Date: Wed, 23 Jan 2002 18:24:41 +0000 (GMT) From: "Philip M. Gollucci" <philip@sduwebship.student.umd.edu> To: <freebsd-questions@FreeBSD.ORG> Subject: MS+HTML -> Unix Message-ID: <20020123182426.E63410-100000@sduwebship.student.umd.edu>
next in thread | raw e-mail | index | archive | help
Say I have a webpage where I want to offer people the ability to upload
either a .txt or a .html file. Now these people basically are computer
illierate, and don't even konw that UNIX is different from Microsh$t.
At anyrate, they will use "Save as (HTML) from MSWord 97/2000, "Save as
(txt)", or worse yet, "Save as RTF".
Then upload that.
Big surprise it gets it really wrong basically meaning it doesn't format
correctly before or after they use the site in any Browser.
One file, tidy told me had over 300 errors and that was just with HTML4.01
not XHTML1.0.
Is there anyway I can on the fly take the messed up HTML file I get and
covert it to what they meant to give me.
Important cases :
Parrell Columns not in a table
Bullets
<DIR> tags
actually closing <u> tags so the whole page isn't underlined.
I've see the demoronizer port, but don't know that much about it, and I
don't think its quite what I want.
Basically I have to take html given me and make the html they mean.
Any Great Ideas
END
------------------------------------------------------------------------------
Philip M. Gollucci (p6m7g8) philip@p6m7g8.com 301.314.3118
Science, Discovery, & the Universe (UMCP)
Webmaster & Webship Teacher
URL: http://www.sdu.umd.edu
EJPress.com
Database/PERL Programmer & System Admin
URL : http://www.ejournalpress.com
Resume : http://www.p6m7g8.com/resume.txt
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020123182426.E63410-100000>
