From owner-svn-doc-head@freebsd.org Sat Sep 9 13:35:19 2017 Return-Path: Delivered-To: svn-doc-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 01F6DE16E1E; Sat, 9 Sep 2017 13:35:19 +0000 (UTC) (envelope-from wblock@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D0274697F2; Sat, 9 Sep 2017 13:35:18 +0000 (UTC) (envelope-from wblock@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v89DZHmI035984; Sat, 9 Sep 2017 13:35:17 GMT (envelope-from wblock@FreeBSD.org) Received: (from wblock@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v89DZH6g035983; Sat, 9 Sep 2017 13:35:17 GMT (envelope-from wblock@FreeBSD.org) Message-Id: <201709091335.v89DZH6g035983@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: wblock set sender to wblock@FreeBSD.org using -f From: Warren Block Date: Sat, 9 Sep 2017 13:35:17 +0000 (UTC) To: doc-committers@freebsd.org, svn-doc-all@freebsd.org, svn-doc-head@freebsd.org Subject: svn commit: r50814 - head/share/tools/convert2utf8 X-SVN-Group: doc-head X-SVN-Commit-Author: wblock X-SVN-Commit-Paths: head/share/tools/convert2utf8 X-SVN-Commit-Revision: 50814 X-SVN-Commit-Repository: doc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-doc-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the doc tree for head List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Sep 2017 13:35:19 -0000 Author: wblock Date: Sat Sep 9 13:35:17 2017 New Revision: 50814 URL: https://svnweb.freebsd.org/changeset/doc/50814 Log: Exclude all files in htdocs/releases, and better check and sanitize the exclude path used. Modified: head/share/tools/convert2utf8/convert2utf8 Modified: head/share/tools/convert2utf8/convert2utf8 ============================================================================== --- head/share/tools/convert2utf8/convert2utf8 Sat Sep 9 11:02:58 2017 (r50813) +++ head/share/tools/convert2utf8/convert2utf8 Sat Sep 9 13:35:17 2017 (r50814) @@ -48,7 +48,7 @@ my $svn = '/usr/local/bin/svn'; my $xargs = '/usr/bin/xargs'; my $docdir = '/usr/doc/en_US.ISO8859-1'; -my $exclpath = 'htdocs/releases/*/*.html'; +my $exclpath = 'htdocs/releases/*/*'; my $verbose = 0; sub usage { @@ -64,8 +64,9 @@ sub usage { print "encoding directories like en_US.ISO8859-1 to UTF-8. The\n"; print "documentation directory must be a Subversion checkout. After\n"; print "files are converted, the directory is renamed to *.UTF-8.\n\n"; - print "The default exclude path prevents conversion of statically-\n"; - print "generated HTML files in htdocs/releases/*/*.html.\n"; + print "The default exclude path prevents conversion of all files\n"; + print "htdocs/releases/*/*. These files are typically statically-\n"; + print "generated historical files.\n"; exit 0; } @@ -94,10 +95,17 @@ sub make_clean { } sub find_files { - my ($dn) = @_; + my ($dn,$exclpath) = @_; + + my $fullexclpath = "$dn/$exclpath"; + # sanitize exclude path, find is very picky about matches + # convert multiple slashes to single + $fullexclpath =~ s%/{2,}%/%; + die "** exclude path '$exclpath' must be relative (under '$docdir')\n" if $exclpath =~ m%^/%; + print "finding files to be converted in '$dn'\n" if $verbose; - print "excluding files matching \"$dn$exclpath/*\"\n" if $verbose; - return map(/^(.*):/, `$find $dn -not -path \"$dn$exclpath\" -type f -print0 | $xargs -0 $file | $grep 'XML\\|SGML\\|BSD'`); + print "excluding files matching \"$fullexclpath/*\"\n" if $verbose; + return map(/^(.*):/, `$find $dn -not -path \"$fullexclpath\" -type f -print0 | $xargs -0 $file | $grep 'XML\\|SGML\\|BSD'`); } sub convert_file { @@ -149,11 +157,6 @@ sub main { $docdir = $opt_d if $opt_d; $exclpath = $opt_e if $opt_e; - # sanitize exclude path, find is very picky about matches - # convert multiple slashes to single - $exclpath =~ s%/{2,}%/%; - die "** exclude path '$exclpath' must be relative (under '$docdir')\n" if $exclpath =~ m%^/%; - my $basedir = basename($docdir); die "** '$docdir' does not have a standard ISO xy_AB directory name\n" unless $basedir =~ /^([a-z]{2}_[A-Z]{1,3})\./; my $isolang = $1; @@ -179,7 +182,7 @@ sub main { make_clean($docdir); - my @files = find_files($docdir); + my @files = find_files($docdir, $exclpath); for my $f (@files) { convert_file($f, $basedir, $isolang, $fromcode, $tocode);