Date: Mon, 31 Jul 2017 16:22:20 +0000 (UTC) From: Guido Falsi <madpilot@FreeBSD.org> To: ports-committers@freebsd.org, svn-ports-all@freebsd.org, svn-ports-head@freebsd.org Subject: svn commit: r446984 - in head/www: . py-html5-parser Message-ID: <201707311622.v6VGMKnX023649@repo.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: madpilot Date: Mon Jul 31 16:22:20 2017 New Revision: 446984 URL: https://svnweb.freebsd.org/changeset/ports/446984 Log: A fast implementation of the HTML 5 parsing spec for Python. Parsing is done in C using a variant of the gumbo parser. The gumbo parse tree is then transformed into an lxml tree, also in C, yielding parse times that can be a thirtieth of the html5lib parse times. That is a speedup of 30x. This differs, for instance, from the gumbo python bindings, where the initial parsing is done in C but the transformation into the final tree is done in python. WWW: https://html5-parser.readthedocs.io/ Added: head/www/py-html5-parser/ head/www/py-html5-parser/Makefile (contents, props changed) head/www/py-html5-parser/distinfo (contents, props changed) head/www/py-html5-parser/pkg-descr (contents, props changed) Modified: head/www/Makefile Modified: head/www/Makefile ============================================================================== --- head/www/Makefile Mon Jul 31 16:02:03 2017 (r446983) +++ head/www/Makefile Mon Jul 31 16:22:20 2017 (r446984) @@ -1668,6 +1668,7 @@ SUBDIR += py-horizon SUBDIR += py-hpack SUBDIR += py-html + SUBDIR += py-html5-parser SUBDIR += py-html5lib SUBDIR += py-http-parser SUBDIR += py-httpie Added: head/www/py-html5-parser/Makefile ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/www/py-html5-parser/Makefile Mon Jul 31 16:22:20 2017 (r446984) @@ -0,0 +1,19 @@ +# $FreeBSD$ + +PORTNAME= html5-parser +PORTVERSION= 0.4.3 +CATEGORIES= www python +MASTER_SITES= CHEESESHOP +PKGNAMEPREFIX= ${PYTHON_PKGNAMEPREFIX} + +MAINTAINER= madpilot@FreeBSD.org +COMMENT= Fast implementation of the HTML 5 parsing spec for Python + +LICENSE= APACHE20 + +BUILD_DEPENDS= ${PYTHON_PKGNAMEPREFIX}lxml>=3.8.0:devel/py-lxml + +USES= pkgconfig python +USE_PYTHON= autoplist distutils + +.include <bsd.port.mk> Added: head/www/py-html5-parser/distinfo ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/www/py-html5-parser/distinfo Mon Jul 31 16:22:20 2017 (r446984) @@ -0,0 +1,3 @@ +TIMESTAMP = 1501237401 +SHA256 (html5-parser-0.4.3.tar.gz) = dd5e3647c5919439c41600172ef96b5fdbf278028bd4000476f87412c4fb7b9c +SIZE (html5-parser-0.4.3.tar.gz) = 261906 Added: head/www/py-html5-parser/pkg-descr ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ head/www/py-html5-parser/pkg-descr Mon Jul 31 16:22:20 2017 (r446984) @@ -0,0 +1,9 @@ +A fast implementation of the HTML 5 parsing spec for Python. Parsing +is done in C using a variant of the gumbo parser. The gumbo parse +tree is then transformed into an lxml tree, also in C, yielding +parse times that can be a thirtieth of the html5lib parse times. +That is a speedup of 30x. This differs, for instance, from the gumbo +python bindings, where the initial parsing is done in C but the +transformation into the final tree is done in python. + +WWW: https://html5-parser.readthedocs.io/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201707311622.v6VGMKnX023649>