Date: Mon, 12 Jan 2015 16:52:41 +0000 From: Matthew Seaman <m.seaman@infracaninophile.co.uk> To: freebsd-questions@freebsd.org Subject: Re: pkg-info: supplying word boundary in regex Message-ID: <54B3FBD9.8090009@infracaninophile.co.uk> In-Reply-To: <20150112154038.GA41662@holstein.holy.cow> References: <20150112154038.GA41662@holstein.holy.cow>
next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --5x5q7XM3FUJO7jpimE6dJVDesVn9sORT4 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 2015/01/12 15:40, parv@pair.com wrote: > (Running pkg 1.4.4 on FreeBSD 8-STABLE here.) >=20 > Could somebody tell me please the syntax for word boundary in regex > for pkg-info ... >=20 > pkg info -x '...' >=20 >=20 > ... ? >=20 > The pkg-info manual page says ... >=20 > -x, --regex > Treat pkg-name as a regular expression according to the "modern" > or "extended" syntax of re_format(7). >=20 > ... and re_format man page says ... >=20 > There are two special cases=3D of bracket expressions: the bracket > expressions `[[:<:]]' and `[[:>:]]' match the null string at the > beginning and end of a word respectively. A word is defined as a > sequence of word characters which is neither preceded nor followed > by word characters. A word character is an alnum character (as > defined by ctype(3)) or an underscore. >=20 > ... then specifying a word boundary as "x[[:>:]]" (to match "x" at the > end of a "word") causes "Invalid regex" error. For example, to get > result only for "tex" (avoiding packages with "text" as string in a > package name[0]) ... >=20 > # pkg info -x 'tex[[:>:]]' > pkg: sqlite error while executing iterator in file \ > pkgdb_iterator.c:905: Invalid regex Normally I'd suggest trying: # env DEBUG=3D4 pkg info -x 'tex[[:>:]]' which should display exactly what SQL is being run. However it appears that the error occurs before Sqlite does any querying -- it must be while compiling the RE. >=20 > Does pkg overstate its support for re_format(7) then? >=20 >=20 > - parv >=20 >=20 > [0] Yes, I realize if a name happens to be "text-tex" I would also > get a result & that would be expected. Common case is that > packages here have only either "text" or "tex" exclusively in > the name. >=20 There is always "pkg info -x 'tex$' to anchor the RE at the end of a string. However that doesn't really answer your question. The re_format(7) support comes straight out of libc -- it's compiled as a loadable module for sqlite, so that queries of the form: SELECT foo FROM bar WHERE foo REXEP 'some-re' ; should be using re_format(7) style REGEXPs. Admittedly, we haven't tested all the odd corner cases for regular expression syntax. Cn you raise an issue on github please? Cheers, Matthew --5x5q7XM3FUJO7jpimE6dJVDesVn9sORT4 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.22 (Darwin) Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJUs/vgAAoJEABRPxDgqeTnR94P/ixycn0Of0RDb4e056HElHMF 6VlPtbaBZnG0cM0nXb2RgnIlkIhD6MBB0CLPJu6o2eYIAOohMxE03/B7O+wkeRAN Rf8ux25TZ45AC9V3gDMrqzdlPtjGwvGeX8WEWkfwlN5X+0QuCwi6l8yQvw4yVrQr fLvso8oXfd/PbrZS4FeT8liD47HdxI+gnkLmSzmiWx9HA7uRqFXCO1KwnkaEgRxA fMMIP9dN4/6hG1mYRb9KCnh/E/M6S4qcMxZz38jyemITJOv8E6EtlTAqc2dOZzlY 6VUou3gcb+lXEEVW1Ucdd427+rDeUuOnjeH/bBv0BQ4oIb1lrIMfldpFc6KZes/S qz+w3sYAv7qYxd6ITBjW5oSoy9MmdQwpddoM7kdFee3QhqVD0SysWmZ2nVH6CS+K +EUd/E97soV0b1roSmzSi2NrCeAb+dHsZT3iaMPx5yxYY+IkKY7wjz7xOaHU8aLk VyIZgycrF9PONnWCD4M9YmW7CGYlKjFg2ZyjX/kRsWYGJDaFeMs5oeU1pH60Sv15 OgJfdf3qfomIFE3rjPnmsRtdExvRCmhgmWs6VKSpwqJNlXHpfOFI+hd6zLc0sBiv 9NS6mxHpbaTFIwHhrlw2KgiJSi4gky5ohoyByGgcCieljpbDmLjHHJHDyBY2c4E7 HbZrw6ZJyByx8JJvuFof =yK5V -----END PGP SIGNATURE----- --5x5q7XM3FUJO7jpimE6dJVDesVn9sORT4--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54B3FBD9.8090009>