From nobody Wed Sep 7 23:06:20 2022 X-Original-To: freebsd-questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4MNHv21rScz4cMTZ for ; Wed, 7 Sep 2022 23:06:38 +0000 (UTC) (envelope-from miguelmclara@gmail.com) Received: from mail-oa1-x2c.google.com (mail-oa1-x2c.google.com [IPv6:2001:4860:4864:20::2c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4MNHv149k3z3cJJ for ; Wed, 7 Sep 2022 23:06:37 +0000 (UTC) (envelope-from miguelmclara@gmail.com) Received: by mail-oa1-x2c.google.com with SMTP id 586e51a60fabf-12ab0eaa366so403750fac.13 for ; Wed, 07 Sep 2022 16:06:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date; bh=py36JvzyULt7eTwk9v6V50ajEFoC1k9eFNa1e4om6Os=; b=YsK6g0NjyWuoPCxsm38g8way9Zm7DeX94Vt78iohFeMnXMdtoANOri/AxK77V9lhXP E8Oo346E+L3ayVCS5XJJYjbq0VIYY4RXBx8CsrGPVLTXBI2ybbqu3jnl8Su1VQ8Kd3/w BFM/fXno3NxICFp3l1FTM3iGDnKsGHs6vp2jAg1VBbu+KNPZcmzxWiXNBoz1iQRYu/AF lgfHQXmwxshIXRpU57io2tUB9YpUo0XkA1ZzPrFFqCHp5tTCLUW78onMWn6e2ck3bF1I JaktNMOxNSGTiQFd6QL7YeGAKxOysoZWWrBOTx2Vcz8xYzVbp8dgl1SWfFyZKQxYcABs DqNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date; bh=py36JvzyULt7eTwk9v6V50ajEFoC1k9eFNa1e4om6Os=; b=D4mKT0Xd1j8+12i3760cnPFmbspwZ3H9JfYA5LiqGtCyrOCyi7ryhpqoBiKtdzaTZb uBfmhkOogDk2OxSM2WP/h+yopwqOGZx3mv6D6CSIjNZhrHJ09c7+VJS+QTXdRDX/pf6R LD6jUt6BELVeMsYWy+x2TlO63sqwOyOKDc1xrjyI5FpNdtuTT5raWaBnX6dTIDmcCo0O 4sF4JySXMFxkyeBYw2yqsAuVD0280c/WBj1ibijpB9T/X5v0Yq5UXqGg0A927B6vk5gG TINdgriq/zQ9zHzkDSziAIlCPemaLoelo37kHBNQHRkhia1d8L63X0tmVM4sHd+UTyct IIew== X-Gm-Message-State: ACgBeo0kkQtCMG3sAyjnrjLqcO9yp20ioUmnpsaC6JW+FY4nlCPwvu6Q iaze3C21NVy/duYaLedHt4Y6K8kjSTNBYmtN2XWzbebA6E4= X-Google-Smtp-Source: AA6agR4ZgNbY/UGVQQklYvqMJ++CMKc9YfipJw/7/x6igKu/+zl7aWNpuzVnEIa0Grmh0pxAcjWqaB/hkrT73koJNLY= X-Received: by 2002:a05:6870:d296:b0:11d:6780:3083 with SMTP id d22-20020a056870d29600b0011d67803083mr383105oae.61.1662591995312; Wed, 07 Sep 2022 16:06:35 -0700 (PDT) List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Miguel C Date: Thu, 8 Sep 2022 00:06:20 +0100 Message-ID: Subject: Re: Slightly OT: How to grep for two different things in a file To: Aryeh Friedman , FreeBSD Mailing List Content-Type: multipart/alternative; boundary="0000000000004980af05e81e5d27" X-Rspamd-Queue-Id: 4MNHv149k3z3cJJ X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=YsK6g0Nj; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (mx1.freebsd.org: domain of miguelmclara@gmail.com designates 2001:4860:4864:20::2c as permitted sender) smtp.mailfrom=miguelmclara@gmail.com X-Spamd-Result: default: False [-3.98 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.98)[-0.984]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; R_SPF_ALLOW(-0.20)[+ip6:2001:4860:4000::/36:c]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; RCVD_IN_DNSWL_NONE(0.00)[2001:4860:4864:20::2c:from]; ARC_NA(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_HAS_DN(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; TAGGED_RCPT(0.00)[]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:15169, ipnet:2001:4860:4864::/48, country:US]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; MID_RHS_MATCH_FROMTLD(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; MLMMJ_DEST(0.00)[freebsd-questions@freebsd.org]; RCPT_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FREEMAIL_TO(0.00)[gmail.com,freebsd.org]; FREEMAIL_ENVFROM(0.00)[gmail.com]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCVD_COUNT_TWO(0.00)[2] X-ThisMailContainsUnwantedMimeParts: N --0000000000004980af05e81e5d27 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Maybe I didn't understand the complexity here but doesn't grep -E or egrep work here ? egrep "string1|string2" ? Also works with -r and you can use it inside exec in find too... but if you want to search for more that one string AFAIK this is the easiest way. On Wed, Sep 7, 2022, 23:56 Andreas Kusalananda K=C3=A4h=C3=A4ri wrote: > On Wed, Sep 07, 2022 at 06:00:36PM -0400, Aryeh Friedman wrote: > > I have 2 patterns I need to find in a given set of files. A file only > > matches if it contains *BOTH* patterns but not in any given > > relationship as to where they are in the file. In the past I have > > used piped greps when both patterns are on the same line but in my > > current case they are almost certainly not on the same line. > > > > For example my two patterns are "tid" (String variable name) and > > "/tmp" [String literal] (i.e. the full string is the concatenation of > > the two patterns I would do: > > > > grep -Ri tid src/java|grep -i /tmp > > > > But since /tmp is in a symbolic constant defined elsewhere (in a > > different Java file) I need to find programmatically either the name > > of the constant (has different names in different classes) and then do > > the piped grep above with it or I need to look for the two patterns > > separately and say a file is only accepted if it has both. > > > > P.S. The reason for this is I am attempting to audit my code base to > > see what classes leave behind orphaned temp files. > > > > -- > > Aryeh M. Friedman, Lead Developer, http://www.PetiteCloud.org > > I don't see an example of the stuff you talk about after "But since > /tmp is in a symbolic constant defined elsewhere..." so I don't fully > understand what that would involve and will therefore ignore it. > Instead, the following answers the subject question, "How to grep for > two different things in a file". > > find src/java -type f \ > -exec grep -qF 'tid' {} \; \ > -exec grep -qF '/tmp' {} \; \ > -print > > or call an in-line script, > > find src/java -type f -exec sh -c ' > for pathname do > if grep -qF "tid" "$pathname" && > grep -qF "/tmp" "$pathaname" > then > printf "%s has both tid and /tmp\n" > "$pathname" > fi > done' sh {} + > > In any case, the point is to first test a file for one of th strings, > and if that succeeds, test the same file for the other string, then > report the file as accepted if that other string was also found. > > See grep(1) for what -F and -q does. I dropped the -i option as I > assumed that you actully know the case, at least when looking for > "/tmp". > > Also, https://unix.stackexchange.com/questions/389705 > > > -- > Andreas (Kusalananda) K=C3=A4h=C3=A4ri > SciLifeLab, NBIS, ICM > Uppsala University, Sweden > > . > > --0000000000004980af05e81e5d27 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Maybe I didn't understand the complexity here but doe= sn't grep -E or egrep work here ?

egrep "string1|string2" ?

Also works with -r and you can use it inside exec in = find too... but if you want to search for more that one string AFAIK this i= s the easiest way.=C2=A0



On Wed, Sep 7, 2022, 23:56 Andreas Kusalananda K=C3=A4h=C3=A4ri = <andreas.kahari@abc.se> = wrote:
On Wed, Sep 07, 2022 at 06:0= 0:36PM -0400, Aryeh Friedman wrote:
> I have 2 patterns I need to find in a given set of files.=C2=A0 A file= only
> matches if it contains *BOTH* patterns but not in any given
> relationship as to where they are in the file.=C2=A0 =C2=A0In the past= I have
> used piped greps when both patterns are on the same line but in my
> current case they are almost certainly not on the same line.
>
> For example my two patterns are "tid" (String variable name)= and
> "/tmp" [String literal] (i.e. the full string is the concate= nation of
> the two patterns I would do:
>
> grep -Ri tid src/java|grep -i /tmp
>
> But since /tmp is in a symbolic constant defined elsewhere (in a
> different Java file) I need to find programmatically either the name > of the constant (has different names in different classes) and then do=
> the piped grep above with it or I need to look for the two patterns > separately and say a file is only accepted if it has both.
>
> P.S. The reason for this is I am attempting to audit my code base to > see what classes leave behind orphaned temp files.
>
> --
> Aryeh M. Friedman, Lead Developer, http://www.PetiteCloud.= org

I don't see an example of the stuff you talk about after "But sinc= e
/tmp is in a symbolic constant defined elsewhere..." so I don't fu= lly
understand what that would involve and will therefore ignore it.
Instead, the following answers the subject question, "How to grep for<= br> two different things in a file".

=C2=A0 =C2=A0 =C2=A0 =C2=A0 find src/java -type f \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -exec grep -qF '= ;tid' {} \; \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -exec grep -qF '= ;/tmp' {} \; \
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -print

or call an in-line script,

=C2=A0 =C2=A0 =C2=A0 =C2=A0 find src/java -type f -exec sh -c '
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 for pathname do
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 if grep -qF "tid" "$pathname" &&
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0grep -qF "/tmp" "$pathaname" =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 then
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 printf "%s has both tid and /tm= p\n" "$pathname"
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 fi
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 done' sh {} +
In any case, the point is to first test a file for one of th strings,
and if that succeeds, test the same file for the other string, then
report the file as accepted if that other string was also found.

See grep(1) for what -F and -q does.=C2=A0 I dropped the -i option as I
assumed that you actully know the case, at least when looking for
"/tmp".

Also, https://unix.stackexchange.com/quest= ions/389705


--
Andreas (Kusalananda) K=C3=A4h=C3=A4ri
SciLifeLab, NBIS, ICM
Uppsala University, Sweden

.

--0000000000004980af05e81e5d27--