From nobody Tue Mar 1 11:57:56 2022 X-Original-To: freebsd-ports@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 517F619DA275 for ; Tue, 1 Mar 2022 11:58:03 +0000 (UTC) (envelope-from ronald-lists@klop.ws) Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl [195.190.28.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4K7G3F2RWsz4dHM for ; Tue, 1 Mar 2022 11:58:00 +0000 (UTC) (envelope-from ronald-lists@klop.ws) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=klop.ws; s=mail; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:From:References: To:Subject:MIME-Version:Date:Message-ID:Sender:Reply-To:Cc:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=MgnlPtAdKhCZamBt0n+lldT6hTKbkJrwdWTT+P2hAAo=; b=G9Cbk+wmqxt0I15DzO57cjI+IO zjzPLcDumo0mBRIYDsLNfTUsR/vEmtWKl0zr3nr5y8sd8YSVD2V5G9pvwqZAChUR4uWkIbxJ1fB93 O2omfjqQ9i0fcQluttpwCubmZvK06XhWHdcy0eU7e2ti7DkEFsVSR+J4QmEmNycJWP3o=; Message-ID: Date: Tue, 1 Mar 2022 12:57:56 +0100 List-Id: Porting software to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-ports List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-ports@freebsd.org X-BeenThere: freebsd-ports@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: pkgs contain non URL safe characters Content-Language: en-US To: Aristedes Maniatis , freebsd-ports@FreeBSD.org References: From: Ronald Klop In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Authenticated-As-Hash: 398f5522cb258ce43cb679602f8cfe8b62a256d1 X-Virus-Scanned: by clamav at smarthost1.greenhost.nl X-Spam-Level: / X-Spam-Score: -0.4 X-Spam-Status: No, score=-0.4 required=5.0 tests=ALL_TRUSTED,BAYES_50,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,T_SCC_BODY_TEXT_LINE autolearn=disabled version=3.4.2 X-Scan-Signature: 0cb660a7d4ce909c6359c48b0bded22a X-Rspamd-Queue-Id: 4K7G3F2RWsz4dHM X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=klop.ws header.s=mail header.b=G9Cbk+wm; dmarc=pass (policy=quarantine) header.from=klop.ws; spf=pass (mx1.freebsd.org: domain of ronald-lists@klop.ws designates 195.190.28.88 as permitted sender) smtp.mailfrom=ronald-lists@klop.ws X-Spamd-Result: default: False [-3.83 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999]; R_DKIM_ALLOW(-0.20)[klop.ws:s=mail]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:195.190.28.64/27]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RWL_MAILSPIKE_EXCELLENT(0.00)[195.190.28.88:from]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[klop.ws:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[195.190.28.88:from]; NEURAL_HAM_SHORT(-0.84)[-0.835]; DMARC_POLICY_ALLOW(-0.50)[klop.ws,quarantine]; MLMMJ_DEST(0.00)[freebsd-ports]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:47172, ipnet:195.190.28.0/24, country:NL]; MID_RHS_MATCH_FROM(0.00)[] X-ThisMailContainsUnwantedMimeParts: N On 2/17/22 03:05, Aristedes Maniatis wrote: > Just to check this behaviour, I used tcpdump to see what the request looked like from pkg-fetch. > > >     123.ish.com.au.15580 > pkg0.twn.freebsd.org.http: Flags [P.], cksum 0x80e0 (incorrect -> 0xfc82), seq 1:184, ack 1, win 1027, options [nop,nop,TS val 975600196 ecr 3136747760], length 183: HTTP, length: 183 >     GET /FreeBSD:13:amd64/quarterly/All/openjdk11-11.0.13+8.1.pkg HTTP/1.1 >     Host: pkgmir.geo.freebsd.org >     Accept: */* >     User-Agent: pkg/1.17.5 >     Range: bytes=6733824- >     Connection: close > > > You can see in there that the + is not URL encoded. Is it expected that pkg uses URL standards for its repository? If not, any advice on how to host a repository on a commercial service like AWS cloudfront? > > Should we rewrite all our files with + symbols to spaces? Should pkg names only contain URL safe characters? Or should pkg-fetch be fixed to encode URLs? > > > I took a quick look at the source for pkg.c and where it calls fetchXGet but I can't understand where any URL encoding might happen. > > > Ari > > > On 14/2/2022 11:18am, Aristedes Maniatis wrote: >> Some packages contain "+" symbol which is a way of encoding spaces in a URL. This means that I'm having trouble hosting our pkg repository behind cloudfront/S3. >> >> I wasn't sure where to post this issue, so I put more details here: https://github.com/freebsd/poudriere/issues/976 >> >> >> Is there a workaround for this issue? Could pkg-fetch escape such characters when interacting with a http repository? >> >> >> Cheers >> >> Ari >> >> Hi, I looked into this a bit and did not see another answer yet on the ML. I think this describes it pretty clearly and also points to official HTTP specifications. https://stackoverflow.com/questions/2678551/when-should-space-be-encoded-to-plus-or-20 TL;DR: The + character is not special in this part of the URL. The request send by pkg is compliant to the specs. I'm aware of having specs and having what browsers and servers do in real life. Why does Cloudfront decode a + to a space in this part of the URL? Regards, Ronald.