Re: Clarification regarding URI (RFC3986) spec followed by HTTP (RFC9110)

Raghu Saxena <poiasdpoiasd@live.com> Wed, 25 January 2023 10:54 UTC

Return-Path: <poiasdpoiasd@live.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 65F62C15171B for <ietf@ietfa.amsl.com>; Wed, 25 Jan 2023 02:54:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.788
X-Spam-Level:
X-Spam-Status: No, score=-4.788 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FORGED_MUA_MOZILLA=2.309, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=live.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CccIGedopW2n for <ietf@ietfa.amsl.com>; Wed, 25 Jan 2023 02:54:26 -0800 (PST)
Received: from AUS01-ME3-obe.outbound.protection.outlook.com (mail-me3aus01olkn2171.outbound.protection.outlook.com [40.92.63.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 863D4C151548 for <ietf@ietf.org>; Wed, 25 Jan 2023 02:54:26 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=I2nL2yb/x9rAHPVhn4C4ml0n64MT6ePBe2ANJ0hpjPKaX6MxM/7Uc2hen0MKrtvknD+eijSFPpIEDeKRcet0ehLpIx4y6qp7SNo/H5WM/vatQhI0/DuJCD35u4FNkJ15F64QHVXxpmbNJTucsJaDdplZoiIhf60+rMZqTPpVVYFxr0IHeN3abHlOreOaL3B0ZhtTpLxtMfQmFuyd4JZRd9daDh3ItvFzuxqWReQjAHJjPSxDe8OqRBSrzQl+ONNvAV9Lqoudt8P/fGfZWVWNVTRud7ZTBy9n194y08oLquSG+OHK+lFyLQm81976j/wznzdCZqA5173Tfy3Qe3PnOQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=639fP0t0lMtxXDEwCYfm7Yxzg9gCT4kqTSCfEh4aOYs=; b=fekUYUt+FxHgAlq+thmbp2aADTiVtLpsI4nDwdIBYA8dKK02cD4YhIOnXUqWiqbpAgW4Vc5YhpZcH21XK9NiJK6tcLMYcc4LWkcmWXuIw1GLEHKXLpaex6L31seymDeImJIxuJlGdLmXM7wqsgkEzVw3cf+uOis9/SkGIxxFpGYLMcHBctRFroaQHnN9lRPNFT7C1n6e4iBaYU7+NOuJKr8usHO+m8K6y0sAcZBEA0Rk9xrrLtidiwHhBfeUvMPMuBfF6NAQMjsk45QKoQEF6F5UxDyeECDLBenIcmLrtCV3+nBZ+J17aA4m2rdOJ1ERQqp00SdEw10B4r8aGm7wyw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=live.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=639fP0t0lMtxXDEwCYfm7Yxzg9gCT4kqTSCfEh4aOYs=; b=OdLsIYZAXL+leN4ZDf/953RYfN8HNU7C2xDx86s8zh5pUiMI7MHexE/T2G4lBf/Bc5DHxM4yAWgVRAlbKWkbe1CQVIbeyY9yvi3pTowv/YQmRU3rjusKmMvrRUjLF+x/csw5ToWXa1LBP4zUKAEhHZ8my8VKSql0t22WQVGm22qsohJV0ExeRvB9BaiupvY46fDvwkCX1pVBwXrs8GptoI0DmVgZWDeuBUcUMNDuU4DIiCXExWWtNBZ4Xou2w0JqEnya2R491etqJ1+iJo64ibs62XZOYdfj9lLLFUhaaRAwgNk0Tm9pbdwF67v83rIdjcHYpB2v3ncFRSMJbHekfQ==
Received: from MEYP282MB3564.AUSP282.PROD.OUTLOOK.COM (2603:10c6:220:178::14) by SYYP282MB0863.AUSP282.PROD.OUTLOOK.COM (2603:10c6:10:b8::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6002.33; Wed, 25 Jan 2023 10:54:22 +0000
Received: from MEYP282MB3564.AUSP282.PROD.OUTLOOK.COM ([fe80::1d16:cd9d:3c28:c0c5]) by MEYP282MB3564.AUSP282.PROD.OUTLOOK.COM ([fe80::1d16:cd9d:3c28:c0c5%4]) with mapi id 15.20.6002.033; Wed, 25 Jan 2023 10:54:22 +0000
Message-ID: <MEYP282MB3564CAEFF922DFEEEE32813DA3CE9@MEYP282MB3564.AUSP282.PROD.OUTLOOK.COM>
Date: Wed, 25 Jan 2023 18:54:15 +0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1
Subject: Re: Clarification regarding URI (RFC3986) spec followed by HTTP (RFC9110)
To: ietf@ietf.org
References: <MEYP282MB3564A385B6CECB0E9E92A630A3CE9@MEYP282MB3564.AUSP282.PROD.OUTLOOK.COM> <634ad97a-9081-1831-9c07-999a3c8e1bbf@gmx.de>
Content-Language: en-US
From: Raghu Saxena <poiasdpoiasd@live.com>
In-Reply-To: <634ad97a-9081-1831-9c07-999a3c8e1bbf@gmx.de>
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="------------wHteuBQ75HqzgxvQW0PomRwn"
X-TMN: [0fPjZb4bOug/aADGwUljlt+saYwAcRvk]
X-ClientProxiedBy: KL1PR01CA0009.apcprd01.prod.exchangelabs.com (2603:1096:820::21) To MEYP282MB3564.AUSP282.PROD.OUTLOOK.COM (2603:10c6:220:178::14)
X-Microsoft-Original-Message-ID: <a4ca7e31-8068-ba5f-36ee-b4f64971f2da@live.com>
MIME-Version: 1.0
X-MS-Exchange-MessageSentRepresentingType: 1
X-MS-PublicTrafficType: Email
X-MS-TrafficTypeDiagnostic: MEYP282MB3564:EE_|SYYP282MB0863:EE_
X-MS-Office365-Filtering-Correlation-Id: 6f86f80f-eb44-41c5-4114-08dafec28406
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: yZ4fpyK16IgzKq6rK3H9Go9+hgMoXYvkHowdbrDtgpucY6yvaL6EXDHh8PfN2hnfZDEXIsDCR79mNztQhTRLFs/IXWwZfD6l8ISl+17IF8WoUPP9pwFrLZFOXFfULTh8dS1vOGvdnhFwExrjbl3wIqu77APiwSOnqWPBAag6VPKsfx/ubw2Xa71OQ0J9Ta5PJRCc48hyMoEWKbV58hP0eoFpH4upTN+0PSzyVtxoxUWsTz0QdkHV36bvbSqOvXD3coXxx8vrlptrawGc4OBiAAxprl2Pvwnnco+1aVuSSHIp1IPpm0cr4yGXuga/3L2TYPz9Bd30v+ym3ZfJiEpgzw6dktJVpJxsbEEr8DAWssrwqu0AMJ8g9mU2ya9uP5C3Nf7669yv6gyd/PSRDfwYJngg90Bslkr3cCjYNYdzixdmLCiQnjvENEwliQKn6x/ivNbTfB42/3Ee4yrYxRDbFy/zTVD7hCzRfEb+OTFlsSSdMmyYf3aKt7YCzkl1kAGImhyyuZ8sVPl6ZPiK36LNvmY/m46MqzAylisJr7/DpjiRqWzIwyUHdqT+uh7+hirM71PSlIv/QF/LHd/fmk8OYLMmoZ6FkCN51s8Al1D3FrLYkeMOoQcO9mWMWm1uvZt8O8aAe3h0SeLo11FzwU8cGQ==
X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1
X-MS-Exchange-AntiSpam-MessageData-0: eUq853m2yRO/U0hCiKXoMuTnzlYbfMVxkdEg5Z1qH39hTxjVwW+ZtqO3Lxyf0qZpJOWJ3PUVItyvFBaITr7DQijrRKeL7Emd9imnzvoelJyoBmxGlEoR3Au0rCxXxCM/+3mFlk/A6GCTmo6HSmCE/bpdbgMDd8lqYpclaj1dHW1C3/VF28TzrArfsRpzXdx5MZSM2Uzsohd8nTlP2UyWmkd3ITVBLiVbASy8y4jkzQRhHy3xXwoIlWm+S3zM1H+ZwdwSD1eE/E9CYOvaE/07Wxdv3RhE3cl593jaJbnfN76L9/T2k0pnGQ5EjNF5wPmqijycMa+4/zo9d3kUwpYsF3UA5CsZ7Tism5rOU77peZctrzyE3jyEg1QfxUx5zeep0GGDkFz73H87vVZfWUD1acNa4CAnutn3vxW7pmZzYNYdLTTpZtV/Z0V4eWL3uOhxyKROFp9oeLsOH+B58UGHdvs4P4ARNyBpnWzFNo3zuElFCrRJPPlADuqH0lowWvLGdzE8aqkh5aTIduSioCpJBWsYbRkIGswvSWwOCOD1ISklROo2Qr0OfcRNY1tXNLhGuH0aCvFfq2fnn0jvG8glxvq1YUE2qrwpGhLHPSkxWGWzA0W+AjTGXWIKpbIJOFDEsQp0Nn8PQeW9yeIoPyUTREevZeNSNhjv+JsLh8+djTJHQZZEZkYI3kfSdEqiYVnvidUVg9eGSII2/d+kkNSeGDY05gYkYvRhoBeJioTt9e9pPS03ssf2f+/xfzRqcAtRyzCtVlvycZUMcuyvBVq5LndATu3W8zSqLi2H1cx+oV8s/erDetR2Novh9lJ73txMYnNSz7bTdmjUTgfQb8iacV3RJSUHbAR8eCnSSrW+dIpeH5Y7nupKUUE13rXg6g3pl5obuNPx8OnoKr53c/sUX1IK+O9Bq//6/gwm4DAevmUhHe16tU/3fU269YIzH/iHiRonszxX299yoDFA1szyC9zRgp+Ngaio5X7qlpEMGZbS7zin54rHkrBP5nPw3o0/Jgs0aJRjqZWilXOlFtN8TpFGt/igwPgfcUoSoDaqBBdZqwXC/66yXB48Pshgja1i8Ng5S300jRHsw14E8GqmxStAkEkcal9h/Xbp8ZJcvja3AanD/3+/36CruDdpfAZN81uvXeU7kse6kcNN0OYmN2dy8iWx5rGoadvTdksHrFd29j7cF0vQlD8Ykkxv0n8ptFrcJ3nJvtNUE2fXokUtWG4yBnesFs7YzK6KMp9hYLYzxkbYfbpZQTthxz/gPaKcll7+nas7hrj76so8xbJRoQ==
X-OriginatorOrg: sct-15-20-4755-11-msonline-outlook-746f3.templateTenant
X-MS-Exchange-CrossTenant-Network-Message-Id: 6f86f80f-eb44-41c5-4114-08dafec28406
X-MS-Exchange-CrossTenant-AuthSource: MEYP282MB3564.AUSP282.PROD.OUTLOOK.COM
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Jan 2023 10:54:22.0815 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SYYP282MB0863
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/0DKimfh26JCoedTMTUSj2x3yD1g>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "IETF-Discussion. This is the most general IETF mailing list, intended for discussion of technical, procedural, operational, and other topics for which no dedicated mailing lists exist." <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Jan 2023 10:54:31 -0000

On 1/25/23 17:47, Julian Reschke wrote:
> On 25.01.2023 10:04, Raghu Saxena wrote:
>> To whomever it may concern,
>>
>> I am writing to seek clarification regarding the URI spec (RFC3986)
>> followed by HTTP, specifically about percent-encoding arbitrary octets
>> (which do not comprise a valid UTF08 sequence). In the last paragraph of
>> RFC3986 Section 2.5
>> (https://www.rfc-editor.org/rfc/rfc3986.html#section-2.5), it says, 
>> quote:
>>
>>  >  When a new URI scheme defines a component that represents textual
>>     data consisting of characters from the Universal Character Set 
>> [UCS],
>>     the data should first be encoded as octets according to the UTF-8
>>     character encoding [STD63]; then only those octets that do not
>>     correspond to characters in the unreserved set should be percent-
>>     encoded.
>>
>> This implies that URI schemes defined after RFC3986 must follow UTF-8
>> encoding in their URIs. However, the original HTTP/1.1 RFC (2616) was
>> dated June 1999, and so would not have had to "abide" by the UTF-8 rule.
>>
>> In fact, many web servers allow and process GET requests with
>> percent-encoded octets, which they decode as raw bytes and have the
>> application level logic handle how to process them.
>>
>> However, since HTTP's latest RFC is 9110, dated June 2022 (post
>> RFC3986), does it mean the UTF-8 rule now applies to it? I would think
>> not, since this would be a breaking change. But some comments on github
>> indicate that this is as per the spec ()
>
> Pointer?
>
My apologies, the comment is here: 
https://github.com/sindresorhus/got/issues/420#issuecomment-345416645


>> tl;dr - Is it compliant with the HTTP specification to send arbitrary
>> bytes, which do not represent a valid UTF-8 sequence, via
>> percent-encoding in the URL query parameter?
>
> Yes.
>
> The http scheme was not re-definey by RFCs after RFC 2616 (in fact, it
> was defined even before that).
>
> Best regards, Julian
>
Thanks for the clarification regarding schemes not being re-defined. I 
will ask the library author to reconsider

Regards,

Raghu Saxena

(P.S. Sorry for the personal reply prior to this - my first time using 
mailing lists)