Re: [Int-area] Where/How is the features innovation, happening? Re: 202112271737.AYC

Jiayihao <jiayihao@huawei.com> Thu, 13 January 2022 07:58 UTC

Return-Path: <jiayihao@huawei.com>
X-Original-To: int-area@ietfa.amsl.com
Delivered-To: int-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2945E3A03F7 for <int-area@ietfa.amsl.com>; Wed, 12 Jan 2022 23:58:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.895
X-Spam-Level:
X-Spam-Status: No, score=-0.895 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_AFFORDABLE=1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nUl0LHNVUotR for <int-area@ietfa.amsl.com>; Wed, 12 Jan 2022 23:58:34 -0800 (PST)
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 724A23A03ED for <int-area@ietf.org>; Wed, 12 Jan 2022 23:58:33 -0800 (PST)
Received: from fraeml743-chm.china.huawei.com (unknown [172.18.147.206]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4JZGyS6fHHz688Hq for <int-area@ietf.org>; Thu, 13 Jan 2022 15:58:24 +0800 (CST)
Received: from kwepemm000002.china.huawei.com (7.193.23.144) by fraeml743-chm.china.huawei.com (10.206.15.224) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Thu, 13 Jan 2022 08:58:29 +0100
Received: from kwepemm600004.china.huawei.com (7.193.23.242) by kwepemm000002.china.huawei.com (7.193.23.144) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Thu, 13 Jan 2022 15:58:27 +0800
Received: from kwepemm600004.china.huawei.com ([7.193.23.242]) by kwepemm600004.china.huawei.com ([7.193.23.242]) with mapi id 15.01.2308.020; Thu, 13 Jan 2022 15:58:27 +0800
From: Jiayihao <jiayihao@huawei.com>
To: Tom Herbert <tom@herbertland.com>
CC: "Abraham Y. Chen" <aychen@avinta.com>, "int-area@ietf.org" <int-area@ietf.org>
Thread-Topic: [Int-area] Where/How is the features innovation, happening? Re: 202112271737.AYC
Thread-Index: AQHX+5ceWKxDFFe/zkuEJARHwXoT9KxK2RgAgAxgX8CAAAJkAIAGYPVQgAAJdoCAAv0fgA==
Date: Thu, 13 Jan 2022 07:58:27 +0000
Message-ID: <1a9176b279ee4ddb92f21b43dcf1897a@huawei.com>
References: <7c509337-31b5-c0d2-020e-aca6fc9d344e@avinta.com> <ce16122f3387479ca4456325a6fb0a6b@huawei.com> <0a3711b9-0969-2eb1-15e6-3aa8354901d0@avinta.com> <CALx6S37zAk-hdk-a5PGnFYaow9mhb1XmC=4PpqMkdVredq8Zhg@mail.gmail.com> <3a323b07b379455fb688cc548ec70a49@huawei.com> <CALx6S34xzTT+Ch_CcAXAZvsLpdq=1xKhszgUVR1iQx40k3ZgjQ@mail.gmail.com> <b6758de9c62b4e96bb8cda3263453e06@huawei.com> <CALx6S34fY9i43YX72yp7UEMuGEKSVeNxCbAA7-qSoYCCL30YKQ@mail.gmail.com>
In-Reply-To: <CALx6S34fY9i43YX72yp7UEMuGEKSVeNxCbAA7-qSoYCCL30YKQ@mail.gmail.com>
Accept-Language: zh-CN, en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
x-originating-ip: [10.108.167.116]
Content-Type: multipart/related; boundary="_004_1a9176b279ee4ddb92f21b43dcf1897ahuaweicom_"; type="multipart/alternative"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-area/Xxv9QFNO2xAVEi6H95hY2H6dTYc>
Subject: Re: [Int-area] Where/How is the features innovation, happening? Re: 202112271737.AYC
X-BeenThere: int-area@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Internet Area WG Mailing List <int-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/int-area>, <mailto:int-area-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/int-area/>
List-Post: <mailto:int-area@ietf.org>
List-Help: <mailto:int-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/int-area>, <mailto:int-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jan 2022 07:58:39 -0000

Thanks for sharing. Hope you don’t mind I rephrase it as follow:

TA6 (IPv6 Temporary address) can be classified into 3 types:
A. X devices to 1 prefix (X >> 1): good for privacy, and looks like a upgraded version of NAT in IPv6 scenarios.
B. 1 devices to 1 prefix: no use for privacy
C. 1 device to Y prefixes (Y >> 1): good for privacy, but waste of address, and we may soon back to days of IPv4 where we lack address space

Then Type A might be the only outlet for TA6 and maintain the end-to-end principle as well. However, to make TA6-TypeA really work, network provider and device still should make a deal for the following points:

1. the device get a lot of sub address slots(like you guess, 1m), and such slots should be scattered enough to avoid the consequence like TA6-TypeB. (No matter addresses come from DHCP or SLACC, but I worry the cost for SLACC because there may be too many duplicate address detection packets.)
2. the device should keep each address disposable for each connections
3. the device should regularly update these time slots (return the used slots and require new slots)

I guess this is just the (only) direction that TA6 can improve privacy. (an upgraded NAT)

Thanks for sharing insight on the ID/Loc split. Here I am wondering is that a variant of TA6-TypeA? Or I feel that it help privacy because it likes another variant of tunneling.
Similarly in draft(Table 1): https://www.ietf.org/archive/id/draft-jia-flex-ip-address-structure-00.html#tab1
A hierarchical address was proposed and it is just good for the scenario you described: The address segment 1 (outer address) is used for routing outside the network provider, and segment 2 (inner address) is used for routing inside the network provider.

I may need more time to think about it, but really good to learn and explore!

Many thanks,
Yihao


From: Tom Herbert <tom@herbertland.com>
Sent: 2022年1月12日 1:37
To: Jiayihao <jiayihao@huawei.com>
Cc: Abraham Y. Chen <aychen@avinta.com>; int-area@ietf.org
Subject: Re: [Int-area] Where/How is the features innovation, happening? Re: 202112271737.AYC



On Tue, Jan 11, 2022 at 2:02 AM Jiayihao <jiayihao@huawei.com<mailto:jiayihao@huawei.com>> wrote:
Hello Tom,

Thanks for sharing draft-herbert-ipv6-prefix-address-privacy, and I am concerning the following points after I go through the draft and related references.

- Both IPv6 Temporary Address (TA6 for short) and NAT only enhance privacy against third parties except the Law Enforcement party (a party with super power).

If so, let’s focus on the privacy for those third parties:

- If a prefix is assigned to a certain entity (like a house, or a certain laboratory), privacy is still hard to reach because such entity can be correlated to just a few users, and such mapping can be easily figured out in reality.

In general, if X(X>=1) endpoints are assigned with Y(Y>=1) shared prefixes in TA6 scenarios, I feel the only ways that the privacy can be enhanced are that:
1. X >> Y, so the case is pretty similar to NAT
2. Y >> X, so the case is like any endpoint have infinite prefixes, so I just assume that each endpoint can use different prefix for each flow connect. (However in reality we would really worry about prefix consumption attack, just looks like another kind of DDoS)

Hi Yihao,

Thanks for the question. We need to consider the vastness of the IPv6 address space, and it's not prefixes that are assigned to end hosts, but rather blocks of individual addresses that are pseudo randomly assigned out of a provider's address prefix. Suppose that some provider only has a 64 bit prefix for assigning these addresses, and let's say they need to serve 1B users and each user has a generous 1M temporary addresses at any given time. The number of addresses in the prefix is 1.8*10^19 addresses and the number of addresses used at any time is 10^15 or just 0.006% of the provider's address space.

If we compare this to NATv4, temporary IPv6 addresses have much better scaling. Even in the base case scenario where a provider had an 8 bit IPv4 prefix so they could assign 2^24 addresses to NAT and combine it with 16 bits of port numbers that is only 40 bits of source address space. If the mappings include destination address and port then that could provide a few more bits but still well short of the usable address space in the example above.

For prefix assignment, like the practice of some mobile providers to give each end device its own 64 bit prefix seems wasteful to me and would lead to prefix exhaustion as you mentioned. I suppose one of the ostensible reasons to do this is for privacy, each end device could create its own unique addresses from their prefix. But as already discussed, if all these addresses share the same network prefix that correlates to the user's device then there's really no privacy benefit.

Any cases else cannot improve good privacy.
HOWEVER, the suffix/IID in TA6 is out of business NO MATTER it is case 1 or case 2!

For this, I am pretty pessimistic that TA6 seems never provide practical privacy enhancement compared to IPv4, nor provide any proof in the referenced drafts.

---

Besides, as you mention ID/locator split, I am doubt that it just don’t works because although ID provide explicit identity, locators are still and always provide IMPLICIT identity just like what IPv4 did. IMO ID/locator split cannot remove the Implicit nature that locator would always be a shadow of identity.


I would confine the ID/locator split to only internal provider use. Externally, the only specific information that should be exposed in an IP address is the provider's network prefix that routes a packet to the provider's network. The externally visible address should not reveal any of the internal structure of the provider's network and definitely no information that could be used to deduce sensitive privacy information about the user like location or human identity. When a packet enters the provider's network it needs to be mapped to the real location of the address. This already happens today in mobile networks and virtual networks where the real location of the host is not set in the address. Once the real location is determined, the packet is tunneled using an encapsulation to the location and at the tunnel endpoint it's decapsulated and the originally sent packet is delivered to the user. ID/locator split in IPv6 addresses is just an alternative for doing this compared to using a more expensive encapsulation like GRE or VXLAN and similarly would not be exposed to the Internet.

Tom


Best Regards,
Yihao

From: Tom Herbert <tom@herbertland.com<mailto:tom@herbertland.com>>
Sent: 2022年1月7日 23:38
To: Jiayihao <jiayihao@huawei.com<mailto:jiayihao@huawei.com>>
Cc: Abraham Y. Chen <aychen@avinta.com<mailto:aychen@avinta.com>>; int-area@ietf.org<mailto:int-area@ietf.org>
Subject: Re: [Int-area] Where/How is the features innovation, happening? Re: 202112271737.AYC



On Thu, Jan 6, 2022 at 11:48 PM Jiayihao <jiayihao@huawei.com<mailto:jiayihao@huawei.com>> wrote:
Hi Tom,

I love your argument on RFC7721 and agree on that quantitative analysis is needed to prove that periodically source address shift is more secure, and if so, the timing to shift the address should be calculated.

Apparently a simple answer to that is OTP as you detailed. However, the sad thing is that if every Endpoint (most refer to requestor/client in IPv6) shift its source address per each connection, is it *seriously affordable* for the network (provider) that these endpoints attached to?

Hi Yihao,

It's unclear to me whether it's affordable without a specific proposal on how to do it. In order to do this it would require a service provider to manage and track a very large block of independent hosts addresses, however given how other databases have been able to scale to very large datasets, I tend to believe there is a feasible solution. I wrote a draft outlining problem and general solutions in https://www.ietf.org/archive/id/draft-herbert-ipv6-prefix-address-privacy-00.txt

If not, is that the argument you prefer CGNAT than IPv6 Temporary address(RFC7721) for IP layer privacy consideration?

Objectively, it seems that CGNAT, under the right conditions, does provide better privacy for users in IP addressing. The strength of CGNAT in terms of privacy has also been anecdotally demonstrated by law enforcement's concerns about it. It pains me to say that because the whole point of a 128 bit address space in IPv6 was to eliminate the NAT abomination :-)

Tom


Happy new year!
Thanks,
Yihao

From: Tom Herbert <tom@herbertland.com<mailto:tom@herbertland.com>>
Sent: 2021年12月31日 2:30
To: Abraham Y. Chen <aychen@avinta.com<mailto:aychen@avinta.com>>
Cc: Jiayihao <jiayihao@huawei.com<mailto:jiayihao@huawei.com>>; int-area@ietf.org<mailto:int-area@ietf.org>
Subject: Re: [Int-area] Where/How is the features innovation, happening? Re: 202112271737.AYC



On Mon, Dec 27, 2021 at 7:00 PM Abraham Y. Chen <aychen@avinta.com<mailto:aychen@avinta.com>> wrote:
Hi, YiHao:

0)    Hope you had a Merry Christmas as well!

1)    Re: Ur. Pts 1) & 2):    Allow me to modify and expand your definitions of the abbreviations, ICP & ISP, a bit to streamline our discussion, then focusing on related meanings of the two keyword prefixes, "C" and "A" in the middle of them:

    A.    ICP (Internet Content Provider):    This is the same as you are using.

    B.    IAP (Internet Access Provider):    This will represent the ISP that you are referring to.

    C.    ISP (Internet Service Provider):    This will be used as the general expression that covers both ICP and IAP above.

    With these, I agree in general with your analysis.

2)    From the above, there is a simpler (layman's instead of engineer's) way to look at this riddle. Let's consider the old fashioned postal service. A letter itself is the "Content". The envelop has the "Address". The postal service cares only what is on the envelop. In fact, it is commonly practiced without explicitly identified that one letter may have multiple layers of envelops that each is opened by the "Addressee" who then forward the next "Addressee" according to the "Address" on the inside envelop, accordingly. To a larger scale, postal services put envelops destined to the same city in one bag. Then, bags destined to the same country in one container, etc. This process is refined to multiple levels depending on the volume of the mail and the facility (routes) available for delivery. Then, the containers are opened progressively along the destination route. No wonder that the US Postal Service claimed (during the early days of the Internet) that the mail system was the fist "packet switching" system.

3)    So, in this analogy, the "Address" on each and every envelop has to be in the clear (not coded or encrypted in any sense) for the mail handlers to work with. It is only the most inner "Content", the letter itself, can have Confidential information (or encrypted if the sender wishes). Under this scenario, the LE (Law Enforcement) is allowed only to track suspected mail by the "Addresses". And, any specific surveillance is only authorized by court, case by case. While no one can prevent LE bypassing this procedure, cases built by violating this requirement would be the ground for being thrown out of the court.

4)    However, in the Internet environment, largely, if not most, Addresses are dynamic. There is no way to specify an IP Address for surveillance of a suspect. This gives the LE the perfect excuse to scoop up everything and then analyze offline. This gives them plenty of time to try various ways to decrypt the encoded messages and the opportunity to sift through everything for incidental "surprise bonus finds". The result is that practically no privacy is left for anyone. is means that all of the schemes of scrambling IP Addresses are useless at the end. So, why do we bother with doing so, at all?

Abe,

Happy New Year!

Your argument seems to be that we shouldn't bother with things like security or encryption at all :-) While it's true that anything sent into the Internet can be intercepted and analyzed offline, it's clearly the intent of security and privacy mechanisms to make offline analysis of data ineffective or at least cost prohibitive. For encryption the calculation is pretty straightforward, the complexity and cost and breaking a cipher is generally correlated to the key size. So for any given key size, it can be determined what sort of resources are required to break the code. This is a continuous escalation as attackers gain access for more computational resources and there are breakthroughs like in quantum computing that require rethinking encryption.  But regardless, the effectiveness of encryption at any given point of time is quantifiable.

For security and privacy in IP addresses I believe we should be similarly taking a quantitative approach. This is where RFC7721 fails. The recommendation of RFC7721 is that for better security, use temporary addresses with shorter lifetimes. But the RFC doesn't attempt to quantify the relationship between address lifetime and the security that's offered or even say what specific lifetime is recommended for optimal security. For instance, if the user changes their interface address twice a day instead of once a day does that halve the chances that some may breach their security by correlating two different flows that they source from the user? Probably not. But, what if they change their address every five minutes? How much better is that than changing the address once a day? It's intuitive that it should be better security, but is it _really_ better? And if it is better, are the benefits worth the aggravation of changing the address. This is quite similar to some companies that have a policy that everyone needs to change their passwords periodically. Studies have shown that there is little quantitative value in doing this and in fact the net effect is likely less security and increased user aggravation-- even so, companies will continue to do this because it's easier to stick with the inertia of intuition.

The fix for the password problem is one time passwords (OTP) and IMO that hints at the fix for the address security problems described in RFC7712, essentially we need single use source addresses per each connection.  The security effects of single use addresses are quantifiable, i.e. given sample packets from independent two flows generated by the same user, without additional information it isn't possible for a third party to correlate that they are sourced by the same user.

Tom



Happy New Year!


Abe (2021-12-27 21:59)





On 2021-12-23 22:26, Jiayihao wrote:
Hello Abe,

Users are unwilling to be watched by any parties(ISP, and ICP also) excepts users themselves. Actually I would like to divide the arguments into 2 case: network layers and below (not completely but mostly controlled by ISP); transport layers and above (not completely but mostly controlled by ICP).

1) For transport layers and above, Encryption Everywhere (like TLS) is a good tool to provide user privacy. However, it is only a tool against ISPs, while ICPs survive and keep gaining revenue (even by selling data like the negative news of Facebook, or Meta, whatever you call it). As discussed, it is not networks faults because IP provides peer-to-peer already. You may blame CGNAT in ISP increasingly contributes to a C/S mode in replacing P2P, like in China where IPv4 addresses are scare and CGNAT is almost everywhere. However, I don’t find the situation any better in U.S. where most of IPv4 address are located. It is a business choice to overwrite the mode to be peer-ICP-peer(C/S mode) at application layer, other than utilize the P2P mode that natively provided by IP.

In this case, there are trust points and they are ICPs.

2) For network layers and below, ISP and IP still provide a pure P2P network, and Encryption in TLS do not blind ISP in IP layer since IP header is still in plaintext and almost controlled by ISP. That is to say, in an access network scenario, the access network provide can see every trace of every user at network layer level (although exclude the encrypted payload). To against this, one can use Proxy(i.e., VPN, Tor) to bypass the trace analysis just like the CGNAT does. The only difference is that detour points (Proxies) belong to a third party, not ISP.

In this case, there are trust points and they are third party proxies.

The bottom line is that trust points are everywhere explicitly or implicitly, and privacy can be leaked from every (trust) point that you trust (or have business with). No matter what network system you have, no matter it is PSTN or ATM, these trust points are just the weak points for your privacy, and the only things users can beg is that *ALL* trust points are 1) well behave/don’t be evil; 2)system is advanced enough that can’t be hacked by any others; 3) protected by law.

I would say pretty challenging and also expecting to reach that.
Network itself just cannot be bypassed in reaching that.

Merry Christmas,
Yihao


From: Abraham Y. Chen <aychen@avinta.com><mailto:aychen@avinta.com>
Sent: 2021年12月23日 10:01
To: Jiayihao <jiayihao@huawei.com><mailto:jiayihao@huawei.com>
Cc: tom@herbertland.com<mailto:tom@herbertland.com>; int-area@ietf.org<mailto:int-area@ietf.org>
Subject: Re: [Int-area] Where/How is the features innovation, happening? Re: 202112221726.AYC
Importance: High


Hi, YiHao:

0)    I am glad that you distilled the complex and elusive privacy / security tradeoff issues to a very unique and concise perspective.

1)    Yes, the IPv4 CG-NAT and IPv6 Temporary address may seem to provide some privacy protection. However, with the availability of the computing power, these (and others such as VPN) approaches may be just ostrich mentality.  On the other hand, they provide the perfect excuse for the government (at least US) to justify for "mass surveillance". For example, the following is a recent news report which practically defeats all current "privacy protection" attempts.

    https://www.usatoday.com/story/news/2021/12/08/federal-court-upholds-terrorism-conviction-mass-surveillance-case/6440325001/

[jiayihao] there is no doubt.

2)    Rather than contradicting efforts, it is time to review whether any of these schemes such as mapping techniques really is effective for the perceived "protection". As much of the current science fiction type crime scene detective novel / movie / TV program hinted, the government probably has more capability to zero-in on anyone than an ordinary citizen can imagine, anyway. And, businesses have gathered more information about us than they will ever admit. Perhaps we should "think out of the box" by going back to the PSTN days of definitive subscriber identification systems, so that accordingly we will behave appropriately on the Internet, and the government will be allowed to only monitor suspected criminals by filing explicit (although in secret) requests, case by case, to the court for approval?



Happy Holidays!





Abe (2021-12-22 21:00 EST)



Hello Tom,



The privacy countermeasure for IPv4/IPv6 is interestingly different.

IPv4 usually utilize CGNAT, i.e., M(hosts)-to-N(IPs), where M >> N so that the host could remain anonymous

IPv6 usually utilize Temporary address, i.e., 1(host)-to-M(IPs[at least suffix level]), where M >> 1 so that the host could remain anonymous.



HOWEVER, I don't feel any approach reaches privacy perfectly, because access network have a global perspective on M-to-N or 1-to-M mapping.

For this, it is hard to be convinced that IPv4/6 itself can reach a perfect privacy.



Thanks,

Yihao Jia



-----------



I believe CGNAT is better than IPv6 in terms of privacy in addressing.

In fact one might argue that IPv4 provides better privacy and security

than IPv6 in this regard. Temporary addresses are not single use which

means the attacker can correlate addresses from a user between

unrelated flows during the quantum the temporary address is used. When

a user changes their address, the attacker can continue monitoring if

it is signaled that the address changed. Here is a fairly simple

exploit I derived to do that (from

draft-herbert-ipv6-prefix-address-privacy-00).



The exploit is:

      o An attacker creates an "always connected" app that provides some

        seemingly benign service and users download the app.

      o The app includes some sort of persistent identity. For instance,

        this could be an account login.

      o The backend server for the app logs the identity and IP address

        of a user each time they connect

      o When an address change happens, existing connections on the user

        device are disconnected. The app will receive a notification and

        immediately attempt to reconnect using the new source address.

      o The backend server will see the new connection and log the new

        IP address as being associated with the specific user. Thus,

the server has

        a real-time record of users and the IP address they are using.

      o The attacker intercepts packets at some point in the Internet.

        The addresses in the captured packets can be time correlated

        with the server database to deduce identities of parties in

        communications that are unrelated to the app.



The only way I see to mitigate this sort of surveillance is single use

addresses. That is effectively what  CGNAT can provide.



Tom

[Image removed by sender.]<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient&utm_term=icon>

Virus-free. www.avast.com<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient&utm_term=link>