Re: [dnssd] Adoption call for draft-sctl-advertising-proxy

Simon Lin <simonlin@google.com> Tue, 17 August 2021 02:11 UTC

Return-Path: <simonlin@google.com>
X-Original-To: dnssd@ietfa.amsl.com
Delivered-To: dnssd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 520933A10DA for <dnssd@ietfa.amsl.com>; Mon, 16 Aug 2021 19:11:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -18.098
X-Spam-Level:
X-Spam-Status: No, score=-18.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.499, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AUGbifop2HHg for <dnssd@ietfa.amsl.com>; Mon, 16 Aug 2021 19:11:02 -0700 (PDT)
Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 87FED3A10D8 for <dnssd@ietf.org>; Mon, 16 Aug 2021 19:11:02 -0700 (PDT)
Received: by mail-lf1-x131.google.com with SMTP id z2so38398438lft.1 for <dnssd@ietf.org>; Mon, 16 Aug 2021 19:11:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=Z/h359g0FwHWNptEizUjRPcicWVk2hHdYV9PcBp/lgQ=; b=sP0pfEpno1lltV+wd8C79QdypXgtmwNNDCSEucPCk5Eum/CsFgGkqiNhPddjaJDxz5 cJhb23jBJXdbAYBJqkV/yCtH8ArNrJc+rbb4V9QU0lw/mEbRH1LTl6KlUEeQN0ILY408 alGUkNG+tnxpzJkuuZcTeGmvD+VSy9qpA6mcMAp/qru+7ypPy0vwm6d2xzvTMxrZWj9f vzkcMFA5iYukCpSPO+glcbAgSX4cOfdXLYvGb32rrddUPZVFzo4T0abiQpcHZNgu6nRD LaQPyp9igciSmIcWopfj+FosUV3C4jVFlvZaVRXcB6u9m7Rv7re8vKctdOEua00ecYkd PnaA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=Z/h359g0FwHWNptEizUjRPcicWVk2hHdYV9PcBp/lgQ=; b=DMRrsL5BIRmIWKDlNbr6igVdOIsTSy5hjZfe8OkK0WkZ91gw1S9y/MmkX4tA5QaSV7 cEBwYrGHr5G8NkE+Q5Nai6FcX0toFx5m+okY8Ko8S0dDYpg26L7n7TT6MI9uHdnlNlDJ Q/b1p/zkFFRt+D+Gz9wXPda7Rf0PWZgMHcJlOIeLsSbc/FQBep2yxqDRouza8RQgh3aY rNPH4oWIxISjpUVHkaMfBQLIOdvUTqETqs+C7VCODk7tlr5Xa16oQAG5grhnJV9nogEN aMR1/m91m6x7pL+WVx0vLpOEfc1ruu9Kog7EgmU/zsk1yl9mx5jBTVyEgmG1RN6LITOM uyUg==
X-Gm-Message-State: AOAM530s5vk9n91IOQnPbpmDWLV+YGLS4rBUMcWWKYraXHFAZjuFCB9y Q/cfpPcw5S60rw6hBK/b5ChmEyhlRLSNW+PxQwtAqQ==
X-Google-Smtp-Source: ABdhPJxHn6+u5TX587z3s6ubTIQ7ua055ewObNIouPNJKEXFS9dMzuO60cGR3FZAYO8NxHdh9ztf9+JILEsgH984BYQ=
X-Received: by 2002:a05:6512:152a:: with SMTP id bq42mr609830lfb.68.1629166259251; Mon, 16 Aug 2021 19:10:59 -0700 (PDT)
MIME-Version: 1.0
References: <CADPZrgS8i9iZ-UMAStNruqQbjC6d_yqteCt5ofsbhj6EWmZ-nA@mail.gmail.com> <CAPt1N1=DVAj9fzV6hoUf0ehG5ixY2TWdbsU2NuFZReV9Yqx-mg@mail.gmail.com>
In-Reply-To: <CAPt1N1=DVAj9fzV6hoUf0ehG5ixY2TWdbsU2NuFZReV9Yqx-mg@mail.gmail.com>
From: Simon Lin <simonlin@google.com>
Date: Tue, 17 Aug 2021 10:10:47 +0800
Message-ID: <CADPZrgRkcYG=DgvuZxAHS6caYX8bF-qfq-jHbwb+KJMDgvP6Fw@mail.gmail.com>
To: mellon@fugue.com
Cc: Jonathan Hui <jonhui@google.com>, dnssd@ietf.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnssd/q0a7fVVk5ktU3W3cFb7RXfTJlmQ>
Subject: Re: [dnssd] Adoption call for draft-sctl-advertising-proxy
X-BeenThere: dnssd@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of extensions to DNS-based service discovery for routed networks." <dnssd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnssd>, <mailto:dnssd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnssd/>
List-Post: <mailto:dnssd@ietf.org>
List-Help: <mailto:dnssd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnssd>, <mailto:dnssd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Aug 2021 02:11:09 -0000

> It generally only happens when the thread omr prefix changes. So in production network it should be rare. But the thread network can fluctuate a bit on startup as the mesh settles, and this can result in changes to > the omr prefix. So although this shouldn’t happen much, the time when is is most likely is right after a power outage or right when the devices are first installed.

Agree that OMR prefix can change for Thread devices of one partition
when two partitions converge.

Are you implying that the SRP server IP address in the SRP Unicast
Dataset is using the OMR address of the SRP server?

In such a case, an OMR prefix change would lead to SRP server IP
change in Unicast Dataset, which further lead to SRP clients having to
switch SRP server IP.
However if the SRP servers are using ML-EIDs in Unicast Dataset, the
SRP server IP won't change when partitions merge, which will lead to
less SRP server switches on SRP clients.

Nevertheless, if a SRP client migrates from one partition to another
and each partition has a different SRP server, the SRP client will
have to reregister, which will cause name conflicts (assuming no SRP
replication).

So, it's foreseeable that partitioning may cause many SRP name
conflicts when SRP replication is not adopted. That being said, can we
argue that SRP replication is less useful when TREL can significantly
reduce the chance of partitioning?

--
Simon Lin


Simon Lin | Software Engineer, Nest Shanghai | simonlin@google.com |
+86 13656630312



On Mon, Aug 16, 2021 at 11:17 PM <mellon@fugue.com> wrote:
>
> It generally only happens when the thread omr prefix changes. So in production network it should be rare. But the thread network can fluctuate a bit on startup as the mesh settles, and this can result in changes to the omr prefix. So although this shouldn’t happen much, the time when is is most likely is right after a power outage or right when the devices are first installed.
>
> --
> mellon@fugue.com
>
> On August 16, 2021 at 03:40:32, Simon Lin (simonlin@google.com) wrote:
>>
>> On the other hand we see conflicts all the time with SRP because of the two-SRP-server mDNS conflict detection algorithm, so it's pretty clear that SRP + mDNS conflict detection is a recipe for creating conflicts.
>>
>>
>> Could you explain why there are so many conflicts because of
>> two-SRP-server mDNS conflict detection algorithm?
>>
>> Are the SRP servers using SRP Anycast TLV or Unicast TLV?
>>
>> I would expect name conflicts to happen more often if SRP servers are
>> using Anycast TLVs because a SRP client can change SRP server for each
>> registration or renewing.
>>
>> When SRP servers are using Unicast TLV, a SRP client should stick to
>> the same SRP server for registration and renewing, even after the SRP
>> client is rebooted.
>> A SRP client may change SRP server when there are multiple request
>> failures, but I would expect it to be less often.
>> So, I would not expect many name conflicts for SRP servers using Unicast TLVs.
>>
>> On Sat, Aug 14, 2021 at 6:40 AM Ted Lemon <mellon@fugue.com> wrote:
>>
>>
>>
>> On August 13, 2021 at 6:24:51 PM, Jonathan Hui (jonhui@google.com) wrote:
>>
>>
>> Another thing to be aware of is that because of the way mDNS works, if you are using mDNS for discovery, a name conflict when names aren't being defended will sit around in the cache for a few minutes. If names are defended, this doesn't happen; one alternative is to just defend names and count on SRP to deal with conflicts, but the downside of this is that now there is a delay before the new information appears.
>>
>> This delay could potentially be minimized by retrying to register as soon as the SRP replication update has been acknowledged by all SRP replication peers—right now my code (when name defense is enabled) just waits two minutes before re-attempting, which is a lot longer than should be necessary.
>>
>>
>> Sounds like a good idea to explore. My initial concern is that waiting for positive confirmation from all peers can lead to other failure modes resulting from poor implementation or connectivity. At the same time, those same assumptions are what would lead us back to needing other mitigations.
>>
>> The problem with anything that's ad-hoc and not integrated into the infrastructure is that there's no model that guarantees success. It's always going to be best effort. The goal has to be to make "best" as good as possible, but have a backup strategy for when it's not quite good enough.
>>
>>
>> _______________________________________________
>> dnssd mailing list
>> dnssd@ietf.org
>> https://www.ietf.org/mailman/listinfo/dnssd