Re: [RTGWG] gRPC dial-out, oc-version, data types, draft-openconfig-rtgwg-gnmi-spec-01

Rob Shakir <robjs@google.com> Mon, 24 September 2018 15:24 UTC

Return-Path: <robjs@google.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 359D3130E8E for <rtgwg@ietfa.amsl.com>; Mon, 24 Sep 2018 08:24:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.51
X-Spam-Level:
X-Spam-Status: No, score=-17.51 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PL3NnGXe0jHe for <rtgwg@ietfa.amsl.com>; Mon, 24 Sep 2018 08:24:41 -0700 (PDT)
Received: from mail-yb1-xb2c.google.com (mail-yb1-xb2c.google.com [IPv6:2607:f8b0:4864:20::b2c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 37D291252B7 for <rtgwg@ietf.org>; Mon, 24 Sep 2018 08:24:41 -0700 (PDT)
Received: by mail-yb1-xb2c.google.com with SMTP id w7-v6so8374606ybm.7 for <rtgwg@ietf.org>; Mon, 24 Sep 2018 08:24:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=npHJTKuVtDM6wMgq+Ybr6mBGpEtFHKJWTJs2Tli9nSE=; b=OzKbxU43iw8HVUJT+WMBl7HPuywFOHvwI0a3V+jooMhqgm8zCX2Eq7Fy921gpfoGbX 1EuDnyFnOc/OxNtMgtablAhc+HJban9fN2K5yVzwyFubwVpY5OzQP4g3UUne3HyBPCQv sMCgSnXWen+1cXnCiHoKCuP0krsIOjsqc0euQWijfOBWuqMt3Ur7KiEqxbppwmR1IqO+ ewHKRoN62AzVYyqCOGhFBJucamTH21Rh2Waj6/sA8ParYnZEPymF8Kaz00wU+V+hnI3Y lqvdW3NVnISSsJc8eK7e/Wl/3L//osPaWJNldylx7pgCCTuzM2VLi0xsW4rLkSyYz9eo KzYw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=npHJTKuVtDM6wMgq+Ybr6mBGpEtFHKJWTJs2Tli9nSE=; b=lrrWKLNFz7sE8CUqp0+au0VWbdP+RErlCmvKmYLnNGqDvrUa/+m/WDPDiC5h/mIMfT p6JbbeLF6GZpMCpCLG+obK88+oiMPaAFSh6xZQi45yoKO85+NGa6+gdnIiRug4dtk87L GBQY5Gz37YymTLc3AnYR8VBnvOAjn6qF/zmCWvtX12Xr57ViYoMF8zXwGML/cpLUPmuZ boQYySN0uio5brsYDEWX3BLL8Vu0xHb1trEp5aHfDF0EueJ//XBdZM7kcKSdGJNc/Ij9 0sqg5qvM8LKQYn2rKLmnp932RRhAOd3O8Ln5W4NzFnoqoR3lkSUqnAIZYSwQ0L7AHjOD AK2Q==
X-Gm-Message-State: ABuFfojq1eDjf9GZdgtLFFYWW8ML8QlCFOE8SSbLItE87t9kxUicxhPC RE08jDSKOKqXVRsYHyoS8HbZBiPdW2qwqI8lDC97zg==
X-Google-Smtp-Source: ACcGV60qBjZ9CFh5+TiE+UyEGZ+J8U2qYOAnRBP01UEAtul2Eu41bhq7Q+psAxjNbvjQxTlDwIgeJL5nRi4FFrxAe+U=
X-Received: by 2002:a5b:346:: with SMTP id q6-v6mr1119800ybp.193.1537802679873; Mon, 24 Sep 2018 08:24:39 -0700 (PDT)
MIME-Version: 1.0
References: <664204728.365531.1537696236737@ss007565.tauri.ch>
In-Reply-To: <664204728.365531.1537696236737@ss007565.tauri.ch>
From: Rob Shakir <robjs@google.com>
Date: Mon, 24 Sep 2018 08:24:28 -0700
Message-ID: <CAHd-QWsGLsnKQz0R16esO8r2SEhOzcKq1GkF7cHaUV=-6fZRvA@mail.gmail.com>
Subject: Re: [RTGWG] gRPC dial-out, oc-version, data types, draft-openconfig-rtgwg-gnmi-spec-01
To: Thomas.Graf@swisscom.com
Cc: rtgwg@ietf.org, paolo@pmacct.net, Matthias.Arnold@swisscom.com, Carl Lebsack <csl@google.com>, Anees Shaikh <aashaikh@google.com>
Content-Type: multipart/alternative; boundary="00000000000078761e05769f9622"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/0tq2nBRg4xk3D5zyvqqbkpnDZy4>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 24 Sep 2018 15:24:45 -0000

Hi Thomas,

On Sun, Sep 23, 2018 at 2:50 AM <Thomas.Graf@swisscom.com> wrote:

> To Rob and co-authors of the gNMI draft,
>
> First of all I want to congratulate on this draft. Speaking for Swisscom,
> from a service provider perspective, Streaming Telemetry is an important
> piece of the Network Telemetry framework. We see with gNMI a big potential
> in replacing SNMP in the long run.
>
> We are looking forward to the next draft and would like to give some input
> regarding gRPC dial-out, openconfig version support and data types.
>

Thanks for the comments. Please note that, as per the content in the draft,
we maintain the authoritative version of the gNMI specification in GitHub
at
https://github.com/openconfig/reference/blob/master/rpc/gnmi/gnmi-specification.md,
rather than as an IETF draft. Changes will initially be published there,
since this work is not being pursued through the IETF.

We are working since a few months with Huawei, Cisco, Juniper and pmacct
> (Paolo Lucente, open source collector daemon) together by using gNMI and
> openconfig YANG models to achieve the goal of an open and vendor
> independent Streaming Telemetry data collection, which integrates into a
> Big Data (message broker and schema registration) architecture.
>
> We understood that the draft-openconfig-rtgwg-gnmi-spec-01 scope is gRPC
> dial-in where in draft-openconfig-rtgwg-gnmi-spec-00 in Appendix B was
> stated that dial-out is on the road map for the next draft.
>
> For the reader it would be beneficial to describe the difference target
> data collection/transport use cases where dial-in vs. dial-out suits the
> best. In particular covering the high availability aspect which we think is
> key for autonomous (closed loop) operation.
>
Carl Lebsack (csl@google.com, now CC'd) and I are working on extending gNMI
to meet dial-out use cases. We've talked to operators that are interested
in this in the context of CPE management, as well as within datacentre
environments. One of the things that we're trying to ensure is that we
understand the impact to implementations that need to support both dial-in
and dial-out modes of operation, and how we can make supporting and testing
both implementations as simple as possible. This has benefits to both
implementors (naturally), but also operators since it ensures that the gNMI
implementations that ship are well tested in both modes of operation. To
this end, we're working with the gRPC team to understand the different
options for implementing dial-out. I hope that we'll have an initial
proposal in the next few weeks.

The normal way that such changes get discussed is in a GitHub issue in the
github.com/openconfig/reference repo, prior to being fully merged back into
the specification. If you have interest, I'd encourage you to watch that
repo such that you get some notification of the issue being opened.


> Regarding gNMI data type support
>
>
> https://github.com/openconfig/reference/blob/master/rpc/gnmi/gnmi-specification.md#23-structured-data-types
>
>
> We believe that less would help in better adoption and interoperability.
> We favor JSON_IETF and Proto.
>

There are a couple of reasons that we support more than just these two
encodings today:

   - In gNMI we support the concept of a mixed schema device. This can be
   used to support different models (e.g., vendor native, and OpenConfig), or
   mixed modelled and unmodelled data within the same transaction. The ASCII
   encoding gives us a means to carry this unmodelled data in gNMI. A more
   detailed explanation is in the mixed schema
   <https://github.com/openconfig/reference/blob/master/rpc/gnmi/mixed-schema.md>
addendum
   to the gNMI spec.
   - We support carrying binary-marshalled protobufs directly on the wire
   in gNMI using the bytes type -- the reason not to use proto.Any here is
   that we end up with redundancy on the wire as to what message type is being
   used - since both the path and the type in the Any message give information
   as to how to unmarshal the protobuf contents. This is described in more
   detail in the protobuf value
   <https://github.com/openconfig/reference/blob/master/rpc/gnmi/protobuf-vals.md>
addendum
   to the spec.

So, there are implementation cases where we need to support these different
encodings. However, I do agree that fewer options are better for
interoperability. What we have done at Google is to define a minimum viable
feature specification that we've used with internal and external developers
to describe what must be supported for both configuration management and
streaming telemetry use cases. We've discussed making this a public
specification - if this is something that you'd be interested in commenting
on, that would be interesting. We can take it up off-list if so.

Regarding openconfig version support.
>
> We understood from
>
>
> https://github.com/openconfig/reference/blob/master/rpc/gnmi/gnmi-specification.md#322-the-capabilityresponse-message
>
> and
>
>
> https://github.com/openconfig/reference/blob/master/rpc/gnmi/gnmi-specification.md#261-the-modeldata-message
>
> that the supported data model version can be obtained before metrics are
> sent by a notification message.
>
> Within the notification message, the data model version is missing
> https://github.com/openconfig/reference/blob/master/rpc/gnmi/gnmi-specification.md#21-reusable-notification-message-format
>
>
> This means that the mapping between device and model version has to be
> stored and maintained on the data collection side. This could be simplified
> by sending the data model version also in the notification message.
>

Do you have a concrete example of where this is an issue? At the moment, if
the collector is schema-aware, then it seems like one would need to select
before unmarshalling what the version of the models that are to be used is,
and then this would not change during the lifetime of a Subscribe RPC (or
more likely, until the software version on the device changes). Is there a
case that you have whereby one needs to make a decision more often than
once per connection?

Alternatively, did I misunderstand and the issue here is maintaining state
w.r.t the connection at the collector side?


> The maintenance of the openconfig model version mapping has been
> challenging. Depending on router software upgrades and vendor this can
> change during operation. One of the three mentioned vendors which is
> exposed the longest to this environment, has a quiet big version mapping
> map which outlines where we are heading to. On top, the version information
> is difficult to obtain on
> https://github.com/openconfig/public/tree/master/release/models since it
> is embedded within the description of the openconfig YANG models.
> draft-openconfig-netmod-model-catalog-02.txt is addressing this with the
> right approach and we believe is an important piece of this puzzle.
>

Agreed -- Anees has been working on putting together catalogue entries for
the OpenConfig models. I'll let him comment on current progress and where
we intend to publish those entries. I agree that having this information
available as to what is supported where is an important part of the overall
solution.

Given your interest in both OpenConfig and gNMI, I'd encourage you to
engage directly on the GitHub for both, and consider joining OpenConfig as
an operator member. We're particularly interested in addressing the
implementation challenges for folks that are actually deploying these
technologies. Experience in production environments, inside and outside of
Google, has guided a significant proportion of the features and changes
we've made to both gNMI and OC.

Thanks again for the comments.

Cheers,
r.