Re: [netmod] instance file parsing

Andy Bierman <andy@yumaworks.com> Tue, 04 December 2018 17:23 UTC

Return-Path: <andy@yumaworks.com>
X-Original-To: netmod@ietfa.amsl.com
Delivered-To: netmod@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2053A130F78 for <netmod@ietfa.amsl.com>; Tue, 4 Dec 2018 09:23:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.358
X-Spam-Level:
X-Spam-Status: No, score=-3.358 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-1.459, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=yumaworks-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id u0l9jicD1pzR for <netmod@ietfa.amsl.com>; Tue, 4 Dec 2018 09:23:44 -0800 (PST)
Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 28CC5130F46 for <netmod@ietf.org>; Tue, 4 Dec 2018 09:23:44 -0800 (PST)
Received: by mail-lf1-x129.google.com with SMTP id b20so12540762lfa.12 for <netmod@ietf.org>; Tue, 04 Dec 2018 09:23:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yumaworks-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=0h9c1csCX5i1W7dLtfrrNIQcbB2sqzRDuaUmm0fYjgY=; b=Noo82Z8mkjnJZXka8ST/kv1tZcT754yVwEob9bab4B8rvgU9f2ex2pxMvO9U1+bY8L F5Yd06xOTaLiSI6mJqkI/ckPhdCNkUZcq1lveztJU43Y1vQ5dkbQElepAdtlTsuGx10j GWyP0I2k6FVJ3VePR1rkIG9ldQCfHqz6mhI8zkyNPZ47flwUS41CNLuOQEBtDMMRLkFe zwrAgmFuJWp8VzcKpLo9AFImTPGk6Ba5l332zVKmyvZNOgvKtyNSRAhTzc0ec3DCAejM l6XjnHUriyDQMkVRuDOk8i5rh1eM87xxBGgirUQH6+VHv+X62mctfavSfYcqEvdPJoxC XWGA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0h9c1csCX5i1W7dLtfrrNIQcbB2sqzRDuaUmm0fYjgY=; b=B0qbZutwxwp/fMmb46LpJKaryTH2CVfGWCbA5HWd/vNJg0HULK1wdZW0IC8AAN/Vmm bRG2mWAw8tx+Vuh/athGJdJwj7DMmIX1hGHFY5AtHFbgrW5TjLtqLq/QwKRpJLt37swZ Lcv0f7vtpa9X4Fy9GpKCWl5tNl1nZVjh6IxtRD3V2r2Lp381W2TmX7+yt6eBiDNg+7UG OHF3NKNPCc+zOF1WmuuXqHL/zoeV+KJaF9GEwWBCyxlSKBowQTK9rf/pfs1iVKFzWvA2 epnR0gvfi+OEK5oYeUvk5trCgs9Rozk4OUmyONJMeeNOONKQD85vABfuLK2+/WEMNfh1 +0XQ==
X-Gm-Message-State: AA+aEWbicJISTr25mzsV012ac/CnOMr32x91DmYBdeQn9bv8J2zNFyNb P6Ah9KOEhYzFCr+73N5DJDGYSq1TsJTtSya40RdWfA==
X-Google-Smtp-Source: AFSGD/VNUoqc8efPPbbUrw6GMx7NBDj0JS9XUnO0XKubcB8XdcaoPCf/9klpxjc23E82ddHKbY5lSil9WPdDSjp6g5M=
X-Received: by 2002:a19:d58e:: with SMTP id m136mr13173984lfg.70.1543944221923; Tue, 04 Dec 2018 09:23:41 -0800 (PST)
MIME-Version: 1.0
References: <CABCOCHQYR_iY7Lp=0m8D-GQ8+Rzuijaa8_41bJw6ZvWPc+Cw_w@mail.gmail.com> <b2ee0537-465a-fbc0-1b94-000da5993b05@cisco.com> <20181130192830.g2622izshwn4f55z@anna.jacobs.jacobs-university.de> <6281fd79-2284-46c0-350c-770dcdade29d@cisco.com> <20181203113633.v3nfuen66ngjjp6l@anna.jacobs.jacobs-university.de> <CABCOCHRC_Z7QJhV-sMrUyX_d49UM8C=x9rgOtTc9iP2Wz52+VQ@mail.gmail.com> <20181203181323.dfbtlkumwp7ui5m6@anna.jacobs.jacobs-university.de> <357e815d-cf6e-c093-0212-947fc8c7344d@cisco.com>
In-Reply-To: <357e815d-cf6e-c093-0212-947fc8c7344d@cisco.com>
From: Andy Bierman <andy@yumaworks.com>
Date: Tue, 4 Dec 2018 09:23:30 -0800
Message-ID: <CABCOCHTfD9DAbn17DtyANCFDdeLTfPS75worZWRs09UcJuY=5w@mail.gmail.com>
To: Robert Wilton <rwilton@cisco.com>
Cc: NetMod WG <netmod@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000e6b45c057c3586ae"
Archived-At: <https://mailarchive.ietf.org/arch/msg/netmod/SJvOXsV_bppH8kMV48uUDRZcH7g>
Subject: Re: [netmod] instance file parsing
X-BeenThere: netmod@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: NETMOD WG list <netmod.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/netmod>, <mailto:netmod-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/netmod/>
List-Post: <mailto:netmod@ietf.org>
List-Help: <mailto:netmod-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/netmod>, <mailto:netmod-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Dec 2018 17:23:48 -0000

On Tue, Dec 4, 2018 at 9:14 AM Robert Wilton <rwilton@cisco.com> wrote:

>
> On 03/12/2018 18:13, Juergen Schoenwaelder wrote:
> > On Mon, Dec 03, 2018 at 09:56:48AM -0800, Andy Bierman wrote:
> >> On Mon, Dec 3, 2018 at 3:36 AM Juergen Schoenwaelder <
> >> j.schoenwaelder@jacobs-university.de> wrote:
> >>
> >>> On Mon, Dec 03, 2018 at 10:21:20AM +0000, Robert Wilton wrote:
> >>>> On 30/11/2018 19:28, Juergen Schoenwaelder wrote:
> >>>>> I doubt it makes sense to carry an entire yang library schema with
> the
> >>>>> instance data. The modules are actually known via the namespaces.
> >>>> Knowing the module name or namespace isn't sufficient to be able to
> parse
> >>>> the data:
> >>>>   - E.g. if the contents are in the format of RFC7895bis, but the
> client
> >>>> tries to parse it using the schema from RFC7895, then it will fail.
> >>>>   - Once NBC changes are allowed via YANG versioning it will become
> more
> >>>> necessary to know the revision or version of the modules to be able to
> >>> parse
> >>>> the data.  E.g. not even just the latest module revision will
> necessarily
> >>>> work.   It is entirely plausible that different server revisions may
> be
> >>>> generating instance data using slightly different schema.
> >>> Perhaps this is a problem of the versioning solution then. ;-)
> >>>
> >>>> If the server has deviations then the client may also need to know
> those
> >>>> deviations to correctly parse the file.
> >>>>
> >>>> So, I think that it is pretty clear that knowing the right schema is
> >>>> required to be able to correctly parse and interpret the instance
> data.
> >>>>
> >>>> I think that there are 4 ways that this can be achieved:
> >>>>   1) embedded the necessary schema information into the YANG instance
> >>> data.
> >>>>   2) put the necessary schema information online somewhere and have a
> URI
> >>>> reference to it.
> >>>>   3) some combination of (1) and (2), e.g. packages defined centrally,
> >>> with
> >>>> deviations listed in the file.
> >>>>   4) the schema is determined using some out of bound mechanism, or
> >>> possibly
> >>>> it is hard coded.
> >>> Perhaps we need to define first what "parse" means. I am not sure
> >>> parsing requires to know all schema details. Anyway, if all schema
> >>> details are needed, then the simplest solution seems to be to read the
> >>> schema from another instance file. This then only requires to
> >>> bootstrap the schema model version.
> >>>
> >>>
> >> I agree with Rob.
> >> The solution has to support accurate parsing of the instance data.
> >> This means that the parser has the correct schema tree to use for
> >> validating an instance document against the schema.
> >>
> >> Since the new yang-library can have a different schema tree for each
> >> datastore,
> >> clearly the option of specifying the datastore is needed.
> > We need to get the terminology straight. I can parse XML and JSON
> > files without the need to have schema information available. So I
> > think the word 'parsing' is quite misleading here. And I can extract
> > data out of XML and JSON files that I find interesting without schema
> > information as well. So its a certain type of tools that may take
> > advantage of being schema aware but not all tools need to be schema
> > aware.
> >
> >>>> I don't think that there is a one size fits all answer here.  I think
> >>> that
> >>>> that it depends on the use case.  Certainly, I think that
> facilitating 1
> >>> -3
> >>>> is useful, but they should be optional rather than mandatory.  I.e.
> >>> defining
> >>>> nodes for these doesn't seem to cost much if a server isn't obliged to
> >>>> populate them.
> >>>>
> >>>> I do think that YANG packages (themselves defined in instance data
> >>>> documents) could be very helpful here.  I.e. rather than listing all
> the
> >>>> modules, instead list the packages + any deviations from that.  I'm
> >>>> presuming here that the definitions of the packages are available via
> a
> >>> URI.
> >>>
> >>>
> >> Yes -- this has been my goal for YANG Packages since the start.
> >> By using nested packages and offline caching, the entire YANG library
> for a
> >> device
> >> could be recognized in a few URIs.
> > An instance file storing the schema tree is an offline cache. It falls
> > out of the instance file format for free. Yes, packages are yet
> > something entirely different but so far no work in this direction is
> > chartered so we should get instance file formats defined without
> > solving another bigger problem first.
>
> In the IETF 103 discussion on YANG versioning, it seemed to be the
> consensus in the room that the versioning design team need to consider
> also versioning of sets of modules, rather than just individual
> modules.  I.e. something along the lines of YANG packaging.
>
>

I am not suggesting YANG Packages has to be solved in order to
work on this instance document format. The IETF thinks YANG modules
are the appropriate level of abstraction.

Issues like "the extra metadata takes up too much space" are entirely
subjective.
Obviously a parser does not need any schema to parse anydata.
This is only needed if there is a real schema expected within the anydata.

It is optional for the writer to add the extra metadata.
It is optional for the reader to use the extra metadata.
Not sure why the solution should forbid this functionality.



> Thanks,
> Rob
>


Andy




>
>
> >
> >
> >>>>>    If
> >>>>> you want to capture the schema, dump the relevant yang library into
> >>>>> another instance file.
> >>>> That just means another file to carry around and manage.
> >>> Yes, but compared to solutions that require new and/or much more
> >>> elaborate data formats, this is very cheap and efficient, in
> >>> particular for systems where the schema itself is relatively
> >>> static. Also in terms of engineering time this is a rather cheap
> >>> solution since you do not need to invent a new way to document and
> >>> communicate a schema.
> >>>
> >>>
> >> I agree the cost of an extra instance document is not a problem,
> especially
> >> since
> >> it would be optional and only used if the defaults are not sufficient:
> >>     - latest revision of module will be OK
> >>     - assume all features present will be OK
> >>     - assume no deviations will be OK
> >>
> >> If the defaults are not OK, then the parser will incorrectly accept or
> >> reject certain instance data,
> >> unless the correct schema tree is obtained somehow.
> > Or the tool processing instance file content just does not care about
> > the schema and it simply assumes that a certain (module,path)
> > combination has fixed semantics. I am big fan of being able to use
> > generic tools to filter and process XML or JSON encoded data.
> >
> > /js
> >
>