[AVTCORE] Magnus Westerlund's Discuss on draft-ietf-payload-rtp-ttml-03: (with DISCUSS and COMMENT)

Magnus Westerlund via Datatracker <noreply@ietf.org> Mon, 21 October 2019 14:02 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: avt@ietf.org
Delivered-To: avt@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id E15EC1200A3; Mon, 21 Oct 2019 07:02:23 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Magnus Westerlund via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-payload-rtp-ttml@ietf.org, Roni Even <roni.even@huawei.com>, avtcore-chairs@ietf.org, roni.even@huawei.com, avt@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.106.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Magnus Westerlund <magnus.westerlund@ericsson.com>
Message-ID: <157166654391.31879.7510825796211658153.idtracker@ietfa.amsl.com>
Date: Mon, 21 Oct 2019 07:02:23 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/weH77Umf07lPbEUX0HP7iPVjSIk>
Subject: [AVTCORE] Magnus Westerlund's Discuss on draft-ietf-payload-rtp-ttml-03: (with DISCUSS and COMMENT)
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 21 Oct 2019 14:02:24 -0000

Magnus Westerlund has entered the following ballot position for
draft-ietf-payload-rtp-ttml-03: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-payload-rtp-ttml/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

James and WG,

I do have a couple of issues I want to have your feedback on if they should be
corrected or not before proceeding to publication. Note they are for discussion
and in cases where things have been discussed and there is consensus please
reference that so that I can take that into consideration when we resolve these.

1. Section 4.1:
        Timestamp:
        The RTP Timestamp encodes the time of the text in the packet.

        As timed text is a media that has duration, from a start time to an end
        time, and the RTP timestmap is a single time tick in the chose clock
        resolution the above text is not clear. I would think the start time of
        the document would be the most useful to include?

        I think the text in 4.2.1.2 combined with the above attempts to imply
        that the RTP timestamp will be the 0 reference for the time-expression?

        I think this needs a bit more clarification. Not having detailed
        studied TTML2/1 I might be missing important details. But some more
        information how the document timebase:media time line connects to the
        RTP timestamp appears necessary.

2. A Discuss Discuss: As Timed Text is directly associated with one or more
video and audio streams and requires synchronization with these other media
streams to function correct. This leads to two questions.

        First of all is application/ttml+xml actually the right top-level media
        type? If using SDP that forces one unless one have BUNDLE to use a
        different RTP session. Many media types having this type of properties
        of being associated with some other media types have registered media
        types in all relevant top-level media types.

        Secondly, this payload format may need some references to mechanisms in
        RTP and signalling that has the purpose of associating media streams? I
        also assume that we have the interesting cases with localization that
        different languages have different time lines for the text and how long
        it shows as there are different tranditions in different countries and
        languages for how one makes subtitles.

        This may also point to the need for discussing the pick one out of n
        mechanism that a manifest may need.

3. Section 7.1:

        It may be appropriate to use the same Synchronization
   Source and Clock Rate as the related media.

        Using the same SSRC as another media stream in the same RTP Session is
        no-no. If you meant to use multiple RTP sessions and associate them
        using the same SSRC in diffiernt, yes it works but is not recommended.
        This points to the need for a clearer discussion of how to achieve
        linkage and the reasons for why same RTP timestamp may be useful or not.

4. Fragmentation:
        I think the fragmentation of an TTML document across multiple RTP
        payloads are a bit insufficiently described. I have the impression that
        it is hard to do something more clever than to fill each RTP payload to
        MTU limtiation, and send them out insequence. However, I think a firm
        requirement to apply RTP sequence number for a single document in
        consecutive numbers. Also the re-assebly process appear to have to
        parts for detecting what belongs together, same timestamp and last
        packet of document should have marker bit set. As a receiver can loose
        the last packet in the previous document, still know that it has
        received everything for the following document. However, if the losses
        are multiple, inspection of the re-assemblied document will be
        necessary to determine if the correct beginning is present. I have the
        impression that a proper section discussing these matter of
        fragmentation and re-assembly are necessary for good interoperability
        and function.

5. Lack of definition of parameter types in the media type when using SDP
Offer/answer.

As the application/ttml media type do contain parameters (charset and profile)
there is a need to define what SDP O/A interpretations they need to have. See
section 3.4.2.1 of RFC 8088 for discussion of these different types.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

A. Section 6.
        To my understanding the TTML document is basically not possible to
        encode better. A poor generator can create unnecessary verbose XML
        which could be shorter, but there are no possibility here to trade-off
        media quality for lower bit-rate. I think that should be made more
        explicit in Section 6.

B. Section 7.
        Wouldn't using 90kHz be the better default? 1kHz is the minimal from
        RTCP report that will work decently. However, if the timed text is
        primarily going to be synchronized with video 90k do ensure that
        (sub-)frame precise timing is possible to express. I don't see any need
        raster line specific for time text so the SMPTE 27 MHz clock is not
        needed. And using non default for subtitling radio etc appears fine.

C. Repair operations and relation to documents. Based on basic properties of
TTML documents, I do think the repair operations should be highly targeting
single documents as there is likely seconds between documents, while the
fragments of a document will be sent in a rather short interval. That
recommendation would be good to include.