Re: [AVTCORE] Magnus Westerlund's Discuss on draft-ietf-payload-rtp-ttml-03: (with DISCUSS and COMMENT)

James Sandford <james.sandford@bbc.co.uk> Wed, 16 October 2019 15:17 UTC

Return-Path: <james.sandford@bbc.co.uk>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 253281200F4; Wed, 16 Oct 2019 08:17:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.201
X-Spam-Level:
X-Spam-Status: No, score=-4.201 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Mei3hJcBOGT0; Wed, 16 Oct 2019 08:17:51 -0700 (PDT)
Received: from mailout0.telhc.bbc.co.uk (mailout0.telhc.bbc.co.uk [132.185.161.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 61D0E12002E; Wed, 16 Oct 2019 08:17:51 -0700 (PDT)
Received: from BGB01XI1005.national.core.bbc.co.uk ([10.184.50.55]) by mailout0.telhc.bbc.co.uk (8.15.2/8.15.2) with ESMTP id x9GFHdxh025264; Wed, 16 Oct 2019 16:17:39 +0100 (BST)
Received: from BGB01XI1016.national.core.bbc.co.uk (10.161.14.79) by BGB01XI1005.national.core.bbc.co.uk (10.184.50.55) with Microsoft SMTP Server (TLS) id 14.3.408.0; Wed, 16 Oct 2019 16:17:38 +0100
Received: from BGB01XUD1001.national.core.bbc.co.uk ([10.184.52.80]) by BGB01XI1016.national.core.bbc.co.uk ([10.161.14.79]) with mapi id 14.03.0408.000; Wed, 16 Oct 2019 16:17:39 +0100
From: James Sandford <james.sandford@bbc.co.uk>
To: Magnus Westerlund <magnus.westerlund@ericsson.com>, The IESG <iesg@ietf.org>
CC: "draft-ietf-payload-rtp-ttml@ietf.org" <draft-ietf-payload-rtp-ttml@ietf.org>, Roni Even <roni.even@huawei.com>, "avtcore-chairs@ietf.org" <avtcore-chairs@ietf.org>, "avt@ietf.org" <avt@ietf.org>
Thread-Topic: Magnus Westerlund's Discuss on draft-ietf-payload-rtp-ttml-03: (with DISCUSS and COMMENT)
Thread-Index: AQHVg2afvBBZi0nYEE6tAWxVrjQIBqddYsmy
Date: Wed, 16 Oct 2019 15:17:37 +0000
Message-ID: <734752AF0E88364D983373FE5CEFED5770DE3FDA@bgb01xud1001>
References: <157115048176.18158.4077040057321391690.idtracker@ietfa.amsl.com>
In-Reply-To: <157115048176.18158.4077040057321391690.idtracker@ietfa.amsl.com>
Accept-Language: en-GB, en-US
Content-Language: en-GB
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [132.185.132.13]
X-TM-AS-Product-Ver: SMEX-12.5.0.1300-8.2.1013-24052.007
X-TM-AS-Result: No-28.545800-8.000000-10
X-TMASE-MatchedRID: j4nUk6F+aLZ5X0FJZbmEppmug812qIbzt3aeg7g/usAfmRnL8RBuBGQg TqznkaL4G+4n2LrElalpOcdqGk51VMOQecnAhEpbQpxiLlDD9FVK0YCCYqpa5Zq4sb0tOi3yuae sYH+TAYyunK7N4l6OXVMvPRdRXf+laWbbDm4LdIajrlYm3WTU7+BQYTAGl41MCAfRfqq1Gm4UwQ +zl/NLC+L/upx1xAwVBV8g14o/hY4xhcKMC/TCzxK6EFc0lvV0IcCiCHZJTldPtLhlThdPEGnt/ rdbhQDU+E4F49aGGdlPXq4Wo2jkzSowKSWrVbTyCz1WR8KHe4BQCOsAlaxN71eilmPI7oJlIrwC GnawdmgFWd4E/YuCIKRLVvWW6E1t4g9q3IQDUz0ve6W+IORwrQXXmzqmsIi7BouprdZWz1MzLMG WJa7zPy9TaQWIulIZ7O1zbau6VKszf0ehUb8NUtxajlW+zwxCu2rcU2ygxCDXIZTIUrehXqHZ5N tLmicGjqLgNuFnv9oWIaNxgnxVBwmQZcmlOSaB5gCHftmwEMLaoFJAcCHymE3SUa+MLzEF03P7E 9KrQTaYn2Mxjf+ZTfZxUHN7bIEunXtKf+ShGJjISPeZE8elXr3BndE4rcfLVUwmVL0cK7OHb29D R0Z/YzGtOxw+oNfP16f6LxPUhEvaZDn2hX5ZwMytm75dvf7ZEeXPNyiMU05JJReS9JUB3GnGaBB 9GBKEiJHU6vx+MjvBUxSl/J8pQj7+sTQQpzD7lfYR0EkD+VoZYA38gj3BxNCb26mGD2jLRH8XVP rD6dJKY2rnwHa6tn8mA3sDDq0AVkVZa47CjvDdB/CxWTRRuyUIayx+Skid
X-TM-AS-User-Approved-Sender: Yes
X-TM-AS-User-Blocked-Sender: No
X-TMASE-Result: 10--28.545800-8.000000
X-TMASE-Version: SMEX-12.5.0.1300-8.2.1013-24052.007
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-EXCLAIMER-MD-CONFIG: c91d45b2-6e10-4209-9543-d9970fac71b7
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/iLFuUazBePFQbiNfg9zwzaOwUDU>
Subject: Re: [AVTCORE] Magnus Westerlund's Discuss on draft-ietf-payload-rtp-ttml-03: (with DISCUSS and COMMENT)
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Oct 2019 15:17:55 -0000

Thank you for the detailed review, Magnus. I've responded to your points in-line.

Regards,
James

>----------------------------------------------------------------------
>DISCUSS:
>----------------------------------------------------------------------
>
>James and WG,
>
>I do have a couple of issues I want to have your feedback on if they should be
>corrected or not before proceeding to publication. Note they are for discussion
>and in cases where things have been discussed and there is consensus please
>reference that so that I can take that into consideration when we resolve these.
>
>1. Section 4.1:
>        Timestamp:
>        The RTP Timestamp encodes the time of the text in the packet.
>
>        As timed text is a media that has duration, from a start time to an end
>        time, and the RTP timestmap is a single time tick in the chose clock
>        resolution the above text is not clear. I would think the start time of
>        the document would be the most useful to include?

Suggested updated text for the Timestamp definition: 

    The RTP Timestamp encodes the start time of the document in User Data Words

>        I think the text in 4.2.1.2 combined with the above attempts to imply
>        that the RTP timestamp will be the 0 reference for the time-expression?
>
>        I think this needs a bit more clarification. Not having detailed
>        studied TTML2/1 I might be missing important details. But some more
>        information how the document timebase:media time line connects to the
>        RTP timestamp appears necessary.

Suggested updated text for the last sentence of Paragraph 2 of 4.2.1.2: 

    Computed TTML media times are offset relative to E in accordance with Section I.2 of [TTML2].

I'm hesitant to expand on the calculation of timing beyond that in this document. It is discussed at length in TTML2 which is included as a normative reference in this document. I'd like to avoid blurring the scope of this document into that of TTML2 if possible.

>2. A Discuss Discuss: As Timed Text is directly associated with one or more
>video and audio streams and requires synchronization with these other media
>streams to function correct. This leads to two questions.
>
>        First of all is application/ttml+xml actually the right top-level media
>        type? If using SDP that forces one unless one have BUNDLE to use a
>        different RTP session. Many media types having this type of properties
>        of being associated with some other media types have registered media
>        types in all relevant top-level media types.

I think we need to be careful with assumptions here. Timed Text MAY be associated with Video and/or Audio streams but there is no requirement to do so. Just as there is no requirement for Video to be associated with Audio or visa-versa.

I'm cautious about opening a can of worms with regards to media types. But my personal opinion is that TTML is not video and it is not audio. It may be associated with them but it is not those. It therefore shouldn't have registered types for video, audio, etc. If the association of different top-level media types is currently difficult, that is an issue I believe should be addressed outside of the scope of this document.

>        Secondly, this payload format may need some references to mechanisms in
>        RTP and signalling that has the purpose of associating media streams? I
>        also assume that we have the interesting cases with localization that
>        different languages have different time lines for the text and how long
>        it shows as there are different tranditions in different countries and
>        languages for how one makes subtitles.
>
>        This may also point to the need for discussing the pick one out of n
>        mechanism that a manifest may need.

I believe this is outside of the scope of this document in the same way it is for audio equivalents and other text formats.

>3. Section 7.1:
>
>        It may be appropriate to use the same Synchronization
>   Source and Clock Rate as the related media.
>
>        Using the same SSRC as another media stream in the same RTP Session is
>        no-no. If you meant to use multiple RTP sessions and associate them
>        using the same SSRC in diffiernt, yes it works but is not recommended.
>        This points to the need for a clearer discussion of how to achieve
>        linkage and the reasons for why same RTP timestamp may be useful or not.

Thanks for spotting this. It should refer to the Reference Clock, not Synchronization Source.

>4. Fragmentation:
>        I think the fragmentation of an TTML document across multiple RTP
>        payloads are a bit insufficiently described. I have the impression that
>        it is hard to do something more clever than to fill each RTP payload to
>        MTU limtiation, and send them out insequence. However, I think a firm
>        requirement to apply RTP sequence number for a single document in
>        consecutive numbers. Also the re-assebly process appear to have to
>        parts for detecting what belongs together, same timestamp and last
>        packet of document should have marker bit set. As a receiver can loose
>        the last packet in the previous document, still know that it has
>        received everything for the following document. However, if the losses
>        are multiple, inspection of the re-assemblied document will be
>        necessary to determine if the correct beginning is present. I have the
>        impression that a proper section discussing these matter of
>        fragmentation and re-assembly are necessary for good interoperability
>        and function.

This has been discussed in response to Adam Roach's Discuss.

>
>----------------------------------------------------------------------
>COMMENT:
>----------------------------------------------------------------------
>
>A. Section 6.
>        To my understanding the TTML document is basically not possible to
>        encode better. A poor generator can create unnecessary verbose XML
>        which could be shorter, but there are no possibility here to trade-off
>        media quality for lower bit-rate. I think that should be made more
>        explicit in Section 6.

Again, this has been discussed in response to Adam's Discuss.

>B. Section 7.
>        Wouldn't using 90kHz be the better default? 1kHz is the minimal from
>        RTCP report that will work decently. However, if the timed text is
>        primarily going to be synchronized with video 90k do ensure that
>        (sub-)frame precise timing is possible to express. I don't see any need
>        raster line specific for time text so the SMPTE 27 MHz clock is not
>        needed. And using non default for subtitling radio etc appears fine.

Could you justify a 90k default? Such a high rate isn't appropriate for most timed text use cases. As stated in my response to "2." above, there is no requirement for TTML to be used alongside video so defining a default rate based on video would be inappropriate. Just as we don't expect video to use a 48k rate or audio a 90k rate. The default for TTML should be appropriate to timed text and should not make assumptions about specific implementations.

>C. Repair operations and relation to documents. Based on basic properties of
>TTML documents, I do think the repair operations should be highly targeting
>single documents as there is likely seconds between documents, while the
>fragments of a document will be sent in a rather short interval. That
>recommendation would be good to include.

Also discussed in response to Adam's Discuss.