Re: [AVTCORE] WG Last Call: "RTP-mixer formatting of multi-party Real-time text" - SFU
Gunnar Hellström <gunnar.hellstrom@ghaccess.se> Thu, 10 December 2020 16:49 UTC
Return-Path: <gunnar.hellstrom@ghaccess.se>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1F3EA3A10EA for <avt@ietfa.amsl.com>; Thu, 10 Dec 2020 08:49:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=egensajt.se
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HXx06Y8O_AOr for <avt@ietfa.amsl.com>; Thu, 10 Dec 2020 08:49:26 -0800 (PST)
Received: from smtp.egensajt.se (smtp.egensajt.se [193.42.159.246]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AB25C3A10E9 for <avt@ietf.org>; Thu, 10 Dec 2020 08:49:23 -0800 (PST)
Received: from [192.168.2.137] (h77-53-37-81.cust.a3fiber.se [77.53.37.81]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: gunnar.hellstrom@ghaccess.se) by smtp.egensajt.se (Postfix) with ESMTPSA id 62DA220433; Thu, 10 Dec 2020 17:49:21 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=egensajt.se; s=dkim; t=1607618961; bh=fan+WlBL26k7pDSgyUGQr22ZG7sbRkUZpOrLr5ERIB8=; h=Subject:From:To:Cc:References:Date:In-Reply-To:From; b=LETDHlm/vvlCnHVDL2uVR/l7ic4RatR2SkDjQevgzpg8cqZN5fGE9LtxZfV3mGEsl 3aDNCa4NUvfoQXBBGxMCEf7hXYvnE9Wq4U4rHDKK/7hKg+/3Q9ON86edG5vZroR4yz Paf4QcIZXpy02ZT/ml15r1zhZCi5nTWVVSeUBquE=
From: Gunnar Hellström <gunnar.hellstrom@ghaccess.se>
To: Bernard Aboba <bernard.aboba@gmail.com>
Cc: IETF AVTCore WG <avt@ietf.org>
References: <CAOW+2duJwBizifn94qcRfpZ6cqRjRVyueyoofox0AWjkcJm02g@mail.gmail.com> <68866CAE-C81B-4C23-9DB5-CA8B57C1E3DC@brianrosen.net> <CAOW+2dt88EX1bj27zurn7XX-Ct24CFi_5SRyGObvGjwEDRuR_A@mail.gmail.com> <CAOW+2dtwOEG6=OEQarQTxQKnUkBAKCCArQXZQUP_QTTbALK1iw@mail.gmail.com> <58a73f79-60ad-442c-3162-d2cd52f025fe@ghaccess.se> <b8784aa8-a5ef-544c-7315-c64767211387@ghaccess.se> <CAOW+2duqBWJrq8ihp9Of+4JfYcgJeAJw8tG9T7QDn_6kxC3hOA@mail.gmail.com> <48539ee9-ba2f-ff8e-b71f-f3cf64b5d59d@ghaccess.se> <bd1ee906-b759-028f-9108-0109fce490c7@ghaccess.se>
Message-ID: <8d7f249a-3e22-602f-8d3a-ef339146d2dd@ghaccess.se>
Date: Thu, 10 Dec 2020 17:49:17 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1
MIME-Version: 1.0
In-Reply-To: <bd1ee906-b759-028f-9108-0109fce490c7@ghaccess.se>
Content-Type: multipart/alternative; boundary="------------17984BA20C77F5010FE94FEB"
Content-Language: sv
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/EAlALSiS4eIuSVSo7oRUiPHtZ2E>
Subject: Re: [AVTCORE] WG Last Call: "RTP-mixer formatting of multi-party Real-time text" - SFU
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Dec 2020 16:49:31 -0000
After looking a bit more on the multi-stream issue, I want to reduce the ambition I announced below. We should not put too many varied methods in one draft. It will be too complicated to require or state compliance to them then. So, I want to back off to mention the possible benefits of multi-stream RTP and SFU. I want to indicate that it would be possible to negotiate between the specified RTP-mixer method and other multi-party aware methods, but say that it is for further study to specify any such method in more detail. This conclusion also matches the need from referencing from NENA specifications. /Gunnar Den 2020-12-09 kl. 20:48, skrev Gunnar Hellström: > > Bernard, > > I am editing the draft to include a specification of use of multiple > RTP streams in one RTP session to each participant. There seems to be > so many ways to negotiate and use such topology, so I do not think I > should dictate to use one specific. Therefore I suggest that the > sections about that topology is made general and need exact > specification at implementation time. > > The idea with the other methods (the RTP-mixer method and the method > for multi-party unaware endpoints) are intended to be sufficiently > specified to result in interoperability, e.g. between emergency > services and mobile services. > > I do not see the same possibility for the multi-stream method. It is > likely that RTT will be included in an environment where the other > media and the application calls for use of specific RTP topologies and > specific ways to negotiate the use of it. Therefore I intend to write > a short description with general statements and references to which > considerations from the RTP-mixer method should be valid also for the > RTP multi-stream methods. > > I intend to also check and adjust the use of terms as "multi-party > aware" to make clear when it means one or both multi-party aware methods. > > A couple of further notes inline. I got some help to understand more; > > Den 2020-12-08 kl. 23:45, skrev Gunnar Hellström: >> >> Bernard, some proposed conclusions below: >> >> Den 2020-12-08 kl. 20:10, skrev Bernard Aboba: >>> Gunnar asked: >>> >>> "Questions: >>> >>> 1. Is 3GPP TS 26.114 Annex T an example of sdp for the kind of SFU >>> described in RFC 7667 section 3.7, with one media section per stream in >>> a lower number of streams than participants? If not, how does a normal >>> sdp look for an SFU? >>> >>> [BA] Annex T appears to use payload multiplexing (one payload type >>> per stream) for simulcast, which is not commonly deployed. >>> It is more common to use SSRC multiplexing (one SSRC per stream), >>> all on the same payload type. Here are examples of what simulcast >>> SDP looks like in various browsers: >>> A playground for Simulcast without an SFU - webrtcHacks >>> <https://webrtchacks.com/a-playground-for-simulcast-without-an-sfu/> >>> >> [GH] Thanks. That example was with Bundle and WbRTC. If that is the >> environment you imagine, we will not be allowed to use RTP transport >> for RTT. Instead it is the data channel usage for RTT in RFC 8865. >> Only audio and video are using RTP. >> >> But I assume that there could be use of the SFU technology also for >> traditional SIP. What we need to consider for the current draft is >> then if an sdp indicating capability for use of an SFU or some other >> form of multi-stream RTP will have something characteristic for that >> capability. E.g. an m-section for each stream. >> >> I think it is quite likely that it must be visible in the sdp that a >> party has multi-stream RTP capability. >> >> Then the "rtt-mix" attribute is sufficient to indicate capability for >> the RTP-mixer based multi-party solution, and we only need to tell >> that other solutions may be implemented. >> >> I need to check wording so that the draft does not preclude other >> multi-party RTP based RTT solutions. >> > [GH]I will just assume that it is visible in the sdp that a > multi-party session with RTP multi-stream topology is offered, or a > capability to use it. And that a party who is not capable of that > method can indicate that fact in an offer or answer. I will not say how. >> >>> >>> >>> 2. RFC 7667 section 3.7 seems to tell that the sequence number is >>> regenerated in sequence by the middlebox. Is that right? That would make >>> it not possible to detect if total loss of text occurred. Recovery can >>> be done based on timestamp analysis, but not detection of unrecoverable >>> loss. >>> >>> [BA] The sequence number needs to be regenerated when an SFU >>> switches between simulcast streams sent by a participant, sending >>> only a single stream onward to the viewer. This is needed since >>> each simulcast stream has its own sequence number space, and >>> endpoints typically do not support receiving simulcast. >> >> There are regulatory requirements to detect and mark cases of >> suspected text loss. So, if there is packet loss from a participant >> to the middlebox, that sequence gap must not be just ignored and >> resulting in an unbroken sequence number series after the SFU. Can >> the sequence number from the source be copied to the transmission >> from the SFU? >> >> Does not video also have that need? >> > [GH]During active transmission from one source, the sequence number > needs to step up at the same rate as the sequence numbers of incoming > packets, so that gaps from the leg between the transmitter to the > middlebox can be revealed and acted on by the receiving party. So, the > offset adjustment is only intended to make the sequence look > consequtive even if there was a pause in transmission of a source. > Then it works for total loss detection for RTT. >> >>> >>> However an RTT sender would not send simulcast (e.g. multiple >>> versions of the same text stream). So the SFU would potentially >>> forward only a single stream per participant, with no CSRCs. The >>> question is whether a "multi-party aware endpoint" would be prepared >>> to receive multiple SSRCs (and no CSRCs), with each SSRC >>> representing a single RTT source. >> In order to have interoperability, the "rtt-mix" attribute must only >> mean the RTP-mixer based solution. Both mixers and endpoints could >> also have multi-stream capability, but the intention to open a >> multiparty session with multi-stream need to have its own >> characteristics. > [GH]I will differentiate the terms. Most likely so that multi-party > aware means any of the two methods RTP-mixer and multi-stream RTP. And > then specific wording for each method when there is talk about only > one method. >>> >>> 3. How is the source conveyed? There is apparently an SSRC to SSRC >>> mapping taking place. Is there a lower number of SSRCs used in >>> transmission from the middlebox than the total number of participants? >>> If so, how is switch of source indicated? By RTCP? >>> >>> [BA] I am suggesting a scenario where there would be one SSRC per >>> source. The SFU could perhaps replace the SSRC received from a >>> participant with an SSRC of its own, but each text stream would have >>> a unique SSRC up to the maximum participant limit. > [GH]But the mapping between NAME and SSRC is lost when RTCP SDES is > just sent along with replaced SSRC. But I assume that there are > methods to sort that out by some signaling that I dont know. So since > I will not be specific I will not bother about such detail. >>> >>> >> RTT is specific in that no data flows when the user does not send any >> information. In the other media there is always contents. > [GH] So in a way it can be said to be "SFU by nature". I might > mention that. It eases switching. In the normal case no switching will > be needed. All simultaneous senders can likely always be allowed to > transmit on to the receivers. >> >>> >>> If we find that it is likely possible to use the SFU, then I would >>> anyway think that we just include general hints of its use in this >>> draft, and let further work detail it. Would that be an acceptable >>> conclusion? > [GH]I am continuing on that line. >>> >>> [BA] My question was whether a "multi-party aware" endpoint could >>> support an SFU (SSRCs but no CSRCs), and how the negotiation would >>> work (whether it was supported or not). If it can work and you can >>> indicate how it would be done (or how it would fail if it cannot be >>> done), that would be fine with me. >>> " >> >> I need another example of sdp for an SFU session establishment in a >> traditional SIP environment to be able to answer. >> > [GH] I changed my mind. My general approach does not need any more sdp > examples. >> > Thanks, > > Gunnar > >> >> Thanks, >> >> Gunnar >> >>> >>> On Mon, Dec 7, 2020 at 3:03 PM Gunnar Hellström >>> <gunnar.hellstrom@ghaccess.se <mailto:gunnar.hellstrom@ghaccess.se>> >>> wrote: >>> >>> Bernard, I indicated that I would like to discuss the SFU. >>> >>> You said: "Also, some clarifications of desired behavior by SFUs >>> might >>> be helpful." and mentioned SFU in a qouple of other comments. >>> >>> I like the idea to have a possibility to use E2E encryption. I also >>> imagined that with separate RTP streams it would be possible to >>> better >>> detect if unrecoverable loss of text appears or not after some lost >>> packets. But I get unsure if that is possible when I read >>> section 3.7 of >>> RFC 7667. >>> >>> I need to understand the SFU description in RFC 7667 better to >>> judge if >>> it really can carry RFC 2198 coded text with maintained >>> possibility to >>> detect unrecoverable loss. >>> >>> Before IETF 108, the draft had both an RTP multi-stream method >>> and an >>> RTP-single stream mixer method ( and also the method for >>> multi-party >>> unaware endpoints ). Then we decided to move on without the >>> multi-stream >>> method. It is a bit inconsistent to bring it in again, but I can >>> see the >>> possibility to arrange for E2E encryption to be a good reason. The >>> earlier multi-stream method was based on the RTP-translator and >>> required >>> the mixer to recover from packet loss and insertion of marks for >>> unrecoverable loss, and therefore could not be used for E2E >>> encryption. >>> Therefore it might be valid to rethink the earlier decision. >>> >>> Questions: >>> >>> 1. Is 3GPP TS 26.114 Annex T an example of sdp for the kind of SFU >>> described in RFC 7667 section 3.7, with one media section per >>> stream in >>> a lower number of streams than participants? If not, how does a >>> normal >>> sdp look for an SFU? >>> >>> 2. RFC 7667 section 3.7 seems to tell that the sequence number is >>> regenerated in sequence by the middlebox. Is that right? That >>> would make >>> it not possible to detect if total loss of text occurred. >>> Recovery can >>> be done based on timestamp analysis, but not detection of >>> unrecoverable >>> loss. >>> >>> 3. How is the source conveyed? There is apparently an SSRC to SSRC >>> mapping taking place. Is there a lower number of SSRCs used in >>> transmission from the middlebox than the total number of >>> participants? >>> If so, how is switch of source indicated? By RTCP? >>> >>> >>> If we find that it is likely possible to use the SFU, then I would >>> anyway think that we just include general hints of its use in this >>> draft, and let further work detail it. Would that be an acceptable >>> conclusion? >>> >>> >>> Regards >>> >>> Gunnar >>> >>> -- >>> >>> Gunnar Hellström >>> GHAccess >>> gunnar.hellstrom@ghaccess.se <mailto:gunnar.hellstrom@ghaccess.se> >>> >> -- >> Gunnar Hellström >> GHAccess >> gunnar.hellstrom@ghaccess.se > -- > Gunnar Hellström > GHAccess > gunnar.hellstrom@ghaccess.se -- Gunnar Hellström GHAccess gunnar.hellstrom@ghaccess.se
- [AVTCORE] WG Last Call: "RTP-mixer formatting of … Bernard Aboba
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Dan Mongrain
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Brian Rosen
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Bernard Aboba
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… James Craig
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… James Hamlin
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Bernard Aboba
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Lorenzo Miniero
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Lorenzo Miniero
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
- Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström