Re: [AVTCORE] WG Last Call: "RTP-mixer formatting of multi-party Real-time text"

Re: [AVTCORE] WG Last Call: "RTP-mixer formatting of multi-party Real-time text" - SFU

Bernard Aboba <bernard.aboba@gmail.com> Tue, 08 December 2020 19:10 UTC

MIME-Version: 1.0
References: <CAOW+2duJwBizifn94qcRfpZ6cqRjRVyueyoofox0AWjkcJm02g@mail.gmail.com> <68866CAE-C81B-4C23-9DB5-CA8B57C1E3DC@brianrosen.net> <CAOW+2dt88EX1bj27zurn7XX-Ct24CFi_5SRyGObvGjwEDRuR_A@mail.gmail.com> <CAOW+2dtwOEG6=OEQarQTxQKnUkBAKCCArQXZQUP_QTTbALK1iw@mail.gmail.com> <58a73f79-60ad-442c-3162-d2cd52f025fe@ghaccess.se> <b8784aa8-a5ef-544c-7315-c64767211387@ghaccess.se>
In-Reply-To: <b8784aa8-a5ef-544c-7315-c64767211387@ghaccess.se>
From: Bernard Aboba <bernard.aboba@gmail.com>
Date: Tue, 08 Dec 2020 11:10:11 -0800
Message-ID: <CAOW+2duqBWJrq8ihp9Of+4JfYcgJeAJw8tG9T7QDn_6kxC3hOA@mail.gmail.com>
To: Gunnar Hellström <gunnar.hellstrom@ghaccess.se>
Cc: IETF AVTCore WG <avt@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000b0022205b5f8b1b8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/_YXqX8scZfkA53et6AjJD5oUrjk>
Subject: Re: [AVTCORE] WG Last Call: "RTP-mixer formatting of multi-party Real-time text" - SFU
Precedence: list

Gunnar asked:

"Questions:

1. Is 3GPP TS 26.114 Annex T an example of sdp for the kind of SFU
described in RFC 7667 section 3.7, with one media section per stream in
a lower number of streams than participants? If not, how does a normal
sdp look for an SFU?

[BA] Annex T appears to use payload multiplexing (one payload type per
stream) for simulcast, which is not commonly deployed.
It is more common to use SSRC multiplexing (one SSRC per stream), all on
the same payload type.  Here are examples of what simulcast SDP looks like
in various browsers:
A playground for Simulcast without an SFU - webrtcHacks
<https://webrtchacks.com/a-playground-for-simulcast-without-an-sfu/>

2. RFC 7667 section 3.7 seems to tell that the sequence number is
regenerated in sequence by the middlebox. Is that right? That would make
it not possible to detect if total loss of text occurred. Recovery can
be done based on timestamp analysis, but not detection of unrecoverable
loss.

[BA] The sequence number needs to be regenerated when an SFU switches
between simulcast streams sent by a participant, sending only a single
stream onward to the viewer.  This is needed since each simulcast stream
has its own sequence number space, and endpoints typically do not support
receiving simulcast.

However an RTT sender would not send simulcast (e.g. multiple versions of
the same text stream).  So the SFU would potentially forward only a single
stream per participant, with no CSRCs. The question is whether a
"multi-party aware endpoint" would be prepared to receive multiple SSRCs
(and no CSRCs), with each SSRC representing a single RTT source.

3. How is the source conveyed? There is apparently an SSRC to SSRC
mapping taking place. Is there a lower number of SSRCs used in
transmission from the middlebox than the total number of participants?
If so, how is switch of source indicated? By RTCP?

[BA] I am suggesting a scenario where there would be one SSRC per source.
The SFU could perhaps replace the SSRC received from a participant with an
SSRC of its own, but each text stream would have a unique SSRC up to the
maximum participant limit.

If we find that it is likely possible to use the SFU, then I would
anyway think that we just include general hints of its use in this
draft, and let further work detail it. Would that be an acceptable
conclusion?

[BA] My question was whether a "multi-party aware" endpoint could support
an SFU (SSRCs but no CSRCs), and how the negotiation would work (whether it
was supported or not).  If it can work and you can indicate how it would be
done (or how it would fail if it cannot be done), that would be fine with
me.
"

On Mon, Dec 7, 2020 at 3:03 PM Gunnar Hellström <
gunnar.hellstrom@ghaccess.se> wrote:

> Bernard, I indicated that I would like to discuss the SFU.
>
> You said: "Also, some clarifications of desired behavior by SFUs might
> be helpful." and mentioned SFU in a qouple of other comments.
>
> I like the idea to have a possibility to use E2E encryption. I also
> imagined that with separate RTP streams it would be possible to better
> detect if unrecoverable loss of text appears or not after some lost
> packets. But I get unsure if that is possible when I read section 3.7 of
> RFC 7667.
>
> I need to understand the SFU description in RFC 7667 better to judge if
> it really can carry RFC 2198 coded text with maintained possibility to
> detect unrecoverable loss.
>
> Before IETF 108, the draft had both an RTP multi-stream method and an
> RTP-single stream mixer method ( and also the method for multi-party
> unaware endpoints ). Then we decided to move on without the multi-stream
> method. It is a bit inconsistent to bring it in again, but I can see the
> possibility to arrange for E2E encryption to be a good reason. The
> earlier multi-stream method was based on the RTP-translator and required
> the mixer to recover from packet loss and insertion of marks for
> unrecoverable loss, and therefore could not be used for E2E encryption.
> Therefore it might be valid to rethink the earlier decision.
>
> Questions:
>
> 1. Is 3GPP TS 26.114 Annex T an example of sdp for the kind of SFU
> described in RFC 7667 section 3.7, with one media section per stream in
> a lower number of streams than participants? If not, how does a normal
> sdp look for an SFU?
>
> 2. RFC 7667 section 3.7 seems to tell that the sequence number is
> regenerated in sequence by the middlebox. Is that right? That would make
> it not possible to detect if total loss of text occurred. Recovery can
> be done based on timestamp analysis, but not detection of unrecoverable
> loss.
>
> 3. How is the source conveyed? There is apparently an SSRC to SSRC
> mapping taking place. Is there a lower number of SSRCs used in
> transmission from the middlebox than the total number of participants?
> If so, how is switch of source indicated? By RTCP?
>
>
> If we find that it is likely possible to use the SFU, then I would
> anyway think that we just include general hints of its use in this
> draft, and let further work detail it. Would that be an acceptable
> conclusion?
>
>
> Regards
>
> Gunnar
>
> --
>
> Gunnar Hellström
> GHAccess
> gunnar.hellstrom@ghaccess.se
>
>

[AVTCORE] WG Last Call: "RTP-mixer formatting of … Bernard Aboba
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Dan Mongrain
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Brian Rosen
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Bernard Aboba
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… James Craig
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… James Hamlin
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Bernard Aboba
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Lorenzo Miniero
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Lorenzo Miniero
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström
Re: [AVTCORE] WG Last Call: "RTP-mixer formatting… Gunnar Hellström