Re: [AVTCORE] Source switching performance in draft-hellstrom-avtcore-multi-party-rtt-source-01.txt - full mux

James Hamlin <james.hamlin@purple.us> Wed, 18 March 2020 11:50 UTC

Return-Path: <james.hamlin@purple.us>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 574D43A1450 for <avt@ietfa.amsl.com>; Wed, 18 Mar 2020 04:50:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HG7aMMokbstp for <avt@ietfa.amsl.com>; Wed, 18 Mar 2020 04:50:50 -0700 (PDT)
Received: from outbound-ip1b.ess.barracuda.com (outbound-ip1b.ess.barracuda.com [209.222.82.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2DF483A1410 for <avtcore@ietf.org>; Wed, 18 Mar 2020 04:50:48 -0700 (PDT)
Received: from smtp.purple.us (unknown [208.17.91.144]) by mx13.us-east-2b.ess.aws.cudaops.com (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NO); Wed, 18 Mar 2020 11:50:24 +0000
Received: from 1-WP-401-EXCH.purplenetwork.net (10.0.10.143) by 1-wp-402-exch.purplenetwork.net (10.0.10.144) with Microsoft SMTP Server (TLS) id 15.0.1263.5; Wed, 18 Mar 2020 04:50:20 -0700
Received: from 1-WP-401-EXCH.purplenetwork.net ([fe80::e190:fa54:4b11:2dfb]) by 1-wp-401-exch.purplenetwork.net ([fe80::e190:fa54:4b11:2dfb%13]) with mapi id 15.00.1263.000; Wed, 18 Mar 2020 04:50:20 -0700
From: James Hamlin <james.hamlin@purple.us>
To: Gunnar Hellström <gunnar.hellstrom@omnitor.se>, "avtcore@ietf.org" <avtcore@ietf.org>
Thread-Topic: Source switching performance in draft-hellstrom-avtcore-multi-party-rtt-source-01.txt - full mux
Thread-Index: AQHV/EBo9FPR1VbXgkO7VcwmthmDkKhNu3YAgAArNbQ=
Date: Wed, 18 Mar 2020 11:50:19 +0000
Message-ID: <1584532219395.21333@purple.us>
References: <158300358958.4740.11384667574242789327@ietfa.amsl.com> <910582c0-dcb5-eea4-2075-eef6085750f4@omnitor.se> <1584146721401.88908@purple.us> <61971f62-cff3-6fcb-1035-341e1244d255@omnitor.se> <1584360674813.99071@purple.us> <3a8a924a-8e36-6015-dee0-5951457dd39f@omnitor.se> <1584440349163.6938@purple.us>, <fe12b835-0d26-a2d9-104e-0522ab7fd8a6@omnitor.se>
In-Reply-To: <fe12b835-0d26-a2d9-104e-0522ab7fd8a6@omnitor.se>
Accept-Language: en-GB, en-US
Content-Language: en-GB
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.0.10.15]
Content-Type: multipart/alternative; boundary="_000_158453221939521333purpleus_"
MIME-Version: 1.0
X-BESS-ID: 1584532221-893025-8865-15579-1
X-BESS-VER: 2019.1_20200317.2325
X-BESS-Apparent-Source-IP: 208.17.91.144
X-BESS-Outbound-Spam-Score: 0.20
X-BESS-Outbound-Spam-Report: Code version 3.2, rules version 3.2.2.222923 [from cloudscan10-25.us-east-2a.ess.aws.cudaops.com] Rule breakdown below pts rule name description ---- ---------------------- -------------------------------- 0.00 HTML_MESSAGE BODY: HTML included in message 0.00 BSF_BESS_OUTBOUND META: BESS Outbound 0.20 BSF_SC0_SA953 META: Custom Rule BSF_SC0_SA953
X-BESS-Outbound-Spam-Status: SCORE=0.20 using global scores of KILL_LEVEL=7.0 tests=HTML_MESSAGE, BSF_BESS_OUTBOUND, BSF_SC0_SA953
X-BESS-BRTS-Status: 1
Archived-At: <https://mailarchive.ietf.org/arch/msg/avt/VvTODGjs0rb-OGnMhgQIkxm8Sro>
Subject: Re: [AVTCORE] Source switching performance in draft-hellstrom-avtcore-multi-party-rtt-source-01.txt - full mux
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 18 Mar 2020 11:50:58 -0000

Hi Gunnar


I think your last point here is probably the most significant: both this proposal and the timestamp proposal require a change to RFC4103 itself and that may just cause too many problems.


Best regards


James


[X][X]

[X]James Hamlin
Contractor
Purple, a Division of ZP Better Together, LLC
purplevrs.com

The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.

________________________________
From: Gunnar Hellström <gunnar.hellstrom@omnitor.se>
Sent: 17 March 2020 21:02
To: James Hamlin; avtcore@ietf.org
Subject: Re: Source switching performance in draft-hellstrom-avtcore-multi-party-rtt-source-01.txt - full mux


Hi James,

Your multiplexed mixing proposal has the very interesting benefit that text from all sources (within the limits you discuss regarding MTU) fitting into one packet gets the same switching delay. 300 ms if we stay at 300 ms interval, or 100 ms if we decrease the transmission interval to 100 ms. (  I prefer 100 ).


There are some considerations to sort out:


1. What is the exact coding specification for the packet?


2. Is there a need to agree on a maximum number of sources per packet?


3. How will sdp be specified


4. Can this be seen as an update to RFC 4103?


1 coding: SDP fmtp parameters tell how many redundant generations are included in the packets. a=fmtp 100 96/96/96 tells that all sources are always represented with one original and two redundant transmissions. (RFC 4103 specifies that the number of generations may be varied during the session. That statement needs to be negated. It is no problem in practice, the issue is merely how to express that new requirement.


jeh: I had thought a=fmtp 100 96/96/96/96/96/96/96/96/96/96/96/96/96/96/96 would be necessary because that would indicate the correct number of blocks for 5 participants with 3 generations, but that's getting far too silly as we don't necessarily know the number of participants at the negotiation stage. So just a=fmtp 100 96/96/96 conveys the formats of the anticipated generations correctly. In the context of a capability flag, we are sure that an implementation knows not to expect only 3 header blocks.


2. Agree on a maximum number of sources? I am not sure if a maximum per packet is needed. Or can that be left to implementation to try to keep within the MTU, and if more sources are active, then distribute between more packets?


jeh: I think it could be left to the mixer. The maximum sources you can list in a packet is 16 and that would create 64 bytes of header just for the CSRC list and timestamps. But if most had only typed a couple of characters then the MTU might not be a limit. If MTU becomes limiting then the mixer would need to buffer text or reduce transmission interval. It could require most participants to be muted for very large conferences.


3. sdp.   Negotiation of a=rtt is needed. And the number of generations needs to be specified.   The number of generations that a party intends to send shall be specified the fmtp attribute. Example a=fmtp 100 96/96/96 means one original and two redundant generations for all sources in all packets. How many sources are carried in the packet can vary from packet to packet and will be detected by the mechanism that extracts the text from the packets.



4. Can this be an update to RFC 4103 or is an RFC 4103bis needed. I hope it can be an update, because RFC 4103 is now mentioned in so many other documents. I think this solution has a more thorough influence on RFC 4103 than the previously discussed methods, but can hopefully go as an update.


jeh: Perhaps this is the thing that kills this idea. On careful reading, your current proposal doesn't break anything in 4103. 4103 doesn't mention a CSRC list but doesn't preclude it. This proposal does change the arrangement of text blocks and so alters 4103. Similarly the timestamp mechanism also breaks 4103's rules.


Thanks,


Gunnar







Den 2020-03-17 kl. 11:19, skrev James Hamlin:

Hi Gunnar


I'll look at pseudo-multi-party and reply separately (I think it's clear that this has to be made to work for compatibility with current implementations).


I think the balance between the current CSRC proposal and the timestamp mechanism is now between the complexity of the timestamp algorithm and the benefit of spreading out redundant generations over time, to benefit from statistical independence of packet loss.


I think it should be possible just to multiplex all participants and all redundant generation slots in one packet. The transmission interval could then remain at 300ms and redundant generations would spread out over 900ms, giving better reliability. The recovery algorithm is not complicated. I think the limit would be packet size / MTU. But with 5 participants entering 10 chars in each 300ms and each char being 4 bytes (the longest for UTF-8), and extending to 5 generations, you get 5 * 10  * 4 * 5 = 1000 bytes of text and then 61 bytes of header = 1061 bytes which is would still be OK with typical MTU. And it's still possible to reduce transmission interval or buffer some participants for 1 generation if the packet size gets too big.


IMHO this looks worth thinking about.


       0                   1                   2                   3


       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |V=2|P|X| CC=3  |M|  "RED" PT   |   sequence number of primary  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |               timestamp of primary encoding "P"               |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |           synchronization source (SSRC) identifier            |
      +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
      |  CSRC list member 1                                           |
      |  CSRC list member 2                                           |
      |  CSRC list member 3                                           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1|   T140 PT   |  timestamp offset CSRC1R2 |      block length |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1|   T140 PT   |  timestamp offset CSRC1R1 |      block length |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1|   T140 PT   |  timestamp offset CSRC1P  |      block length |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1|   T140 PT   |  timestamp offset CSRC2R2 |      block length |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1|   T140 PT   |  timestamp offset CSRC2R1 |      block length |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1|   T140 PT   |  timestamp offset CSRC2P  |      block length |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1|   T140 PT   |  timestamp offset CSRC1R2 |      block length |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1|   T140 PT   |  timestamp offset CSRC1R1 |      block length |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |1|   T140 PT   |  timestamp offset CSRC1P  |      block length |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |0|   T140 PT   | "CSRC1R2" T.140 encoded redundant data        |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+---------------+
      |   |          "CSRC1R1" T.140 encoded redundant data   |       |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+         +-+-+-+
      |              "CSRC1P" T.140 encoded primary data              |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |              "CSRC2R2" T.140 encoded redundant data           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+---------------+
      |   |          "CSRC2R1" T.140 encoded redundant data           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+         +-+-+-+
      |        |      "CSRC2P" T.140 encoded primary data             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |              "CSRC3R2" T.140 encoded redundant data           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+---------------+
      |   |          "CSRC3R1" T.140 encoded redundant data           |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |        |     "CSRC3P"  T.140 encoded primary data     |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


Best regards James



[X][X]

[X]James Hamlin
Contractor
Purple, a Division of ZP Better Together, LLC
purplevrs.com

The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.

________________________________
From: Gunnar Hellström <gunnar.hellstrom@omnitor.se><mailto:gunnar.hellstrom@omnitor.se>
Sent: 16 March 2020 22:12
To: James Hamlin; avtcore@ietf.org<mailto:avtcore@ietf.org>
Subject: Re: Source switching performance in draft-hellstrom-avtcore-multi-party-rtt-source-01.txt


Hi James,


Den 2020-03-16 kl. 13:11, skrev James Hamlin:

Hi Gunnar


Many thanks for taking the time to go through this so thoroughly.


I think we have 2 main aspects to this work:-

  1.  Compatibility with existing implementations
  2.  Choosing an efficient mechanism for the future

For the first of these, it seems to me that the only solution is for a mixer to be able to do inline participant labeling and buffering to produce a presentable single text stream. Current implementations of RFC4103 will simply not understand switching between participants, nor will they make any visual indication of which text belongs to which participant, so the mixer needs to do that.
Yes, it is possible to implement a limited functionality "pseudo-multi-party" text mixer for presenting multi-party rtt on a point-to-point rtt terminal. A procedure is included as Appendix A in this draft:

https://datatracker.ietf.org/doc/draft-hellstrom-mmusic-multi-party-rtt/

When the mixer has text from more than one source to transmit, it looks for suitable switching moments in the text from the source it is transmitting. When a phrase or a sentence is complete or a line separator issued, or even a long pause, then the mixer decides to switch source and inserts a line separator and a label for next in turn to get its text transmitted, and then the text.  This method can be used but has its limitations. It only displays one source at a time in real-time. Erasure over a switch does not work. It assumes that the receiver accepts to use the chat-style layout in one display area etc.

Each application area should decide what to do with old terminals who do not support multi-party. Implement the switching method above, or upgrade the terminals or refuse to involve the terminals in multi-party sessions.






For the second we have established that it's possible to: allow the different redundant blocks in a packet to be for different participants; or use timestamps to resolve the correct redundant text to use and to have each packet associated with just one participant. I can also imagine putting all text generations of each source in the CSRC list in each packet which adds 12 bytes of header space per CSRC (assuming 2 redundant generations); I'll write a separate mail about that.
Sound interesting.
After that, I think we have a fairly complete set of solutions to choose from.

Thanks,

Gunnar


Some other comments inline.

Best regards

James



[X][X]

[X]James Hamlin
Contractor
Purple, a Division of ZP Better Together, LLC
purplevrs.com

The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.

________________________________
From: Gunnar Hellström <gunnar.hellstrom@omnitor.se><mailto:gunnar.hellstrom@omnitor.se>
Sent: 14 March 2020 11:17
To: James Hamlin; avtcore@ietf.org<mailto:avtcore@ietf.org>
Subject: Re: Source switching performance in draft-hellstrom-avtcore-multi-party-rtt-source-01.txt


Hi James,

Thanks for an interesting proposal.

Let us extend the information about the packet contents of your example a bit:



seq         01  02  03  04  05  06  07  08  09  10  11  12  13  14  15

CSRC=source  1   2   3   1   2   3   1   2   3   1   2   3   1   2   3

Timestamp   91  92  93  94  95  96  97  98  99 100 101 102 103 104 105


R2 t offset  6   6   6   6   6   6   6   6   6   6   6   6   6   6   6

R1 t offset  3   3   3   3   3   3   3   3   3   3   3   3   3   3   3

R2                                   A   M   X   B   N   Y   C   O   Z

R1                       A   M   X   B   N   Y   C   O   Z

P            A   M   X   B   N   Y   C   O   Z


Lost                             X   X


(The timestamps and timestamp offsets ("t offset") are shown in 100 ms, in reality it will be in milliseconds)


The SSRC of the packet is always the mixer's SSRC.

The source is indicated in the CSRC-list that in this method has only one member = the SSRC of the source represented in the packet.

+jeh: Agreed: My mistake.


The timestamp is created by the mixer when sending, and the timestamp offsets make it possible to calculate the timestamps the redundant texts had when they were transmitted as originals.


The receiver must store essential data from a number of packets. This data is the sequence number, the source (=CSRC), the Timestamp.

So, let us see what happens if both packet 06 and 07 are lost.

The receiver must also store for each source, the timestamps for which text has been recieved (either with real contents or empty).


In packets 1 to 5, we have received and put in display areas for source 1: "AB", for source 2: "MN", for source 3: "X"


08 is received and the gap (07 and 06 ) is detected (07 and 06 with two redundant elements in both, making a need for retrieval of 4 text elements ) is remembered so we need to do the recovery analysis..  The source (2) and other essential data is noted. The original timestamp of R2 is calculated as Timestamp-R2 t offset = 98-6 = 92. Checking back in the list of received packets we find that we got a packet with timestamp 92 (and indeed, it contained text from source 2), so there is no need to recover R2. Then the original timestamp of R1 is calculated as Timestamp-R1 t offset = 98-3 = 95. Checking back in the list of received packets we find that we got a packet with timestamp 95 (and indeed, it contained text from source 2), so there is no need to recover R1. The Primary text in packet 08 ("O") is retrieved and put in the display area for source 2. It is noted that we have got text for timestamp 98 for source 2. The gap is still 4.


09 is received.  The gap (4 elements ) is remembered so we need to do the recovery analysis.  The source (3) and other essential data is noted. The original timestamp of R2 is calculated as Timestamp-R2 t offset = 99-6 = 93. Checking back in the list of received packets we find that we got a packet with timestamp 93 (and indeed, it contained text from source 3), so there is no need to recover R2. Then the original timestamp of R1 is calculated as Timestamp-R1 t offset = 99-3 = 96. Checking back in the list of received packets we find that we never got a packet with timestamp 96 , so we recover R1 ("Y") and insert it in the display area of source 3. The Primary text in packet 09 ("Z") is retrieved and put in the display area for source 3. It is noted that we got text from timestamps 96 and 99 for source 3 (the gap can now be reduced to 3)



10 is received.  The gap (3 ) is remembered so we need to do the recovery analysis.  The source (1) and other essential data is noted. The original timestamp of R2 is calculated as Timestamp-R2 t offset = 100-6 = 94. Checking back in the list of received packets we find that we got a packet with timestamp 94 (and indeed, it contained text from source 1), so there is no need to recover R2. Then the original timestamp of R1 is calculated as Timestamp-R1 t offset = 100-3 = 97. Checking back in the list of received packets we find that we never got a packet with timestamp 97 , so we recover R1 ("C") and insert it in the display area of source 1. The Primary text in packet 10 is empty so there is nothing to put in the display area for source 1. It is noted that we got text from timestamps 97 and 100 for source 1 (the gap can now be reduced to 2)



11 is received.  The gap (2 ) is remembered so we need to do the recovery analysis.  The source (2) and other essential data is noted. The original timestamp of R2 is calculated as Timestamp-R2 t offset = 101-6 = 95. Checking back in the list of received packets we find that we got a packet with timestamp 95 (and indeed, it contained text from source 2), so there is no need to recover R2. Then the original timestamp of R1 is calculated as Timestamp-R1 t offset = 101-3 = 98. Checking back in the list of received packets we find that we got a packet with timestamp 98 , so we do not need to recover R1. The Primary text in packet 11 is empty so there is nothing to put in the display area for source 2. It is noted that we got text from timestamp 101 for source 2. (we did not recover anything, so the gap is still 2)


12 is received.  The gap (2 ) is remembered so we need to do the recovery analysis.  The source (3) and other essential data is noted. The original timestamp of R2 is calculated as Timestamp-R2 t offset = 102-6 = 96. Checking back in the list of received packets we find that we never got a packet with timestamp 96, but from packet 09 we recovered R1 from timestamp 96. So we shall not recover anything from R2 here. Then the original timestamp of R1 is calculated as Timestamp-R1 t offset = 102-3 = 99. Checking back in the list of received packets we find that we got a packet with timestamp 99, so we do not need to recover R1. The Primary text in packet 12 is empty so there is nothing to put in the display area for source 3. It is noted that we got text for timestamp 102 for source 3 (the gap can now be reduced to 1)


13 is received.  The gap (1 ) is remembered so we need to do the recovery analysis.  The source (1) and other essential data is noted. The original timestamp of R2 is calculated as Timestamp-R2 t offset = 103-6 = 97. Checking back in the list of received packets we find that we already recovered text for timestamp 97, so nothing is recovered and nothing inserted from R2 in the display area of source 1 . Then the original timestamp of R1 is calculated as Timestamp-R1 t offset = 103-3 = 100. Checking back in the list of received packets we find that we got a packet with timestamp 100, so we do not need to recover R1. The Primary text in packet 13 is empty so there is nothing to put in the display area for source 1. It is noted that we got text for timestamp 103 for source 1 (the gap can now be reduced to 0)


14 is received.  The gap is now 0 so we do not need to do any recovery analysis.   The source (2) and other essential data is noted. The Primary text in packet 13 is empty so there is nothing to put in the display area for source 2. It is noted that we got text for timestamp 104 for source 2.


After this, we have in the display areas:  for 1: "ABC" for 2: "MNO", for 3: "XYZ", so everything is recovered.


I am sorry, the narrative above may be hard to follow. It could probably be converted to some table format if we need to do it again for other cases.


So, yes, this method also works.

I see a couple of differences in characteristics between this "timestamp method" and the "CSRClist method":


1. The recovery time from loss to recovery can with the timestamp method be 200 times the number of simultaneous sending sources in milliseconds. Thus with 5 sources: 1 second. With the CSRClist method, it is steady 200 milliseconds. (assuming a transmission interval of 100 ms and round robin mixer switching.)


2. The recovery capacity in packets in sequence is 2*the number of simultaneous sources = 10 packets for 5 sources. With the CSRC method it is 2 packets. (again assuming round robin mixer switching )


3. The complexity of the procedure is higher but still manageable for the timestamp method.

jeh: Yes. I had thought this would be simpler, but the timestamp logic gets complicated.


4. The number of packets to store essential information about is higher for the timestamp method ( I think it is 4*the number of active sources). For the CSRC list method, it is 4 packets and less information per packet.


You say: "The advantage of this approach is that the format of the packet doesn't change. The current arrangement where all the text in a packet is for one participant is preserved."


I do not see the CSRClist method as a change in packet format.

jeh: Agreed: I should have said that differently: the change is that the text over the redundant and primary block is no longer continuous; it's for different participants.


The mixer needs in both methods to include a CSRC -list. The difference is that in the CSRClist method, the list has more members. It is still within the format description of RTP. The source of the redundant parts will vary in a packet, but the composition of the contents of the packet for transmission from the mixer is as usual for a single sender: Put what was sent next to last in the packet as R2, put what was sent last as R1, and put the new text chunk as P. The only addition is the rule that the CSRC list is populated with the sources in the strict order.


Summary: both methods seem possible. It will be interesting to get more comments.


Thanks,


Gunnar








Den 2020-03-14 kl. 01:45, skrev James Hamlin:
Hi Gunnar


I've also been thinking through the possibility of a sender switching source without clearing all redundant generations first. Clearly, a sender that did this today would cause problems for existing receivers. But checking timestamps at the receiver should fix this.


Consider three senders 1, 2 and 3 which send text "ABC", "MNO" and "XYZ". The block below shows this text being sent taking round-robin turns with the participants.


seq    01  02  03  04  05  06  07  08  09  10  11  12  13  14  15

part    1   2   3   1   2   3   1   2   3   1   2   3   1   2   3

                            -

R2                              A   M   X   B   N   Y   C   O   Z

R1                  A   M   X   B   N   Y   C   O   Z

P       A   M   X   B   N   Y   C   O   Z


If sequence 06 is lost and the receiver sees sequence 07 then it may assume that the lost packet was for participant 1 and use the redundant character "B". This would lead to character "B" being duplicated in the output for participant 1. But it is possible for a receiver to get the correct result; the timestamp for the redundant text in packet 07 will not be higher than the most recent timestamp previously received and so shouldn't be used. The redundant text in the following packet 08 has no applicable redundant text, but the timestamp of the R1 in packet 09 will be greater than the most recent received and so is usable.


The advantage of this approach is that the format of the packet doesn't change. The current arrangement where all the text in a packet is for one participant is preserved. Tracking the timestamp adds some implementation effort but I think it's minimal. It does also mean the mixer needs to synchronize timestamps across the media sources but it is in a position to do so.

Best regards

James



[X][X]

[X]James Hamlin
Contractor
Purple, a Division of ZP Better Together, LLC
purplevrs.com

The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.

________________________________
From: Gunnar Hellström <gunnar.hellstrom@omnitor.se><mailto:gunnar.hellstrom@omnitor.se>
Sent: 13 March 2020 16:41
To: avtcore@ietf.org<mailto:avtcore@ietf.org>; James Hamlin
Subject: Source switching performance in draft-hellstrom-avtcore-multi-party-rtt-source-01.txt


Hi,

I want to follow-up on the good discussion on source switching performance a couple of days ago, under the subject "[AVTCORE] Improved RTP-mixer performance for RFC 2198 and RFC 4103 redundancy coding"

Two parts in the performance increase solution.
Two actions are proposed in draft-hellstrom-avtcore-multi-party-rtt-source:
a) Reduce the packet transmission interval from 300 to 100 ms.
b) Use a strict relation between members in the CSRC list and the parts of the payload that is original text and first generation redundancy and second generation redundancy so that the mixer can switch source for every new packet and the sources of text recovered from redundancy can be assessed by the receiver.

I think it is worth while to move forward with the complete improvement a) and b) proposed in the draft. It will cause less complexity, lower delays and lower risk for stalling in case of many participants entering new text simultaneously.

Here is my reasoning:

I have the following view of the achievable performance improvements for different cases:

1. With the original source switching with RFC 4103 and an RTP-mixer using 300 ms transmission interval and not allowing a mix of sources in one packet, there can be one source switching per second by the mixer with an introduced delay of up to one second.

2. By just reducing the transmission interval from 300 to 100 ms, it will be possible to have three source switches per second with an introduced delay of up to one second. (with just two parties sending text simultaneously, the delay will be maximum 300 ms. )

3. And by applying the proposal from the multi-party-rtt-source draft with the CSRC-list as a source list for the redundancy, and also using 100 ms transmission interval, there can be switching between five source per second with an introduced delay of max 500 ms. With just two parties typing simultaneously, the delay will be a maximum of 100 ms.

The delays are extreme values from when all sources start to type simultaneously.  It was agreed that at least the improvement from the reduced transmission interval is needed.

Case 1 and 2 are a bit complex for the mixer to implement. From the moment it has text queued for transmission from another source B than the one currently transmitted A, then the mixer needs to stop adding new text from A to the packets, but still send two more packets with the agreed transmission interval, progressing the latest transmitted original text to first generation redundancy and then again one more packet with the text as second level redundancy. Not until that is done, the mixer is allowed to start taking text from the transmission queue from B to transmit. This is the background of the 1 s vs 300 ms delays in case 1 and 2.

In case 3, there is much less complexity. When there is something from B in queue for transmission, the mixer can decide to insert that in next packet and add the redundancy from earlier transmissions from A, because their sources are included in the CSRC list in the same packet.

Therefore I want to move on with the complete solution in case 3.

-----------------------------------

Influence on the multi-party capability negotiation:
There is an installed park of RTT implementations without multi-party awareness. The receiver need to take active part in planning the multi-party RTT presentation. Therefore a capability negotiation is needed. A simple sdp attribute a=rtt-mix without value is proposed in the draft.

It is important to let this attribute mean capability of the complete solution case 3).  If there is a temptation to have different levels of implementation, some only implementing the shorter transmission interval (2) and some implementing the complete solution (3), then threre will be a need for two different attributes, or one attribute with a list of parameter values for the two cases. That would complicate the evaluation of the negotiation. Therefore I would prefer that the attribute can mean capability to use the complete mixing solution (3).

Regards

Gunnar

Den 2020-02-29 kl. 20:13, skrev internet-drafts@ietf.org<mailto:internet-drafts@ietf.org>:

A New Internet-Draft is available from the on-line Internet-Drafts directories.


        Title           : Indicating source of multi-party Real-time text
        Author          : Gunnar Hellstrom
        Filename        : draft-hellstrom-avtcore-multi-party-rtt-source-01.txt
        Pages           : 13
        Date            : 2020-02-29

Abstract:
   Real-time text mixers need to identify the source of each transmitted
   text chunk so that it can be presented in suitable grouping with
   other text from the same source.  An enhancement for RFC 4103 real-
   time text is provided, suitable for a centralized conference model
   that enables source identification, for use by text mixers and
   conference-enabled participants.  The mechanism builds on use of the
   CSRC list in the RTP packet.  A capability exchange is specified so
   that it can be verified that a participant can handle the multi-party
   coded real-time text stream.  The capability is indicated by an sdp
   media attribute "rtt-mix".


The IETF datatracker status page for this draft is:
https://datatracker.ietf.org/doc/draft-hellstrom-avtcore-multi-party-rtt-source/

There are also htmlized versions available at:
https://tools.ietf.org/html/draft-hellstrom-avtcore-multi-party-rtt-source-01
https://datatracker.ietf.org/doc/html/draft-hellstrom-avtcore-multi-party-rtt-source-01

A diff from the previous version is available at:
https://www.ietf.org/rfcdiff?url2=draft-hellstrom-avtcore-multi-party-rtt-source-01


Please note that it may take a couple of minutes from the time of submission
until the htmlized version and diff are available at tools.ietf.org.

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/


_______________________________________________
I-D-Announce mailing list
I-D-Announce@ietf.org<mailto:I-D-Announce@ietf.org>
https://www.ietf.org/mailman/listinfo/i-d-announce
Internet-Draft directories: http://www.ietf.org/shadow.html
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


--

+ + + + + + + + + + + + + +

Gunnar Hellström
Omnitor
gunnar.hellstrom@omnitor.se<mailto:gunnar.hellstrom@omnitor.se>
+46 708 204 288

--

+ + + + + + + + + + + + + +

Gunnar Hellström
Omnitor
gunnar.hellstrom@omnitor.se<mailto:gunnar.hellstrom@omnitor.se>
+46 708 204 288

--

+ + + + + + + + + + + + + +

Gunnar Hellström
Omnitor
gunnar.hellstrom@omnitor.se<mailto:gunnar.hellstrom@omnitor.se>
+46 708 204 288

--

+ + + + + + + + + + + + + +

Gunnar Hellström
Omnitor
gunnar.hellstrom@omnitor.se<mailto:gunnar.hellstrom@omnitor.se>
+46 708 204 288