[avtext] AVTEXT Minutes for IETF 93

Jonathan Lennox <jonathan@vidyo.com> Thu, 13 August 2015 16:27 UTC

From: Jonathan Lennox <jonathan@vidyo.com>
To: "avtext@ietf.org" <avtext@ietf.org>
Thread-Topic: AVTEXT Minutes for IETF 93
Thread-Index: AQHQ1eTt06i2FYjgE02WeqZuJO90tw==
Date: Thu, 13 Aug 2015 16:27:20 +0000
Message-ID: <71FDF98A-ECAE-4580-A1FA-27953CF5CF42@vidyo.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: text/plain; charset="utf-8"
Content-ID: <0E9D30238FBED54C9B1A382AF86A2D6E@vidyo.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <http://mailarchive.ietf.org/arch/msg/avtext/ciFDwF6S0X1U8JYcpbyDjOxnTRo>
Subject: [avtext] AVTEXT Minutes for IETF 93
Precedence: list

Here are the minutes for AVTEXT 93 in Prague.  Please send the chairs any corrections or additions.

AVTEXT Audio/Video Transport Extensions

Thursday, 24 July, 2015 17:40 - 19:10 CEST (Room: Berlin/Brussels)

Chairs: Jonathan Lennox and Magnus Westerlund (filling in for Keith Drage)
Responsible Area Director: Ben Campbell
Notetakers: Emil Ivov, Varun Singh

Agenda bash and status update
=====

WG status:
    grouping taxonomy approved
    stream pause requires a few changes after AD review
    splicing notification: ready for pub req, will proceed once pause is advanced

Action Item: Ben to confirm that Bo responded to all his AD review
items for stream pause, send to IETF last call


==========================================
RTP Header Extension for source description
Magnus Westerlund, presenting
==========================================

Status: Ready for WGLC

Action Items:
Simon Perrault will review draft
Magnus and/or Roni will submit 5285bis to AVTCORE

Discussion:

Issue: switching between one and two byte header extension format is not clear in RFC 5285
Mo Zanaty: the issue is even worse because two-byte header support is not not mandatory
Magnus: we suggest we clarify 5285
Roni Even: +1
Colin Perkins: +1
Stephan Wenger: Deprecate the one-byte headers? Is there much deployment of it?
Many people: yes.
Jonathan: Most current deployment is one-byte, not clear there is deployment of two-byte
Mo: Suggest to mandate support for both extensions and then require that two byte header extensions use
IDs bigger than 15 so that they can be negotiated.
Magnus or Roni will submit 5285bis to AVTCORE

Jonathan: Who read this?
5/6 hands
Jonathan: Who will review?
Simon Perreault

Jonathan: many things would likely want to use this.
Roni: there’s also a dependency in clue
Colin: We should check if something in bundle would also need to be fixed to address the issues Magnus pointed out

Jonathan: OK, we are ready for last call!


==========================================
Layer Refresh Request (LRR) RTCP feedback message
Jonathan Lennox
==========================================

Action Items:
Bernard Aboba and Mo Zanaty to review H.264-SVC layer refresh text.
Stephan to work with JCT if a new SEI message is needed with H.265 for temporal layer refresh.
Jonathan will work with co-authors for VP8 text.

Jonathan to send text to the list about what FIR means for MRST/MRMT.

Discussion:

Jonathan: this is now a WG document
Deltas: not many. main one: recognizing layer refresh point
H.264/SVC is complete
ACTION: Bernard A. and Mo Z. are volunteering for reviews of this part
H.265
Stephen Wenger: is anyone interested at all in non-scalable nested coding structures?
Jonathan: yes for VP8 and VP9.
Stephen: given current paces of SDOs, that’s likely going to exist in H.265 as well so I volunteer doing that.
ACTION ITEM: Stephan will take of this for JCT
ACTION ITEM: Jonathan will nudge colleagues to do reviews for VP8
ACTION ITEM: review the description of how you recognize layer refresh?

Mo: not clear the VP8 Y bit is sufficient to recognize all temporal sync points.
Jonathan: false negatives aren't the end of the world, false positives are bad.

Next slide: Temporal switch points are complicated!
Justin gets up to explain … says he’s confused
Mo explains: these are just three layers of reference frames

Open issue: what does FIR mean for spatial Multi stream scalability
QUESTION to WG:
Refresh layer or refresh the whole source? Only relevant for H.264-SVC. Question: do we want to define this? Do we want to do this here?

Stephan Wenger: wasn’t there some kind of consensus somewhere (payload) that new payload formats should have some form of FIR mapping.
Jonathan: shakes head
Stephan: an issue for SHVC payload format. 
Stephan: I will talk to Ye-Kui.
Jonathan: Two issues: what should the semantic be, and how is it instantiated for any given codec.
My recommendation: FIR means refresh all layers and we need to define something else for per layer.
Mo: I sort of thought that’s what it meant already. After all it’s a FULL intra refresh
Bernard: For IMTC structures it doesn't make a difference.

Jonathan: If no one is currently doing anything where it makes a difference, not an issue.

CONSENSUS: FIR is a full refresh.

Jonathan Question: where do we say that?
Mo: let’s just have this doc update 6190
Jonathan: probably update CCM, not 6190.

Some more discussions showing there’s actually no consensus.

Justin: Does this mean that FIR has different behavior for MRST than for simulcast?
Jonathan: Yes.
Justin: I think that's confusing and awkward.
Mo: Maybe have FIR refresh all simulcast streams?
[Unhappiness]
Stephan: Multiple prediction coding coming.
Jonathan: My inclination is to say that MRST and simulcast are different, despite the asymmetry.
Stephan: Possibly recommend that FIR be sent for all layers simultaneously.
Justin: Can we just say that LIR is per-layer and FIR is per-SSRC?
Jonathan: For MRST/MRMT those are the same thing. Stephan's suggestion would be straightforward for
MRST, but for MRMT the multiple FIRs will not be the same packet so there might be sync issues.

Magnus: write it up and send this to the list (ACTION ITEM)
Stephan agreed
Jonathan agree: will do!

==========================================
RTCP feedback message for image control
Roni Even
==========================================

Action items:
Roni: Consider timestamp for which picture request is relative to
Roni will draft a liaison statement for 3GPP


Discussion:

Motivations: a picture of Matt Damon
also: get a detailed image for part of an image. camera zoom and move. zoom on participant in an MCU

Proposal: have an RTCP message the describes the area they want zoomed or moved. Requires consent of sender.
Reference is based on current picture/quality rate

There’s also a notification/ack message from the sender

Mo: question. you shouldn’t do this with absolutes but with relatives because resolutions change.
Roni: idea is to have pixels based on the current view.
Mo: but there’s no way to know which frame this relates to
Randell J (through Jabber): same question pointing out the race conditions

Ben: coordinates are unsigned?
Roni: can be negative so that you can move outside of your view

Stephen: using reference timestamps should fix most of the race conditions
Roni: Agreed, we will consider this (ACTION ITEM)

Randell: offsets could be made floats (0-1 of the source width) and be relative to the absolute source

Peter T. question: Is this a feedback or a control message?
Roni: it’s both
Peter: is this weird?
Roni: no
Peter: how do you do reliability?
Roni: we have an ICN response message. also
Colin: also there’s a standard convention for doing this
Roni: yes, that’s what we are using here

Peter T: how are you referring to the stream that you mean?
Roni: by SSRC
Peter: what if the SSRC changes?
Roni: then you're receiving a new stream so you have a new view
Roni: Relatedly, in multipoint BFCP allows you to say who controls the camera

Stephan Wenger: I think the document is underspecified. You need to
say whether your source window is with respect to decoded bits or only
with respect to bits that are meant for human consumption

Ben C.: clarification on the IANA issue - they registered parameters with IANA, but they're held up
because we don't have expert reviewers.
Roni: yes, I was just saying this is why I didn't know they had done it.
Ben C.: We also need to think about whether we want to continue work that's divergent from what they've done, whether there's harm done by having two ways of doing things.

Magnus: we need to feedback our criticism to the original authors before we start this work

Ben: Is this in a release?
Magnus: Yes, but they have a process for fixing it.

Mo: suggestion about semantics on the ICN message: would be useful if sender gave current viewport
Roni: makes sense

Randell (via Jabber): we already do dynamic resolution scale in Firefox, so using non-pixel-based values is a win. Also means you don't have to have memory of sizes, and avoids confusing when using layered encodings.

Jonathan: I think we are better off helping 3GPP fix their own message than doing our own competitive
message, if possible.
Cullen: +1
Jonathan: Rachel and Roni, go to a 3GPP meeting and help them fix this. Also, please draft text of a
liaison statement.

Roni: can we ask if people care about this?
Magnus: ok, WG QUESTION: WHO CARES ABOUT THIS AT ALL?
Cullen: maybe some of our video surveillance people would.
Bo: maybe I care too. But I think we should try to work it out in 3GPP.  (PROMISE)
Jonathan: you could be the person who catches the liaison statement in 3GPP?
Bo: Myself or a close colleague.

Stephan Wenger: Let’s stop using poor design of H.281 as an excuse. No one has given valid reasons for replacing it.
Jonathan: The 3GPP spec recomments both H.281 and an RTCP feedback message, presumably for different
circumstances.
Stephen Botzko: we tried to replace this with H.282 but no one implemented it. So I agree with Stephan.

ACTION ITEM AND CONSENSUS: Roni will write a liaison to 3GPP and we’ll see what happens there


==========================================
Video Frame Info (RTP Header Extension)
Mo Zanaty
==========================================

Status: Consensus to adopt as WG document.
Action Item: Chairs to work with ADs to add milestone.

Discussion:

Motivation: we need payload agnostic RTP switching (like in SFUs). The reason is that payload is often encrypted and even if not it could be an unknown format.

Peter Thatcher: original idea was to be codec-agnostic, but there is codec-specific information?
Mo: If you need spatial/quality information, yes.  H.265's combined layerId forces this.
Peter: Would it be possible to have some way to specify the semantic explicitly?
Mo: I'm not sure how that's easier than doing it per-payload-type. Perhaps we could write
recommendations for how future codecs do things.

Stephan Wenger: Some stuff is still missing in this document.
Design suggestion: let’s just say, you grab the first N bits of your codec packet and copy it here.
This is not future proof and we can make it future proof

Jonathan: you would certainly need TL0 pic index.  For spatial scalability would also need something equivalent to the VP9 scalability structure / scalability update which tells you what layers you can be expecting to receive.
Also suggestion:  you could restrict this to temporal only as the behavior is much better defined and understood.
If we do keep these layer IDs, we could arrange things so they are the same as in the LRR spec.

Bernard Aboba: +1 on the TL0 pic index.
Mo: OK, I’ll take this up on the list

Jonathan: how many people know what this is about? MANY
          how many people think we should work on it? MANY
          anyone thinking we shouldn’t?

Stephan: We need to understand Selective Forwarding Unit architecture better before we work on it.
Bernard: We need this today, have multiple proprietary implementations of it, and we can learn more
working on it.
Mo: bernard’s document outlines a problem, this document can be viewed as a solution to that problem
Stephan: we shouldn’t work on this until we study it well and know exactly what we want to do.
Cullen: I am on the other end. let’s just get it done to address today’s problems.
Bernard: The proprietary arches only solve the temporal case. This draft addresses today’s problems and
even goes a bit beyond. So that’s good enough.

Magnus: Who supports this as a WG doc? 10 PLUS HANDS
Magnus: Against? none

Magnus: will work with ADs to add milestone

[avtext] AVTEXT Minutes for IETF 93 Jonathan Lennox