Re: [clue] I-D Action: draft-ietf-clue-telepresence-requirements-04.txt

Hello Paul,

There was a bounding attribute "the scene area" which was removed. As I 
mentioned at the time the idea was to give an indication of how the 
capture areas were located in the room.

With regards to audio I think we need to revisit that with respect to 
the spatial attributes.

Regards, Christian

On 19/07/2013 12:42 AM, Paul Kyzivat wrote:
> Christian,
>
> You raise a point that has bothered me for some time:
>
> We describe the spatial attributes of a capture by point of capture 
> and area of capture (a quadrilateral). Implicitly this describes an 
> unbounded pyramid. In reality, it is bounded by the geometry of the 
> room and the stuff in it, but we don't have that info. This pyramid 
> may be meaningful in some sense for video, though we lack other 
> information such as depth of field. But as John replied, I guess it 
> means very little for audio.
>
> The fundamental question is: does this provide sufficient information 
> for the consumer to properly adapt the received media to his available 
> equipment? I don't know the answer to that, but I suspect that it will 
> allow some approximation, but probably not enough to do the best 
> possible job. Is there more that we ought to be providing to help with 
> that job? Or does doing more provide diminishing returns?
>
>     Thanks,
>     Paul
>
> On 7/17/13 11:01 PM, Christian Groves wrote:
>> Hello Mary,
>>
>> I did a review of the requirements and Appendix to see if they are met.
>> My comments are below. Sorry for the length of the email but I figured
>> it would be easier for people to read my comment [CNG] along with the
>> requirement.
>>
>> REQMT-1:The solution MUST support a description of the spatial
>>
>> arrangement of source video images sent in video streams
>>
>> which enables a satisfactory reproduction at the receiver
>>
>> of the original scene.This applies to each site in a
>>
>> point to point or a multipoint meeting and refers to the
>>
>> spatial ordering within a site, not to the ordering of
>>
>> images between sites.
>>
>> [CNG] Supported – via Capture Area attribute.
>>
>> Use case point to point symmetric, and all other use cases.
>>
>> REQMT-1a:The solution MUST support a means of allowing
>>
>> the preservation of the order of images in the
>>
>> captured scene.For example, if John is to
>>
>> Susan's right in the image capture, John is
>>
>> also to Susan's right in the rendered image.
>>
>> [CNG] Supported – via Capture Area attribute.
>>
>> REQMT-1b:The solution MUST support a means of allowing
>>
>> the preservation of order of images in the
>>
>> scene in two dimensions - horizontal and
>>
>> vertical.
>>
>> [CNG] Supported – via Capture Area attribute.
>>
>> REQMT-1c:The solution MUST support a means to identify
>>
>> the point of capture of individual video
>>
>> captures in three dimensions.
>>
>> [CNG] Supported – via Point of capture attribute.
>>
>> REQMT-1d:The solution MUST support a means to identify
>>
>> the extent of individual video captures in
>>
>> three dimensions.
>>
>> [CNG] Partial – Capture area attribute allows the specification of a
>> plane of capture in 3 dimensions. However depth of the capture (needed
>> for a 3D area) is not supported.
>>
>> REQMT-2:The solution MUST support a description of the spatial
>>
>> arrangement of captured source audio sent in audio streams
>>
>> which enables a satisfactory reproduction at the receiver
>>
>> in a spatially correct manner.This applies to each site
>>
>> in a point to point or a multipoint meeting and refers to
>>
>> the spatial ordering within a site, not the ordering of
>>
>> channels between sites.
>>
>> [CNG] Supported – Via Capture Area attribute.
>>
>> Use case point to point symmetric, and all use cases,
>>
>> especially heterogeneous.
>>
>> REQMT-2a:The solution MUST support a means of preserving
>>
>> the spatial order of audio in the captured
>>
>> scene.For example, if John sounds as if he is
>>
>> at Susan's right in the captured audio, John
>>
>> voice is also placed at Susan's right in the
>>
>> rendered image.
>>
>> [CNG] Supported – Via Capture Area attribute.
>>
>> REQMT-2b:The solution MUST support a means to identify
>>
>> the number and spatial arrangement of audio
>>
>> channels including monaural, stereophonic
>>
>> (2.0), and 3.0 (left, center, right) audio
>>
>> channels.
>>
>> [CNG] Partial – the Audio Channel Format attribute currently only
>> supports “mono” and “stereo”. It doesn’t support the 3.0 format.
>>
>> REQMT-2c:The solution MUST NOT preclude the use of
>>
>> binaural audio.[Edt. This is an outstanding
>>
>> issue.Text will be changed when the issue is
>>
>> resolved.]
>>
>> [CNG] Partial? – Binaural isn’t mentioned in the framework so isn’t
>> precluded as such. There is no indication in CLUE recording a binaural
>> capture.
>>
>> REQMT-2d:The solution MUST support a means to identify
>>
>> the point of capture of individual audio
>>
>> captures in three dimensions.
>>
>> [CNG] Supported – via Point of capture attribute.
>>
>> REQMT-2e:The solution MUST support a means to identify
>>
>> the extent of individual audio captures in
>>
>> three dimensions.
>>
>> [CNG] Partial – The Capture area attribute is based on capturing a plane
>> in Cartesian space. There is no depth parameter. Whether the plane
>> concept holds for audio like it does video needs further thought.
>>
>> REQMT-3:The solution MUST support a mechanism to enable a
>>
>> satisfactory spatial matching between audio and video
>>
>> streams coming from the same endpoints.
>>
>> [CNG] Supported - The given that an Advertiser places captures in a
>> particular scene it is possible to indicate audio and video streams are
>> from the same endpoint.
>>
>> Use case is point to point symmetric, and all use cases.
>>
>> REQMT-3a:The solution MUST enable individual audio
>>
>> streams to be associated with one or more video
>>
>> image captures, and individual video image
>>
>> captures to be associated with one or more
>>
>> audio captures, for the purpose of rendering
>>
>> proper position.
>>
>> [CNG] Supported (Partial?) – It is possible to deduce that audio and
>> video relate to the same capture area in a scene by analysing the
>> Capture Area parameters. It is possible to relate individual audio
>> captures to video captures if the Advertisement is constructed in a
>> limited way. However its not possible to deduce the relationships
>> between individual captures by utilising the Scene, CSE and capture
>> concepts. CSE only relate to one media type. So it’s not possible to
>> link audio CSEs to video CSEs.
>>
>> REQMT-3b:The solution MUST enable individual audio
>>
>> streams to be rendered in any desired spatial
>>
>> position.
>>
>> [CNG] Supported? – I think its assumed that a Consumer can do whatever
>> it likes with respect to rendering decisions. Ultimately that’s local
>> policy.
>>
>> Edt: Rendering is an open issue. Text will
>>
>> be changed when it is resolved.]
>>
>> REQMT-4:The solution MUST enable interoperability between
>>
>> endpoints that have a different number of similar devices.
>>
>> For example, one endpoint may have 1 screen, 1 speaker, 1
>>
>> camera, 1 mic, and another endpoint may have 3 screens, 2
>>
>> speakers, 3 cameras and 2 mics.Or, in a multi-point
>>
>> conference, one endpoint may have one screen, another may
>>
>> have 2 screens and a third may have 3 screens.This
>>
>> includes endpoints where the number of devices of a given
>>
>> type is zero.
>>
>> Use case is asymmetric point to point and multipoint.
>>
>> [CNG] Supported – CLUE enables different capture capabilities to be
>> signalled. It makes no assumption as to the rendering.
>>
>> REQMT-5:The solution MUST support means of enabling
>>
>> interoperability between telepresence endpoints where
>>
>> cameras are of different picture aspect ratios.
>>
>> [CNG] Supported – CLUE doesn’t describe “aspect ratio” but allows
>> different capture areas to be defined. This is a means of describing
>> aspect ratio.
>>
>> REQMT-6:The solution MUST provide scaling information which
>>
>> enables rendering of a video image at the actual size of
>>
>> the captured scene.
>>
>> [CNG] Supported - Capture Area, Capture Point, Point of Line of Capture
>> and Scene Scale allow this.
>>
>> REQMT-7:The solution MUST support means of enabling
>>
>> interoperability between telepresence endpoints where
>>
>> displays are of different resolutions.
>>
>> [CNG] Supported? – CLUE allow encoding parameters to be associated with
>> captures which could state the resolution of the captured image. However
>> there is no way to indicate “display” resolution. Perhaps the
>> requirement needs to be reworded?
>>
>> REQMT-8:The solution MUST support methods for handling different
>>
>> bit rates in the same conference.
>>
>> [CNG] Supported – CLUE allow encoding parameters to be associated with
>> captures which could state the bit rate.
>>
>> REQMT-9:The solution MUST support means of enabling
>>
>> interoperability between endpoints that send and receive
>>
>> different numbers of media streams.
>>
>> Use case heterogeneous and multipoint.
>>
>> [CNG] Supported – CLUE allows different numbers of captures to be sent.
>>
>> REQMT-10:The solution MUST make it possible for endpoints without
>>
>> support for telepresence extensions to participate in a
>>
>> telepresence session with those that do.
>>
>> [CNG] Supported – The support of a CLUE channel will be negotiated
>> between endpoints. From the current signalling draft it seems a basic
>> set of media can be established without CLUE. Endpoints that don’t
>> support CLUE obviously won’t be able to determine spatial information 
>> etc…
>>
>> REQMT-11:The solution MUST support a mechanism for determining
>>
>> whether or not an endpoint or MCU is capable of
>>
>> telepresence extensions.
>>
>> [CNG] Supported? – The negotiation of a CLUE channel via SDP is one
>> method in the signalling draft. Another indication at the SIP level may
>> be beneficial such as a feature tag. This is yet to be documented.
>>
>> REQMT-12:The solution MUST support a means to enable more than two
>>
>> sites to participate in a teleconference.
>>
>> Use case multipoint.
>>
>> [CNG] Partial – This is on the agenda and there’s basic support i.e. via
>> the composed attribute and “scene-switch-policy” CSE attribute. However
>> no method is yet defined to keep the source information.
>>
>> REQMT-13:The solution MUST support both transcoding and switching
>>
>> approaches to providing multipoint conferences.
>>
>> [CNG] Supported – CLUE supports these two methods. However further work
>> is needed on the details.
>>
>> REQMT-14:The solution MUST support mechanisms to make possible for
>>
>> either or both site switching or segment switching.[Edt:
>>
>> This needs rewording.Deferred until layout discussion is
>>
>> resolved.]
>>
>> [CNG] Partial – The “scene-switch-policy” CSE attribute supports this to
>> some level but further work is needed in this area.
>>
>> REQMT-15:The solution MUST support mechanisms for presentations in
>>
>> such a way that:
>>
>> *Presentations can have different sources
>>
>> *Presentations can be seen by all
>>
>> *There can be variation in placement, number and size of
>>
>> Presentations
>>
>> [CNG] Partial – CLUE allows an endpoint whether the capture is related
>> to a presentation or not through the use of the presentation attribute.
>> The spatial attributes (i.e. capture area) allows the place and size to
>> be indicated. The number can be inferred from the number of captures.
>> CLUE doesn’t have any policy on who can see the presentations. The CLUE
>> Advertisement mechanism may include presentation captures to all people
>> within the conference. Perhaps the 2^nd bullet can be removed or
>> clarified that its not related to policy?
>>
>> REQMT-16:The solution MUST include extensibility mechanisms.
>>
>> [CNG] Supported? – whilst not explicitly documented I think there’s
>> general agreement this is needed.
>>
>> REQMT-17:The solution must support a mechanism for allowing
>>
>> information about media captures to change during a
>>
>> conference.
>>
>> [CNG] Supported? – Keeping the CLUE signalling channel open would allow
>> for this and people seem in agreement this should be possible. Further
>> detailed work is needed in the signalling draft to describe this. i.e.
>> incremental vs full updates. Interaction with bearer signalling (i.e.
>> SDP O/A) etc.
>>
>> REQMT-18:The solution MUST provide a mechanism for the secure
>>
>> exchange of information about the media captures.
>>
>> [CNG] Partial? – It is assumed that CLUE will operate over a DTLS/SCTP
>> however there’s no documentation regarding CLUE security.
>>
>> Appendix A. open issues
>>
>> OPEN-1Binaural Audio [REQMT-2C] The need to support of binaural
>>
>> audio is unresolved, and the "MUST NOT preclude" language in
>>
>> this requirement is problematic.The authors believe this
>>
>> requirement needs to be either changed or withdrawn,
>>
>> depending on how the issue is resolved.
>>
>> [CNG] Not supported - This is still unresolved.
>>
>> OPEN-2Reference to Rendering [REQMT-3b] This is the only
>>
>> requirement which refers to rendering.It may also be empty,
>>
>> since receivers can rendering audio captures as they wish.
>>
>> This is deferred until broader discussion on rendering
>>
>> requirements is concluded.
>>
>> [CNG] I think we should frame the requirements with regards to captures
>> as that’s the way the framework is written. Perhaps some text to say
>> that rendering is based on the receivers local policy.
>>
>> OPEN-3Conference modes [REQMT-14] This wording of this requirement
>>
>> is problematic in part because the conference modes (site
>>
>> switching and segment switching) are not defined.It at
>>
>> least needs rewording.This is deferred until broader
>>
>> discussion on layout is concluded.
>>
>> [CNG] Yes this is still open.
>>
>> OPEN-4Need to capture requirement that attributes can change at any
>>
>> time during the call.
>>
>> [CNG] Yes although it seems REQMT-17 covers this.
>>
>> OPEN-5Need to add requirement for three dimensions in the right
>>
>> place
>>
>> [CNG] Requirements 1d and 2e do capture an aspect of 3D.
>>
>> OPEN-6Multi-view, is there a requirement needed?
>>
>> [CNG] Even if there’s no requirement we appear to support it because we
>> allow both the capture area and capture point to be sent for captures.
>> An Advertiser could describe multiple capture points capturing the same
>> area of capture.
>>
>>
>> Regards, Christian
>>
>> On 17/07/2013 6:07 AM, Mary Barnes wrote:
>>> There really were no changes other than to refresh this draft. I was
>>> going to extend the security section with more detail, but I really
>>> can't add much more detail without referencing things that are defined
>>> in the framework. So, I think the next step is for me to work with
>>> Mark on the security for the framework.
>>>
>>> There are some open issues identified in the appendix of this document
>>> that we need to figure out whether they are issues that need
>>> resolution for this document. I will open issues in the tracker and we
>>> can discuss each one and perhaps make some decisions before Berlin.
>>>
>>> We also need to consider whether the current framework meets these
>>> requirements. If some requirements are not met, we need to decide
>>> whether it's because they'll be met by the solution documents or won't
>>> be met at all. In which case, we need to decide whether we actually
>>> need them for the solution.
>>>
>>> Regards,
>>> Mary.
>>>
>>>
>>> On Tue, Jul 16, 2013 at 11:29 AM, <internet-drafts@ietf.org
>>> <mailto:internet-drafts@ietf.org>> wrote:
>>>
>>>
>>>     A New Internet-Draft is available from the on-line Internet-Drafts
>>>     directories.
>>>     This draft is a work item of the ControLling mUltiple streams for
>>>     tElepresence Working Group of the IETF.
>>>
>>>     Title : Requirements for Telepresence Multi-Streams
>>>     Author(s) : Allyn Romanow
>>>     Stephen Botzko
>>>     Mary Barnes
>>>     Filename : draft-ietf-clue-telepresence-requirements-04.txt
>>>     Pages : 14
>>>     Date : 2013-07-15
>>>
>>>     Abstract:
>>>     This memo discusses the requirements for a specification that 
>>> enables
>>>     telepresence interoperability, by describing the relationship 
>>> between
>>>     multiple RTP streams. In addition, the problem statement and
>>>     definitions are also covered herein.
>>>
>>>
>>>     The IETF datatracker status page for this draft is:
>>>
>>> https://datatracker.ietf.org/doc/draft-ietf-clue-telepresence-requirements 
>>>
>>>
>>>
>>>     There's also a htmlized version available at:
>>>
>>> http://tools.ietf.org/html/draft-ietf-clue-telepresence-requirements-04
>>>
>>>     A diff from the previous version is available at:
>>>
>>> http://www.ietf.org/rfcdiff?url2=draft-ietf-clue-telepresence-requirements-04 
>>>
>>>
>>>
>>>
>>>     Internet-Drafts are also available by anonymous FTP at:
>>>     ftp://ftp.ietf.org/internet-drafts/
>>>
>>>     _______________________________________________
>>>     I-D-Announce mailing list
>>>     I-D-Announce@ietf.org <mailto:I-D-Announce@ietf.org>
>>>     https://www.ietf.org/mailman/listinfo/i-d-announce
>>>     Internet-Draft
>>> <https://www.ietf.org/mailman/listinfo/i-d-announce%0AInternet-Draft>
>>>     directories: http://www.ietf.org/shadow.html
>>>     or ftp://ftp.ietf.org/ietf/1shadow-sites.txt
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> clue mailing list
>>> clue@ietf.org
>>> https://www.ietf.org/mailman/listinfo/clue
>>
>> _______________________________________________
>> clue mailing list
>> clue@ietf.org
>> https://www.ietf.org/mailman/listinfo/clue
>>
>
> _______________________________________________
> clue mailing list
> clue@ietf.org
> https://www.ietf.org/mailman/listinfo/clue
>