Re: [clue] I-D Action: draft-ietf-clue-telepresence-requirements-04.txt

Paul Kyzivat <pkyzivat@alum.mit.edu> Thu, 18 July 2013 14:42 UTC

Return-Path: <pkyzivat@alum.mit.edu>
X-Original-To: clue@ietfa.amsl.com
Delivered-To: clue@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A890421E8100 for <clue@ietfa.amsl.com>; Thu, 18 Jul 2013 07:42:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.24
X-Spam-Level:
X-Spam-Status: No, score=-0.24 tagged_above=-999 required=5 tests=[AWL=0.197, BAYES_00=-2.599, FH_RELAY_NODNS=1.451, HELO_MISMATCH_NET=0.611, RDNS_NONE=0.1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uZUoO7O4IY6r for <clue@ietfa.amsl.com>; Thu, 18 Jul 2013 07:42:24 -0700 (PDT)
Received: from qmta06.westchester.pa.mail.comcast.net (qmta06.westchester.pa.mail.comcast.net [IPv6:2001:558:fe14:43:76:96:62:56]) by ietfa.amsl.com (Postfix) with ESMTP id 4414821E80F5 for <clue@ietf.org>; Thu, 18 Jul 2013 07:42:24 -0700 (PDT)
Received: from omta13.westchester.pa.mail.comcast.net ([76.96.62.52]) by qmta06.westchester.pa.mail.comcast.net with comcast id 1oga1m00617dt5G56qiPyw; Thu, 18 Jul 2013 14:42:23 +0000
Received: from Paul-Kyzivats-MacBook-Pro.local ([50.138.229.164]) by omta13.westchester.pa.mail.comcast.net with comcast id 1qiP1m00S3ZTu2S3ZqiPm5; Thu, 18 Jul 2013 14:42:23 +0000
Message-ID: <51E7FECE.7000607@alum.mit.edu>
Date: Thu, 18 Jul 2013 10:42:22 -0400
From: Paul Kyzivat <pkyzivat@alum.mit.edu>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/20130620 Thunderbird/17.0.7
MIME-Version: 1.0
To: clue@ietf.org
References: <20130716162934.14206.13360.idtracker@ietfa.amsl.com> <CAHBDyN5Yz7eGtyVVVDkxfgMM580Ef=9jMsL3kePpeDW6tpQJmw@mail.gmail.com> <51E75A6C.3090707@nteczone.com>
In-Reply-To: <51E75A6C.3090707@nteczone.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1374158543; bh=FI0sOzvYx/lNJJqakSCDh/l3Rzywgx9cqnmpX+k9YOg=; h=Received:Received:Message-ID:Date:From:MIME-Version:To:Subject: Content-Type; b=LQqYjWxxpuNA35e4NG9/MF6Vrjgo3xgXw8Z7aPLI8Vd5W65Pal1liHly/+iQoj2xW mIaAjznFV+GCkgRPQrJ5LbGT2kWH4JSpFFXcB+M+uyVnXS9xyD0L94uvrmH7QJCSVp MceNQsSAKf2pX8K1mDTGeOSwCVgkpEg2HQkTfjSVoqcO+17o2GYS1i/eg8FgVu2Fhw PVDbaaliYx8t8V6UuHrJJCCyMkVADtM243ardlEszzJlHDnOa/HOAJ2ZgNyevjyp5w RpXlJR8YJJ/BT8350GGPJuLMP9jVolzuOSt0hROh/ao3Wf7q3pzbLyhGw5LE4KIWja QWoogELX5ZIGQ==
Subject: Re: [clue] I-D Action: draft-ietf-clue-telepresence-requirements-04.txt
X-BeenThere: clue@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: CLUE - ControLling mUltiple streams for TElepresence <clue.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/clue>, <mailto:clue-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/clue>
List-Post: <mailto:clue@ietf.org>
List-Help: <mailto:clue-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/clue>, <mailto:clue-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Jul 2013 14:42:28 -0000

Christian,

You raise a point that has bothered me for some time:

We describe the spatial attributes of a capture by point of capture and 
area of capture (a quadrilateral). Implicitly this describes an 
unbounded pyramid. In reality, it is bounded by the geometry of the room 
and the stuff in it, but we don't have that info. This pyramid may be 
meaningful in some sense for video, though we lack other information 
such as depth of field. But as John replied, I guess it means very 
little for audio.

The fundamental question is: does this provide sufficient information 
for the consumer to properly adapt the received media to his available 
equipment? I don't know the answer to that, but I suspect that it will 
allow some approximation, but probably not enough to do the best 
possible job. Is there more that we ought to be providing to help with 
that job? Or does doing more provide diminishing returns?

	Thanks,
	Paul

On 7/17/13 11:01 PM, Christian Groves wrote:
> Hello Mary,
>
> I did a review of the requirements and Appendix to see if they are met.
> My comments are below. Sorry for the length of the email but I figured
> it would be easier for people to read my comment [CNG] along with the
> requirement.
>
> REQMT-1:The solution MUST support a description of the spatial
>
> arrangement of source video images sent in video streams
>
> which enables a satisfactory reproduction at the receiver
>
> of the original scene.This applies to each site in a
>
> point to point or a multipoint meeting and refers to the
>
> spatial ordering within a site, not to the ordering of
>
> images between sites.
>
> [CNG] Supported – via Capture Area attribute.
>
> Use case point to point symmetric, and all other use cases.
>
> REQMT-1a:The solution MUST support a means of allowing
>
> the preservation of the order of images in the
>
> captured scene.For example, if John is to
>
> Susan's right in the image capture, John is
>
> also to Susan's right in the rendered image.
>
> [CNG] Supported – via Capture Area attribute.
>
> REQMT-1b:The solution MUST support a means of allowing
>
> the preservation of order of images in the
>
> scene in two dimensions - horizontal and
>
> vertical.
>
> [CNG] Supported – via Capture Area attribute.
>
> REQMT-1c:The solution MUST support a means to identify
>
> the point of capture of individual video
>
> captures in three dimensions.
>
> [CNG] Supported – via Point of capture attribute.
>
> REQMT-1d:The solution MUST support a means to identify
>
> the extent of individual video captures in
>
> three dimensions.
>
> [CNG] Partial – Capture area attribute allows the specification of a
> plane of capture in 3 dimensions. However depth of the capture (needed
> for a 3D area) is not supported.
>
> REQMT-2:The solution MUST support a description of the spatial
>
> arrangement of captured source audio sent in audio streams
>
> which enables a satisfactory reproduction at the receiver
>
> in a spatially correct manner.This applies to each site
>
> in a point to point or a multipoint meeting and refers to
>
> the spatial ordering within a site, not the ordering of
>
> channels between sites.
>
> [CNG] Supported – Via Capture Area attribute.
>
> Use case point to point symmetric, and all use cases,
>
> especially heterogeneous.
>
> REQMT-2a:The solution MUST support a means of preserving
>
> the spatial order of audio in the captured
>
> scene.For example, if John sounds as if he is
>
> at Susan's right in the captured audio, John
>
> voice is also placed at Susan's right in the
>
> rendered image.
>
> [CNG] Supported – Via Capture Area attribute.
>
> REQMT-2b:The solution MUST support a means to identify
>
> the number and spatial arrangement of audio
>
> channels including monaural, stereophonic
>
> (2.0), and 3.0 (left, center, right) audio
>
> channels.
>
> [CNG] Partial – the Audio Channel Format attribute currently only
> supports “mono” and “stereo”. It doesn’t support the 3.0 format.
>
> REQMT-2c:The solution MUST NOT preclude the use of
>
> binaural audio.[Edt. This is an outstanding
>
> issue.Text will be changed when the issue is
>
> resolved.]
>
> [CNG] Partial? – Binaural isn’t mentioned in the framework so isn’t
> precluded as such. There is no indication in CLUE recording a binaural
> capture.
>
> REQMT-2d:The solution MUST support a means to identify
>
> the point of capture of individual audio
>
> captures in three dimensions.
>
> [CNG] Supported – via Point of capture attribute.
>
> REQMT-2e:The solution MUST support a means to identify
>
> the extent of individual audio captures in
>
> three dimensions.
>
> [CNG] Partial – The Capture area attribute is based on capturing a plane
> in Cartesian space. There is no depth parameter. Whether the plane
> concept holds for audio like it does video needs further thought.
>
> REQMT-3:The solution MUST support a mechanism to enable a
>
> satisfactory spatial matching between audio and video
>
> streams coming from the same endpoints.
>
> [CNG] Supported - The given that an Advertiser places captures in a
> particular scene it is possible to indicate audio and video streams are
> from the same endpoint.
>
> Use case is point to point symmetric, and all use cases.
>
> REQMT-3a:The solution MUST enable individual audio
>
> streams to be associated with one or more video
>
> image captures, and individual video image
>
> captures to be associated with one or more
>
> audio captures, for the purpose of rendering
>
> proper position.
>
> [CNG] Supported (Partial?) – It is possible to deduce that audio and
> video relate to the same capture area in a scene by analysing the
> Capture Area parameters. It is possible to relate individual audio
> captures to video captures if the Advertisement is constructed in a
> limited way. However its not possible to deduce the relationships
> between individual captures by utilising the Scene, CSE and capture
> concepts. CSE only relate to one media type. So it’s not possible to
> link audio CSEs to video CSEs.
>
> REQMT-3b:The solution MUST enable individual audio
>
> streams to be rendered in any desired spatial
>
> position.
>
> [CNG] Supported? – I think its assumed that a Consumer can do whatever
> it likes with respect to rendering decisions. Ultimately that’s local
> policy.
>
> Edt: Rendering is an open issue. Text will
>
> be changed when it is resolved.]
>
> REQMT-4:The solution MUST enable interoperability between
>
> endpoints that have a different number of similar devices.
>
> For example, one endpoint may have 1 screen, 1 speaker, 1
>
> camera, 1 mic, and another endpoint may have 3 screens, 2
>
> speakers, 3 cameras and 2 mics.Or, in a multi-point
>
> conference, one endpoint may have one screen, another may
>
> have 2 screens and a third may have 3 screens.This
>
> includes endpoints where the number of devices of a given
>
> type is zero.
>
> Use case is asymmetric point to point and multipoint.
>
> [CNG] Supported – CLUE enables different capture capabilities to be
> signalled. It makes no assumption as to the rendering.
>
> REQMT-5:The solution MUST support means of enabling
>
> interoperability between telepresence endpoints where
>
> cameras are of different picture aspect ratios.
>
> [CNG] Supported – CLUE doesn’t describe “aspect ratio” but allows
> different capture areas to be defined. This is a means of describing
> aspect ratio.
>
> REQMT-6:The solution MUST provide scaling information which
>
> enables rendering of a video image at the actual size of
>
> the captured scene.
>
> [CNG] Supported - Capture Area, Capture Point, Point of Line of Capture
> and Scene Scale allow this.
>
> REQMT-7:The solution MUST support means of enabling
>
> interoperability between telepresence endpoints where
>
> displays are of different resolutions.
>
> [CNG] Supported? – CLUE allow encoding parameters to be associated with
> captures which could state the resolution of the captured image. However
> there is no way to indicate “display” resolution. Perhaps the
> requirement needs to be reworded?
>
> REQMT-8:The solution MUST support methods for handling different
>
> bit rates in the same conference.
>
> [CNG] Supported – CLUE allow encoding parameters to be associated with
> captures which could state the bit rate.
>
> REQMT-9:The solution MUST support means of enabling
>
> interoperability between endpoints that send and receive
>
> different numbers of media streams.
>
> Use case heterogeneous and multipoint.
>
> [CNG] Supported – CLUE allows different numbers of captures to be sent.
>
> REQMT-10:The solution MUST make it possible for endpoints without
>
> support for telepresence extensions to participate in a
>
> telepresence session with those that do.
>
> [CNG] Supported – The support of a CLUE channel will be negotiated
> between endpoints. From the current signalling draft it seems a basic
> set of media can be established without CLUE. Endpoints that don’t
> support CLUE obviously won’t be able to determine spatial information etc…
>
> REQMT-11:The solution MUST support a mechanism for determining
>
> whether or not an endpoint or MCU is capable of
>
> telepresence extensions.
>
> [CNG] Supported? – The negotiation of a CLUE channel via SDP is one
> method in the signalling draft. Another indication at the SIP level may
> be beneficial such as a feature tag. This is yet to be documented.
>
> REQMT-12:The solution MUST support a means to enable more than two
>
> sites to participate in a teleconference.
>
> Use case multipoint.
>
> [CNG] Partial – This is on the agenda and there’s basic support i.e. via
> the composed attribute and “scene-switch-policy” CSE attribute. However
> no method is yet defined to keep the source information.
>
> REQMT-13:The solution MUST support both transcoding and switching
>
> approaches to providing multipoint conferences.
>
> [CNG] Supported – CLUE supports these two methods. However further work
> is needed on the details.
>
> REQMT-14:The solution MUST support mechanisms to make possible for
>
> either or both site switching or segment switching.[Edt:
>
> This needs rewording.Deferred until layout discussion is
>
> resolved.]
>
> [CNG] Partial – The “scene-switch-policy” CSE attribute supports this to
> some level but further work is needed in this area.
>
> REQMT-15:The solution MUST support mechanisms for presentations in
>
> such a way that:
>
> *Presentations can have different sources
>
> *Presentations can be seen by all
>
> *There can be variation in placement, number and size of
>
> Presentations
>
> [CNG] Partial – CLUE allows an endpoint whether the capture is related
> to a presentation or not through the use of the presentation attribute.
> The spatial attributes (i.e. capture area) allows the place and size to
> be indicated. The number can be inferred from the number of captures.
> CLUE doesn’t have any policy on who can see the presentations. The CLUE
> Advertisement mechanism may include presentation captures to all people
> within the conference. Perhaps the 2^nd bullet can be removed or
> clarified that its not related to policy?
>
> REQMT-16:The solution MUST include extensibility mechanisms.
>
> [CNG] Supported? – whilst not explicitly documented I think there’s
> general agreement this is needed.
>
> REQMT-17:The solution must support a mechanism for allowing
>
> information about media captures to change during a
>
> conference.
>
> [CNG] Supported? – Keeping the CLUE signalling channel open would allow
> for this and people seem in agreement this should be possible. Further
> detailed work is needed in the signalling draft to describe this. i.e.
> incremental vs full updates. Interaction with bearer signalling (i.e.
> SDP O/A) etc.
>
> REQMT-18:The solution MUST provide a mechanism for the secure
>
> exchange of information about the media captures.
>
> [CNG] Partial? – It is assumed that CLUE will operate over a DTLS/SCTP
> however there’s no documentation regarding CLUE security.
>
> Appendix A. open issues
>
> OPEN-1Binaural Audio [REQMT-2C] The need to support of binaural
>
> audio is unresolved, and the "MUST NOT preclude" language in
>
> this requirement is problematic.The authors believe this
>
> requirement needs to be either changed or withdrawn,
>
> depending on how the issue is resolved.
>
> [CNG] Not supported - This is still unresolved.
>
> OPEN-2Reference to Rendering [REQMT-3b] This is the only
>
> requirement which refers to rendering.It may also be empty,
>
> since receivers can rendering audio captures as they wish.
>
> This is deferred until broader discussion on rendering
>
> requirements is concluded.
>
> [CNG] I think we should frame the requirements with regards to captures
> as that’s the way the framework is written. Perhaps some text to say
> that rendering is based on the receivers local policy.
>
> OPEN-3Conference modes [REQMT-14] This wording of this requirement
>
> is problematic in part because the conference modes (site
>
> switching and segment switching) are not defined.It at
>
> least needs rewording.This is deferred until broader
>
> discussion on layout is concluded.
>
> [CNG] Yes this is still open.
>
> OPEN-4Need to capture requirement that attributes can change at any
>
> time during the call.
>
> [CNG] Yes although it seems REQMT-17 covers this.
>
> OPEN-5Need to add requirement for three dimensions in the right
>
> place
>
> [CNG] Requirements 1d and 2e do capture an aspect of 3D.
>
> OPEN-6Multi-view, is there a requirement needed?
>
> [CNG] Even if there’s no requirement we appear to support it because we
> allow both the capture area and capture point to be sent for captures.
> An Advertiser could describe multiple capture points capturing the same
> area of capture.
>
>
> Regards, Christian
>
> On 17/07/2013 6:07 AM, Mary Barnes wrote:
>> There really were no changes other than to refresh this draft. I was
>> going to extend the security section with more detail, but I really
>> can't add much more detail without referencing things that are defined
>> in the framework. So, I think the next step is for me to work with
>> Mark on the security for the framework.
>>
>> There are some open issues identified in the appendix of this document
>> that we need to figure out whether they are issues that need
>> resolution for this document. I will open issues in the tracker and we
>> can discuss each one and perhaps make some decisions before Berlin.
>>
>> We also need to consider whether the current framework meets these
>> requirements. If some requirements are not met, we need to decide
>> whether it's because they'll be met by the solution documents or won't
>> be met at all. In which case, we need to decide whether we actually
>> need them for the solution.
>>
>> Regards,
>> Mary.
>>
>>
>> On Tue, Jul 16, 2013 at 11:29 AM, <internet-drafts@ietf.org
>> <mailto:internet-drafts@ietf.org>> wrote:
>>
>>
>>     A New Internet-Draft is available from the on-line Internet-Drafts
>>     directories.
>>     This draft is a work item of the ControLling mUltiple streams for
>>     tElepresence Working Group of the IETF.
>>
>>     Title : Requirements for Telepresence Multi-Streams
>>     Author(s) : Allyn Romanow
>>     Stephen Botzko
>>     Mary Barnes
>>     Filename : draft-ietf-clue-telepresence-requirements-04.txt
>>     Pages : 14
>>     Date : 2013-07-15
>>
>>     Abstract:
>>     This memo discusses the requirements for a specification that enables
>>     telepresence interoperability, by describing the relationship between
>>     multiple RTP streams. In addition, the problem statement and
>>     definitions are also covered herein.
>>
>>
>>     The IETF datatracker status page for this draft is:
>>
>> https://datatracker.ietf.org/doc/draft-ietf-clue-telepresence-requirements
>>
>>
>>     There's also a htmlized version available at:
>>
>> http://tools.ietf.org/html/draft-ietf-clue-telepresence-requirements-04
>>
>>     A diff from the previous version is available at:
>>
>> http://www.ietf.org/rfcdiff?url2=draft-ietf-clue-telepresence-requirements-04
>>
>>
>>
>>     Internet-Drafts are also available by anonymous FTP at:
>>     ftp://ftp.ietf.org/internet-drafts/
>>
>>     _______________________________________________
>>     I-D-Announce mailing list
>>     I-D-Announce@ietf.org <mailto:I-D-Announce@ietf.org>
>>     https://www.ietf.org/mailman/listinfo/i-d-announce
>>     Internet-Draft
>>     <https://www.ietf.org/mailman/listinfo/i-d-announce%0AInternet-Draft>
>>     directories: http://www.ietf.org/shadow.html
>>     or ftp://ftp.ietf.org/ietf/1shadow-sites.txt
>>
>>
>>
>>
>> _______________________________________________
>> clue mailing list
>> clue@ietf.org
>> https://www.ietf.org/mailman/listinfo/clue
>
> _______________________________________________
> clue mailing list
> clue@ietf.org
> https://www.ietf.org/mailman/listinfo/clue
>