Re: [AVTCORE] Question

Julius Friedman <juliusfriedman@gmail.com> Tue, 06 January 2015 14:17 UTC

Return-Path: <juliusfriedman@gmail.com>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CC1BC1A6F3C; Tue, 6 Jan 2015 06:17:10 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3n2EuX4pRw4i; Tue, 6 Jan 2015 06:17:04 -0800 (PST)
Received: from mail-pa0-x22c.google.com (mail-pa0-x22c.google.com [IPv6:2607:f8b0:400e:c03::22c]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6821F1A6F2B; Tue, 6 Jan 2015 06:17:04 -0800 (PST)
Received: by mail-pa0-f44.google.com with SMTP id et14so31052212pad.17; Tue, 06 Jan 2015 06:17:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=W3FR3HEVbHNYHkbC983k5xvcvxelDHzt/pOL58w/qUM=; b=RwFVdkgtVPBRnf5bqJqb9TlMO2EprrMYiXxdjtsvpiRA7DSr70AZeXea1I8TKsUVmR PuG6H17gLVBbbdxt8AU+bDvBe7ZXBGLrbKNu0AJSqRKAaIThLrREwqZfaUIExNIeAAmo ZUUlTa+9Dg3fkiFqWdTPGnR9RA7Ru7lj+QfVglNNiCNXCVydfw0VZTQgP+OGinhbIomp be3op/JrWjG70bwRL4gKWaC3PQAeBbq9pfIeUxYnKAKO2S+2prxUUZ44PzO93H53hUU1 O7vwR2Ron3epgDr6PTPfFRHGEDtTHQrdpKESGdj4hY2K77zaKSCpKRcdufEiqGvI/Bv1 3jrA==
MIME-Version: 1.0
X-Received: by 10.66.102.103 with SMTP id fn7mr138089274pab.113.1420553822524; Tue, 06 Jan 2015 06:17:02 -0800 (PST)
Received: by 10.70.117.99 with HTTP; Tue, 6 Jan 2015 06:17:02 -0800 (PST)
In-Reply-To: <87d26sefaw.fsf@hobgoblin.ariadne.com>
References: <CACFvNHXhFQe2CEKsv4q637SoGa-vKuhztN+a1ywK1ewU_Sa6KA@mail.gmail.com> <87d26sefaw.fsf@hobgoblin.ariadne.com>
Date: Tue, 06 Jan 2015 09:17:02 -0500
Message-ID: <CACFvNHUGDyJDKA0vpPQeKN1H4TCU_8VzuP2Y-V22Jx4XuYJW9w@mail.gmail.com>
From: Julius Friedman <juliusfriedman@gmail.com>
To: "Dale R. Worley" <worley@ariadne.com>, avt@ietf.org, mmusic@ietf.org
Content-Type: multipart/alternative; boundary="047d7bd909a0f9ddc0050bfc7550"
Archived-At: http://mailarchive.ietf.org/arch/msg/avt/abtOnvbH-lCy7uWK39WVSpofJ9Q
Subject: Re: [AVTCORE] Question
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Jan 2015 14:17:11 -0000

But how would we know that the media is 10 minutes long if it was not
already stored?  If it was truly live, we'd only know that it was
*scheduled* to be 10 minutes long, not that it will actually *be* 10
minutes long.

- In that case we don't know if tomorrow will come because it isn't
guaranteed today, lets be serious.

I have a server which is aggregating a stream which is stored, my server
doesn't store the stream.

Only authorized users can seek the stream on the server to move it in time.

The public has to watch from where it is to where it ends and know that
seeking is not supported remotely.

In such a case I know the end time and that seeking isn't supported but the
client doesn't.

This is a common task 'aggregating a stream' and SHOULD be handled, the
fact that Range infers seek-ability is for lack of a better word
`over-reaching`.

I will be adding something for this in my server and I know for a fact that
others would benefit.

That aside,

What about my questions regarding the 'source' and 'ssrc' parameters? Those
aren't proposals, those are genuine questions.

What about my questions regarding the Binary Interleaved Data? The wording
on Disabling 'Retransmission' and the 'Safe' characters '\$'  or '\#'..

As for my other 'issues' / errata for RFC2435 has anything been determined
on them?

What about the one for RFC3550, has anyone sat down to do the math on that
one yet?

I will include their questions here again and I would appreciate a single
complete and through reply.

Since the binary framing was apparently an after thought (based on
information from the drafts which show interleaved data in a 'DATA' RTSP
response) I think this was remnant from that in some way or at least that's
the best I can possibly imagine.

Why the framing octet would also be listed as a 'safe' character is beyond
me, there were several other characters / octets which definitely could
have been used e.g. 0xC3 -> 0xFF.

The fact that the document makes so many statements about ease of parsing
and then literally and figuratively destroy this concept with bad decisions
such as the framing character as well as the interesting syntax for the
markup.

I will past some of those here :

1.4 <https://tools.ietf.org/html/rfc2326#section-1.4> Protocol Properties

   RTSP has the following properties:

....

   Easy to parse:
          RTSP can be parsed by standard HTTP or MIME parsers.



...



4 <https://tools.ietf.org/html/rfc2326#section-4> RTSP Message

   RTSP is a text-based protocol and uses the ISO 10646 character set in
   UTF-8 encoding (RFC 2279 <https://tools.ietf.org/html/rfc2279>
[21]). Lines are terminated by CRLF, but
   receivers should be prepared to also interpret CR and LF by
   themselves as line terminators.

     Text-based protocols make it easier to add optional parameters in a
     self-describing manner. Since the number of parameters and the
     frequency of commands is low, processing efficiency is not a
     concern. Text-based protocols, if done carefully, also allow easy
     implementation of research prototypes in scripting languages such
     as Tcl, Visual Basic and Perl.

     The 10646 character set avoids tricky character set switching, but
     is invisible to the application as long as US-ASCII is being used.
     This is also the encoding used for RTCP. ISO 8859-1 translates
     directly into Unicode with a high-order octet of zero. ISO 8859-1
     characters with the most-significant bit set are represented as
     1100001x 10xxxxxx. (See RFC 2279
<https://tools.ietf.org/html/rfc2279> [21])

   RTSP messages can be carried over any lower-layer transport protocol
   that is 8-bit clean.

   Requests contain methods, the object the method is operating upon and
   parameters to further describe the method. Methods are idempotent,
   unless otherwise noted. Methods are also designed to require little
   or no state maintenance at the media server.



This literally becomes a nightmare when you have to reconstruct TCP
sequences.

This is only exacerbated by the fact the document allows packets up to the
maximum holding size in this mode AND specifies that packets with an
unknown context must be skipped...

Skipped how? Skipped as in past their header, past their data? Can that
text even be there?

We are talking about TCP and in such a case the data from a previous
segment was already received by the receiver and contains only some of the
data how would they skip data they have yet to receive?

I will explain why this is a security issue, Since there is no sequence
information in the binary header it must be trusted to contain valid length
values. Without inspecting the latter octets one can make no such
assumption about the type of binary data contained therein, especially if
there is no context assigned.

This if I can get a single TCP packet into a stream I can very easily
inject a lot of 'receive' space thus exponentially increasing my chances of
getting another packet into the stream.

I think that there should at the very least be a correlation between
'Block-Size' and the frame size such that frame sizes MUST NOT ever be
larger than that value if not already.

Furthermore I think that something needs to be done to specify how to deal
with nuisances such as 'more / less bytes available than indicated in
frame' especially when we also have 'Rtsp' messages which can be
interleaved at the same time in between frames (which can contain "$").

So to re-iterate my understanding of how things are in toto I have an
example I will outline below which is is a Rtsp Binary Interleaved sender
who tells a Rtsp Binary Interleaved client there is a frame of 48344 bytes
on channel 0.

The client received only 1328 bytes.

47016 bytes remain.

The client starts to receive again.

At this point there should only be the rest of the data in the frame (47016
bytes)

This data will come over several MSS sized segments, during which time the
socket may also be written to again (before the segment is transmitted to
the client).

{
    If the latter data happens to get to the receiver before the other
portions of the segment we have TCP 're-ordering' based on the sequence
number and may multiple ACK's for the early packets.

    If any data is missed we will have a re-transmission.

    The data being 'received' on the socket by the receiver coming from the
server comes from the data portion of the tcp packet which now contains
data from a previous segment.
}

The sender is now committed to sending the additional bytes as indicated
however if the sender stops sending this data or it's truncated it will
never make to the receiver. The stack will never be able to re-assemble the
packet and now framing sync is lost.

At this point ANY data consumed from the initial frame header and data
which was received must ALSO be dis-guarded but there is NO WAY for the
server to indicate this to the client... (This is why UDP is commonly used,
you get what you get and you don't what you don't)

There are a lot of articles outlining why TCP and RTP don't get along
mostly citing the overhead of re-transmissions so with Rtsp you think we
would have attempted to solve this not make it worse with bad encapsulation.

Something needs to be done about this because it's also a security concern,
e.g. if a client were to send a rtcp packet to the server which is very
larger and contains no data he could attempt to exhaust the resources of
that server (if they didn't monitor their rtcp bandwidth properly) or at
least force a higher amount of traffic.

So, information needs to be clear that there is no such thing as a packet
of less then 8 bytes under rtcp and 16 bytes under rtp. (Unless compression
is being used; ROHC or otherwise)

The null packet or any packet under 16 bytes would used for 'spacing' not a
'unknown' context, especially when I can have 255 channels, what unknown
context am I going to use then?

That is 4 octets for the `$` header, and the rest for the the packet header
which can then be inspected to be the same version, payloadtype and ssrc as
required by the client.

This prevents a client who may also send a "GET_PARAMETER" under
interleaved transport with weird headers and a body to attempt underflow
also which is something the current specification does nothing to protect
against.

"GET_PARAMATER rtsp://someuri/ RTSP/1.0\r\nCSeq:1\r\nDate:
xx:xx:xx\r\nContent-Length:24\r\n$UserAgent
$009\r\n$\0\0\aRTSP/1.$\0\0\r\n\r\nDatawith$IsNotAGoodIdea!\0\0"

This would allow applications to still take note of when a large frame was
seen but not be forced to act upon it and potentially lose data, and
depending on their implementation allow remote code execution.

I urge the committee to seriously look into these issues as they can have a
large impact in both interpretation of the standard and it's security
features.

https://isecpartners.github.io/fuzzing/vulnerabilities/2013/12/30/vlc-vulnerability.html


On Mon, Jan 5, 2015 at 10:52 PM, Dale R. Worley <worley@ariadne.com> wrote:

> Julius Friedman <juliusfriedman@gmail.com> writes:
> > I would like to address your response:
> >
> > "In any case, why would we want to prevent seeking on a stored media
> > object?  I can see why if the media is an un-stored media stream, but in
> > that case, the software likely doesn't know the total length that the
> > media will have."
> >
> > I will justify how "likely" this is as follows:
> >
> > 1) A Live 10 minute `media` which has been playing for 3 minutes, leaving
> > me 7 minutes in play but unable to seek.
> >    a) If the stream was stored or recorded this obviously wouldn't be an
> > issue
>
> But how would we know that the media is 10 minutes long if it was not
> already stored?  If it was truly live, we'd only know that it was
> *scheduled* to be 10 minutes long, not that it will actually *be* 10
> minutes long.
>
> > 2) I have a server which requires authorization to seek but not to play.
>
> Why would we want to support that?
>
> > 3) I am a client and I have started playing and 'PAUSE' is not supported
> >
> > 4) I am a server re-streaming from a server where "Range" is not
> supported
>
> I'm no expert in RTSP, but my understanding is that these two situations
> are only expected to happen when the media is truly live, that is, the
> server does not know how long the media will be.
>
> > "Perhaps referring to the mailing list discussion for this RFC would
> > reveal the answer."
> >
> > Which mailing list is that? 'mmusic' they have also been addressed,
> several
> > times. I never get a response from them.
>
> I assume you mean "I have asked before on mmusic and received no
> response."  Perhaps nobody else cares about the proposed capability?
>
> I mean the mailing list discussion surrounding this document when it was
> made an RFC.  (Better, look up the Internet-Draft name and refer to the
> appropriate mailing list archive for the time when the I-D was being
> discussed.)
>
> Dale
>