RE: DoS attack ?

John C Klensin <> Thu, 06 December 2001 18:58 UTC

Return-Path: <>
Received: from by (PMDF V6.0-025 #44856) id <> (original mail from; Thu, 06 Dec 2001 13:58:13 -0500 (EST)
Received: from by (PMDF V6.0-025 #44856) id <> for (ORCPT; Thu, 06 Dec 2001 13:58:12 -0500 (EST)
Received: from by (PMDF V6.0-025 #44856) id <> for (ORCPT; Thu, 06 Dec 2001 13:58:12 -0500 (EST)
Received: from ([]) by (PMDF V6.0-025 #44856) with ESMTP id <> for; Thu, 06 Dec 2001 13:58:12 -0500 (EST)
Received: from [] (helo=P2) by with esmtp (Exim 3.22 #1) id 16C3f3-000H37-00; Thu, 06 Dec 2001 18:54:21 +0000
Date: Thu, 06 Dec 2001 13:54:20 -0500
From: John C Klensin <>
Subject: RE: DoS attack ?
In-reply-to: <>
To: Nicolas Popp <>, YangWoo Ko <>
Message-id: <128855563.1007646860@P2>
MIME-version: 1.0
X-Mailer: Mulberry/2.1.1 (Win32)
Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7BIT
Content-disposition: inline
References: <7FC3066C236FD511BC5900508BAC86FE4364D0@trestles.inte>
List-Owner: <>
List-Post: <>
List-Subscribe: <>, <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Help: <>, <>
List-Id: <>

--On Thursday, 06 December, 2001 10:09 -0800 Nicolas Popp
<> wrote:

> You can also do what most search engines would.
> You return a (small) range of ranked results in the set of
> results and your last result is a referral back to you for the
> next range in the set...Then you try to detect automated
> crawlers that recursively follow the referrals and slow them
> down to a halt.
> So, just from that standpoint, it could be useful for the
> protocol to support the notion of results set range (query) as
> well as referral (response).

Brief observations:


Been there and done that.  Does not scale well to VLDBs.
Returning of resulted based on sorted ranking doesn't either,
unless you retrieve everything (or high-performance record
pointers to everything) on the server, sort things out there,
and then, for efficiency, return part of it and cache everything
else.  If you are trying to conserve bandwidth or improve user
presentation speed, this works (although it still doesn't scale
terribly well).  If you are trying to conserve server resources,
it is usually bad news.

<Incorporate several observations about race conditions and
server-based search state here>

How would you feel about getting back an randomly-chosen subset
of specified maximum size each time, sampled with replacement ?

Let's discuss next week.