Re: DoS attack ?

YangWoo Ko <> Sat, 08 December 2001 01:21 UTC

Return-Path: <>
Received: from by (PMDF V6.0-025 #44856) id <> (original mail from; Fri, 07 Dec 2001 20:21:28 -0500 (EST)
Received: from by (PMDF V6.0-025 #44856) id <> for (ORCPT; Fri, 07 Dec 2001 20:21:28 -0500 (EST)
Received: from by (PMDF V6.0-025 #44856) id <> for (ORCPT; Fri, 07 Dec 2001 20:21:27 -0500 (EST)
Received: from ([]) by (PMDF V6.0-025 #44856) with ESMTP id <> for; Fri, 07 Dec 2001 20:21:27 -0500 (EST)
Received: (from newcat@localhost) by (8.10.0/8.10.0) id fB81IQX19011; Sat, 08 Dec 2001 10:18:26 +0900
Date: Sat, 08 Dec 2001 10:18:26 +0900
From: YangWoo Ko <>
Subject: Re: DoS attack ?
In-reply-to: <>
To: Nicolas Popp <>
Cc: 'Patrik F?ltstr?m' <>, 'John C Klensin' <>, YangWoo Ko <>,
Message-id: <>
MIME-version: 1.0
Content-type: text/plain; charset=euc-kr
Content-disposition: inline
User-Agent: Mutt/1.3.23i
References: <>
List-Owner: <>
List-Post: <>
List-Subscribe: <>, <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Help: <>, <>
List-Id: <>

Dear all,

As I am reading & writing email at airport public pc, I am not so sure
that I do understand discussions well. But, my concern on this discussion
is that databases are spread over Internet, which is not the just for
search engine. If we took this into consideration, there would be two
additional points;
(1) it will be terribly hard to coordinate all databases to be consistently
    cache or react.
(2) sending very fuzzy queries to all databases and gathering responses
    will be an another issue in front of us.

My best regards

On Fri, Dec 07, 2001 at 09:09:54AM -0800, Nicolas Popp wrote:
> >(1)One practical path is to give in the protocol a way for the server to
> >say "I'm sorry 
> >(2) As soon as you do "paged results", you force the server to keep state.
> No. You simply refetch (the referral query sends me back all the facets and
> the new range). All large search engines are stateless and handle paged
> results. Their caching strategies is typically based on query keyword
> frequency (the cache is built as indexing time based on the query
> distribution, more than keeping the "last requests" (they actually do a
> little bit of that too). Since they are stateless, their response is
> consistent with the query (at any time their return the set of results for
> the query within the specified range, but that set is not immutable. The
> fact that the index is slowing changing makes that strategy acceptable from
> a user perspective).
> Having said that I don't disagree with (1). I just think you need to do
> both.
> -Nico
> -----Original Message-----
> From: Patrik F?ltstr?m []
> Sent: Thursday, December 06, 2001 10:34 PM
> To: Nicolas Popp; 'John C Klensin'; YangWoo Ko
> Cc:
> Subject: RE: DoS attack ?
> --On 01-12-06 10.09 -0800 Nicolas Popp <> wrote:
> > You can also do what most search engines would.
> > You return a (small) range of ranked results in the set of results and
> > your last result is a referral back to you for the next range in the
> > set...Then you try to detect automated crawlers that recursively follow
> > the referrals and slow them down to a halt.
> > 
> > So, just from that standpoint, it could be useful for the protocol to
> > support the notion of results set range (query) as well as referral
> > (response).
> We have been through this when looking at other protocols....and I would
> urge you to learn from earlier mistakes (and successes).
> (1) One practical path is to give in the protocol a way for the server to
> say "I'm sorry, but I will not do that operation you requested. Instead I
> did the following". This generic response can be "you only got 10 records
> even though the result set is larger".
> (2) As soon as you do "paged results", you force the server to keep state.
> Depending on whether the protocol is stateful or stateless, it is harder or
> easier for the server to know when to remove the cached search. Further, as
> soon as you start doing pages results, you end up getting problems with
> sorting the result, handling of database changes between the two fetches
> (i.e. can the server re-issue the query for the second fetch, or do the
> server really have to cache the result set and return the second part at
> the second fetch) and million of other problems.
> So, my suggestion is "don't go there".
>     paf

YangWoo Ko :
We Invent Enterprise Software Solutions
and Make You Secure & Powerful.