[Din] distributed search engine
Stan Srednyak <stan.sredn@gmail.com> Fri, 02 July 2021 18:10 UTC
Return-Path: <stan.sredn@gmail.com>
X-Original-To: din@ietfa.amsl.com
Delivered-To: din@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7CCE33A26C3 for <din@ietfa.amsl.com>; Fri, 2 Jul 2021 11:10:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hrgnVnExvOex for <din@ietfa.amsl.com>; Fri, 2 Jul 2021 11:10:17 -0700 (PDT)
Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 900B93A26C0 for <Din@irtf.org>; Fri, 2 Jul 2021 11:10:13 -0700 (PDT)
Received: by mail-lf1-x12b.google.com with SMTP id w19so19593649lfk.5 for <Din@irtf.org>; Fri, 02 Jul 2021 11:10:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=IDZbiBU3B21p3lJ68KPPWORfujY9iDYrYfq0H8MUFLU=; b=Mvu89x2a1JuccgDFaQPEQGFXW4RKBao5Ta3Y2vPqonuw2/g6Wr+WBuiE52ttArTnTL c+FGWclTR+1ZNFFOzgHEJixxibSobW5TPVx3A8+ZQ67dhTq89DcMa/igA72n0hEXQ+iB uvUIh/lVNk0V9WhRF8y9UWpTpGINMHvMAFJJQqrhSLQFJHMWIimC1HrKgauuykR4cikK 2TLvatDFCV2Kp7VqJMQZPxH9R4+VlyJy811MFT9P10ku1D/GC5JEN+uzfZt9tZZ8CWGa I8bySxl0BJ0SCv4DHXv4wqw8vuvjMyx74LI/R2Hy+khUDW9GNpCJGwFQG7OGBZ2m2kLY 4L5A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=IDZbiBU3B21p3lJ68KPPWORfujY9iDYrYfq0H8MUFLU=; b=dGw207FfFKLDwpH3ZGRrrmpRxLaTm1WSKKlqRCU7uVXPKGUG0Ii8z8gf6rc829221G sHhkuZKDIbiPVBybs6nR4OWfMxA1I5YP8QU+DeoTby7mT9DCTjOhVLl/0im0twueRIvX 2JRgi1RYjMz86pckptNRChJTwvld8lYNx7RzbVyxMxwSsd2AIKdIDNpoDfkJSFk4V3UG 4RJK1ouUQ3fPbGceD0E2Edv728241QZdWXa7zbaJmp2RDy4lFw+cJ8Lvcm0ZS53vYU0G Z0+vca/o7C3MOyzvAA7Ri1PxatloIm9uTbFjrxjtXzWxF0r7L8PB0eRklt87E/iwzNdw 9QcA==
X-Gm-Message-State: AOAM532NItPVQRN22POqjc8YdGctdOFqKABJtEEGRQtJZzqO3T/RPmRL L+pg4eWaOU9SGgdxw1YNcnZWBqOfcHIAoxv5XBd7KnGJKMjasA==
X-Google-Smtp-Source: ABdhPJz/DJ2o8LRbPTO5EDwXgObVMRS6p0HR5T6NRQaurfYT8FrqHtjMAWsqzqclyBeOcn3NdUY86IbMJ7BgHFi36Ys=
X-Received: by 2002:a05:6512:3332:: with SMTP id l18mr656750lfe.439.1625249410396; Fri, 02 Jul 2021 11:10:10 -0700 (PDT)
MIME-Version: 1.0
From: Stan Srednyak <stan.sredn@gmail.com>
Date: Fri, 02 Jul 2021 14:09:59 -0400
Message-ID: <CAE-786g_VpQLXkjXhRGuQkK+qes-RzLRL4FJ9ViSatHkiCwS-w@mail.gmail.com>
To: Din@irtf.org
Content-Type: multipart/alternative; boundary="000000000000c736ee05c627ddc3"
Archived-At: <https://mailarchive.ietf.org/arch/msg/din/cv-OVx3xb8xgoOCaWDNj2ae9Ikw>
Subject: [Din] distributed search engine
X-BeenThere: din@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussion of distributed Internet Infrastructure approaches, aspects such as Service Federation, and underlying technologies" <din.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/din>, <mailto:din-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/din/>
List-Post: <mailto:din@irtf.org>
List-Help: <mailto:din-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/din>, <mailto:din-request@irtf.org?subject=subscribe>
X-List-Received-Date: Fri, 02 Jul 2021 18:10:22 -0000
hi DINRG, It seems that there is a high demand for development of decentralized Internet. While decentralization must address many issues, in particular, social networks, it is of utmost importance to develop decentralized search engines, as search engines are the gateway to the web. Search engine industry has been monopolized by a few large companies. While there are many negative impacts that this centralization has had ( i.e., manipulation of the rank), the one that is particularly conspicuous is the secrecy of the ranking algorithms. It is highly desirable to have open ranking algorithms and allow the users to choose from a variety of algorithms. Some time ago I started working on the design and implementation of a decentralized search engine. The basic idea of my approach is that while computational power needed to realize a distributed search engine is immense, it is possible to design a communication protocol that would orchestrate data collection, analysis, ranking, and serving search queries to users and split the work load among participating nodes. The participating nodes in my design are computers of ordinary internet users. I have developed corresponding algorithms that are necessary to realize search and ranking operations on a distributed network of user computers. One of the challenges lies in achieving acceptable latency (<1 second) in serving search transactions. According to my estimates, it is possible to achieve latency comparable to the existing search engines. In addition, I have shown that it is possible to guarantee the computation and delivery of the true rank (the one that is actually being requested by the user) to the end users ( of course, there is the obvious problem that in decentralized architectures the nodes may try to manipulate the rank, and rank some pages unjustifiably high or low. Nonetheless, it is possible to design a network communication protocol in such a way that it is highly improbable that malicious nodes can manipulate the rank, as long as their total fraction is below a certain threshold). Some of the details of the project can be found at https://rorur.com. To incentivize people to maintain "search nodes" ( analogously to Ethereum nodes), I proposed an architecture that allows individuals and companies to advertise on this network, quite analogously to what is done on the usual search engines, with the difference that the revenue is distributed to the node maintainers. There are some details on how to achieve this in a secure fashion, and some of them can be found on the site linked above. I will be rolling out the first stage of this project quite soon, and I would like to know if there is any interest in this project. Of course , this has to be a collaborative project. It is impossible to run it on individual hardware ( although it is possible to deploy it on a centralized data center). There will be several stages in the deployment, in particular, several versions of the communication protocol. I will detail on these stages in a forthcoming publication. There are various roles you can participate in the project, from maintaining a node, to software development, to algorithm design. I will be very happy to hear from you. I think this project ties in really well with the spirit of this group and more generally, with the spirit of IETF. As explained in the white paper, this project, if successful, can lead to the transformation of the web into a "knowledge system". There is a large discussion that is necessary here, but to make it brief, it may allow for creation of personalized search and personal knowledge graphs. It can also be instrumental in creating more robust Internet infrastructure. I will try to develop this project in close collaboration with the IETF community, because the issues it addresses have to do with fundamental aspects of the web, at the level of protocols and data routing. If I am not mistaken in my calculations, distributed search operations can be added on top of the standard protocol stack and thus become part of everyday web operation. best regards, Stan Srednayk
- [Din] distributed search engine Stan Srednyak
- Re: [Din] distributed search engine Jon Crowcroft
- [Din] ACM SIGCOMM'22 in Amsterdam Aaron Ding
- [Din] ACM SIGCOMM 2022 - paper registration/submi… Aaron Ding