Queries

Basically, this is something equivallent to broadcast messages: they are packets of data that will be duplicated to all the nodes in one region or the entire network.

But, in their "philisophy", they are a special kind of broadcast messages. Their goal is to send a small message to trigger some action in one or more node in the network. So, the transfer has to be as fast as possible, implying a very small message size (1 to 100 bytes(2)).

An example: Gnutella [1]

One of the most well-known example that is using queries is the Gnutella protocol.

There are two kinds of queries in Gnutella. The first one is asking the question "Do you have a file with that word in its name?". Every node in the network is expected to search on their hard disk and re-transmit the query to the rest of the network(3).

When a node finds one or more matching file on their hard disk, they will make one query per file meaning "I have a file named AAA and my IP is BBB.CCC.DDD.EEE". Sooner or later(4), the node that made the initial search query will receive the result.

Why it is anonymous

As you can see from the previous example, there is no obligation from whatever is producing the query to tell where it came from. Thus, queries can be anonymous. If any service requires to tell where is the origin of the query, it just has to do so in the contents of the query (like the "search result" query in Gnutella).

To be able to trace the path of a query, someone would need to be able to control a large part of the network nodes. Also, to "know" what queries are produced by a node, someone would require to compare all the input queries with the ouput queries. But then, since a node can decide by itself to connect to any node it wants, that would be again very difficult.

So, we can say that, in practice, it is difficult enough to do tracing and monitoring to say that queries are anonymous(5), and is thus protocol (and IP address) independant.

Notes

(1) This term was highly "inspired" by the Gnutella protocol, as you can read below.

(2) Said another way, queries are optimized for the lowest ping time possible. This implies that if you send too much queries based on the bandwidth available, the transfer rate could greatly affect the ping time. This is what happened once to Gnutella.

(3) In ANet, the clients cannot do filtering over the queries, as it is usually done now in Gnutella. Thus, the clients are "passive" to the transmission part of the ANet deamon. But ANet will support "filtering modules" that the user can install in the deamon for some specific services.

(4) In Gnutella, it is usually "later" instead of "sooner"... A good idea in Gnutella would be to cache those results if several similar searches are made. Note that in ANet queries not cached: only part of the information (the checksum) is cached. If you want caching, use static data instead[2].

(5) I am not implying that ANet is 100% anonymous, nor that we will make sure that it will become 100% anonymous.

References

About the references...

[1] Semi-Official Gnutella Web Site. External link.
[2] Benad, "Static Data". Local link.

Last update for this document: September 1, 2001, at 18:45:18 PST