Introduction to ANet
What is ANet?
Anonymous Two-Way Data Transfers
Gateways and Clusters
Static Data is the second of the three kinds of communications built in the ANet protocol (queries, static data and TWDT).
Basically, this is the same as queries, but for data that is bigger and that is relevant for a longer period of time. This is in contrast with queries, that are smaller but are distributed on the network faster(2).
The kind of data that should be used for this is data that is not often changed, that is, not dynamic. For example, web pages that do not change often is something that could be considered as "static" data. If the data has to be generated "on-the-fly" (dynamic), then you should send a query that will initiate a two-way data transfer instead.
Let's say you have some data that you want to be distributed and kept in the database by as many computers as possible, then you need to distribute static data. Then, all computers that will keep that data will become a "duplicate" of you, and you will also become the "duplicate" of the rest of the network. The advantage of this is that each computer will have locally a copy of your data and won't need to ask you "Do you have that data, and if so, can you give me a copy?".
So, the network somehow becomes a database, where each node has a copy of the same data as all the other nodes(3).
Because static data is substantially bigger than queries, we cannot afford to send all the static data from a node to all its adjacent nodes. So, uploading static data from a node to another node has to be a two-step process.
The first step is to ask "do you already have that data". Actually, it is "Here's a list of what I have. What do you want?". Then, the other node will reply with a list of the data it wants.
Then, the second step is to actually upload the requested data from one node to the other one.
You have to note though that a node can never ask "I want this data" without being offered the chance to download it (from the list of "Here's what I have"). Thus, distributing static data is a passive process, as a node doesn't need to "trigger" some action to receive static data(4).
Already, we have a huge problem: how can we identify uniquely some static data "packet"(5)? Remember that the network is a distributed one, so no node can control all the creation of the keys that will uniquely identify the static data packet. From one point in the network, there is no safe way to verify that the "key" you want to create is not already used, or to ensure that no other node will use the same key for a different packet.
There would be, in practice, no problem if the key for some packet is a checksum. But then, if we change the contents of the data, its corresponding key will change. So, this kind of key is useful only when the data never changes. Otherwise, there's no way to find "the latest version of that packet".
The solution is to have two keys for each packet.
The "primary" key identifies the expected contents of the packet; this is some kind of file name, so that users can "know" what contents the file should have. The primary key is not unique. Several packets can have the exact same primary key, but it's up to the user to figure out which packet is the "right" one.
The "secondary" key identifies uniquely the actual contents of the packet; this is basically a checksum. So, this can uniquely identify the different "versions" of the packet, and if it is a digital signature, what source made the packet. With the secondary key, we can check if the data has been tampered with, and thus you can destroy the detected "broken" data(6).
(1) While "static" is the right term, "data" might be too generic. Maybe the term "static packet" or "static record" would be better...
(2) Because of their size, a node will send some static data to a node it is connected to only if the receiving node doesn't already have the data. Thus, nodes must allow other nodes to ask them what static data it has.
(3) "Somehow" is the key word here. ANet is not an implementation of a database, since while it can be used to define storage, it doesn't define any transaction. It is up to the programmers of the database clients to define and enforce the transaction rules.
(4) Actually, a node cannot trigger some action to receive static data without any client interaction, through queries.
(5) With long-term memory structures, the smallest unit is a file. But in networking (especially IP), it is a packet. Since ANet is more a networking protocol than a long-term memory structure, I suppose that is it better to use the term "static data packet" than "file", even though it sounds less "intuitive".
(6) So, I just avoided completley all the "unqueness" problems that are starting to plague Freenet. You see, it's that simple...
(7) In Gnutella, you have to constantly ask, through searches, what the other nodes have. So, some users are "spamming" the network with searches like "a", "b", and so on. With ANet, it could be possible to use static data for file lists, so that you know at all times what's on the network. Also, doing "stupid" searches will affect only your computer, as you search in the file lists that are already in your computer.
About the references... Benad, "Anonymous Two-Way Data Transfers". Local link.
Last update for this document: September 1, 2001, at 18:52:12 PST