How to setup a small dht using Bamboo-DHT implementation

"Bamboo is a either based on Pastry, a re-engineering of the Pastry protocols, or an entirely new DHT, depending on how you want to look at it."
- Bamboo-DHT website

Introductory notes and make


Bamboo-DHT is the implementation that is used to build the service on Planetlab called the OpenDHT. In this how to I show what I did to get a small ring of bamboo nodes up and running. As a starter You will need Java installed and the sources for Bamboo-DHT downloaded. Compilation is straight forward just issue "make" in the topmost folder. NOTE: I am using Linux and NOT Windows. For the visualizer to work with your network you have to modify it, see section visualizer. Also note that I am writing these from my memory and I might remember some parts and some parts can be a bit hazy.

Configuration file


I used the configuration file planetlab/openhash.cfg as a base for my own configuration. First step is to modify the global stage, that contains the nodes identity.

<global> <initargs> node_id ${NodeID} </initargs> </global>

This should be changed into something like localhost:<port_#>. I used port 3600. It can be something else also but remember that Bamboo-DHT reserves the port defined here in this configuration file and also ports <port_#> + 1 and + 2. If the port is 3600, the node listens for routing messages etc in port 3600, in port 3601 Bamboo-DHT will listen for HTTP messages and in port 3602 Bamboo-DHT will listen for RPC messages. This should be taken in to account at least when running multiple nodes on one machine and so on...

<global> <initargs> node_id localhost:3600 </initargs> </global>

Second step is to remove the mac_key_file initarg from the network stage, if you do not want access control with macs.

<Network> class bamboo.network.Network <initargs> mac_key_file /home/ucb_bamboo/pl-mac.key </initargs> </Network>

Should look like this

<Network> class bamboo.network.Network <initargs> </initargs> </Network>

For the Router stage there is many different initargs some of them are listed here. Their meaning is easily understood from the name, if the basics of DHTs are familiar.

                debug_level             0
                gateway                 
                periodic_ping_period    20
                ls_alarm_period         4
                near_rt_alarm_period    0
                far_rt_alarm_period     10
                leaf_set_size           3
                digit_values            2
                ignore_proximity        false
                location_cache_size     0
                immediate_join          true

The router stage should be modified to contain your networks Bamboo-DHT nodes that act as gateways.

<Router> class bamboo.router.Router <initargs> gateway_count 2 gateway_0 <name-of-router-node-1>:<port> gateway_1 <name-of-router-node-2>:<port> leaf_set_size 4 digit_values 2 immediate_join true </initargs> </Router>

Datamanager is the responsible one for storing all the data and maintaing it. Removing old data after TTL expires. In this example the stage will be initialized to require to ACK per piece of data.

<DataManager> class bamboo.dmgr.DataManager <initargs> required_acks 2 </initargs> </DataManager>

StorageManager stage handles the actual storing of the data. You can just fill in where you want to put the storage file (like "tmp/mystorage").

<StorageManager> class bamboo.db.StorageManager <initargs> homedir <location-of-the-cache> </initargs> </StorageManager>

DHT stage is the DHT it self it tells the system which stage is the actual storage and how many time every key-value pair should be replicated over the ring. Remember that this has to be some safe value. There is multiple papers discussing and calculating this value. Just remember that if you have less nodes in the system than the min_replica_count informs, you will not get a successful put.

<Dht> class bamboo.dht.Dht <initargs> storage_manager_stage StorageManager min_replica_count 8 </initargs> </Dht>

This stage tells the DHT that which port are we listening. See also the discussion about the ports from above. This has to be the + 2 one. For example if node id says 3600, then this has to be 3602.

<Gateway> class bamboo.dht.Gateway <initargs> port <port> </initargs> </Gateway>

Web interface stage is as its name says. It just tells that the web interface stage should be started and that it should show us the storage manager informed in the initargs. The nodes web interface of a node id can be found from the global stage. Just write to the browser URL field "http://<nodes-ip>:<port>". Remember the web interface uses the http port (+1 to the one used in the node id).

<WebInterface> class bamboo.www.WebInterface <initargs> storage_manager_stage StorageManager </initargs> </WebInterface>

The last stage is a virtual coordinate system that is built on top of the whole system. I am not sure if this is needed or not, but it did not bother me while I left it to the configuration file.

<Vivaldi> class bamboo.vivaldi.Vivaldi <initargs> vc_type 2.5d generate_pings true eavesdrop_pings false use_reverse_ping true ping_period 10000 version 1 </initargs> </Vivaldi>

Here is a sample configuration file that I have used for testing.

Compilation


This should be easy. Just write make in the main folder of bamboo sources. I assume that you have Java compiler installed and everything. Check out the visualizer section below if you want to build and use the stand-alone visualizer to track your ring.

Start-up


You have to start each node separately from bamboo/bin folder using command "run-java bamboo.lss.DustDevil <config-file>". Every node has to have its own configuration file. It is possible to start them by scripting as it was done in the configuration file I took as the starting point. It is easier for bigger rings. For testing in small environments few handwritten configuration files is enough. Remember to start the gateways first that there is someone listening for the join messages.

Visualizer


If you want to use more sophisticated visualizer to examine your ring you can use the stand-alone visualizer. By default it will point to the planetlab and to the OpenDHT, but with small changes you will get it working with your own ring.

Go to the "bamboo/vis/FetchNodeInfoThread.java" file with your favorite code editor and search for a line that says "public final String [] GATEWAY_NODES_IP =". Change it to list IPs from your gateways. Their web interface URLs to be exact. For example like the following there is a node running in private address space using address 10.1.0.4 and it is using port 3600.

public final String [] GATEWAY_NODES_IP = { "http//10.1.0.4:3601/", };

Remember that after the changes you have to recompile at least the visualizer. In order to run the visualizer use command "run-java bamboo.vis.Vis <webinterface-URL>".

End notes


Now the Bamboo-DHT ring should be up and running. For more explanations on things concerning the Bamboo-DHT and its setup, please consult Bamboo-DHT website, doc directory in the sources or the Bamboo-Users mailing list.

For some working implementations using bambooDHT, you can check the HIPL code base to see how it uses the OpenDHT instantiation of bambooDHT. You can also see the OpenDHT sites user guide. There are few python scripts that can put, get and remove data.

All of these manuals/tutorials are provided as is. They worked for me and that is all the help I give with them, so if I forgot something or there is a typo you can inform me but do not expect me to solve your problems :) Oh and almost forgot, use them at your own risk.