Here is a short list of steps to follow to analyse data using Tane. Steps to follow: 1) Prepare your data: the input data for Tane must be a set of comma separated records without extra whitespace. For example, if you take a look on original/testadata.orig file, you'll see lines like: 128059,1,1,1,1,2,5,5,1,1,2 1285531,1,1,1,1,2,1,3,1,1,2 1287775,5,1,1,2,2,2,3,1,1,2 144888,8,10,10,8,5,10,7,8,1,4 145447,8,4,4,1,2,9,3,3,1,4 167528,4,1,1,1,2,1,3,6,1,2 Put your data in the directory called original. The name of that file must end with .orig, for example, data.orig. The entries can be arbtirary strings (excluding ','), not necessary numeric. For example: monday,raining,15,abc tuesday,shining,24,dfg etc. 2) Generate description file for your data as follows. Take a look on description file in descriptions/testdata.dsc. Make a copy of that file and replace word testdata (in line DataIn = ...) to match your datafile name (in this case data). The name of that file must end with .dsc, for example data.dsc. 3) Generate data for Tane as follows. Now you should have two files ready: original/data.orig [tiedoston nimi? vrt. kohta 1) yllä] descriptions/data.dsc Change to original -directory and run the data generation command: $ cd original $ ../bin/select.bin ../descriptions/data.dsc Now you should find two generated files at the data -directory, data.dat and data.rel. 4) Run analysis at the tane root directory. For example: $ ./bin/tane 11 100 11 data/testdata.dat where first 11 is the level where you want to stop the search, 100 is the number of records from the beginning of the data file you want to use in the search, and the last 11 is the number of attributes you want to use in search. In the testdata there are total of 11 attributes and 100 records, so the whole data is used in the analysis. And the data/testdata.dat is the name of the data file. As an output you'll see lines like: 1 -> 10 1 -> 11 1 2 -> 3 1 2 -> 4 1 2 -> 6 1 2 -> 9 1 4 -> 3 1 3 -> 4 These are the dependencies that are found during the analysis. Line '1 -> 10' means that attribute number 1 defines the attribute number 10. Replace the number with the actual name of the attribute to make the output more readable. Between the dependency lines there are some logging information, which you can ignore, such as: Level == 4 #candidates == 330 avg.elements == 6 (14186/2257) i.e., in search level 4 there are 330 candidates left to analyse. You can also output the result of the search to a file: $ ./bin/tane 11 100 11 data/testdata.dat > results.txt results.txt will be in 'unix' form, new line character at the end of each line, and should be shown correctly in Wordpad (not in Notepad).