M. Shahriar Hossain

Appendix: Construction of Data

The number of elements in a 75% imprecise record is 12 where the domain is D{a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p} then an instance of a 75% imprecise record R can be the subset of D with twelve random elements. Consider such a 75% imprecise record R{a, c, d, f, g, h, j, k, m, n, o, p }. We construct different percentage of imprecision of records by picking up random elements from R. This is illustrated in Table 1.

Table 1. Construction of different percentage of imprecise data.

Percentage of Imprecesion	Records
75%	a, c, d, f, g, h, j, k, m, n, o, p
68.75%	a, c, d, f, h, j, k, m, n, o, p
62.5%	a, c, d, f, h, j, k, m, n, p
56.25%	a, d, f, h, j, k, m, n, p
50%	a, d, f, h, j, k, m, p
43.75%	a, d, f, h, k, m, p
37.5%	d, f, h, k, m, p
31.25%	d, f, h, k, m
25%	d, h, k, m
18.75%	d, h, k
12.5%	h, k
6.25% (In this case, 6.25% data are imprecise means all the data are precise which basically means 0% imprecision in this case )	k

For preserving the randomness in large scale, a total of 30000 records were generated for each of the percentage file. One random element is eliminated from a 75% imprecise record at until the record length becomes an atomic element. This is the way how different percentage of imprecision is generated from a 75% imprecise record.

Hence for each record in the 75% imprecise data file, there exists corresponding records in different percentage files that are randomly shortened by certain number of elements at each percentage. That is, if the 75% imprecise file contains a record R{a, c, e, f, g, h, j, k, m, n, o, p} , then the 68.75% imprecise file contains a corresponding record that is a subset of R of length |R|-1, similarly the 62.5% imprecise file contains a corresponding record that is again a subset of R and of length |R|-2. This process continues until the size of the record becomes 1.

Each of our test case data files contained 30000 records. All the data files were executed using different types of hierarchies. The domain length of our test cases is 32. Hence, each record of the 75% imprecise data file contains 24 elements.

Data Files (total 24 files):

imprecision	Data File
75 %	0.txt
71.875 %	1.txt
68.75 %	2.txt
65.625 %	3.txt
62.5 %	4.txt
59.375 %	5.txt
56.25 %	6.txt
53.125 %	7.txt
50 %	8.txt
46.875 %	9.txt
43.75 %	10.txt
40.625 %	11.txt
37.5 %	12.txt
34.375 %	13.txt
31.25 %	14.txt
28.125 %	15.txt
25 %	16.txt
21.875 %	17.txt
18.75 %	18.txt
15.625 %	19.txt
12.5 %	20.txt
9.375 %	21.txt
6.25 %	22.txt
3.125 %	23.txt

Partition Tree Files (total 12 files):