Appendix: Construction of Data
The number of elements in a 75% imprecise record is 12 where the domain is D{a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p} then an instance of a 75% imprecise record R can be the subset of D with twelve random elements. Consider such a 75% imprecise record R{a, c, d, f, g, h, j, k, m, n, o, p }. We construct different percentage of imprecision of records by picking up random elements from R. This is illustrated in Table 1.
Table 1. Construction of different percentage of imprecise data.
Percentage of Imprecesion |
Records |
75% |
a, c, d, f, g, h, j, k, m, n, o, p |
68.75% |
a, c, d, f, h, j, k, m, n, o, p |
62.5% |
a, c, d, f, h, j, k, m, n, p |
56.25% |
a, d, f, h, j, k, m, n, p |
50% |
a, d, f, h, j, k, m, p |
43.75% |
a, d, f, h, k, m, p |
37.5% |
d, f, h, k, m, p |
31.25% |
d, f, h, k, m |
25% |
d, h, k, m |
18.75% |
d, h, k |
12.5% |
h, k |
6.25% (In this case, 6.25% data are imprecise means all the
data are precise which basically means 0% imprecision in this case ) |
k |
For preserving the randomness in large scale, a total of 30000 records were generated for each of the percentage file. One random element is eliminated from a 75% imprecise record at until the record length becomes an atomic element. This is the way how different percentage of imprecision is generated from a 75% imprecise record.
Hence for each record in the 75% imprecise data file, there
exists corresponding records in different percentage files that are randomly
shortened by certain number of elements at each percentage. That is, if the 75%
imprecise file contains a record R{a, c, e, f, g, h, j,
k, m, n, o, p} , then the 68.75% imprecise file contains a corresponding
record that is a subset of R of
length |R|-1, similarly the 62.5%
imprecise file contains a corresponding record that is again a subset of R and of length |R|-2. This process continues until the size of the record becomes
1.
Each of our test case data files contained 30000 records. All the data files were executed using different types of hierarchies. The domain length of our test cases is 32. Hence, each record of the 75% imprecise data file contains 24 elements.
Data Files (total 24 files):
imprecision |
Data File |
75 |
|
71.875 |
|
68.75 |
|
65.625 |
|
62.5 |
|
59.375 |
|
56.25 |
|
53.125 |
|
50 |
|
46.875 |
|
43.75 |
|
40.625 |
|
37.5 |
|
34.375 |
|
31.25 |
|
28.125 |
|
25 |
|
21.875 |
|
18.75 |
|
15.625 |
|
12.5 |
|
9.375 |
|
6.25 |
|
3.125 |
Partition Tree Files (total 12
files):