|
|
The core
idea behind the NovoSpark's visualization technique is to present each
multidimensional observation as a single observation curve. With this
approach if two data observations are close, the observation curves will have
very similar shapes, whereas if the records are different, the curves will look
significantly different as well.
Let us
illustrate this by an example. Suppose we have two 10-dimensional
observations A and B as shown below.
|
| A: {53.78, 1, 17.56, 2.54, 6.36, 0.16, 4.63,
8.1, 3.28, 1.9} |
B: {50.53, 1.4, 19.05, 2.34, 5.95, 1.53, 3.63,
7.82, 2.98, 2.48} |
|
The observation curves
below are the visual representations of observations A
and B.
|
|

Figure 1. Observation curve A
|

Figure 2. Observation curve B
|
|
Let us now put these two images together. As you may have noticed, the
observation curves A and B look very similar.
It means that the original observations are very much alike too.
The more indistinguishable the observation curves are from each other, the
more identical the original data observations are.
|

Figure 3. Observation curves A and B
|
|
The approach establishes a one-to-one correspondence between data records
and observation curves.
The order of observation parameters, or data
columns, is significant for the shapes of observation
curves. For instance, if we swap the first three parameters in
observations A and B, the shapes of the observation
curves will change too, as shown on Figure 4.
A (new): {17.56, 53.78, 1, 2.54, 6.36, 0.16, 4.63,
8.1, 3.28, 1.9}
B (new): {19.05, 50.53, 1.4, 2.34, 5.95,
1.53, 3.63, 7.82, 2.98, 2.48}
|

Figure 4. Observation curves A and B with swapped
parameters
|
|
An observation curve is a two-dimensional image of a multidimensional
data observation.
When the curves are rendered in three-dimensional space, with the third
dimension, also called "Z-dimension" , representing either a distance in
multidimensional space or time span between two observations, a lot
of interesting data properties can be seen.
|

Figure 5. Observation curves A and B shown in 3D space
|
|
A straight path between observation A and observation B in
multidimensional space can be represented as a surface
connecting two observation curves. Any observation with intermediate data
values like the one highlighted in red on Figure 6, will
lie on this surface.
The image on Figure 7
below was obtained as a result of connecting 38 individual observation
curves in an ordered dataset.
|

Figure 6. The shortest path between observations A and B
|
|

Figure 7. Dynamic process on a 3D view
|
|
Three-dimensional visualization space provides an opportunity to look at
the same image from different angles. For instance, the below image
represents a "left side" 2D projection view of the above dataset.

Figure 8. Dynamic process on a 2D view
|
|
Let us get back to observations A and B . If
we were to see the differences between them in more
detail, one of the techniques that could be used for this purpose is
the transformation of the original data.
|
|
The original data can be
normalized to have data values lie within the [0, 1] range. For
instance, here is the result of applying the normalization
to observations A and B:
The original data with max
values in each column highlighted in bold:
A: { 53.78, 1.0,
17.56, 2.54, 6.36, 0.16, 4.63, 8.10, 3.28,
1.90 }
B: { 50.53, 1.4,
19.05, 2.34, 5.95, 1.53, 3.63, 7.82, 2.98, 2.48 }
The normalized data:
A: {1, 0, 0, 1,
1, 0, 1, 1, 1, 0}
B: {0, 1,
1, 0, 0, 1, 0, 0, 0, 1}
|

Figure 9. Observation curves A and B after normalization
|
|
In the cases when different
observation parameters are measured in different units or scale, the
shapes of observation curves may get obscured by the nibs appearing
on both sides of the curves. To eliminate them and enhance the
general appearance of the curves a filter can be applied as shown on
Figure 10.
|

Figure 10. Observation curves A and B after normalization
and filter
|
|
To see the differences
between observation curves in even greater detail a color palette can
be used to identify curves' levels based on the colors they are rendered
with on the image. By imaginarily stretching the curves along the Z-axis and
looking at the resulting color bars from the top you can obtain a spectrum
view of each observation.
The below images show the
spectrum bars of observations A and B . You can see that
they look very similar, which is an indication that the original data
observations are very similar as well.
|
|

Figure 11. Observation curve A and its spectrum bar
|

Figure 12. Observation curve B and its spectrum bar
|
|
|