Time Indexing Features
Time-indexing has a set of operations on a data container and
stores its data using its well-defined formats for all the
time-based data. Everything is presented as streams of time,
with data attached at particular times. The difference
between the time-indexing architecture, presented here, and
existing approaches to storing data where time is a key element,
is that time-indexing provides a
consistent and coherent framework for doing any time-based
selection and processing.
The time-indexes rely on the fact that time is well ordered,
and this ordering is maintained by an index.
In fact, they are time-ordered containers of data.
They give access to data at given timestamps, but
do nothing special with the data.
The indexes themselves are data agnostic.
This is a bizarre property of a data container.
From Nanoseconds to Millennia
Time-indexes have been presented in general terms, but in this section
a more detailed discussed is made.
Time for most people is usually represented by a value on a clock or a watch,
and usually constitutes an hour and a number of minutes. Sometimes people
consider the date with the year, the month, and the day to be correlated
with the time. In most instances, this is an adequate representation of time.
There are situations, however, where time has to be considered in a different
scope than the usual perception. Time may viewed at a very small scale of
increments, resolving to microseconds or nanoseconds, or at a very large scale
resolving to millions of years.
The time values allowed in this technology, encapsulate all of the
above scenarios. Therefore, a time-index can store data with timestamps
being seconds apart, or it can store data with timestamps
just microseconds apart. Alternatively, it can store data with timestamps that are
thousands of years apart. Such a wide range of timestamps gives this technology
a broad range of applicability where time is involved.
By having such accuracy and spread in the times held in a time-index,
data can be presented back to the user at that level of accuracy.
This means that it becomes possible for an application writer to choose how to
present data to end-users. It can be done in bulk, like a report, or it
can be done in real-time.
That is, data is presented an item at a time, where each item is presented
with gap equal to the gap of the timestamps.
Such real-time data
presenters are ideal in multi-media or simulation frameworks.
If an application needs the type of the data that is being
held in a time-index, this can be held as well. Whether it be
a number, a string, a text file, an image, an audio segment,
or a video frame; all of these can be stored.
The type held is not a feature of the whole index,
each
individual element in a time-index can hold the type of the data
held for that entry, even if the type varies.
Furthermore, the size of the
data for each element is not limited.
Each element can hold data that is a different size to any of the other elements.
Finally, the total number of
time and data values that can be held in each index is also unlimited,
meaning that the number of elements can run to
billions.
Basically, a time-index can hold a few items of data or can expand to be a massive
data set. The times held are resolved to any accuracy required.
Such attributes mean that it becomes possible to reconsider how
data is stored, what data is stored, and for how long data is stored.
Data Security and Data Integrity
Once data is in an index it cannot be changed;
it is immutable .
There are no operations to change
data held at a particular timestamp.
Data can only be appended to the end of a time-index.
Also there no operations to change timestamps, they too are immutable .
The lack of modify or update operations may seem like a major drawback,
but rather, the converse applies. The advantage is that data is secure
as it can never be altered. With this attribute,data integrity is also maintained,
as it is not possible to take parts of data away.
Consider how today's computer systems usually replace
existing data with the latest version. The original data
is considered to be out-of-date, and its value is lost forever.
Both data security
and integrity are lost when the update occurs.
When using time-indexing, rather than changing a data value,
a new value is appended to the index at the time the change occurs.
With the
time-indexing technology presented here, every
version of the data can be saved.
Asking the time-index for the latest version of the data
gets the most up-to-date value, but
one can go back in time and find previous values.
No data is ever lost.
Continuous vs Discrete
Once time-indexing has been added to an application, the way
the time-index is used will vary from application to application.
As the data in these applications can have different internal
structures, the time-indexing can be used differently.
The main variance in the way time-indexing is used for
data depends on whether that data is continuous, such as
multi-media streams, or discrete, such as stock prices or log file entries.
Continuous data is time sensitive within the data, discrete data
is time sensitive for each piece of data.
An example of continuous data is audio data.
One audio file can be split into many
subcomponents (audio samples), with each audio sample having its own
time-index item. In this case, the single audio file will be divided
such that there are many time-index items for the one audio file.
An example of discrete data is the stock price.
A stock price system could hold each value
of the price, with each version being indexed within one time-index
item. In this case, there is one version of the stock value for
one time index item.
Sharing and Overlaying
By having a well-defined set of operations and a set of common formats,
the need for different formats to hold time related data can be significantly
reduced.
Data which was in a file and originally intended for just one tool,
can be utilized
in a range of tools by using plug-in components such as data formatters.
The number of tools that can access the same data is increased.
Time-indexing is, therefore, a catalyst for a high degree of data sharing.
Data sharing not only means that different tools can utilize a
wider range of data sets, it also allows the same data set to
be used concurrently by the a single tool. It also means that
the same data can be viewed by different tools without having
the intermediate step of converting and copying data from one
format to another to suit the tool.
Time-indexing not only promotes data sharing, it also promotes data overlaying,
where data sets are aggregated and visualized together in the same tool.
By overlaying previously unrelated data it becomes possible to observe and
expose new patterns and relationships in that data.
These patterns and relationships may not have been previously understood
as no formal connection had been made between them.
By converting existing applications to use time-indexing they
can be made to interact with data originally destined for other
applications.
New application areas
which have not traditionally held time data,
can be augmented with time-indexing to bring about an increase
in application functionality and effectiveness for such
applications.
Summary
To summarize, time-indexing assumes that time is ordered, with times
ranging from millions of years down to nanoseconds. The indexes themselves
can hold any kind of data, with billions of separate data items being held.
The main properties of time-indexing
are time and data immutability
ensuring data security and data integrity,
data sharing, concurrent access, data overlaying,
plus the ability to hold both continuous and discrete data.
|