There exists a data collection system that collects dimensions from environment sensors that measure velocity water flowing via a river or funnel. Each measurement creates a set quantity of values (e.g. Date, Time, Temperature, Pressure etc.) plus a listing of velocity values.
Initially the sensors provided three velocity values and so i simply saved each value in it's own column of merely one table inside a FireBird database. Afterwards sensor were introduced that may output as much as nine velocity values and so i simple added six more posts. Despite the fact that most sensors use under 9 values, I believed it wouldn't be considered a problem if the majority of the posts just contained zeroes.
However I am facing a brand new generation that may output everything from 1 to 256 values and that i assume it won't be extremely powerful to include another 247 posts, especially since the majority of the dimensions will still only contain 3 to 9 values.
Because the dimensions are collected every ten minutes, and also the database consists of all data for 30-50 sensors the quantity of information is very significant following a couple of years, yet it should be easy to generate overviews/graphs for just about any random time period.
What exactly will be the most effective method to keep variable listing of values ?
Since each record has it's own unique ID, I suppose I possibly could just store all velocity values in separate table, each value labeled with it's record ID. I simply possess the feeling this wouldn't be extremely powerful which it might end up with slow after while.
Databases are designed for considerable amounts of information inside a table if you are using efficient indexes. So this can be used table structure:
create table measurements ( id, seq integer, -- between 1 and 256 ts timestamp, -- Timestamp of the measurement value decimal(...) )
Create a catalog on
id, seq and
ts. That will help you to search effectively with the data. Should you distrust your database, just place a couple of million rows and run a few chooses to determine how good it costs.
To compare: I've an Oracle database here with 112 million rows and that i can choose an archive by timestamp or ID within 120ms (.12s)
You could lay aside serialized data inside a text area, for instance JSON-encoding the dimensions as:
[<velocity-value-1>, <velocity-value-2>, ...]
Then, inside your code, deserialize the values after querying.
This will work nicely should you only filter your queries through the other fields, and never through the saved values. Should you choose filter through the values, with them in
WHERE clauses is a nightmare.
I'd opt for another table:
table measurements (Id, DateTime, Temperature, Pressure) table velocity (Id, MeasurementId, Sequence, Value)
Velocity.Sequence may be the index from the velocity value for your measurement (1-256).
Populate these tables with data as near to real-world as you possibly can and try out the sql claims to get the best indexes.