- Provide high insertion rate
- Provide a good compression rate to store more data on expensive SSDs
- Engine should be SSD friendly (less writes per timeperiod to help with SSD
- Provide a reasonable response time (within ~50 ms) on SELECT queries on hot
recently inserted data
Looking on these requirements I actually think that TokuDB might be a good
fit for this task.
There are solutions in the NoSQL space that are optimized for this scenario: Cassandra or OpenTSDB. Indeed using one of these will have an impact on the application side.
Most of the time when the requirements dictate looking into different solutions, the easiest to estimate is the initial costs: development (nb: this doesn’t include only pure development, but also learning costs, etc.) and hardware costs.
Unfortunately many times we ignore taking into consideration long term costs:
- maintenance costs (hardware, operations, enhancements)
- opportunity costs (features that the current architecture won’t be able to support as being either impossible or too expensive)
- accounting for the risks of failed initial designs (the technical debt costs)
Way too many times we optimize for the initial costs (the general excuse is that familiarity delivers faster—with the more scientific forms: time to market is essential and premature optimization is the root of all evil), while ignoring almost completely the ongoing costs.
Original title and link: Considering TokuDB as an engine for timeseries data… or Cassandra or OpenTSDB ( ©myNoSQL)