The idea behind deferred updates processing is to postpone updating of the existing record and store incoming deltas as a new record. Thus, record update operations become a simple write operations with corresponding performance. Deferred updates technique elaborated here fits well when system handles a lot of updates of stored data and write performance is the main concern, while reading speed requirements are not that strict. The following cases (each of them separately or any combination of them) may indicate that one can benefit from using the technique (the list is not complete):
- updates are well spread over the whole and large dataset
- lower rate (among write operations) of “true updates” (i.e. low percentage of writes are for completely new data, not really updates of existing data)
- good portion of data stored/updated may never be accessed
- system should be able to handle high write peaks without major performance degradation
Sounds like CQRS event stores.
Project available on ☞ GitHub.
Original title and link: Deferring Processing Updates to Increase HBase Write Performance (NoSQL databases © myNoSQL)