Roshi is a set CRDT store modeled as a stateless web server in front of N (clusters) x M (shards) redis servers. It has 3 operations: Insert, Select, and Delete
Each element is a {"key", "member", "score"}
- A
keyis the set name essentially. You select based onkey.keys can be merged on select. - A
memberis the unique name of a member of the set - A
scoreis a numeric value that has multiple uses.- If two elements with the same
keyandmemberare inserted, a select will only return the one with the highestscore - If an element is deleted, the
scoreof the delete must be at least as high as thescorein the database. - One can think of the
scoreas a timestamp, because that's SoundCloud's use for it.
- If two elements with the same
key = <raffle_id>member = <bundle_id>:<trigger_id>:<score_n>wherescore_n = 1,2,3,4for an entry with a score of 4score = <timestamp_of_action>EntryCount(raffle_id) = Count(Select(raffle_id))SelectRandom(raffle_id) = Nth(Select(raffle_id), Random() % EntryCount(raffle_id))
SelectRandom and EntryCount are fast, but memory usage increased by a factor of avg_entry_score over option B
key = <raffle_id>member = <bundle_id>:<trigger_id>score = <trigger_score>EntryCount(raffle_id) = Sum(Map(Select(raffle_id), :score))SelectRandom(raffle_id) = IterateUntil0(Select(raffle_id), Random() % EntryCount(raffle_id))
SelectRandom is still slow, but memory usage is less (by a factor of avg_entry_score
I like Option A, and then you could go back through and squash the extra entries after the fact, which would allow for the same memory usage as Option B.