Mergeable Summaries for Heavy Hitters

202503031501
Status: #idea
Tags: UtahDB

Mergeable Summaries for Heavy Hitters - Misra-Gries and SpaceSaving Algorithms

Misra-Gries Algorithm

Info

Dropping things as a batch improves runtime, which is why we prefer to remove all 0’s aggressively, instead of dropping as we need space.

Merging MG Summaries

Attention

We could alternatively sort it, and prune everything after the largest k elements

Tldr

When you merge multiple MG summaries, you are not worse-off in terms of error


References

  1. Paper