A method and apparatus for different embodiments of probabilistic summary
data structure based encoding for garbage collection are described. In
one embodiment, a method comprises generating a probabilistic summary
data structure that represents active blocks of data within a storage
device based on identifications of the active blocks or the data within
the active blocks. The method also includes performing garbage collection
of at least a portion of the storage device based on the probabilistic
summary data structure.