Datafly algorithm

[1][2] Anonymization is achieved by automatically generalizing, substituting, inserting, and removing information as appropriate without losing many of the details found within the data.

The Datafly algorithm has been criticized for trying to achieve anonymization by overgeneralization.

The algorithm selects the attribute with the greatest number of distinct values as the one to generalize first.

[4] Input: Private Table PT; quasi-identifier QI = ( A1, ..., An ), k-anonymity constraint k; domain generalization hierarchies DGHAi, where i = 1,...,n with accompanying functions fAi, and loss, which is a limit on the percentage of tuples that can be suppressed.

PT[id] is the set of unique identifiers or keys for each tuple.