The subject disclosure pertains to a powerful and flexible framework for
record matching. The framework facilitates design of a record matching
query or package composed of a set of well-defined primitive operators
(e.g., relational, data cleaning . . . ), which can ultimately be
executed to match records. To assist design of such packages, a learning
technique based on examples is provided. More specifically, a set of
matching and non-matching record pairs can be input and employed to
facilitate automatic package generation. A generated package can
subsequently be transformed manually and/or automatically into a
semantically equivalent form optimized for execution.