A method and apparatus for processing (transporting) data, such as in a data
warehouse
system. In one embodiment, the data are received from a source and compared to
data in a lookup cache comprising a subset of data from a first data set (e.g.,
a dimension table). Instances of the data not present in a lookup cache (that is,
new data) are identified. Information corresponding to these instances are generated
(e.g., a unique identifier is associated with each of these instances), and the
first data set is updated accordingly. The lookup cache is then updated with the
new data and the unique identifiers. Accordingly, the information (data) in the
lookup cache and in the first data set are in synchronization. The lookup cache
does not need to be rebuilt (e.g., to update a second data set such as a fact table).