Systems and methods are provided for training neural networks and other
systems with heterogeneous data. Heterogeneous data are partitioned into
a number of data categories. A user or system may then assign an
importance indication to each category as well as an order value which
would affect training times and their distribution (higher order favoring
larger categories and longer training times). Using those as input
parameters, the ordered training generates a distribution of training
iterations (across data categories) and a single training data stream so
that the distribution of data samples in the stream is identical to the
distribution of training iterations. Finally, the data steam is used to
train a recognition system (e.g., an electronic ink recognition system).