A non-volatile memory and methods include cached page copying using a
minimum number of data latches for each memory cell. Multi-bit data is
read in parallel from each memory cell of a group associated with a first
word line. The read data is organized into multiple data-groups for
shuttling out of the memory group-by-group according to a predetermined
order for data-processing. Modified data are returned for updating the
respective data group. The predetermined order is such that as more of
the data groups are processed and available for programming, more of the
higher programmed states are decodable. An adaptive full-sequence
programming is performed concurrently with the processing. The
programming copies the read data to another group of memory cells
associated with a second word line, typically in a different erase block
and preferably compensated for perturbative effects due to a word line
adjacent the first word line.