Memory efficient array transposition via multi pass tiling

A schedule can be generated for physically transposing an array such that when the array is transferred from a first memory type to a second memory type, the number of block transfers performed is minimized. The array can be rearranged to ensure that most or all data elements in any block read into internal memory are used before that block's internal storage is reused. The algorithm can compute an offline schedule and then execute that schedule. The method can assemble the schedule during one or more passes with an algorithm. Scheduling passes can apply a permutation to a representation of the array's structure and then tile the representation to ensure efficient use of internal memory. Tiling may alter the permutation, so the algorithm can be reinvoked and run on the tiled representation. The algorithm can be run on successively retiled representations until no additional tiling is required. Optimizations for schedule generation and/or execution include: simplification of the data structure representing the array; alteration of the order in which tiles are accessed; inversion of permutations; processing non-canonical input; and the production of stream output. The method can also be modified for specialized architectures, including: block I/O architecture; fully associative caches; set-associative caches; and multi-level memory hierarchies.

Web www.patentalert.com

< Systems and methods for interacting with a user interface of a media player

> Adaptive web crawling using a statistical model

~ 00446