Article by Timothy J. Callahan and John Wawrzynek, published in Proceedings International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) 2000.
Abstract:The Garp compiler and architecture have been developed in parallel, in part to help investigate whether features of the architecture help facilitate rapid, automatic compilation utilizing the Garp's rapidly reconfigurable coprocessor. Previously reported work for compiling to Garp has drawn heavily on techniques from software compilation rather than high-level synthesis. That trend continues in this paper, which describes the extension of those techniques to support pipelined execution of loops on the coprocessor. Even though it targets hardware, our approach resembles VLIW software pipelining much more than it resembles hardware synthesis retiming algorithms.
This paper presents a simple, uniform scheme for pipelining the hardware execution of a broad class of loops. The loops can have multiple control paths, multiple exits (including exits resulting from hyperblock path exclusion), data-dependent exits, and arbitrary memory accesses.