While microprocessors have been the dominant devices in use for general-purpose computing for the last decade, there is still a large gap between the computational efficiency of microprocessors and custom silicon. Reconfigurable devices, such as FPGAs, have come closer to closing that gap, offering a 10x benefit in computational density over microprocessors, and often offering another potential 10x improvement in yielded functional density on low granularity operations. On highly regular computations, reconfigurable architectures have a clear superiority to traditional processor architectures. On tasks with high functional diversity, microprocessors use silicon more efficiently than reconfigurable devices. The BRASS project is developing a coupled architecture which allow a reconfigurable array and processor core to cooperate efficiently on computational tasks, exploiting the strengths of both architectures.
We are developing an architecture and a prototype component that will combine a processor and a high performance reconfigurable array on a single chip. The reconfigurable array extends the usefulness and efficiency of the processor by providing the means to tailor its circuits for special tasks. The processor improves the efficiency of the reconfigurable array for irregular, general-purpose computation.
We anticipate that a processor combined with reconfigurable resources can achieve a significant performance improvement over either a separate processor or a separate reconfigurable device on an interesting range of problems drawn from embedded computing applications. As such, we hope to demonstrate that this composite device is an ideal system element for embedded processing.
Reconfigurable devices have proven extremely efficient for certain types of processing tasks. The key to their cost/performance advantage is that conventional processors are often limited by instruction bandwidth and execution restrictions or by an insufficient number or type of functional units. Reconfigurable logic exploits more program parallelism. By dedicating significantly less instruction memory per active computing element, reconfigurable devices achieve a 10x improvement in functional density over microprocessors. At the same time this lower memory ratio allows reconfigurable devices to deploy active capacity at a finer grained level, allowing them to realize a higher yield of their raw capacity, sometimes as much as 10x, than conventional processors.
The high functional density characteristic of reconfigurable devices comes at the expense of the high functional diversity characteristic of microprocessors. Microprocessors have evolved to a highly optimized configuration with clear cost/performance advantages over reconfigurable arrays for a large set of tasks with high functional diversity. By combining a reconfigurable array with a processing core we hope to achieve the best of both worlds.
While it is possible to combine a conventional processor with commercial reconfigurable devices at the circuit board level, integration radically changes the i/o costs and design point for both devices, resulting in a qualitatively different system. Notably, the lower on-chip communication costs allow efficient cooperation between the processor and array at a finer grain than is sensible with discrete designs.
Integrating reconfigurable devices with processors is an active area of research for many groups around the world; check out our summary of past and present efforts for an overview of relevant efforts.