Computing On Many Cores
Résumé
This paper presents a new method to parallelize programs, adapted to manycore processors. The method relies on a parallelizing hardware and a new programming style. A manycore design is presented, built from a highly simplified new core microarchitecture, with no branch predictor, no data memory and a three stage pipeline. Cores are multithreaded, run out-of-order but not speculatively and fork new threads. The new programming style is based on functions and avoids data structures. The hardware creates a concurrent thread at each function call. Loops are replaced by semantically equivalent divide and conquer functions. Instead of computing on data structures, we compute in parallel on scalars, favouring distribution and eliminating inter-thread communications. We illustrate our method on a sum reduction, a matrix multiplication and a sort. C implementations using no array are parallelized. From loop templates, a MapReduce model can be implemented and dynamically deployed by the hardware. We compare our method to pthread parallelization, showing that (i) our parallel execution is deterministic, (ii) thread management is cheap, (iii) parallelism is implicit and (iv) functions and loops are parallelized. Implicit parallelism makes parallel code easy to write. Deterministic parallel execution makes parallel code easy to debug.
Origine | Fichiers produits par l'(les) auteur(s) |
---|
Loading...