The Eighteenth International Conference on
Raleigh, North Carolina. September 12-16, 2009.
Polyhedral-Model Guided Loop-Nest Auto-Vectorization
Konrad Trifunovic, Ayal Zaks, Albert Cohen and Dorit Nuzman
Optimizing compilers strive to construct efficient executables by applyingsequences of transformations. Additional transformations are constantly beingdevised, with various mutual interactions among them, thereby exacerbating thenotoriously difficult phase-ordering problem --- that of deciding whichtransformations to apply and in which order. Fortunately, new infrastructuressuch as the polyhedral compilation framework host a variety of transformations,facilitating the efficient exploration and configuration of multipletransformation sequences. Many powerful optimizations, however, remainexternal to the polyhedral framework, with potential mutual interactionsthat need to be considered.In this paper we examine the interactions between loop transformations of thepolyhedral compilation framework and subsequent vectorization optimizationstargeting fine-grain SIMD data-level parallelism. Automatic vectorizationinvolves many low-level, target-specific considerations and transformations,which currently exclude it from being part of the polyhedral framework.In order to consider potential interactions among polyhedral looptransformations and vectorization, we first model the performanceimpact of the different loop transformations and vectorization strategies,and then show how this cost model can be integrated seamlessly into the polyhedralrepresentation. This predictive modelling then facilitates efficientexploration and educated decision making on how tobest apply various polyhedral loop transformationswhile considering the subsequent effects of different vectorization schemes. Our work demonstrates the feasibility and benefit of tuning the polyhedral modelin the context of vectorization. Experimental results confirm that our modelhas accurate predictions, providing speedups of over 2 on average overtraditional innermost-loop vectorization on PowerPC970 and Cell-SPU SIMDplatforms.