[0/2,vect] Improve vectorization of epilogues

Message ID 46aba879-1e0b-13d9-0592-68972495a283@arm.com
Headers show
  • Improve vectorization of epilogues
Related show


Andre Vieira (lists) Aug. 23, 2019, 4:50 p.m.

In this patch series I address PR 88915 in the first patch, by enabling 
vectorization of epilogues when doing loop versioning and turn on 
--param vect-epilogues-nomask=1 on by default in the second patch.

I benchmarked intrate SPEC2017 for both aarch64 and x86_64 (AVX512).

This patch gives aarch64 a 7% up for x264_r on SPEC2017, with all other 
benchmarks in intrate staying the same. On a x86_64 with AVX512 I do see 
a 3% drop on exchange2_r and a 1% drop on xz_r. Other benchmarks either 
go up a little or stay the same. Again x264_r showing the highest gain 
with a 16% improvement and the intrate geomean goes up by 1%.

Andre Vieira (2):
[PATCH 1/2][vect]PR 88915: Vectorize epilogues when versioning loops
[PATCH 2/2][vect]Make vect-epilogues-nomask=1 default