Add option to repeat the kernel on the device NUM times to increase benchmark accuracy