Looks like KSHIFTRQ could be used as an alternative, to right-shift top 32-bits of k0
counter to be lower 32-bits, which could be copied to the regular purpose register. Like:
.check_next_dword:
add eax, 32
KSHIFTRQ k0, k0, 32 ;shift hi 32 bits to be low 32 bits
kmovd ebx, k0
...
And yes, vxorps zmm0, zmm0, zmm0
will set zmm0
to zero, as according to vxorps referense it's xor-ing without mask into 3-rd argument (you may check as well this SO question about zeroing zmm register)