Quantcast
Channel: AVX512BW: handle 64-bit mask in 32-bit code with bsf / tzcnt? - Stack Overflow
Viewing all articles
Browse latest Browse all 3

AVX512BW: handle 64-bit mask in 32-bit code with bsf / tzcnt?

$
0
0

this is my code for 'strlen' function in AVX512BW

vxorps          zmm0, zmm0, zmm0   ; ZMM0 = 0
vpcmpeqb        k0, zmm0, [ebx]    ; ebx is string and it's aligned at 64-byte boundary
kortestq        k0, k0             ; 0x00 found ?
jnz             .chk_0x00

now for 'chk_0x00', in x86_64 systems, there is no problem and we can handle it like this:

chk_0x00:
kmovq   rbx, k0
tzcnt   rbx, rbx
add     rax, rbx

here we have a 64-bit register so we can store the mask into it but my question is about x86 systems where we don't have any 64-bit register so we must using 'memory' reserve (8-byte) and check both DWORD of the mask one by one (in fact, this is my way and i want to know if there is any better way)

chk_0x00:
kmovd   ebx, k0       ; move the first dword of the mask to the ebx
test    ebx, ebx      ; 0x00 found in the first dword ?
jz      .check_next_dword
bsf     ebx, ebx
add     eax, ebx
jmp     .done
.check_next_dword:
      add     eax, 32     ; 0x00 is not found in the first DWORD of the mask so we pass it by adding 32 to the length
      sub     esp, 8      ; reserve 8-byte from memory
      kmovq   [esp], k0   ; move the 8-byte MASK from k0 to our reserved memory
      mov     ebx, [esp+4] ; move the second DWORD of the mask to the ebx
      bsf     ebx, ebx
      add     eax, ebx
      add     esp, 8

in my x86 way, i used 'kmovd' to move the first DWORD of the mask into the ebx but i don't know what i have to do for the second DWORD of the mask !!! so i just reserved 8-byte from memory and move the mask (8-byte) into it then i moved the second dword into the ebx and checked it again ... is there any better solution ? (i think my way is not FAST enough) Also is it true to use vxorps to initializing a zmm register with zero ?


Viewing all articles
Browse latest Browse all 3

Trending Articles