HPC Magazine February 2014 - An Introduction to Performance Programming - part I.
Listing 3: Assembly code of the saxpy function in Listing 1 modified with mulpd (simultaneous computation of two fp operations).
..___tag_value_saxpy.1: xorl %eax, %eax movslq %edi, %rdi testq %rdi, %rdi jle ..B1.5 ..B1.3: movupd (%rsi,%rax,8), %xmm1 mulpd %xmm0, %xmm1 addpd (%rdx,%rax,8), %xmm1 movupd %xmm1, (%rdx,%rax,8) incq %rax cmpq %rdi, %rax jl ..B1.3 ..B1.5: ret