This test was run at 20/12/2024 16:00:51
Results for a cyclic layout with add C
benchmark | layout | add C | M | N | K | plots | cycles | ideal | utilization |
---|---|---|---|---|---|---|---|---|---|
dense_gemm_cyclic_32x32x32 | cyclic | yes | 32 | 32 | 32 | yes | 117 | 80 | 0.6837606837606838 |
dense_gemm_cyclic_32x32x48 | cyclic | yes | 32 | 32 | 48 | yes | 145 | 112 | 0.7724137931034483 |
dense_gemm_cyclic_32x32x64 | cyclic | yes | 32 | 32 | 64 | no | 192 | 144 | 0.75 |
dense_gemm_cyclic_32x48x32 | cyclic | yes | 32 | 48 | 32 | yes | 173 | 120 | 0.6936416184971098 |
dense_gemm_cyclic_32x48x48 | cyclic | yes | 32 | 48 | 48 | yes | 211 | 168 | 0.7962085308056872 |
dense_gemm_cyclic_32x48x64 | cyclic | yes | 32 | 48 | 64 | no | 288 | 216 | 0.75 |
dense_gemm_cyclic_32x64x32 | cyclic | yes | 32 | 64 | 32 | no | 229 | 160 | 0.6986899563318777 |
dense_gemm_cyclic_32x64x48 | cyclic | yes | 32 | 64 | 48 | no | 277 | 224 | 0.8086642599277978 |
dense_gemm_cyclic_32x64x64 | cyclic | yes | 32 | 64 | 64 | no | 384 | 288 | 0.75 |
dense_gemm_cyclic_48x32x32 | cyclic | yes | 48 | 32 | 32 | yes | 173 | 120 | 0.6936416184971098 |
dense_gemm_cyclic_48x32x48 | cyclic | yes | 48 | 32 | 48 | yes | 213 | 168 | 0.7887323943661971 |
dense_gemm_cyclic_48x32x64 | cyclic | yes | 48 | 32 | 64 | no | 288 | 216 | 0.75 |
dense_gemm_cyclic_48x48x32 | cyclic | yes | 48 | 48 | 32 | yes | 257 | 180 | 0.7003891050583657 |
dense_gemm_cyclic_48x48x48 | cyclic | yes | 48 | 48 | 48 | yes | 312 | 252 | 0.8076923076923077 |
dense_gemm_cyclic_48x48x64 | cyclic | yes | 48 | 48 | 64 | no | 432 | 324 | 0.75 |
dense_gemm_cyclic_48x64x32 | cyclic | yes | 48 | 64 | 32 | no | 341 | 240 | 0.7038123167155426 |
dense_gemm_cyclic_48x64x48 | cyclic | yes | 48 | 64 | 48 | no | 411 | 336 | 0.8175182481751825 |
dense_gemm_cyclic_48x64x64 | cyclic | yes | 48 | 64 | 64 | no | 576 | 432 | 0.75 |
dense_gemm_cyclic_64x32x32 | cyclic | yes | 64 | 32 | 32 | no | 229 | 160 | 0.6986899563318777 |
dense_gemm_cyclic_64x32x48 | cyclic | yes | 64 | 32 | 48 | no | 281 | 224 | 0.797153024911032 |
dense_gemm_cyclic_64x32x64 | cyclic | yes | 64 | 32 | 64 | no | 384 | 288 | 0.75 |
dense_gemm_cyclic_64x48x32 | cyclic | yes | 64 | 48 | 32 | no | 341 | 240 | 0.7038123167155426 |
dense_gemm_cyclic_64x48x48 | cyclic | yes | 64 | 48 | 48 | no | 413 | 336 | 0.8135593220338984 |
dense_gemm_cyclic_64x48x64 | cyclic | yes | 64 | 48 | 64 | no | 576 | 432 | 0.75 |
dense_gemm_cyclic_64x64x32 | cyclic | yes | 64 | 64 | 32 | no | 453 | 320 | 0.7064017660044151 |
dense_gemm_cyclic_64x64x48 | cyclic | yes | 64 | 64 | 48 | no | 545 | 448 | 0.8220183486238533 |
dense_gemm_cyclic_64x64x64 | cyclic | yes | 64 | 64 | 64 | no | 768 | 576 | 0.75 |
average | 0.7502518358352567 |
Results for a cyclic layout
benchmark | layout | add C | M | N | K | plots | cycles | ideal | utilization |
---|---|---|---|---|---|---|---|---|---|
dense_matmul_cyclic_32x32x32 | cyclic | no | 32 | 32 | 32 | yes | 95 | 80 | 0.8421052631578947 |
dense_matmul_cyclic_32x32x48 | cyclic | no | 32 | 32 | 48 | yes | 130 | 112 | 0.8615384615384616 |
dense_matmul_cyclic_32x32x64 | cyclic | no | 32 | 32 | 64 | no | 164 | 144 | 0.8780487804878049 |
dense_matmul_cyclic_32x48x32 | cyclic | no | 32 | 48 | 32 | yes | 139 | 120 | 0.8633093525179856 |
dense_matmul_cyclic_32x48x48 | cyclic | no | 32 | 48 | 48 | yes | 190 | 168 | 0.8842105263157894 |
dense_matmul_cyclic_32x48x64 | cyclic | no | 32 | 48 | 64 | no | 244 | 216 | 0.8852459016393442 |
dense_matmul_cyclic_32x64x32 | cyclic | no | 32 | 64 | 32 | no | 183 | 160 | 0.8743169398907104 |
dense_matmul_cyclic_32x64x48 | cyclic | no | 32 | 64 | 48 | no | 250 | 224 | 0.896 |
dense_matmul_cyclic_32x64x64 | cyclic | no | 32 | 64 | 64 | no | 324 | 288 | 0.8888888888888888 |
dense_matmul_cyclic_48x32x32 | cyclic | no | 48 | 32 | 32 | yes | 139 | 120 | 0.8633093525179856 |
dense_matmul_cyclic_48x32x48 | cyclic | no | 48 | 32 | 48 | yes | 190 | 168 | 0.8842105263157894 |
dense_matmul_cyclic_48x32x64 | cyclic | no | 48 | 32 | 64 | no | 244 | 216 | 0.8852459016393442 |
dense_matmul_cyclic_48x48x32 | cyclic | no | 48 | 48 | 32 | yes | 205 | 180 | 0.8780487804878049 |
dense_matmul_cyclic_48x48x48 | cyclic | no | 48 | 48 | 48 | yes | 280 | 252 | 0.9 |
dense_matmul_cyclic_48x48x64 | cyclic | no | 48 | 48 | 64 | no | 364 | 324 | 0.8901098901098901 |
dense_matmul_cyclic_48x64x32 | cyclic | no | 48 | 64 | 32 | no | 271 | 240 | 0.8856088560885609 |
dense_matmul_cyclic_48x64x48 | cyclic | no | 48 | 64 | 48 | no | 370 | 336 | 0.9081081081081082 |
dense_matmul_cyclic_48x64x64 | cyclic | no | 48 | 64 | 64 | no | 484 | 432 | 0.8925619834710744 |
dense_matmul_cyclic_64x32x32 | cyclic | no | 64 | 32 | 32 | no | 183 | 160 | 0.8743169398907104 |
dense_matmul_cyclic_64x32x48 | cyclic | no | 64 | 32 | 48 | no | 250 | 224 | 0.896 |
dense_matmul_cyclic_64x32x64 | cyclic | no | 64 | 32 | 64 | no | 324 | 288 | 0.8888888888888888 |
dense_matmul_cyclic_64x48x32 | cyclic | no | 64 | 48 | 32 | no | 271 | 240 | 0.8856088560885609 |
dense_matmul_cyclic_64x48x48 | cyclic | no | 64 | 48 | 48 | no | 370 | 336 | 0.9081081081081082 |
dense_matmul_cyclic_64x48x64 | cyclic | no | 64 | 48 | 64 | no | 484 | 432 | 0.8925619834710744 |
dense_matmul_cyclic_64x64x32 | cyclic | no | 64 | 64 | 32 | no | 359 | 320 | 0.8913649025069638 |
dense_matmul_cyclic_64x64x48 | cyclic | no | 64 | 64 | 48 | no | 490 | 448 | 0.9142857142857143 |
dense_matmul_cyclic_64x64x64 | cyclic | no | 64 | 64 | 64 | no | 644 | 576 | 0.8944099378881988 |
average | 0.8854226979371725 |
Results for a banked layout with add C
benchmark | layout | add C | M | N | K | plots | cycles | ideal | utilization |
---|---|---|---|---|---|---|---|---|---|
dense_gemm_banked_32x32x32 | banked | yes | 32 | 32 | 32 | yes | 105 | 80 | 0.7619047619047619 |
dense_gemm_banked_32x32x48 | banked | yes | 32 | 32 | 48 | yes | 138 | 112 | 0.8115942028985508 |
dense_gemm_banked_32x32x64 | banked | yes | 32 | 32 | 64 | no | 170 | 144 | 0.8470588235294118 |
dense_gemm_banked_32x48x32 | banked | yes | 32 | 48 | 32 | yes | 154 | 120 | 0.7792207792207793 |
dense_gemm_banked_32x48x48 | banked | yes | 32 | 48 | 48 | yes | 202 | 168 | 0.8316831683168316 |
dense_gemm_banked_32x48x64 | banked | yes | 32 | 48 | 64 | no | 250 | 216 | 0.864 |
dense_gemm_banked_32x64x32 | banked | yes | 32 | 64 | 32 | no | 202 | 160 | 0.7920792079207921 |
dense_gemm_banked_32x64x48 | banked | yes | 32 | 64 | 48 | no | 266 | 224 | 0.8421052631578947 |
dense_gemm_banked_32x64x64 | banked | yes | 32 | 64 | 64 | no | 330 | 288 | 0.8727272727272727 |
dense_gemm_banked_48x32x32 | banked | yes | 48 | 32 | 32 | yes | 154 | 120 | 0.7792207792207793 |
dense_gemm_banked_48x32x48 | banked | yes | 48 | 32 | 48 | yes | 202 | 168 | 0.8316831683168316 |
dense_gemm_banked_48x32x64 | banked | yes | 48 | 32 | 64 | no | 250 | 216 | 0.864 |
dense_gemm_banked_48x48x32 | banked | yes | 48 | 48 | 32 | yes | 226 | 180 | 0.7964601769911505 |
dense_gemm_banked_48x48x48 | banked | yes | 48 | 48 | 48 | yes | 298 | 252 | 0.8456375838926175 |
dense_gemm_banked_48x48x64 | banked | yes | 48 | 48 | 64 | no | 370 | 324 | 0.8756756756756757 |
dense_gemm_banked_48x64x32 | banked | yes | 48 | 64 | 32 | no | 298 | 240 | 0.8053691275167785 |
dense_gemm_banked_48x64x48 | banked | yes | 48 | 64 | 48 | no | 394 | 336 | 0.8527918781725888 |
dense_gemm_banked_48x64x64 | banked | yes | 48 | 64 | 64 | no | 490 | 432 | 0.8816326530612245 |
dense_gemm_banked_64x32x32 | banked | yes | 64 | 32 | 32 | no | 202 | 160 | 0.7920792079207921 |
dense_gemm_banked_64x32x48 | banked | yes | 64 | 32 | 48 | no | 266 | 224 | 0.8421052631578947 |
dense_gemm_banked_64x32x64 | banked | yes | 64 | 32 | 64 | no | 330 | 288 | 0.8727272727272727 |
dense_gemm_banked_64x48x32 | banked | yes | 64 | 48 | 32 | no | 298 | 240 | 0.8053691275167785 |
dense_gemm_banked_64x48x48 | banked | yes | 64 | 48 | 48 | no | 394 | 336 | 0.8527918781725888 |
dense_gemm_banked_64x48x64 | banked | yes | 64 | 48 | 64 | no | 490 | 432 | 0.8816326530612245 |
dense_gemm_banked_64x64x32 | banked | yes | 64 | 64 | 32 | no | 393 | 320 | 0.8142493638676844 |
dense_gemm_banked_64x64x48 | banked | yes | 64 | 64 | 48 | no | 522 | 448 | 0.8582375478927203 |
dense_gemm_banked_64x64x64 | banked | yes | 64 | 64 | 64 | no | 650 | 576 | 0.8861538461538462 |
average | 0.8348218771479536 |
Results for a banked layout
benchmark | layout | add C | M | N | K | plots | cycles | ideal | utilization |
---|---|---|---|---|---|---|---|---|---|
dense_matmul_banked_32x32x32 | banked | no | 32 | 32 | 32 | yes | 90 | 80 | 0.8888888888888888 |
dense_matmul_banked_32x32x48 | banked | no | 32 | 32 | 48 | yes | 122 | 112 | 0.9180327868852459 |
dense_matmul_banked_32x32x64 | banked | no | 32 | 32 | 64 | no | 154 | 144 | 0.935064935064935 |
dense_matmul_banked_32x48x32 | banked | no | 32 | 48 | 32 | yes | 130 | 120 | 0.9230769230769231 |
dense_matmul_banked_32x48x48 | banked | no | 32 | 48 | 48 | yes | 178 | 168 | 0.9438202247191011 |
dense_matmul_banked_32x48x64 | banked | no | 32 | 48 | 64 | no | 226 | 216 | 0.9557522123893806 |
dense_matmul_banked_32x64x32 | banked | no | 32 | 64 | 32 | no | 170 | 160 | 0.9411764705882353 |
dense_matmul_banked_32x64x48 | banked | no | 32 | 64 | 48 | no | 234 | 224 | 0.9572649572649573 |
dense_matmul_banked_32x64x64 | banked | no | 32 | 64 | 64 | no | 298 | 288 | 0.9664429530201343 |
dense_matmul_banked_48x32x32 | banked | no | 48 | 32 | 32 | yes | 130 | 120 | 0.9230769230769231 |
dense_matmul_banked_48x32x48 | banked | no | 48 | 32 | 48 | yes | 178 | 168 | 0.9438202247191011 |
dense_matmul_banked_48x32x64 | banked | no | 48 | 32 | 64 | no | 226 | 216 | 0.9557522123893806 |
dense_matmul_banked_48x48x32 | banked | no | 48 | 48 | 32 | yes | 190 | 180 | 0.9473684210526315 |
dense_matmul_banked_48x48x48 | banked | no | 48 | 48 | 48 | yes | 262 | 252 | 0.9618320610687023 |
dense_matmul_banked_48x48x64 | banked | no | 48 | 48 | 64 | no | 334 | 324 | 0.9700598802395209 |
dense_matmul_banked_48x64x32 | banked | no | 48 | 64 | 32 | no | 250 | 240 | 0.96 |
dense_matmul_banked_48x64x48 | banked | no | 48 | 64 | 48 | no | 346 | 336 | 0.9710982658959537 |
dense_matmul_banked_48x64x64 | banked | no | 48 | 64 | 64 | no | 442 | 432 | 0.9773755656108597 |
dense_matmul_banked_64x32x32 | banked | no | 64 | 32 | 32 | no | 170 | 160 | 0.9411764705882353 |
dense_matmul_banked_64x32x48 | banked | no | 64 | 32 | 48 | no | 234 | 224 | 0.9572649572649573 |
dense_matmul_banked_64x32x64 | banked | no | 64 | 32 | 64 | no | 298 | 288 | 0.9664429530201343 |
dense_matmul_banked_64x48x32 | banked | no | 64 | 48 | 32 | no | 250 | 240 | 0.96 |
dense_matmul_banked_64x48x48 | banked | no | 64 | 48 | 48 | no | 346 | 336 | 0.9710982658959537 |
dense_matmul_banked_64x48x64 | banked | no | 64 | 48 | 64 | no | 442 | 432 | 0.9773755656108597 |
dense_matmul_banked_64x64x32 | banked | no | 64 | 64 | 32 | no | 330 | 320 | 0.9696969696969697 |
dense_matmul_banked_64x64x48 | banked | no | 64 | 64 | 48 | no | 458 | 448 | 0.9781659388646288 |
dense_matmul_banked_64x64x64 | banked | no | 64 | 64 | 64 | no | 586 | 576 | 0.9829351535836177 |
average | 0.9534837103880086 |