Skip to content

CI run c51755f

  • Run: link
  • Time: Denver 2025-08-26 06:25:54 MDT • Brussels 2025-08-26 14:25:54 CEST

GEMM Deployment Summary

  • Workflow: AIE Deployment Gemm
  • Commit: c51755f6173c3ac917414fcf9be370204269a30c
  • Runner: venus
  • Run time: Denver 2025-08-26 06:25:39 MDT • Brussels 2025-08-26 14:25:39 CEST
  • Run: #49; Attempt 1
HW M K N Status Note
single_col 128 128 128 ✅ success
single_col 128 128 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 128 128 64 ✅ success
single_col 128 256 128 ❌ failed Mismatch between expected and actual output values.
single_col 128 256 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 128 256 64 ❌ failed Data mismatch during matrix multiplication execution.
single_col 128 64 128 ✅ success
single_col 128 64 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 128 64 64 ✅ success
single_col 256 128 128 ✅ success
single_col 256 128 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 256 128 64 ✅ success
single_col 256 256 128 ❌ failed Output values do not match expected results.
single_col 256 256 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 256 256 64 ❌ failed Data mismatch during matrix multiplication execution.
single_col 256 64 128 ✅ success
single_col 256 64 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 256 64 64 ✅ success
single_col 64 128 128 ✅ success
single_col 64 128 256 ✅ success
single_col 64 128 64 ✅ success
single_col 64 256 128 ❌ failed Data mismatch during matrix multiplication execution.
single_col 64 256 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 64 256 64 ❌ failed Data mismatch during matrix multiplication execution.
single_col 64 64 128 ✅ success
single_col 64 64 256 ✅ success
single_col 64 64 64 ✅ success
single_core 128 128 128 ✅ success
single_core 128 128 256 ✅ success
single_core 128 128 64 ✅ success
single_core 128 256 128 ✅ success
single_core 128 256 256 ✅ success
single_core 128 256 64 ✅ success
single_core 128 64 128 ✅ success
single_core 128 64 256 ✅ success
single_core 128 64 64 ✅ success
single_core 256 128 128 ✅ success
single_core 256 128 256 ✅ success
single_core 256 128 64 ✅ success
single_core 256 256 128 ✅ success
single_core 256 256 256 ✅ success
single_core 256 256 64 ✅ success
single_core 256 64 128 ✅ success
single_core 256 64 256 ✅ success
single_core 256 64 64 ✅ success
single_core 64 128 128 ✅ success
single_core 64 128 256 ✅ success
single_core 64 128 64 ✅ success
single_core 64 256 128 ✅ success
single_core 64 256 256 ✅ success
single_core 64 256 64 ✅ success
single_core 64 64 128 ✅ success
single_core 64 64 256 ✅ success
single_core 64 64 64 ✅ success
whole_array 128 128 128 ❌ failed Undefined symbols during linking process.
whole_array 128 128 256 ❌ failed Undefined symbols during linking process.
whole_array 128 128 64 ❌ failed Undefined symbols during linking process.
whole_array 128 256 128 ✅ success
whole_array 128 256 256 ❌ failed Unexpected command state in qds_device::wait().
whole_array 128 256 64 ✅ success
whole_array 128 64 128 ❌ failed Undefined symbols during linking process.
whole_array 128 64 256 ❌ failed Undefined symbols during linking process.
whole_array 128 64 64 ❌ failed Undefined symbols during linking process.
whole_array 256 128 128 ❌ failed Undefined symbols during linking process.
whole_array 256 128 256 ❌ failed Undefined symbols during linking process.
whole_array 256 128 64 ❌ failed Undefined symbols during linking process.
whole_array 256 256 128 ✅ success
whole_array 256 256 256 ❌ failed Unexpected command state in qds_device::wait().
whole_array 256 256 64 ✅ success
whole_array 256 64 128 ✅ success
whole_array 256 64 256 ❌ failed Unexpected command state in qds_device::wait().
whole_array 256 64 64 ✅ success
whole_array 64 128 128 ❌ failed Undefined symbols during linking process.
whole_array 64 128 256 ❌ failed Undefined symbols during linking process.
whole_array 64 128 64 ❌ failed Undefined symbols during linking process.
whole_array 64 256 128 ✅ success
whole_array 64 256 256 ✅ success
whole_array 64 256 64 ✅ success
whole_array 64 64 128 ✅ success
whole_array 64 64 256 ✅ success
whole_array 64 64 64 ✅ success

Totals:53 • ❌ 28 • All: 81

[single_col] M=128 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 32 | 61,286 | 1,150.00 | 28.49 | 44.52 | 17.11 | 26.73 | | tile2,1 | 32 | 61,287 | 1,150.00 | 28.49 | 44.52 | 17.11 | 26.73 |
[single_col] M=128 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 16 | 28,468 | 1,150.00 | 28.49 | 44.52 | 18.42 | 28.78 | | tile2,1 | 16 | 28,469 | 1,150.00 | 28.49 | 44.52 | 18.42 | 28.78 |
[single_col] M=128 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 31,437 | 1,150.00 | 28.49 | 44.52 | 16.68 | 26.06 | | tile3,1 | 16 | 31,438 | 1,150.00 | 28.49 | 44.52 | 16.68 | 26.06 |
[single_col] M=128 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 8 | 12,992 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile3,1 | 8 | 12,992 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 |
[single_col] M=256 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 64 | 120,675 | 1,150.00 | 28.49 | 44.52 | 17.38 | 27.15 | | tile2,1 | 64 | 120,675 | 1,150.00 | 28.49 | 44.52 | 17.38 | 27.15 |
[single_col] M=256 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 58,171 | 1,150.00 | 28.49 | 44.52 | 18.03 | 28.17 | | tile3,1 | 32 | 58,172 | 1,150.00 | 28.49 | 44.52 | 18.03 | 28.16 |
[single_col] M=256 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 59,718 | 1,150.00 | 28.49 | 44.52 | 17.56 | 27.44 | | tile3,1 | 32 | 59,718 | 1,150.00 | 28.49 | 44.52 | 17.56 | 27.44 |
[single_col] M=256 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 27,232 | 1,150.00 | 28.49 | 44.52 | 19.25 | 30.08 | | tile3,1 | 16 | 27,233 | 1,150.00 | 28.49 | 44.52 | 19.25 | 30.08 |
[single_col] M=64 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 16 | 31,509 | 1,150.00 | 28.49 | 44.52 | 16.64 | 26.00 | | tile2,1 | 16 | 32,024 | 1,150.00 | 28.49 | 44.52 | 16.37 | 25.58 |
[single_col] M=64 K=128 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 32 | 79,008 | 1,150.00 | 28.49 | 44.52 | 13.27 | 20.74 | | tile2,1 | 32 | 79,521 | 1,150.00 | 28.49 | 44.52 | 13.19 | 20.60 |
[single_col] M=64 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 8 | 13,592 | 1,150.00 | 28.49 | 44.52 | 19.29 | 30.14 | | tile2,1 | 8 | 14,105 | 1,150.00 | 28.49 | 44.52 | 18.59 | 29.04 |
[single_col] M=64 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 8 | 13,315 | 1,134.00 | 28.90 | 45.15 | 19.69 | 30.76 | | tile3,1 | 8 | 13,831 | 1,134.00 | 28.90 | 45.15 | 18.95 | 29.61 |
[single_col] M=64 K=64 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 46,593 | 1,134.00 | 28.90 | 45.15 | 11.25 | 17.58 | | tile3,1 | 16 | 47,610 | 1,134.00 | 28.90 | 45.15 | 11.01 | 17.21 |
[single_col] M=64 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 4 | 5,781 | 1,134.00 | 28.90 | 45.15 | 22.67 | 35.43 | | tile2,1 | 4 | 5,783 | 1,134.00 | 28.90 | 45.15 | 22.67 | 35.41 |
[single_core] M=128 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 121,493 | 1,445.44 | 22.67 | 35.42 | 17.26 | 26.97 |
[single_core] M=128 K=128 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 128 | 243,883 | 1,150.00 | 28.49 | 44.52 | 17.20 | 26.87 |
[single_core] M=128 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 59,234 | 1,150.00 | 28.49 | 44.52 | 17.70 | 27.66 |
[single_core] M=128 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 128 | 234,014 | 1,150.00 | 28.49 | 44.52 | 17.92 | 28.01 |
[single_core] M=128 K=256 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 256 | 471,469 | 1,150.00 | 28.49 | 44.52 | 17.79 | 27.80 |
[single_core] M=128 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 115,293 | 1,150.00 | 28.49 | 44.52 | 18.19 | 28.42 |
[single_core] M=128 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 55,813 | 1,150.00 | 28.49 | 44.52 | 18.79 | 29.36 |
[single_core] M=128 K=64 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 112,896 | 1,150.00 | 28.49 | 44.52 | 18.58 | 29.02 |
[single_core] M=128 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 27,265 | 1,150.00 | 28.49 | 44.52 | 19.23 | 30.05 |
[single_core] M=256 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 128 | 239,557 | 1,150.00 | 28.49 | 44.52 | 17.51 | 27.36 |
[single_core] M=256 K=128 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 255 | 479,560 | 1,149.20 | 28.51 | 44.55 | 17.49 | 27.33 |
[single_core] M=256 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 118,622 | 1,150.00 | 28.49 | 44.52 | 17.68 | 27.62 |
[single_core] M=256 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 256 | 462,736 | 1,150.00 | 28.49 | 44.52 | 18.13 | 28.33 |
[single_core] M=256 K=256 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 512 | 928,941 | 1,150.00 | 28.49 | 44.52 | 18.06 | 28.22 |
[single_core] M=256 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 128 | 229,659 | 1,150.00 | 28.49 | 44.52 | 18.26 | 28.54 |
[single_core] M=256 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 112,809 | 1,150.00 | 28.49 | 44.52 | 18.59 | 29.05 |
[single_core] M=256 K=64 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 128 | 227,247 | 1,347.15 | 24.32 | 38.01 | 18.46 | 28.84 |
[single_core] M=256 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 55,759 | 1,150.00 | 28.49 | 44.52 | 18.81 | 29.38 |
[single_core] M=64 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 61,242 | 1,150.00 | 28.49 | 44.52 | 17.12 | 26.75 |
[single_core] M=64 K=128 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 124,790 | 1,150.00 | 28.49 | 44.52 | 16.81 | 26.26 |
[single_core] M=64 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 29,463 | 1,150.00 | 28.49 | 44.52 | 17.79 | 27.80 |
[single_core] M=64 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 119,647 | 1,150.00 | 28.49 | 44.52 | 17.53 | 27.39 |
[single_core] M=64 K=256 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 128 | 242,727 | 1,150.00 | 28.49 | 44.52 | 17.28 | 27.00 |
[single_core] M=64 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 58,107 | 1,150.00 | 28.49 | 44.52 | 18.05 | 28.20 |
[single_core] M=64 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 26,863 | 1,134.00 | 28.90 | 45.15 | 19.52 | 30.50 |
[single_core] M=64 K=64 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 54,975 | 1,134.00 | 28.90 | 45.15 | 19.07 | 29.80 |
[single_core] M=64 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 8 | 12,808 | 1,134.00 | 28.90 | 45.15 | 20.47 | 31.98 |
[whole_array] M=128 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 58,111 | 1,150.00 | 28.49 | 44.52 | 18.04 | 28.19 | | tile2,2 | 32 | 58,111 | 1,150.00 | 28.49 | 44.52 | 18.04 | 28.19 | | tile3,2 | 32 | 59,072 | 1,150.00 | 28.49 | 44.52 | 17.75 | 27.74 | | tile3,1 | 32 | 59,073 | 1,150.00 | 28.49 | 44.52 | 17.75 | 27.74 |
[whole_array] M=128 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 27,340 | 1,150.00 | 28.49 | 44.52 | 19.18 | 29.96 | | tile2,2 | 16 | 27,340 | 1,150.00 | 28.49 | 44.52 | 19.18 | 29.96 | | tile3,2 | 16 | 28,476 | 1,150.00 | 28.49 | 44.52 | 18.41 | 28.77 | | tile3,1 | 16 | 28,477 | 1,150.00 | 28.49 | 44.52 | 18.41 | 28.77 |
[whole_array] M=256 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 115,293 | 1,150.00 | 28.49 | 44.52 | 18.19 | 28.42 | | tile2,2 | 64 | 115,294 | 1,150.00 | 28.49 | 44.52 | 18.19 | 28.42 | | tile3,1 | 64 | 116,197 | 1,150.00 | 28.49 | 44.52 | 18.05 | 28.20 | | tile3,2 | 64 | 116,199 | 1,150.00 | 28.49 | 44.52 | 18.05 | 28.20 |
[whole_array] M=256 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,2 | 32 | 55,931 | 1,150.00 | 28.49 | 44.52 | 18.75 | 29.29 | | tile2,1 | 32 | 55,931 | 1,150.00 | 28.49 | 44.52 | 18.75 | 29.29 | | tile3,1 | 32 | 57,300 | 1,150.00 | 28.49 | 44.52 | 18.30 | 28.59 | | tile3,2 | 32 | 57,300 | 1,150.00 | 28.49 | 44.52 | 18.30 | 28.59 |
[whole_array] M=256 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 34,981 | 1,150.00 | 28.49 | 44.52 | 14.99 | 23.42 | | tile2,2 | 16 | 34,981 | 1,150.00 | 28.49 | 44.52 | 14.99 | 23.42 | | tile3,1 | 16 | 37,122 | 1,150.00 | 28.49 | 44.52 | 14.12 | 22.07 | | tile3,2 | 16 | 37,123 | 1,150.00 | 28.49 | 44.52 | 14.12 | 22.07 |
[whole_array] M=256 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 8 | 12,991 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile2,2 | 8 | 12,991 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile3,1 | 8 | 13,501 | 1,150.00 | 28.49 | 44.52 | 19.42 | 30.34 | | tile3,2 | 8 | 13,504 | 1,150.00 | 28.49 | 44.52 | 19.41 | 30.33 |
[whole_array] M=64 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 16 | 29,701 | 1,150.00 | 28.49 | 44.52 | 17.65 | 27.58 | | tile2,1 | 16 | 29,710 | 1,150.00 | 28.49 | 44.52 | 17.65 | 27.57 | | tile2,2 | 16 | 30,523 | 1,150.00 | 28.49 | 44.52 | 17.18 | 26.84 | | tile3,2 | 16 | 30,524 | 1,150.00 | 28.49 | 44.52 | 17.18 | 26.84 |
[whole_array] M=64 K=256 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 32 | 62,487 | 1,150.00 | 28.49 | 44.52 | 16.78 | 26.22 | | tile2,1 | 32 | 62,493 | 1,150.00 | 28.49 | 44.52 | 16.78 | 26.22 | | tile2,2 | 32 | 63,369 | 1,150.00 | 28.49 | 44.52 | 16.55 | 25.85 | | tile3,2 | 32 | 63,371 | 1,150.00 | 28.49 | 44.52 | 16.55 | 25.85 |
[whole_array] M=64 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 8 | 13,045 | 1,150.00 | 28.49 | 44.52 | 20.10 | 31.40 | | tile3,1 | 8 | 13,046 | 1,150.00 | 28.49 | 44.52 | 20.09 | 31.40 | | tile3,2 | 8 | 13,561 | 1,150.00 | 28.49 | 44.52 | 19.33 | 30.20 | | tile2,2 | 8 | 13,568 | 1,150.00 | 28.49 | 44.52 | 19.32 | 30.19 |
[whole_array] M=64 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,2 | 4 | 5,874 | 1,150.00 | 28.49 | 44.52 | 22.31 | 34.87 | | tile3,2 | 4 | 5,874 | 1,150.00 | 28.49 | 44.52 | 22.31 | 34.87 | | tile2,1 | 4 | 5,875 | 1,150.00 | 28.49 | 44.52 | 22.31 | 34.86 | | tile3,1 | 4 | 5,876 | 1,150.00 | 28.49 | 44.52 | 22.31 | 34.85 |
[whole_array] M=64 K=64 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 8 | 12,991 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile2,1 | 8 | 12,992 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile2,2 | 8 | 12,992 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile3,2 | 8 | 12,992 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 |
[whole_array] M=64 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 2 | 2,317 | 1,150.00 | 28.49 | 44.52 | 28.28 | 44.20 | | tile3,2 | 2 | 2,317 | 1,150.00 | 28.49 | 44.52 | 28.28 | 44.20 | | tile2,1 | 2 | 2,317 | 1,150.00 | 28.49 | 44.52 | 28.28 | 44.20 | | tile2,2 | 2 | 2,317 | 1,150.00 | 28.49 | 44.52 | 28.28 | 44.20 |