Skip to content

CI run aa95c34

  • Run: link
  • Time: Denver 2025-08-26 06:05:46 MDT • Brussels 2025-08-26 14:05:46 CEST

GEMM Deployment Summary

  • Workflow: AIE Deployment Gemm
  • Commit: aa95c34904de8c06ac42327c4fea38827a3ccef9
  • Runner: venus
  • Run time: Denver 2025-08-26 06:05:25 MDT • Brussels 2025-08-26 14:05:25 CEST
  • Run: #47; Attempt 1
HW M K N Status Note
single_col 128 128 128 ✅ success
single_col 128 128 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 128 128 64 ✅ success
single_col 128 256 128 ❌ failed Mismatch between expected and actual output values.
single_col 128 256 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 128 256 64 ❌ failed Data mismatch during matrix multiplication execution.
single_col 128 64 128 ✅ success
single_col 128 64 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 128 64 64 ✅ success
single_col 256 128 128 ✅ success
single_col 256 128 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 256 128 64 ✅ success
single_col 256 256 128 ❌ failed Output values do not match expected results.
single_col 256 256 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 256 256 64 ❌ failed Data mismatch during matrix multiplication execution.
single_col 256 64 128 ✅ success
single_col 256 64 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 256 64 64 ✅ success
single_col 64 128 128 ✅ success
single_col 64 128 256 ❌ failed Maximum relative error of 100% in output values.
single_col 64 128 64 ✅ success
single_col 64 256 128 ❌ failed Data mismatch during matrix multiplication execution.
single_col 64 256 256 ❌ failed Unexpected command state in qds_device::wait().
single_col 64 256 64 ❌ failed Data mismatch during matrix multiplication execution.
single_col 64 64 128 ✅ success
single_col 64 64 256 ❌ failed Maximum relative error reached 100% during execution.
single_col 64 64 64 ✅ success
single_core 128 128 128 ✅ success
single_core 128 128 256 ❌ failed Peano not added to PATH causing potential conflicts.
single_core 128 128 64 ✅ success
single_core 128 256 128 ✅ success
single_core 128 256 256 ✅ success
single_core 128 256 64 ✅ success
single_core 128 64 128 ✅ success
single_core 128 64 256 ✅ success
single_core 128 64 64 ✅ success
single_core 256 128 128 ✅ success
single_core 256 128 256 ❌ failed Peano not added to PATH causing potential conflicts.
single_core 256 128 64 ✅ success
single_core 256 256 128 ✅ success
single_core 256 256 256 ✅ success
single_core 256 256 64 ✅ success
single_core 256 64 128 ✅ success
single_core 256 64 256 ✅ success
single_core 256 64 64 ✅ success
single_core 64 128 128 ✅ success
single_core 64 128 256 ✅ success
single_core 64 128 64 ✅ success
single_core 64 256 128 ✅ success
single_core 64 256 256 ✅ success
single_core 64 256 64 ✅ success
single_core 64 64 128 ✅ success
single_core 64 64 256 ✅ success
single_core 64 64 64 ✅ success
whole_array 128 128 128 ❌ failed Undefined symbols during linking process.
whole_array 128 128 256 ❌ failed Undefined symbols during linking process.
whole_array 128 128 64 ❌ failed Undefined symbols during linking process.
whole_array 128 256 128 ✅ success
whole_array 128 256 256 ❌ failed Unexpected command state in qds_device::wait().
whole_array 128 256 64 ✅ success
whole_array 128 64 128 ❌ failed Undefined symbols during linking process.
whole_array 128 64 256 ❌ failed Undefined symbols during linking process.
whole_array 128 64 64 ❌ failed Undefined symbols during linking process.
whole_array 256 128 128 ❌ failed Undefined symbols during linking process.
whole_array 256 128 256 ❌ failed Undefined symbols during linking process.
whole_array 256 128 64 ❌ failed Undefined symbols during linking process.
whole_array 256 256 128 ✅ success
whole_array 256 256 256 ❌ failed Unexpected command state in qds_device::wait().
whole_array 256 256 64 ✅ success
whole_array 256 64 128 ✅ success
whole_array 256 64 256 ❌ failed Unexpected command state in qds_device::wait().
whole_array 256 64 64 ✅ success
whole_array 64 128 128 ❌ failed Undefined symbols during linking process.
whole_array 64 128 256 ❌ failed Undefined symbols during linking process.
whole_array 64 128 64 ❌ failed Undefined symbols during linking process.
whole_array 64 256 128 ✅ success
whole_array 64 256 256 ✅ success
whole_array 64 256 64 ✅ success
whole_array 64 64 128 ✅ success
whole_array 64 64 256 ✅ success
whole_array 64 64 64 ✅ success

Totals:49 • ❌ 32 • All: 81

[single_col] M=128 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 32 | 61,281 | 1,150.00 | 28.49 | 44.52 | 17.11 | 26.74 | | tile2,1 | 32 | 61,281 | 1,150.00 | 28.49 | 44.52 | 17.11 | 26.74 |
[single_col] M=128 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 28,469 | 1,150.00 | 28.49 | 44.52 | 18.42 | 28.78 | | tile3,1 | 16 | 28,471 | 1,150.00 | 28.49 | 44.52 | 18.41 | 28.77 |
[single_col] M=128 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 16 | 31,471 | 1,150.00 | 28.49 | 44.52 | 16.66 | 26.03 | | tile2,1 | 16 | 31,471 | 1,150.00 | 28.49 | 44.52 | 16.66 | 26.03 |
[single_col] M=128 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 8 | 12,992 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile3,1 | 8 | 12,992 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 |
[single_col] M=256 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 64 | 120,665 | 1,150.00 | 28.49 | 44.52 | 17.38 | 27.16 | | tile2,1 | 64 | 120,666 | 1,150.00 | 28.49 | 44.52 | 17.38 | 27.16 |
[single_col] M=256 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 32 | 58,164 | 1,150.00 | 28.49 | 44.52 | 18.03 | 28.17 | | tile2,1 | 32 | 58,165 | 1,150.00 | 28.49 | 44.52 | 18.03 | 28.17 |
[single_col] M=256 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 59,865 | 1,150.00 | 28.49 | 44.52 | 17.52 | 27.37 | | tile3,1 | 32 | 59,865 | 1,150.00 | 28.49 | 44.52 | 17.52 | 27.37 |
[single_col] M=256 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 16 | 27,233 | 1,150.00 | 28.49 | 44.52 | 19.25 | 30.08 | | tile2,1 | 16 | 27,233 | 1,150.00 | 28.49 | 44.52 | 19.25 | 30.08 |
[single_col] M=64 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 16 | 31,515 | 1,150.00 | 28.49 | 44.52 | 16.64 | 25.99 | | tile2,1 | 16 | 32,025 | 1,150.00 | 28.49 | 44.52 | 16.37 | 25.58 |
[single_col] M=64 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 8 | 13,596 | 1,150.00 | 28.49 | 44.52 | 19.28 | 30.13 | | tile2,1 | 8 | 14,111 | 1,150.00 | 28.49 | 44.52 | 18.58 | 29.03 |
[single_col] M=64 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 8 | 13,316 | 1,134.00 | 28.90 | 45.15 | 19.69 | 30.76 | | tile3,1 | 8 | 13,831 | 1,134.00 | 28.90 | 45.15 | 18.95 | 29.61 |
[single_col] M=64 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 4 | 5,781 | 1,134.00 | 28.90 | 45.15 | 22.67 | 35.43 | | tile2,1 | 4 | 5,781 | 1,134.00 | 28.90 | 45.15 | 22.67 | 35.43 |
[single_core] M=128 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 121,242 | 1,441.52 | 22.73 | 35.52 | 17.30 | 27.03 |
[single_core] M=128 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 59,234 | 1,150.00 | 28.49 | 44.52 | 17.70 | 27.66 |
[single_core] M=128 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 128 | 234,012 | 1,150.00 | 28.49 | 44.52 | 17.92 | 28.01 |
[single_core] M=128 K=256 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 256 | 471,474 | 1,150.00 | 28.49 | 44.52 | 17.79 | 27.80 |
[single_core] M=128 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 115,293 | 1,150.00 | 28.49 | 44.52 | 18.19 | 28.42 |
[single_core] M=128 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 55,813 | 1,150.00 | 28.49 | 44.52 | 18.79 | 29.36 |
[single_core] M=128 K=64 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 112,896 | 1,150.00 | 28.49 | 44.52 | 18.58 | 29.02 |
[single_core] M=128 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 27,265 | 1,150.00 | 28.49 | 44.52 | 19.23 | 30.05 |
[single_core] M=256 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 128 | 239,555 | 1,150.00 | 28.49 | 44.52 | 17.51 | 27.36 |
[single_core] M=256 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 118,622 | 1,150.00 | 28.49 | 44.52 | 17.68 | 27.62 |
[single_core] M=256 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 256 | 462,734 | 1,150.00 | 28.49 | 44.52 | 18.13 | 28.33 |
[single_core] M=256 K=256 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 512 | 928,943 | 1,150.00 | 28.49 | 44.52 | 18.06 | 28.22 |
[single_core] M=256 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 128 | 229,659 | 1,150.00 | 28.49 | 44.52 | 18.26 | 28.54 |
[single_core] M=256 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 112,809 | 1,150.00 | 28.49 | 44.52 | 18.59 | 29.05 |
[single_core] M=256 K=64 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 128 | 227,223 | 1,346.96 | 24.33 | 38.01 | 18.46 | 28.84 |
[single_core] M=256 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 55,759 | 1,150.00 | 28.49 | 44.52 | 18.81 | 29.38 |
[single_core] M=64 K=128 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 61,242 | 1,150.00 | 28.49 | 44.52 | 17.12 | 26.75 |
[single_core] M=64 K=128 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 124,792 | 1,150.00 | 28.49 | 44.52 | 16.81 | 26.26 |
[single_core] M=64 K=128 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 29,463 | 1,150.00 | 28.49 | 44.52 | 17.79 | 27.80 |
[single_core] M=64 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 119,650 | 1,150.00 | 28.49 | 44.52 | 17.53 | 27.39 |
[single_core] M=64 K=256 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 128 | 242,727 | 1,150.00 | 28.49 | 44.52 | 17.28 | 27.00 |
[single_core] M=64 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 58,107 | 1,150.00 | 28.49 | 44.52 | 18.05 | 28.20 |
[single_core] M=64 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 16 | 26,863 | 1,134.00 | 28.90 | 45.15 | 19.52 | 30.50 |
[single_core] M=64 K=64 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 54,975 | 1,134.00 | 28.90 | 45.15 | 19.07 | 29.80 |
[single_core] M=64 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 8 | 12,808 | 1,134.00 | 28.90 | 45.15 | 20.47 | 31.98 |
[whole_array] M=128 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,2 | 32 | 58,111 | 1,150.00 | 28.49 | 44.52 | 18.04 | 28.19 | | tile2,1 | 32 | 58,111 | 1,150.00 | 28.49 | 44.52 | 18.04 | 28.19 | | tile3,2 | 32 | 58,631 | 1,150.00 | 28.49 | 44.52 | 17.88 | 27.94 | | tile3,1 | 32 | 58,631 | 1,150.00 | 28.49 | 44.52 | 17.88 | 27.94 |
[whole_array] M=128 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,2 | 16 | 27,340 | 1,150.00 | 28.49 | 44.52 | 19.18 | 29.96 | | tile2,1 | 16 | 27,340 | 1,150.00 | 28.49 | 44.52 | 19.18 | 29.96 | | tile3,2 | 16 | 28,169 | 1,150.00 | 28.49 | 44.52 | 18.61 | 29.08 | | tile3,1 | 16 | 28,169 | 1,150.00 | 28.49 | 44.52 | 18.61 | 29.08 |
[whole_array] M=256 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 64 | 115,293 | 1,150.00 | 28.49 | 44.52 | 18.19 | 28.42 | | tile2,2 | 64 | 115,293 | 1,150.00 | 28.49 | 44.52 | 18.19 | 28.42 | | tile3,2 | 64 | 116,647 | 1,150.00 | 28.49 | 44.52 | 17.98 | 28.09 | | tile3,1 | 64 | 116,647 | 1,150.00 | 28.49 | 44.52 | 17.98 | 28.09 |
[whole_array] M=256 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 32 | 55,931 | 1,150.00 | 28.49 | 44.52 | 18.75 | 29.29 | | tile2,2 | 32 | 55,931 | 1,150.00 | 28.49 | 44.52 | 18.75 | 29.29 | | tile3,2 | 32 | 56,090 | 1,150.00 | 28.49 | 44.52 | 18.69 | 29.21 | | tile3,1 | 32 | 56,091 | 1,150.00 | 28.49 | 44.52 | 18.69 | 29.21 |
[whole_array] M=256 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,2 | 16 | 34,705 | 1,150.00 | 28.49 | 44.52 | 15.11 | 23.60 | | tile2,1 | 16 | 34,706 | 1,150.00 | 28.49 | 44.52 | 15.11 | 23.60 | | tile3,2 | 16 | 36,713 | 1,150.00 | 28.49 | 44.52 | 14.28 | 22.31 | | tile3,1 | 16 | 36,714 | 1,150.00 | 28.49 | 44.52 | 14.28 | 22.31 |
[whole_array] M=256 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,2 | 8 | 12,992 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile2,1 | 8 | 12,992 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile3,2 | 8 | 13,384 | 1,150.00 | 28.49 | 44.52 | 19.59 | 30.60 | | tile3,1 | 8 | 13,385 | 1,150.00 | 28.49 | 44.52 | 19.58 | 30.60 |
[whole_array] M=64 K=256 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 16 | 29,591 | 1,150.00 | 28.49 | 44.52 | 17.72 | 27.68 | | tile2,1 | 16 | 29,602 | 1,150.00 | 28.49 | 44.52 | 17.71 | 27.67 | | tile3,2 | 16 | 29,724 | 1,150.00 | 28.49 | 44.52 | 17.64 | 27.56 | | tile2,2 | 16 | 29,727 | 1,150.00 | 28.49 | 44.52 | 17.64 | 27.56 |
[whole_array] M=64 K=256 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,1 | 32 | 62,876 | 1,150.00 | 28.49 | 44.52 | 16.68 | 26.06 | | tile2,1 | 32 | 62,880 | 1,150.00 | 28.49 | 44.52 | 16.68 | 26.06 | | tile2,2 | 32 | 63,738 | 1,150.00 | 28.49 | 44.52 | 16.45 | 25.71 | | tile3,2 | 32 | 63,744 | 1,150.00 | 28.49 | 44.52 | 16.45 | 25.70 |
[whole_array] M=64 K=256 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 8 | 13,042 | 1,150.00 | 28.49 | 44.52 | 20.10 | 31.41 | | tile3,1 | 8 | 13,045 | 1,150.00 | 28.49 | 44.52 | 20.10 | 31.40 | | tile3,2 | 8 | 13,840 | 1,150.00 | 28.49 | 44.52 | 18.94 | 29.60 | | tile2,2 | 8 | 13,844 | 1,150.00 | 28.49 | 44.52 | 18.94 | 29.59 |
[whole_array] M=64 K=64 N=128 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile2,1 | 4 | 5,874 | 1,150.00 | 28.49 | 44.52 | 22.31 | 34.87 | | tile3,1 | 4 | 5,874 | 1,150.00 | 28.49 | 44.52 | 22.31 | 34.87 | | tile2,2 | 4 | 5,876 | 1,150.00 | 28.49 | 44.52 | 22.31 | 34.85 | | tile3,2 | 4 | 5,876 | 1,150.00 | 28.49 | 44.52 | 22.31 | 34.85 |
[whole_array] M=64 K=64 N=256 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,2 | 8 | 12,991 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile3,1 | 8 | 12,991 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile2,1 | 8 | 12,992 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 | | tile2,2 | 8 | 12,992 | 1,150.00 | 28.49 | 44.52 | 20.18 | 31.53 |
[whole_array] M=64 K=64 N=64 | Tile | Kernels | Total cycles | Avg cycles per kernel | MACs/cycle (kernel) | Peak eff. kernel % | MACs/cycle (system) | Peak eff. system % | |------|---------|--------------|-----------------------|---------------------|--------------------|---------------------|--------------------| | tile3,2 | 2 | 2,317 | 1,150.00 | 28.49 | 44.52 | 28.28 | 44.20 | | tile3,1 | 2 | 2,317 | 1,150.00 | 28.49 | 44.52 | 28.28 | 44.20 | | tile2,2 | 2 | 2,317 | 1,150.00 | 28.49 | 44.52 | 28.28 | 44.20 | | tile2,1 | 2 | 2,317 | 1,150.00 | 28.49 | 44.52 | 28.28 | 44.20 |