Skip to content

Snitch Cluster peripherals

This section documents the registers exposed by the Snitch cluster to interface with various cluster-level peripherals, including the performance counters.

Summary

Name Offset Length Description
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_0 0x0 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_1 0x8 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_2 0x10 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_3 0x18 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_4 0x20 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_5 0x28 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_6 0x30 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_7 0x38 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_8 0x40 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_9 0x48 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_10 0x50 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_11 0x58 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_12 0x60 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_13 0x68 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_14 0x70 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.PERF_COUNTER_ENABLE_15 0x78 8 Enable particular performance counter and start tracking.
snitch_cluster_peripheral.HART_SELECT_0 0x80 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_1 0x88 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_2 0x90 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_3 0x98 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_4 0xa0 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_5 0xa8 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_6 0xb0 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_7 0xb8 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_8 0xc0 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_9 0xc8 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_10 0xd0 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_11 0xd8 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_12 0xe0 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_13 0xe8 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_14 0xf0 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.HART_SELECT_15 0xf8 8 Select from which hart in the cluster, starting from 0,
snitch_cluster_peripheral.PERF_COUNTER_0 0x100 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_1 0x108 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_2 0x110 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_3 0x118 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_4 0x120 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_5 0x128 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_6 0x130 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_7 0x138 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_8 0x140 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_9 0x148 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_10 0x150 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_11 0x158 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_12 0x160 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_13 0x168 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_14 0x170 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.PERF_COUNTER_15 0x178 8 Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what
snitch_cluster_peripheral.CL_CLINT_SET 0x180 8 Set bits in the cluster-local CLINT. Writing a 1 at location i sets the cluster-local interrupt
snitch_cluster_peripheral.CL_CLINT_CLEAR 0x188 8 Clear bits in the cluster-local CLINT. Writing a 1 at location i clears the cluster-local interrupt
snitch_cluster_peripheral.HW_BARRIER 0x190 8 Hardware barrier register. Loads to this register will block until all cores have
snitch_cluster_peripheral.ICACHE_PREFETCH_ENABLE 0x198 8 Controls prefetching of the instruction cache.

PERF_COUNTER_ENABLE

Enable particular performance counter and start tracking. - Reset default: 0x0 - Reset mask: 0x7fffffff

Instances

Name Offset
PERF_COUNTER_ENABLE_0 0x0
PERF_COUNTER_ENABLE_1 0x8
PERF_COUNTER_ENABLE_2 0x10
PERF_COUNTER_ENABLE_3 0x18
PERF_COUNTER_ENABLE_4 0x20
PERF_COUNTER_ENABLE_5 0x28
PERF_COUNTER_ENABLE_6 0x30
PERF_COUNTER_ENABLE_7 0x38
PERF_COUNTER_ENABLE_8 0x40
PERF_COUNTER_ENABLE_9 0x48
PERF_COUNTER_ENABLE_10 0x50
PERF_COUNTER_ENABLE_11 0x58
PERF_COUNTER_ENABLE_12 0x60
PERF_COUNTER_ENABLE_13 0x68
PERF_COUNTER_ENABLE_14 0x70
PERF_COUNTER_ENABLE_15 0x78

Fields

{"reg": [{"name": "CYCLE", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "TCDM_ACCESSED", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "TCDM_CONGESTED", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "ISSUE_FPU", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "ISSUE_FPU_SEQ", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "ISSUE_CORE_TO_FPU", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "RETIRED_INSTR", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "RETIRED_LOAD", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "RETIRED_I", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "RETIRED_ACC", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_AW_STALL", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_AR_STALL", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_R_STALL", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_W_STALL", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_BUF_W_STALL", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_BUF_R_STALL", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_AW_DONE", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_AW_BW", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_AR_DONE", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_AR_BW", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_R_DONE", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_R_BW", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_W_DONE", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_W_BW", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_B_DONE", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "DMA_BUSY", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "ICACHE_MISS", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "ICACHE_HIT", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "ICACHE_PREFETCH", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "ICACHE_DOUBLE_HIT", "bits": 1, "attr": ["rw"], "rotate": -90}, {"name": "ICACHE_STALL", "bits": 1, "attr": ["rw"], "rotate": -90}, {"bits": 33}], "config": {"lanes": 1, "fontsize": 10, "vspace": 190}}
Bits Type Reset Name
63:31 Reserved
30 rw 0x0 ICACHE_STALL
29 rw 0x0 ICACHE_DOUBLE_HIT
28 rw 0x0 ICACHE_PREFETCH
27 rw 0x0 ICACHE_HIT
26 rw 0x0 ICACHE_MISS
25 rw 0x0 DMA_BUSY
24 rw 0x0 DMA_B_DONE
23 rw 0x0 DMA_W_BW
22 rw 0x0 DMA_W_DONE
21 rw 0x0 DMA_R_BW
20 rw 0x0 DMA_R_DONE
19 rw 0x0 DMA_AR_BW
18 rw 0x0 DMA_AR_DONE
17 rw 0x0 DMA_AW_BW
16 rw 0x0 DMA_AW_DONE
15 rw 0x0 DMA_BUF_R_STALL
14 rw 0x0 DMA_BUF_W_STALL
13 rw 0x0 DMA_W_STALL
12 rw 0x0 DMA_R_STALL
11 rw 0x0 DMA_AR_STALL
10 rw 0x0 DMA_AW_STALL
9 rw 0x0 RETIRED_ACC
8 rw 0x0 RETIRED_I
7 rw 0x0 RETIRED_LOAD
6 rw 0x0 RETIRED_INSTR
5 rw 0x0 ISSUE_CORE_TO_FPU
4 rw 0x0 ISSUE_FPU_SEQ
3 rw 0x0 ISSUE_FPU
2 rw 0x0 TCDM_CONGESTED
1 rw 0x0 TCDM_ACCESSED
0 rw 0x0 CYCLE

PERF_COUNTER_ENABLE . ICACHE_STALL

Incremented for instruction cache stalls. This is a hart-local signal

PERF_COUNTER_ENABLE . ICACHE_DOUBLE_HIT

Incremented for instruction cache double hit. This is a hart-local signal

PERF_COUNTER_ENABLE . ICACHE_PREFETCH

Incremented for instruction cache prefetches. This is a hart-local signal

PERF_COUNTER_ENABLE . ICACHE_HIT

Incremented for instruction cache hits. This is a hart-local signal

PERF_COUNTER_ENABLE . ICACHE_MISS

Incremented for instruction cache misses. This is a hart-local signal

PERF_COUNTER_ENABLE . DMA_BUSY

Incremented whenever DMA is busy. This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_B_DONE

Incremented whenever B handshake occurs. This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_W_BW

Whenever W handshake occurs, the counter is incremented by the number of bytes transfered in this cycle This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_W_DONE

Incremented whenvever W handshake occurs. This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_R_BW

Whenever R handshake occurs, the counter is incremented by the number of bytes transfered in this cycle This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_R_DONE

Incremented whenever R handshake occurs. This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_AR_BW

Whenever AR handshake occurs, the counter is incremented by the number of bytes transfered for this transaction This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_AR_DONE

Incremented whenever AR handshake occurs. This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_AW_BW

Whenever AW handshake occurs, the counter is incremented by the number of bytes transfered for this transaction This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_AW_DONE

Incremented whenever AW handshake occurs. This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_BUF_R_STALL

Incremented whenever r_valid = 1 but r_ready = 0. This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_BUF_W_STALL

Incremented whenever w_ready = 1 but w_valid = 0. This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_W_STALL

Incremented whenever w_valid = 1 but w_ready = 0. This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_R_STALL

Incremented whenever r_ready = 1 but r_valid = 0. This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_AR_STALL

Incremented whenever ar_valid = 1 but ar_ready = 0. This is a DMA-local signal

PERF_COUNTER_ENABLE . DMA_AW_STALL

Incremented whenever aw_valid = 1 but aw_ready = 0. This is a DMA-local signal

PERF_COUNTER_ENABLE . RETIRED_ACC

Offloaded instructions retired by the core. This is a hart-local signal.

PERF_COUNTER_ENABLE . RETIRED_I

Base instructions retired by the core. This is a hart-local signal.

PERF_COUNTER_ENABLE . RETIRED_LOAD

Load instructions retired by the core. This is a hart-local signal.

PERF_COUNTER_ENABLE . RETIRED_INSTR

Instructions retired by the core, both offloaded and not. Does not count instructions issued independently by the FPU sequencer. This is a hart-local signal.

PERF_COUNTER_ENABLE . ISSUE_CORE_TO_FPU

Incremented whenever the core issues an FPU instruction. This is a hart-local signal.

PERF_COUNTER_ENABLE . ISSUE_FPU_SEQ

Incremented whenever the FPU Sequencer issues an FPU instruction. Might not be available if the hardware doesn't support FREP. Note that all FP instructions offloaded by the core to the FPU are routed through the sequencer (although not necessarily buffered) and thus are also counted. The instructions issued independently by the FPU sequencer could thus be calculated as ISSUE_FPU_SEQ_PROPER = ISSUE_FPU_SEQ - ISSUE_CORE_TO_FPU. This is a hart-local signal.

PERF_COUNTER_ENABLE . ISSUE_FPU

Operations performed in the FPU. Includes both operations initiated by the sequencer and by the core. When the Xfrep extension is available, this counter is equivalent to ISSUE_FPU_SEQ (see description of ISSUE_FPU_SEQ). If the Xfrep extension is not supported, then it is equivalent to ISSUE_CORE_TO_FPU. This is a hart-local signal.

PERF_COUNTER_ENABLE . TCDM_CONGESTED

Incremented whenever an access towards the TCDM is made but the arbitration logic didn't grant the access (due to congestion). It's strictly less than TCDM_ACCESSED. This is a cluster-global signal.

PERF_COUNTER_ENABLE . TCDM_ACCESSED

Increased whenever the TCDM is accessed. Each individual access is tracked, so if n cores access the TCDM, n will be added. Accesses are tracked at the TCDM, so it doesn't matter whether the cores or for example the SSR hardware accesses the TCDM. This is a cluster-global signal.

PERF_COUNTER_ENABLE . CYCLE

Cycle counter. Counts up as long as the cluster is powered.

HART_SELECT

Select from which hart in the cluster, starting from 0, the event should be counted. For each performance counter the cores can be selected individually. If a hart greater than the clusters total hart size is selected the selection will wrap and the hart corresponding to hart_select % total_harts_in_cluster will be selected. - Reset default: 0x0 - Reset mask: 0x3ff

Instances

Name Offset
HART_SELECT_0 0x80
HART_SELECT_1 0x88
HART_SELECT_2 0x90
HART_SELECT_3 0x98
HART_SELECT_4 0xa0
HART_SELECT_5 0xa8
HART_SELECT_6 0xb0
HART_SELECT_7 0xb8
HART_SELECT_8 0xc0
HART_SELECT_9 0xc8
HART_SELECT_10 0xd0
HART_SELECT_11 0xd8
HART_SELECT_12 0xe0
HART_SELECT_13 0xe8
HART_SELECT_14 0xf0
HART_SELECT_15 0xf8

Fields

{"reg": [{"name": "HART_SELECT", "bits": 10, "attr": ["rw"], "rotate": -90}, {"bits": 54}], "config": {"lanes": 1, "fontsize": 10, "vspace": 130}}
Bits Type Reset Name Description
63:10 Reserved
9:0 rw x HART_SELECT Select source of per-hart performance counter

PERF_COUNTER

Performance counter. Set corresponding PERF_COUNTER_ENABLE bits depending on what performance metric you would like to track. - Reset default: 0x0 - Reset mask: 0xffffffffffff

Instances

Name Offset
PERF_COUNTER_0 0x100
PERF_COUNTER_1 0x108
PERF_COUNTER_2 0x110
PERF_COUNTER_3 0x118
PERF_COUNTER_4 0x120
PERF_COUNTER_5 0x128
PERF_COUNTER_6 0x130
PERF_COUNTER_7 0x138
PERF_COUNTER_8 0x140
PERF_COUNTER_9 0x148
PERF_COUNTER_10 0x150
PERF_COUNTER_11 0x158
PERF_COUNTER_12 0x160
PERF_COUNTER_13 0x168
PERF_COUNTER_14 0x170
PERF_COUNTER_15 0x178

Fields

{"reg": [{"name": "PERF_COUNTER", "bits": 48, "attr": ["rw"], "rotate": 0}, {"bits": 16}], "config": {"lanes": 1, "fontsize": 10, "vspace": 80}}
Bits Type Reset Name Description
63:48 Reserved
47:0 rw x PERF_COUNTER Performance counter

CL_CLINT_SET

Set bits in the cluster-local CLINT. Writing a 1 at location i sets the cluster-local interrupt of hart i, where i is relative to the first hart in the cluster, ignoring the cluster base hart ID. - Offset: 0x180 - Reset default: 0x0 - Reset mask: 0xffffffff

Fields

{"reg": [{"name": "CL_CLINT_SET", "bits": 32, "attr": ["wo"], "rotate": 0}, {"bits": 32}], "config": {"lanes": 1, "fontsize": 10, "vspace": 80}}
Bits Type Reset Name Description
63:32 Reserved
31:0 wo x CL_CLINT_SET Set cluster-local interrupt of hart i

CL_CLINT_CLEAR

Clear bits in the cluster-local CLINT. Writing a 1 at location i clears the cluster-local interrupt of hart i, where i is relative to the first hart in the cluster, ignoring the cluster base hart ID. - Offset: 0x188 - Reset default: 0x0 - Reset mask: 0xffffffff

Fields

{"reg": [{"name": "CL_CLINT_CLEAR", "bits": 32, "attr": ["wo"], "rotate": 0}, {"bits": 32}], "config": {"lanes": 1, "fontsize": 10, "vspace": 80}}
Bits Type Reset Name Description
63:32 Reserved
31:0 wo x CL_CLINT_CLEAR Clear cluster-local interrupt of hart i

HW_BARRIER

Hardware barrier register. Loads to this register will block until all cores have performed the load. At this stage we know that they reached the same point in the control flow, i.e., the cores are synchronized. - Offset: 0x190 - Reset default: 0x0 - Reset mask: 0xffffffff

Fields

{"reg": [{"name": "HW_BARRIER", "bits": 32, "attr": ["ro"], "rotate": 0}, {"bits": 32}], "config": {"lanes": 1, "fontsize": 10, "vspace": 80}}
Bits Type Reset Name Description
63:32 Reserved
31:0 ro x HW_BARRIER Hardware barrier register.

ICACHE_PREFETCH_ENABLE

Controls prefetching of the instruction cache. - Offset: 0x198 - Reset default: 0x1 - Reset mask: 0x1

Fields

{"reg": [{"name": "ICACHE_PREFETCH_ENABLE", "bits": 1, "attr": ["wo"], "rotate": -90}, {"bits": 63}], "config": {"lanes": 1, "fontsize": 10, "vspace": 240}}
Bits Type Reset Name Description
63:1 Reserved
0 wo 0x1 ICACHE_PREFETCH_ENABLE Enable instruction prefetching.