ZigZag - Deep Learning Hardware Design Space Exploration
This repository presents the novel version of our tried-and-tested hardware Architecture-Mapping Design Space Exploration (DSE) Framework for Deep Learning (DL) accelerators. ZigZag bridges the gap between algorithmic DL decisions and their acceleration cost on specialized accelerators through a fast and accurate hardware cost estimation.
Mapping Class Reference

Collect information of a complete mapping (spatial and temporal) More...

Public Member Functions

def __init__ (self, Accelerator accelerator, SpatialMappingPerMemLvl|SpatialMappingInternal spatial_mapping, TemporalMapping temporal_mapping, LayerNode layer_node, bool access_same_data_considered_as_no_access=False)
 
def combine_spatial_temporal_mapping_dict (self)
 Combine spatial and temporal mapping dictionary into combined_mapping_dict by inserting spatial loops above temporal loops at each level. More...
 
list[bool] get_psum_flags (self)
 This function generates an list "psum_flag" that identify whether an output memory level holds partial or final output. More...
 
def gen_data_precision_dict (self)
 This function generates a dictionary that collect data precision for each operand at each arch level. More...
 
def gen_r_ir_loop_list (self)
 Given the combined mapping, generate r/ir loop size list at each level for each operand. More...
 
def calc_data_size (self)
 Based on the r loop size list, calculate the data size held by each architectural level. More...
 
def calc_effective_data_size (self)
 Calculate the effective data size for getting the allowed memory updating window in latency calculation. More...
 
def calc_data_access (self)
 Based on the ir loop size list and the total MAC Op count, calculate the data access at each memory level in a bottom-up way. More...
 
def calc_req_mem_bw_and_data_transfer_rate (self)
 This function calculates the average & instant required memory bw and the periodic data transfer pattern. More...
 
def disable_data_traffic_external (self)
 This function set all the data traffic between the top level memory and the external world to 0 in unit_mem_data_movement. More...
 

Public Attributes

 accelerator
 
 spatial_mapping
 
 temporal_mapping
 
 layer_node
 
 operand_list
 
 access_same_data_considered_as_no_access
 
 mem_level
 
 combined_mapping_dict_1s1t_reform
 
 combined_mapping_dict_1s2t_reform
 
 psum_flag
 
 combined_mapping_dict_1s1t
 
 combined_mapping_dict_1s2t
 
 data_precision_dict
 
 r_loop_size_per_level
 
 r_loop_size_per_level2
 
 ir_loop_size_per_level
 
 r_loop_size_cabl
 
 r_loop_size_cabl2
 
 ir_loop_size_cabl
 
 ir_loop_size_cabl2
 
 output_ir_loop_size_caal
 
 data_elem_per_level_unrolled
 
 data_bit_per_level_unrolled
 
 data_elem_per_level
 
 data_bit_per_level
 
 effective_data_elem
 
 effective_data_bit
 
 data_access_raw
 
 data_access_raw2
 

Detailed Description

Collect information of a complete mapping (spatial and temporal)

NOTE: Mapping is HW-unaware, i.e. Mapping doesn't take in HW information like memory bw, access cost, size and so on.

Constructor & Destructor Documentation

◆ __init__()

def __init__ (   self,
Accelerator  accelerator,
SpatialMappingPerMemLvl | SpatialMappingInternal  spatial_mapping,
TemporalMapping  temporal_mapping,
LayerNode  layer_node,
bool   access_same_data_considered_as_no_access = False 
)

Member Function Documentation

◆ calc_data_access()

def calc_data_access (   self)

Based on the ir loop size list and the total MAC Op count, calculate the data access at each memory level in a bottom-up way.

◆ calc_data_size()

def calc_data_size (   self)

Based on the r loop size list, calculate the data size held by each architectural level.

◆ calc_effective_data_size()

def calc_effective_data_size (   self)

Calculate the effective data size for getting the allowed memory updating window in latency calculation.

The effective data size is calculated by using data_elem_per_level_unrolled divided by the top r loops.

◆ calc_req_mem_bw_and_data_transfer_rate()

def calc_req_mem_bw_and_data_transfer_rate (   self)

This function calculates the average & instant required memory bw and the periodic data transfer pattern.

◆ combine_spatial_temporal_mapping_dict()

def combine_spatial_temporal_mapping_dict (   self)

Combine spatial and temporal mapping dictionary into combined_mapping_dict by inserting spatial loops above temporal loops at each level.

  • combined_mapping_dict_1s1t: corresponding level's spatial mapping and temporal mapping are merged together. Each level's data size is the total data size.
  • combined_mapping_dict_1s2t: each level's spatial mapping is merged to level+1's temporal mapping. Each level's data size is the unrolled data size.

◆ disable_data_traffic_external()

def disable_data_traffic_external (   self)

This function set all the data traffic between the top level memory and the external world to 0 in unit_mem_data_movement.

◆ gen_data_precision_dict()

def gen_data_precision_dict (   self)

This function generates a dictionary that collect data precision for each operand at each arch level.

◆ gen_r_ir_loop_list()

def gen_r_ir_loop_list (   self)

Given the combined mapping, generate r/ir loop size list at each level for each operand.

TODO cleanup

◆ get_psum_flags()

list[bool] get_psum_flags (   self)

This function generates an list "psum_flag" that identify whether an output memory level holds partial or final output.

E.g., psum_flag = [True, True, False] means that there are 3 memory levels for output and only the outermost memory level hold the final output, the 1st and 2nd memory levels need to store partial output for some time. For indexing convenience, we add an extra False to the end of the psum_flag list.

TODO cleanup

Member Data Documentation

◆ accelerator

accelerator

◆ access_same_data_considered_as_no_access

access_same_data_considered_as_no_access

◆ combined_mapping_dict_1s1t

combined_mapping_dict_1s1t

◆ combined_mapping_dict_1s1t_reform

combined_mapping_dict_1s1t_reform

◆ combined_mapping_dict_1s2t

combined_mapping_dict_1s2t

◆ combined_mapping_dict_1s2t_reform

combined_mapping_dict_1s2t_reform

◆ data_access_raw

data_access_raw

◆ data_access_raw2

data_access_raw2

◆ data_bit_per_level

data_bit_per_level

◆ data_bit_per_level_unrolled

data_bit_per_level_unrolled

◆ data_elem_per_level

data_elem_per_level

◆ data_elem_per_level_unrolled

data_elem_per_level_unrolled

◆ data_precision_dict

data_precision_dict

◆ effective_data_bit

effective_data_bit

◆ effective_data_elem

effective_data_elem

◆ ir_loop_size_cabl

ir_loop_size_cabl

◆ ir_loop_size_cabl2

ir_loop_size_cabl2

◆ ir_loop_size_per_level

ir_loop_size_per_level

◆ layer_node

layer_node

◆ mem_level

mem_level

◆ operand_list

operand_list

◆ output_ir_loop_size_caal

output_ir_loop_size_caal

◆ psum_flag

psum_flag

◆ r_loop_size_cabl

r_loop_size_cabl

◆ r_loop_size_cabl2

r_loop_size_cabl2

◆ r_loop_size_per_level

r_loop_size_per_level

◆ r_loop_size_per_level2

r_loop_size_per_level2

◆ spatial_mapping

spatial_mapping

◆ temporal_mapping

temporal_mapping

The documentation for this class was generated from the following file: