Skip to the content.

AutoDSE Tutorial

Design Space Definition

The candidate pragmas for AutoDSE can be in either of the following forms:

#pragma ACCEL PIPELINE auto{pragma_name}
#pragma ACCEL PARALLEL factor=auto{pragma_name}
#pragma ACCEL TILE factor=auto{pragma_name}

You can either let AutoDSE augment the code with the candidate pragmas or do it yourself. To make AutoDSE do it, run one of the following commands based on your use case:

ds_generator [-I<include dir>] <kernel file> # only adds the candidate pragmas to the kernel code and stores the generated code in `rose_merlinkernel_[KERNEL_NAME].c`

autodse <project dir> <working dir> <kernel file> <fastgen|accurategen> [<database file>]  # adds candidate pragmas and runs DSE after it

Each of the above commands will produce a file named ds_info.json which describes the design space along with AutoDSE’s settings. Each of the pragmas would be defined in the following form:

"__PARA__L0" : 
		"ds_type" : "PARALLEL",
		"options" : "[x for x in [1,2,4,7,8,14,16,32] if x*__TILE__L0<=32]",
		"default" : 1,
		"order" : "0 if x&(x-1)==0 else 1"

Please refer to this file for a complete example on design space (DS) definition for the GEMM kernel from the Machsuite benchmark

AutoDSE Settings

The rest of the settings in the config file determines how AutoDSE should be run. The following table gives a description of each of them.

Setting Description
design-space.max-part-num the maximum numbers of DS partitions allowed
evaluate.command.bitgen Merlin command to generate the bitstream
evaluate.command.hls Merlin command to run the HLS synthesis
evaluate.command.transform Merlin command for applying its code transformations
evaluate.max-util.[XX] the maximum allowed utilization for resource XX
project.backup type of project backup, choices: BACKUP_ERROR, NO_BACKUP, or BACKUP_ALL number of top designs generated as Merlin projects in fast mode
search.algorithm.gradient.fine-grained-first if set to true starts optimizing from the innermost loops in the bottleneck optimizer
search.algorithm.gradient.latency-threshold the minimum threshold latency we wish to achieve
search.algorithm.gradient.quality-type how to measure the quality of design, choices: performance, or finite-difference exploration strategy, use bottleneck to implement the paper’s approach. You can also use exhaustive for exhaustive search or hybrid for a combination of these two.
timeout.bitgen time limit (in minutes) for generating the bitstream
timeout.exploration time limit (in minutes) for DSE
timeout.hls time limit (in minutes) for the HLS synthesis
timeout.transform time limit (in minutes) for Merlin’s code transformation

Output Structure

The generated files structure is as shown below:

+-- evaluate          # backup directory which includes some of the explored Merlin projects

+-- output            # includes the Merlin project for top designs and a summary of them 

+-- logs              # all the log files
|   +-- dse.log         # contains all the messages printed in the console
|   +-- eval.log        # shows how many jobs were run and whether or not they were successful
|   +-- partX_log.log   # includes a summary of the status of explorer in DS partition X
|   +-- partX_expr.log  # includes a more detailed summary of the explorer in DS partition X

+-- result.db        # includes a database of the explored designs

+-- summary_[XX].rpt # summarizes all the points explored in mode XX (`fast` or `accurate`)

General Tips

  1. When AutoDSE is finished, it will save all the explored design points in a file (by default: redis.db) as a Redis database. In the presence of such file, you can resume the DSE by re-running the DSE command:
dse <project dir> <working dir> <config file> <fast|accurate> [<database file>]
  1. Since AutoDSE runs the HLS tool to assess a design point, the timeout values may need to be changed across different kernels. If the final report of the AutoDSE shows high number of timed out designs, adjust the respective time limit and rerun the tool.