Add multicore support for hardware configuration
This commit is contained in:
parent
c2d3c37eb6
commit
97973c31e0
2 changed files with 47 additions and 33 deletions
|
|
@ -14,7 +14,7 @@ Thus, the algorithm is used in spacecraft to decrease the amount of image data t
|
|||
Since the environment in space is limited, the design needs to focus on an energy efficient design using a small hardware area.
|
||||
This alters the focus of the codesign to prefer energy efficiency over throughput or execution time.
|
||||
However, the aspect of fast execution times is still highly relevant and a good balance between the two needs to be explored.
|
||||
|
||||
Notably, in current RISC-V space processors there are no vector processing units, making this a interesting aspect.
|
||||
## Method
|
||||
|
||||
### Development and evaluation
|
||||
|
|
@ -24,6 +24,7 @@ For parallelisation, the (OpenMP library)[https://www.openmp.org/] will be used.
|
|||
To test and evaluate the software implementation, it will run in the gem5 simulator. The hardware configuration is also done in configuration files for gem5.
|
||||
The mock data for the images will be generated in C with nonsensical values. This does not matter since different values will not affect the run time.
|
||||
When measuring the performance the sequential time of generating mock data and freeing the memory will be deducted for a true performance reflection.
|
||||
For the parts where problem size will increase, performance will be measured by cycles per DCT block.
|
||||
|
||||
### Building
|
||||
|
||||
|
|
@ -38,7 +39,7 @@ The following flags will be used based on what functionality is needed:
|
|||
- `-lm` for math library
|
||||
- `-libomp` for OpenMP library
|
||||
- `-O[level]` for different optimisation levels
|
||||
- `-march=rv64imafcv` for the RISC-V ISA
|
||||
- `-march=rv64imadcv` for the RISC-V ISA
|
||||
- `-mabi=lp64d` for the RISC-V ABI
|
||||
|
||||
### Simulating
|
||||
|
|
@ -51,15 +52,27 @@ The python script for this project is tailored for this project specifically, th
|
|||
- `--l2` for the L2 cache size
|
||||
- `--vlen` for the vector length
|
||||
- `--elen` for the element length
|
||||
- `--cores` for the number of cores
|
||||
|
||||
To run the simulation and output the result, the following command is used:
|
||||
|
||||
```bash
|
||||
../gem5/build/RISCV/gem5.opt -d stats/ ./riscv_hw.py --l1i 16kB --l1d 64kB --l2 256kB --vlen 256 --elen 64
|
||||
../gem5/build/RISCV/gem5.opt -d stats/ ./riscv_hw.py --l1i 16kB --l1d 64kB --l2 256kB --vlen 256 --elen 64 --cores 1
|
||||
```
|
||||
|
||||
## Implementation
|
||||
|
||||
### Initial hardware configuration
|
||||
For the initial and naive software implementation, some hardware configurations are set. These are:
|
||||
- L1 instruction cache size: 16kB
|
||||
- L1 data cache size: 64kB
|
||||
- L2 cache size: 256kB
|
||||
- Vector length: 256
|
||||
- Element length: 64
|
||||
- Number of threads: 1
|
||||
- L1 cache associativity: 2
|
||||
- L2 cache associativity: 8
|
||||
|
||||
### Constants and definitions
|
||||
Throughout the code, several constants and definitions are defined for ease to try different configurations. These are defined in the following way:
|
||||
- `DCT_SIZE` is the size of the DCT block
|
||||
|
|
@ -75,9 +88,6 @@ This will be done by allocating DCT-blocks heap memory and filling them with dat
|
|||
It's important to actually generate all the data and not reuse the same matrices to get realistic cache hits and misses.
|
||||
The memory allocation is done in the following way:
|
||||
|
||||
### Initial hardware configuration
|
||||
|
||||
|
||||
```c
|
||||
element_t ***mock_matrices = (element_t ***) malloc(TOTAL_DCT_BLOCKS * sizeof(element_t**));
|
||||
for (int i = 0; i < TOTAL_DCT_BLOCKS; i++) {
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue