RISC-V CPU Pipeline Simulation
1. Introduction
RISC-V is an open-source architecture and instruction set standard originating from
Berkeley. This project requires you to implement a RISC-V CPU pipeline simulator based on
the standard five-stage pipeline. You will need to implement a subset of the instructions
from the RV32I instruction set specified in RISC-V Specification 2.2. Implementing a
complete CPU simulator can effectively exercise system programming capabilities and
deepen understanding of architecture-related knowledge.
2. Project introduction
2.1. Project requirements
The most important part of this project is to implement a RISC-V CPU pipeline simulator.
The specific requirements are as follows:
• command-line argument parser module that allows for parsing paths to RISC-V binary
files specified in the command line. It also provides an option to enable or disable printing
a history log at the end of the file. Please make sure your simulator can be run by
Simulator xxx.riscv, where xxx.riscv is the path of the riscv binary code.
• load ELF files (has implemented in templates).
• history module (has a reference structure in templates).
• memory management (has a reference structure in templates, is needed by some
instruction).
• simulate the required instructions (see Section 2.7), including handle the system call (see
Section 2.6).
• handle data hazard, control hazard and memory access hazard.
2.2. Possible Structure
NOTE: This is just a possible structure. You can design your own structure.
The overview diagram (Figure 1) of the simulator code architecture is shown below. The
entry point of the simulator is Main.cpp, which includes parsing parameters, loading ELF
files, initializing the simulator module, and finally calling the simulate() function to enter
the execution of the simulator. Unless there is an error in executing the simulator,
theoretically, simulate() function will not return.
The simulator itself is designed as a large class, which is the class The data in the Simulator
class includes PC, general registers, pipeline registers, execution history recorders, memory
modules and branch prediction modules (not necessary for you, will not affect your score).
Among them, because the memory module and branch prediction module are relatively
independent, they are implemented as two separate classes MemoryManager and
BranchPredictor.
The most core function in the simulator is the simulate() function, which performs cyclelevel simulation on the simulator. In each simulation, it will execute fetch(), decode(),
execute(), accessMemory() and writeBack() five functions, each of which takes as input the
pipeline register from the previous cycle and outputs to the pipeline register for the next
cycle. At the end of a cycle, contents of new registers are copied into those used as inputs.
During execution, each function handles content related to data hazards, control hazards
and memory access hazards and records historical information at appropriate places.
Figure 1: Simulator Architecture
2.3. Memory Management
The function of MemoryManager is to provide a simple and easy-to-use memory access
interface for the simulator, which must support arbitrary memory size and memory address
access, and can detect illegal memory address access. What you need to do is to load the
different sections from the elf file into the correct memory locations based on the section’s
virtual memory address and memory size (not file size). Then, when simulating the
execution of read/write instructions, you just need to parse the memory address and directly
operate with this memory address in the MemoryManager, without the need for any
conversions in between.
The following implementation of MemoryManager uses a mechanism similar to the twolevel page table (single-level page table is OK) used in x86 architecture. Specifically
speaking, it divides the 32-bit memory space (4GB) logically into pages with a size of 4KB
(2^12), using the first 10 bits of the memory address as an index for level one page table,
followed by another 10 bits as an index for level two page table, and finally using last 12 bits
as an offset within a single page.
NOTE: This is a sample memory manager. You can create your own design. But you have to
make sure that your ELF file is loaded into the correct position. With the above
implementation, you don’t need to allocate 4GB of memory at once; you only need to
allocate as needed. Of course, you can also directly allocate the memory required for loading
the ELF file along with an additional stack area (Figure 2).
2.4. ELF Load and Initialization
You need to according to Section 3 to implement the ELF file loader and initialize the
simulator. The ELF file loader is responsible for loading the ELF file into the simulator’s
memory, and the initialization process is responsible for setting the initial state of the
simulator, including setting the initial value of the PC, setting the initial value of the general
register, and setting the initial value of the stack pointer, etc.
Figure 2: Memory Layout
Figure 2 shows the typical layout of a simple computer’s program memory with the text,
various data, and stack and heap sections. The text and data segments are placed in their
corresponding positions when you load the ELF file. After loading the ELF file, initialize the
stack by setting the stack pointer to the top of memory and adjusting the stack size as
needed, for example, to 4MB. Heap management is typically handled by software, so you
don’t need to worry about it.
2.5. Simulator Implementation
For the RISC-V pipeline simulator, you need to implement the five-stage pipeline, including
• Fetch, all instructions in the RV32I instruction set are fixed-length 4 bytes.
• Decode, translates instructions into RISC-V assembly format strings. In addition, mimics
hardware implementations by abstracting common fields such as op1, op2, and dest from
instructions.
• Execute, simply executes corresponding behaviors based on different types. In conclusion,
it checks data hazards, control hazards, and memory access hazards according to the
current commands and situations during the decode stage, and takes actions accordingly.
At this point, jump command gets whether or not jump happens, and inserts bubbles into
pipeline registers when branch to wrong path.
• Memory access, performs memory read-write operations,and detects data hazard and
forwarding. When detecting data hazard, it needs consider both general data hazard and
situation where pipeline stalls due to memory access hazard last cycle. Besides, priority
level for forwarding must also taken into account.
• Write back, writes execution results back to register,and handles data hazard like before.
For the RISC-V Pipeline Hazards, I recommend you to refer to the following links:
• RISCV-V Pipeline Hazards from Berkeley
• RISCV-V Pipeline Hazards from Washington
2.6. System Call
This project use following system calls. The system call use ecall instruction to trigger. The
a7 register saves the system call number, the a0 register saves the system call parameter,
and the return value will be saved in the a0 register.
System Call
Name
System Call
Number Parameter Return Value
Print string 0
The initial address
of string None
Print char 1 The value of char None
Print number 2
The value of
number None
Exit program 3 None None
Read char 4 None The value of char
Read number 5 None The value of
number
The detailed information about system call can be found in test-release/lib.c.
2.7. Required Instructions
The following table lists the instructions that you need to implement in the simulator. You
can refer to the RISC-V Specification 2.2 for the detailed information about these
instructions.
"lui", "auipc", "jal", "jalr", "beq", "bne", "blt",
"bge", "bltu", "bgeu", "lb", "lh", "lw", "lbu",
"lhu", "sb", "sh", "sw", "addi", "slti", "sltiu",
"xori", "ori", "andi", "slli", "srli", "srai", "add",
"sub", "sll", "slt", "sltu", "xor", "srl", "sra",
"or", "and", "ecall"
2.8. History
The simulator needs to record the number of cycles and the number of instructions executed
during the simulation process, and output the number of cycles and the number of
instructions executed when the input parameters indicate that these need to be printed.
NOTE: This part is not tested by the test scripts, but you need to implement it and provide
the usage of it in your ReadMe.md. Please make sure your ReadMe is clear and detailed.
2.9. Advanced Features
You can implement the following advanced features to improve the simulator:
• Implement a branch prediction module to improve the performance of the simulator.
• Implement a cache module to improve the performance of the simulator. This will be
related to the next project.
• Implement a out-of-order execution module to improve the performance of the simulator.
• Some other advanced features that you are interested in.
NOTE: These advanced features are not necessary for this project, and they will not affect
your score. If you have interest, you can implement them.
2.10. Test Cases
We provide some test cases for you to verify your simulator. You can find them in the testrelease directory. We also have other programs to further verify your simulator, all these
test cases will be part of your final score.
How to run the test cases:
• Download the test-release.zip file from the course platform.
• Unzip the test-release.zip file in the root directory of your project, you will get testrelease directory and run-test-release.sh.
• Run the run-test-release.sh script in the root directory of your project, like bash runtest-release.sh
• Please make sure your executable file is named Simulator and is located in the build
directory.
The example output:
> bash run-test-release.sh
Comparing ./test-release/add.out and ./test-release/add.ref
Succeed! Files ./test-release/add.out ./test-release/add.ref are the same
Comparing ./test-release/mul-div.out and ./test-release/mul-div.ref
Succeed! Files ./test-release/mul-div.out ./test-release/mul-div.ref are the
same
Comparing ./test-release/n!.out and ./test-release/n!.ref
Succeed! Files ./test-release/n!.out ./test-release/n!.ref are the same
Comparing ./test-release/qsort.out and ./test-release/qsort.ref
Succeed! Files ./test-release/qsort.out ./test-release/qsort.ref are the same
Comparing ./test-release/simple-function.out and ./test-release/simplefunction.ref
Succeed! Files ./test-release/simple-function.out ./test-release/simplefunction.ref are the same
5 / 5 tests pass!
3. ELF File Loader
3.1. ELF File Format
There are three main types of object files in the ELF (Executable and Linking Format)
format:
• Relocatable file: holds code and data suitable for linking with other object files.
• Executable file: holds a program suitable for execution.
• Shared object file: holds code and data suitable for linking in two contexts.
Object files participate in program linking (building a program) and program execution
(running a program). For convenience and efficiency, the object file format provides parallel
views of a file’s contents, reflecting the differing needs of these activities. Figure 3 shows the
basic structure of an ELF object file.
Figure 3: Object File Format
Section in object file format:
• Sections are used during the linking and compilation process
• They represent different types of data within the ELF file, such as code (.text), initialized
data (.data), uninitialized data (.bss), symbols table (.symtab), string table (.strtab),
relocation information (.rel.text, .rel.data), and debugging information.
• Sections contain information that is useful for linking and for debugging, but they are not
necessarily loaded into memory when the program is executed.
• The ELF file contains a section header table that lists all sections and their attributes.
Segment in object file format:
• Segments are used during the execution process.
• They are typically a collection of sections that need to be loaded into memory as a unit.
In summary, sections are for organization and use during compilation and linking, while
segments are for mapping the ELF file into memory during execution. An object file
segment contains one or more sections, as “Segment Contents”.
3.2. Program Loading
As the system creates or augments a process image, it logically copies a file’s segment to a
virtual memory segment. Virtual addresses and file offsets for SYSTEM V architecture
segments are congruent modulo 4KB (0x1000) or larger powers of 2, which means when you
divide the virtual address and the file offset by 4KB, the remainders are the same. Because
4KB is the maximum page size, the files will be suitable for mapping regardless of physical
page size. Figure 4 shows the basic structure of an ELF executable file.
Figure 4: Executable File
Although the example’s file offsets and virtual addresses are congruent modulo 4KB for both
text data, up to four file pages hold impure text or data (depending on page size and file
system block size).
• The first text page contains the ELF header, the program header table, and other info.
• The last text page holds a copy of the beginning of data.
• The first data page has a copy of the end of text.
• The last data page may contain file information not relevant to the running process.
Figure 5: Process Image Segments
Logically, the system enforces the memory permissions as if each segment were complete
and separate; segments’ addresses are adjusted to ensure each logical page in the address
space has a single set of permissions. In the example (Figure 4) above, the region of the file
holding the end of text and the beginning of data will be mapped twice; at one virtual
address for text and at a different virtual address for data.
The end of the data segment requires special handling for uninitialized data (often referred
to as the .bss segment (Block Started by Symbol), is a portion of the memory in a program
that is reserved for variables that have not been given an explicit initial value by the
programmer.), which the system defines to begin with zero values. Thus if a files’s last data
page includes information not in the logical memory page, the extraneous data must be set
to zero, not the unknown contents of the executable file. “Impurities” in the other three
pages are not logically part of the process image; whether the system expunges them is
unspecified. The memory image (Figure 5) for this program follows, assuming 4KB (0x1000)
pages.
In this project, you do not need to care about “Impurities” in the pages. Just deal with the
uninitialized data.
3.3. Program Loading Example
Here is an example of loading an ELF file into memory. The following is the output of the
simulator when loading the add.riscv file. You need to allocate memory for segments
according to MSize (memory size). For address larger than FSize (file size), you need to fill
the memory with 0.
> ./Simulator ../test-inclass/add.riscv -s -v
==========ELF Information==========
Type: ELF32
Encoding: Little Endian
ISA: RISC-V(0xf3)
Number of Sections: 14
ID Name Address Size
[0] 0x0 0
[1] .text 0x100e8 8636
[2] .eh_frame 0x13000 4
[3] .init_array 0x13008 16
[4] .fini_array 0x13018 8
[5] .data 0x13020 2472
[6] .sdata 0x139c8 32
[7] .sbss 0x139e8 56
[8] .bss 0x13a20 1416
[9] .comment 0x0 45
[10] .riscv.attributes 0x0 28
[11] .symtab 0x0 4632
[12] .strtab 0x0 1478
[13] .shstrtab 0x0 118
Number of Segments: 3
ID Flags Address FSize MSize
[0] 0x4 0x0 28 0
[1] 0x5 0x10000 8868 8868
[2] 0x6 0x13000 2536 4008
===================================
Memory Pages:
0x0-0x400000:
0x10000-0x11000
0x11000-0x12000
0x12000-0x13000
0x13000-0x14000
Fetched instruction 0x00003197 at address 0x1012c
NOTE: We have provided a sample ELF file loader. You may use it as is or modify it to suit
your needs. Please ensure you understand it before using.
3.4. Some other information
For entry point, you can find it in ELF header (e_entry in ELF header). It gives the virtual
address to which the system first transfers control, thus starting the process. If the file has
no associated entry point, it holds zero.
You have the option to create your own ELF file loader or utilize existing libraries like elfio.
Your choice won’t affect your score, but I recommend writing it yourself for a better
understanding of the ELF file format and program loading process.
For more information about the ELF file format, you can refer to the following links: cmu-elf
NOTE: All ELF files used for testing are little-endian. Ensure to manage the file’s endianness
accordingly.
4. Submission
For this project, you must use C/C++/Rust to implement the simulator. If you use python,
you will get a 0 score. You need to submit the following files:
• src/*, include all source code files
• include/*, include all header files if you use C/C++
• CMakelists.txt, the cmake file for your project if you use C++/C
• Cargo.toml, the cargo file for your project if you use Rust
• projetc-report.pdf, a detailed introduction to your project. The specific things that need to
be included are as follows:
‣ the usage of your simulator
‣ the implementation details of your simulator (how to handle hazards, how to implement
the pipeline, etc.)
‣ the history information of your simulator
‣ the environment, how to compile and run your project
‣ your understanding of memory management, ELF file loader.
• test-release/*, include all test cases provided by us, do not change the file name
• build.sh, a script to build your project which should be able to compile your project just
by running bash build.sh
Please compress all files into a single zip file and submit it to the course platform. The file
name should be your student ID, like xxxxxxxxx.zip. Please make sure your project can be
compiled and run on the Linux platform. If your project cannot be compiled and run, you
will receive a 0 score. Please ensure that your emulator can be compiled by cmake with gcc/
g++ or cargo. If you use other tools, it is not acceptable.
If you have any questions or have some suggestions about the submission process, please
feel free to ask me (TA ZHANG Yanglin) in the course group or send an email
(lucky@lucky9.cyou / 119010446@link.cuhk.edu.cn) to me.
5. Grading
For this assignment, we are to submit a RISC-V CPU pipeline simulator. If you have
difficulty completing this, you may submit a sequential version of the simulator; however,
you will receive a maximum of 30% of the score.
The overall score will be calculated as follows:
• Not provided test cases: 45%
• Provided test cases: 25%
• History (like Section 2.8): 5%
• Report: 20%
• Code style and comments: 5%
• Advanced features (bonus): 5%
Some matters need attention:
• The code should be well-structured and easy to understand.
• The ReadMe.md should be clear and easy to understand. Please provide detailed
introduction about your simulator, including the usage of your simulator, the
implementation details of your simulator, the history information of your simulator, how
to compile and run your project, and other information that you consider important.
• Do not plagiarize. If we discover that you have plagiarized, you will not only receive a
score of zero for this project, but you will also fail this course directly. Additionally, we
will report your actions to the Registry office. If we use plagiarism-detection software and
after confirmation by the TA, it is found that you have indeed plagiarized, we will notify
you via email.
• Please ensure that your project can pass the aforementioned test scripts (Section 2.10); we
will provide you with some example tests. If your project does not yield the expected
output, it will initially receive a score of zero. Following that, you may contact us with
your test scripts, but the score for the part of your project pertaining to code style and
comments will directly be 0.
6. Development Environment
6.1. RISC-V Environment Installation and Configuration
NOTE: This section is for project test environment setup. You do not need to do this in your
project.
For convenience, this experiment is entirely based on the RISC-V 32I instruction set, with
reference to the RISC-V Specification 2.2 standard.
The following steps were taken to configure the environment:
• Downloaded riscv-tools from GitHub and configured, compiled and installed riscv-gnutoolchain for Linux platform
• To use official simulator as a reference, downloaded, compiled and installed riscv-qemu
from GitHub;
It should be noted that when compiling riscv-gnu-toolchain, it is necessary to specify that
the tool chain and C language standard library use RV32I instruction set. Otherwise during
compilation compiler will use extended instruction sets like RV32C、RV32D etc., even if
compiler settings are made only for using RV32I instructions during compile time compiler
would still link in standard library functions which uses extended instructions sets.
Therefore in order to get ELF program which only uses RV32I standard instructions one
must recompile within riscv-gnu-toolchain with following options:
mkdir build; cd build
../configure --with-arch=rv32i --prefix=/path/to/riscv32i
make -j$(nproc)
During compilation, use -march=rv32i to let the compiler generate ELF programs for the
RV32I standard instruction set:
riscv32-unknown-elf-gcc -march=rv32i add.c lib.c -o add.riscv
Dissasemble the ELF program use following command:
riscv32-unknown-elf-objdump -D add.riscv > add.s
7. Collection of potentially useful links
• RISCV-V Pipeline Hazards from Berkeley
• RISCV-V Pipeline Hazards from Washington
• elfio
• cmu-elf
• cmake-tutorial
• introduction to modern cmake
8. About template
We have provided a template code using C++. You can refer to this code for your work, or
modify it as per your requirements. We are not responsible for any errors that arise from
your use of the template. This template is provided for reference only and its accuracy is not
guaranteed (except for loading ELF). For those who choose to complete this project using
Rust, I believe you have the sufficient skills to do so without a template. However, you may
still refer to the C++ template if necessary.
请加QQ:99515681 邮箱:99515681@qq.com WX:codinghelp
标签: