RISC-V Assembler Reference

This document gives an overview of RISC-V assembly language. First, an introduction to assembler and linker concepts, then sections describing assembler directives, pseudo- instructions, relocation functions, and assembler concepts such as labels, relative and absolute addressing, immediate values, constants and finally control and status registers.

The homepage contains a listing of instruction in the I (Base Integer Instruction Set) and M (Multiply and Divide) extension. For detailed information on the RISC-V instruction set refer to the RISC-V ISA Specification.

Concepts

This section briefly covers some high level concepts that are required to understand the process of assembling and linking executable code from source files.

Assembly file

An assembly file contains assembly language directives, macros and instructions. It can be emitted by a compiler or it can be handwritten. An assembly file is the input file to the assembler. The standard extensions for assembly files are .s and .S, with the later indicating that the assembly file should be pre-processed using the C preprocessor.

Relocatable Object file

A relocatable object file contains compiled object code and data emitted by the assembler. An object file cannot be run, rather it is used as input to the linker. The standard extension for object files is .o. The most common cross-platform file format for RISC-V executables is the ELF (Electronic Linker Format) object file format. The objdump utility can be used to disassemble an object file, objcopy can be used to copy and extract sections from ELF files and the nm utility can list symbols in an object file.

ELF Header

An ELF file has an ELF header that contains magic to indicate the file is ELF formatted, the architecture of the binary, the endianness of the binary (little-endian for RISC-V), the ELF file type (Relocatable Object File, Executable File, Shared Library), the number of program headers and their offset in the file, the number of section headers and their offset in the file, fields indicating the ELF version and ABI (Application Binary Interface) version of the file and finally flags indicating various ABI options such as RVC compression and which floating- point ABI that the executable code in the binary conforms to.

Program Header

Program Headers provide size and offsets of loadable segments within an executable file or shared object along with protection attributes used by the operating system (read, write and exec). Program headers are not present in relocatable object files and are primarily for use by the operating system to and dynamic linker to map code and data into memory.

Section Header

Section Headers provice size, offset, type, alignment and flags of the sections contained within the ELF file. Section headers are not required to execute a static binary but are necessary for dynamic linking as well as program linking. Various section types refer to the location of the symbol table, relocations and dynamic symbols in the ELF binary file.

Sections

An object file is made up of multiple sections, with each section corresponding to distinct types of executable code or data. There are a variety of different section types. This list shows the four most common sections:

Program linking

Program linking is the process of reading multiple relocatable object files, merging the sections from each of the source files, calculating the new addresses for symbols and applying relocation fixups to text or data that is pointed to in relocation entries.

Linker Script

A linker script is a text source file that is optionally input to the linker and it contains rules for the linker to use when calculating the load address and alignment of the various sections when creating an executable output file. The standard extension for linker scripts is .ld.

Assembler Directives

The assembler implements a number of directives that control the assembly of instructions into an object file. These directives give the ability to include arbitrary data in the object file, control exporting of symbols, selection of sections, alignment of data, assembly options for compression, position dependent and position independent code.

The following are assembler directives for emitting data:

Directive Arguments Description
.2byte   16-bit comma separated words (unaligned)
.4byte   32-bit comma separated words (unaligned)
.8byte   64-bit comma separated words (unaligned)
.half   16-bit comma separated words (naturally aligned)
.word   32-bit comma separated words (naturally aligned)
.dword   64-bit comma separated words (naturally aligned)
.byte   8-bit comma separated words
.dtpreldword   64-bit thread local word
.dtprelword   32-bit thread local word
.sleb128 expression signed little endian base 128, DWARF
.uleb128 expression unsigned little endian base 128, DWARF
.asciz “string” emit string (alias for .string)
.string “string” emit string
.incbin “filename” emit the included file as a binary sequence of octets
.zero integer zero bytes

The following are assembler directives for control of alignment:

Directive Arguments Description
.align integer align to power of 2 (alias for .p2align)
.balign b,[pad_val=0] byte align
.p2align p2,[pad_val=0],max align to power of 2

The following are assembler directives for definition and exporing of symbols:

Directive Arguments Description
.globl symbol_name emit symbol_name to symbol table (scope GLOBAL)
.local symbol_name emit symbol_name to symbol table (scope LOCAL)
.equ name, value constant definition

The following directives are for selection of sections:

Directive Arguments Description
.text   emit .text section (if not present) and make current
.data   emit .data section (if not present) and make current
.rodata   emit .rodata section (if not present) and make current
.bss   emit .bss section (if not present) and make current
.comm symbol_name,size,align emit common object to .bss section
.common symbol_name,size,align emit common object to .bss section
.section [{.text,.data,.rodata,.bss}] emit section (if not present, default .text) and make current

The following directives includes options, macros and other miscellaneous functions:

Directive Arguments Description
.option {rvc,norvc,pic,nopic,push,pop} RISC-V options
.macro name arg1 [, argn] begin macro definition \argname to substitute
.endm   end macro definition
.file “filename” emit filename FILE LOCAL symbol table
.ident “string” accepted for source compatibility
.size symbol, symbol accepted for source compatibility
.type symbol, @function accepted for source compatibility

Assembler Pseudo-instructions

The assembler implements a number of convenience psuedo-instructions that are formed from instructions in the base ISA, but have implicit arguments or in some case reversed arguments, that result in distinct semantics.

The following table lists RISC-V assembler pseudo instructions:

Pesudo-instruction Expansion Description
nop addi zero,zero,0 No operation
li rd, expression (several expansions) Load immediate
la rd, symbol (several expansions) Load address
mv rd, rs1 addi rd, rs, 0 Copy register
not rd, rs1 xori rd, rs, -1 One’s complement
neg rd, rs1 sub rd, x0, rs Two’s complement
negw rd, rs1 subw rd, x0, rs Two’s complement Word
sext.w rd, rs1 addiw rd, rs, 0 Sign extend Word
seqz rd, rs1 sltiu rd, rs, 1 Set if = zero
snez rd, rs1 sltu rd, x0, rs Set if ≠ zero
sltz rd, rs1 slt rd, rs, x0 Set if < zero
sgtz rd, rs1 slt rd, x0, rs Set if > zero
fmv.s frd, frs1 fsgnj.s frd, frs, frs Single-precision move
fabs.s frd, frs1 fsgnjx.s frd, frs, frs Single-precision absolute value
fneg.s frd, frs1 fsgnjn.s frd, frs, frs Single-precision negate
fmv.d frd, frs1 fsgnj.d frd, frs, frs Double-precision move
fabs.d frd, frs1 fsgnjx.d frd, frs, frs Double-precision absolute value
fneg.d frd, frs1 fsgnjn.d frd, frs, frs Double-precision negate
beqz rs1, offset beq rs, x0, offset Branch if = zero
bnez rs1, offset bne rs, x0, offset Branch if ≠ zero
blez rs1, offset bge x0, rs, offset Branch if ≤ zero
bgez rs1, offset bge rs, x0, offset Branch if ≥ zero
bltz rs1, offset blt rs, x0, offset Branch if < zero
bgtz rs1, offset blt x0, rs, offset Branch if > zero
bgt rs, rt, offset blt rt, rs, offset Branch if >
ble rs, rt, offset bge rt, rs, offset Branch if ≤
bgtu rs, rt, offset bltu rt, rs, offset Branch if >, unsigned
bleu rs, rt, offset bltu rt, rs, offset Branch if ≤, unsigned
j offset jal x0, offset Jump
jr offset jal x1, offset Jump register
ret jalr x0, x1, 0 Return from subroutine

Relocation Functions

The relocation function directives create synthesize operand values that are resolved at program link time and are used as immediate parameters to specific instructions. The sections on absolute and relative addressing give examples of using the relocation functions.

The following table lists assembler functions used to generate relocations:

Assembler Notation Description Instructions
%hi(symbol) Absolute (HI20) lui
%lo(symbol) Absolute (LO12) loads, stores, adds
%pcrel_hi(symbol) PC-relative (HI20) auipc
%pcrel_lo(label) PC-relative (LO12) loads, stores, adds
%tprel_hi(symbol) TLS LE (Local Exec) auipc
%tprel_lo(label) TLS LE (Local Exec) loads, stores, adds
%tprel_add(offset) TLS LE (Local Exec) add

Labels

Text labels are used as branch, unconditional jump targets and symbol offsets. Text labels are added to the symbol table of the compiled module.

loop:
        j loop

Numeric labels are used for local references. References to local labels are suffixed with ‘f’ for a forward reference or ‘b’ for a backwards reference.

1:
        j 1b

Absolute Addressing

Absolute addresses are used in position dependent code. An absolute address is formed using two instructions, the U-Type lui (Load Upper Immediate) instruction to load bits[31:20] and an I-Type or S-Type instruction such as addi (add immediate), lw (load word) or sw (store word) that fills in the low 12 bits relative to the upper immediate.

The following example shows how to load an absolute address:

.section .text
.globl _start
_start:
	    lui  a1,      %hi(msg)       # load msg(hi)
	    addi a1, a1,  %lo(msg)       # load msg(lo)
	    jalr ra, puts
2:	    j    2b

.section .rodata
msg:
	    .string "Hello World\n"

which generates the following assembler output and relocations as seen by objdump:

0000000000000000 <_start>:
   0:	000005b7          	lui	a1,0x0
			0: R_RISCV_HI20	msg
   4:	00858593          	addi	a1,a1,8 # 8 <.L21>
			4: R_RISCV_LO12_I	msg

Relative Addressing

Relative addresses are used in position independent code. A PC-relative address is formed using two instructions, the U-Type auipc (Add Upper Immediate Program Counter) instruction to load bits[31:20] relative to the program counter of the auipc instruction followed by an I-Type or S-Type instruction such as addi (add immediate), lw (load word) or sw (store word).

The following example shows how to load a PC-relative address:

.section .text
.globl _start
_start:
1:	    auipc a1,     %pcrel_hi(msg) # load msg(hi)
	    addi  a1, a1, %pcrel_lo(1b)  # load msg(lo)
	    jalr  ra, puts
2:	    j     2b

.section .rodata
msg:
	    .string "Hello World\n"

which generates the following assembler output and relocations as seen by objdump:

0000000000000000 <_start>:
   0:	00000597          	auipc	a1,0x0
			0: R_RISCV_PCREL_HI20	msg
   4:	00858593          	addi	a1,a1,8 # 8 <.L21>
			4: R_RISCV_PCREL_LO12_I	.L11

Load Immediate

The li (load immediate) instruction is an assembler pseudo instruction that is used to synthesize constants. The li pseudo instruction will emit a sequence starting with lui followed by addi and slli (shift left logical immediate) to construct constants by shifting and adding.

The following example shows the li psuedo instruction being used to load an immediate value:

.section .text
.globl _start
_start:

.equ CONSTANT, 0xcafebabe

        li a0, CONSTANT

which generates the following assembler output as seen by objdump:

0000000000000000 <_start>:
   0:	00032537          	lui     a0,0x32
   4:	bfb50513          	addi    a0,a0,-1029
   8:	00e51513          	slli    a0,a0,0xe
   c:	abe50513          	addi    a0,a0,-1346

Load Address

The la (load address) instruction is an assembler pseudo- instruction used to load the address of a symbol or label. The instruction can emit absolute or relative addresses depending on the -fpic or -fno-pic assembler command line options or an .options pic or .options nopic assembler directive. The pseduo-instruction emits a relocation so that the address of the symbol can be fixed up during program linking.

The following example uses the la psuedo instruction to load a symbol address:

.section .text
.globl _start
_start:

        la a0, msg

.section .rodata
msg:
	    .string "Hello World\n"

which generates the following assembler output and relocations as seen by objdump:

0000000000000000 <_start>:
   0:	00000517          	auipc	a0,0x0
			0: R_RISCV_PCREL_HI20	msg
   4:	00850513          	addi	a0,a0,8 # 8 <_start+0x8>
			4: R_RISCV_PCREL_LO12_I	.L11

Constants

Constants are emitted to the symbol table of the object but they do not take any space in the code or data sections. Constants can be referenced in expressions which emit relocations.

The following example shows loading a constant using the %hi and %lo assembler functions.

.equ UART_BASE, 0x40003000

        lui a0,      %hi(UART_BASE)
        addi a0, a0, %lo(UART_BASE)

This example uses the li pseudo instruction to load a constant and writes a string using polled IO to a UART:

.equ UART_BASE,  0x40003000
.equ REG_RBR,    0
.equ REG_TBR,    0
.equ REG_IIR,    2
.equ IIR_TX_RDY, 2
.equ IIR_RX_RDY, 4

.section .text
.globl _start
_start:
1:      auipc a0, %pcrel_hi(msg)    # load msg(hi)
        addi  a0, a0, %pcrel_lo(1b)  # load msg(lo)
2:      jal   ra, puts
3:      j     3b

puts:
        li    a2, UART_BASE
1:      lbu   a1, (a0)
        beqz  a1, 3f
2:      lbu   a3, REG_IIR(a2)
        andi  a3, a3, IIR_TX_RDY
        beqz  a3, 2b
        sb    a1, REG_TBR(a2)
        addi  a0, a0, 1
        j     1b
3:      ret

.section .rodata
msg:
	    .string "Hello World\n"

Control and Status Registers

Control and status registers are typically used to update privileged processor state however there are a few non-privileged instructions that access control and status registers such as the CSR pseudo-instructions rdcycle, rdtime, rdinstret for access to counters and frcsr, frrm, frflags, fscsr, fsrm, fsflags, fsrmi and fsflagsi for controling round mode and accessing floating point accrued exception state.

The following instructions allow reading, writing, setting and clearing bits in CSRs (control and status registers):

CSR Operation Description
CSRRW rd, csr, rs1 Control and Status Register Atomic Read and Write
CSRRS rd, csr, rs1 Control and Status Register Atomic Read and Set Bits
CSRRC rd, csr, rs1 Control and Status Register Atomic Read and Clear Bits
CSRRWI rd, csr, imm5 Control and Status Register Atomic Read and Write Immediate
CSRRSI rd, csr, imm5 Control and Status Register Atomic Read and Set Bits Immediate
CSRRCI rd, csr, imm5 Control and Status Register Atomic Read and Write Immediate

The following code sample shows how to enable interrupts, enable timer interuppts, and then set and wait for a timer interrupt to occur. The example uses CSR instructions and access to a platform specific MMIO (memory mapped input output) region:

.equ RTC_BASE,      0x40000000
.equ TIMER_BASE,    0x40004000

# setup machine trap vector
1:      auipc   t0, %pcrel_hi(mtvec)        # load mtvec(hi)
        addi    t0, t0, %pcrel_lo(1b)       # load mtvec(lo)
        csrrw   zero, mtvec, t0

# set mstatus.MIE=1 (enable M mode interrupt)
        li      t0, 8
        csrrs   zero, mstatus, t0

# set mie.MTIE=1 (enable M mode timer interrupts)
        li      t0, 128
        csrrs   zero, mie, t0

# read from mtime
        li      a0, RTC_BASE
        ld      a1, 0(a0)

# write to mtimecmp
        li      a0, TIMER_BASE
        li      t0, 1000000000
        add     a1, a1, t0
        sd      a1, 0(a0)

# loop
loop:
        wfi
        j loop

# break on interrupt
mtvec:
        csrrc   t0, mcause, zero
        bgez    t0, fail       # interrupt causes are less than zero
        slli    t0, t0, 1      # shift off high bit
        srli    t0, t0, 1
        li      t1, 7          # check this is an m_timer interrupt
        bne t0, t1, fail
        j pass

pass:
        la      a0, pass_msg
        jal     puts
        j       shutdown

fail:
        la      a0, fail_msg
        jal     puts
        j       shutdown

.section .rodata

pass_msg:
        .string "PASS\n"

fail_msg:
        .string "FAIL\n"

References

This project is a continuation of previous work by michaeljclark.