The ARM instruction set


The ARM instruction set, describedin this section, may be subject to processor-specific restrictions and changes. Particular combinations of instructions must be avoided where noted, as unpredictable results may otherwise occur. Refer to the appropriate ARM processor datasheet for a precise definition of the instruction set, and also refer to companion application notes for information on relevant restrictions and changes.

The most significant variations are those between ARM processors with 26- and 32-bit program counters.

Conditional execution and the `S' bit

All ARM instructions are conditional and are only executed if their condition field matches the N, Z, C and V condition flags of the program status register (PSR). For full details of the processor status flags refer to the ARM Datasheet for the appropriate ARM device. The default condition field setting is execute always; other conditions are specified by appending a two-character condition mnemonic to the instruction mnemonic. A conditionally executed sequence of instructions will usually be shorter and sometimes even faster than a branched-around sequence, because it will not cause breaks in the CPU pipeline.

----------------------------------------------------------
Mnemonic  |Condition                  |CPU condition flags
----------------------------------------------------------
EQ        |EQual                      |Z set              
----------------------------------------------------------
NE        |Not Equal                  |Z clear            
----------------------------------------------------------
CS        |Carry Set/unsigned Higher  |C set              
          |or Same                    |                   
----------------------------------------------------------
CC        |Carry Clear/unsigned LOwer |C clear            
          |than                       |                   
----------------------------------------------------------
MI        |Negative (MInus)           |N set              
----------------------------------------------------------
PL        |Positive (PLus)            |N clear            
----------------------------------------------------------
VS        |oVerflow Set               |V set              
----------------------------------------------------------
VC        |oVerflow Clear             |V clear            
----------------------------------------------------------
HI        |HIgher unsigned            |C set and Z clear  
----------------------------------------------------------
LS        |Lower or Same unsigned     |C clear or Z set   
----------------------------------------------------------
GE        |Greater than or Equal to   |(N and V) set or   
          |                           |(Nand V) clear     
----------------------------------------------------------
LT        |Less Than                  |(N set and V clear)
          |                           |or (N clear and V  
          |                           |set)               
----------------------------------------------------------
GT        |Greater Than               |((N and V) set or  
          |                           |clear) and Z clear 
----------------------------------------------------------
LE        |Less Than or equal to      |(N set and V clear)
          |                           |or (N clear and V  
          |                           |set) or Z set      
----------------------------------------------------------

HS (Higher or Same) and LO (LOwer than) are synonyms for CS and CC respectively.

Condition flags are set by executed ALU instructions which have the `S' bit set, and by executed comparison instructions. The S bit is set by appending `S' to the instruction mnemonic.

Register Names and `.'

Fifteen registers (R0 to R14), the program counter (PC), and the processor status register (PSR), are all directly accessible to the programmer. Register R15 contains the PC, and in 26-bit address ARMs it contains the PSR too. In 32-bit address ARMs the PSR is separate, and is manipulated by separate instructions.

R14 is used as the subroutine link register, saving a copy of R15 when a Branch with Link instruction is executed (see Branch instructions - B and BL). R13 is conventionally used as a stack pointer.

The non-user processor modes each have their own R13 and R14, and in 32-bit ARMs, PSR registers. FIQ mode additionally has its own R8-R12. When a mode change occurs, because of interrupts, SWIs or traps, R14 of the new mode is set to a copy of R15, and in 32-bit ARMs the PSR of the new mode is copied from the PSR of the old mode. For further details of banked registers and mode changes, consult the appropriate ARM datasheet for the target processor.

Within an assembly language source, the current value of the program counter (PC) can be referred to as `.'. Usually, `.' is 8 bytes ahead of the instruction using it because of pipelining. For example:

LDR R0,[.-8+offset]

loads a word at offset bytes from the current instruction. Please refer to the appropriate ARM datasheet for precise details.

Branch instructions

The syntax of these instructions is:

B{L}{condition} expression

where the expression evaluates to the branch destination address. If the address is within \xb1 32MB of the current program counter, it can be expressed directly as an offset. On 32-bit address ARMs, branches of more than 32MB have to be effected by loading the destination address directly into the PC, or by adding a long offset to the PC using a value loaded into a register. Branch with Link saves the PC into R14 of the current bank. To return, use:

MOV    PC, R14

or

LDMFD  SP!, {...,PC}

if the link register has been saved on a stack. Note that these instructions will not restore the original PSR.

The assembler automatically compensates for the effects of pipelining and prefetching when calculating offsets.

Data processing instructions

The syntax of these instructions is:

opcode{condition}{S} destination,operand2

The destination must be a register. Operand2 may be any of:

#constant-expression{,constant-rotation}

register {,shift #constant-expression}

LSL        Shift left
    LSR        Logical shift right
    ASR        Arithmetic shift right
    ROR        rotate right

register {,shift register}

register, RRX

For simple constants (e.g. #&FC000003), the assembler will generate the appropriate rotation for you.

Data processing instructions

The syntax of this group of instructions is:

opcode{condition}{S} destination,operand1,operand2

The destination and operand1 must both be registers, and operand2 should be as described for the MOV and MVN instructions (see Data processing instructions - MOV and MVN).

With ADD and ADC a carry is generated by 32-bit overflows; for subtractions it is generated if, and only if underflow did not occur.

With ADD, ADC, SUB, SBC, RSB and RSC the V flag is set if signed overflow occurred, i.e. when the carry into bit 31 was not equal to the carry out of that bit.

Data processing instructions

The N and C flags may also be affected if a shift or rotation was involved in the construction of operand2

The syntax of these instructions is:

opcode{condition}{P} operand1,operand2

Each of these instructions preserves its operands and produces no result other than updated PSR flags. Operand1 must be a register, and operand2 must be as described for MOV and MVN (see Data processing instructions - MOV and MVN).

If P is not specified, the PSR condition flags are set to the ALU condition flags after the operation (as described above), and the instructions behave as conventional status-setting comparisons.

With 26-bit ARMs, use of P allows direct manipulation of the PSR, as described below. P must not be used with 32-bit ARMs: instead use MSR and MRS (see section PSR transfer - MSR and MRS).

In 26-bit user mode, {opcode}P moves the result of the operation to the PSR, and sets the N, Z, C and V flags from the top four bits of the result. In other 26-bit modes it sets the N, Z, C, V, I and F flags from the top six bits, and the mode bits from the bottom two bits of the result. A typical use of {opcode}P would be to change modes.

PSR transfer

The syntax of these instructions is:

MSR

MRS

These instructions are available on 32-bit ARMs only. R15 cannot be used as the destination register. Please refer to your ARM datasheet for precise details.

psrl can be one of CPSR, CPSR_all, CPSR_flg, CPSR_ctl, SPSR, SPSR_all, SPSR_flg, or SPSR_ctl. (CPSR and CPSR_all are synonyms as are SPSR and SPSR_all).

psrs can be one of SPSR, SPSR_all, CPSR or CPSR_all.

Operand2 is as described in section Data processing instructions - MOV and MVN.

In user mode the instructions behave as follows:

MSR CPSR_all, op2                    ; CPSR{N,Z,C,V} <- op2
MSR CPSR_flg, op2                    ; CPSR{N,Z,C,V} <- op2
MSR CPSR_ctl, op2                    ; No effect
MRS Rd, CPSR                    ; Rd <- CPSR{N,Z,C,V,I,F,M[4:0]}
MSR SPSR, op2                    ; Not valid in user mode
MRS Rd, SPSR                    ; Not valid in user mode

In privileged modes the instructions behave as follows:

MSR CPSR_all, op2                    ; CPSR{N,Z,C,V,I,F,M[4:0]} <- op2
MSR CPSR_flg, op2                    ; CPSR{N,Z,C,V} <- op2
MSR CPSR_ctl, op2                    ; CPSR{I,F,M[4.0]} <- op2
MRS Rd, CPSR                    ; Rd <- CPSR{N,Z,C,V,I,F,M[4:0]}
MSR SPSR_all, Rm                    ; SPSR_mode{N,Z,C,V,I,F,M[4:0]} <- op2
MSR SPSR_flg, Rm                    ; SPSR_mode{N,Z,C,V} <- op2
MSR SPSR_ctl, Rm                    ; SPSR_mode{I,F,M[4.0]} <- op2
MRS Rd, SPSR                    ; Rd <- SPSR_mode{N,Z,C,V,I,F,M[4:0]}

Single data transfer

These instructions come in two forms called pre-indexed and post-indexed. The syntax of pre-indexed instructions is:

opcode{condition}{B} register,[base{,index}]{!}

Post-indexed instructions take the form:

opcode{condition}{B} register,[base]{,index}

B specifies a byte instead of a word transfer (i.e. 8 bits instead of 32). Register is the destination of the load or source of the store. Base must be a register; for pre-indexed addressing, index is added to it to yield the load or store address; with post-indexed addressing, base gives the address for the load or store, and base+index is the value written back to base. In the pre-indexed case ! enables writeback of base+index to base.

Index may be one of the following:

#{-}12-bit-constant-expression
{-}register {, shift #5-bit-constant-expression}

(Shift is explained in section Data processing instructions - MOV and MVN. In this second form the value of index is the value in register shifted as specified.

LDR can also used to generate literal constants, program counter relative constant addresses and external addresses. The syntax is:

LDR register,=expression

If expression is a numeric constant, then a MOV or MVN will be used rather than an LDR if the constant can be constructed by either of these instructions. Otherwise, the assembler will generate a program-relative LDR, and if the desired literal does nor already exist within the addressable range of this LDR, it will place the literal in the next literal pool, (see also LTORG Organisational directives - END, ORG, LTORG and KEEP.

Additionally, LDR or STR can be used to transfer data to or from an address specified by a label (optionally with an offset) as follows:

opcode{cond}{B} register,label-expression

When used in this form, label expression must either be addressable PC-relative from this instruction, or must be a register-relative label created using the `^' directive with a register operand, (see section Describing the layout of store - ^ and #).

Block data transfer

The syntax of these instructions is:

opcode{condition}type base{!},register-list{^}

The opcode is combined with one of eight instruction types with the mnemonics DB, DA, IB, IA, FD, ED, FA, and EA; the meaning of FD, ED FA and EA varies according to whether a load or store is performed. In detail:

--------------------------------------------------------
STMDB      |Decrement base Before the store             
--------------------------------------------------------
STMDA      |Decrement base After the store              
--------------------------------------------------------
STMIB      |Increment base Before the store             
--------------------------------------------------------
STMIA      |Increment base After the store              
--------------------------------------------------------
LDMDB      |Decrement base Before the load              
--------------------------------------------------------
LDMDA      |Decrement base After the load               
--------------------------------------------------------
LDMIB      |Increment base Before the load              
--------------------------------------------------------
LDMIA      |Increment base After the load               
--------------------------------------------------------
STMFD      |Push registers to a Full stack, Descending  
           |(STMDB)                                     
--------------------------------------------------------
STMED      |Push registers to an Empty stack, Descending
           |(STMDA)                                     
--------------------------------------------------------
STMFA      |Push registers to a Full stack, Ascending   
           |(STMIB)                                     
--------------------------------------------------------
STMEA      |Push registers to an Empty stack, Ascending 
           |(STMIA)                                     
--------------------------------------------------------
LDMFD      |Pop registers from a Full stack, Descending 
           |(LDMIA)                                     
--------------------------------------------------------
LDMED      |Pop registers from an Empty stack,          
           |Descending (LDMIB)                          
--------------------------------------------------------
LDMFA      |Pop registers from a Full stack, Ascending  
           |(LDMDA)                                     
--------------------------------------------------------
LDMEA      |Pop registers from an Empty stack, Ascending
           |(LDMDB)                                     
--------------------------------------------------------

A full stack is one in which the stack pointer points to the last data item written to it, and an empty stack is one where the stack pointer points to the first free slot in it. A descending stack grows from high memory addresses to low, and an ascending stack vice versa.

Base contains the starting address for the transfer and can be any register except R15. If present ! requests writeback of the updated base address to base after the instruction is executed.

Register-list is a comma-separated list of registers, or register ranges enclosed in {}. A register range is two register names joined by a hyphen, and represents the registers named and all those between them. The directive RLIST (see section Miscellaneous directives - ALIGN, NOFP, RLIST and ENTRY) can also be used to create a list of registers to be used. In user mode ^ sets the S bit to load the PSR along with the PC; in privileged modes it forces transfer of the user mode registers.

Multiplies

The syntax of these instructions is:

MUL{condition}{S} destination,operand1,operand2
MLA{condition}{S} destination,operand1,operand2,operand3

The destination and all operands must be registers. MUL multiplies operand1 by operand2, and places the result in the destination register. MLA multiplies operand1 by operand2, adds operand3 to the product and places the result in the destination register. Both instructions work with signed and unsigned integers. For details of how to make multiply instructions execute quickly, see the appropriate ARM datasheet, or the Cookbook.

Certain combinations of operands should be avoided and are warned against by the assembler. The destination register should not be the same as operand1 as this will give a meaningless result. R15 should not be used as a destination register, nor as an operand. See the appropriate ARM datasheet for further details.

Single data swap

SWP swaps a byte or word quantity between a register and memory, locking the memory bus in the process to preserve atomic operation (where supported by external hardware). The syntax is:

SWP{condition}{B} destination,source,[base]

Destination, source and base must all be registers. B sets the width of the transfer to byte rather than word. The memory address is that in base; its contents are read, the source register is written to it, and the old memory contents are then stored in destination. The same register can serve as source and destination. R15 may not be used as the swap address, the source or the destination.

Software interrupt/supervisor call

This instruction is used by programs to communicate with the host operating system. The syntax is:

SWI constant-expression

The expression value is truncated to 24 bits (i.e. between &0 and &FFFFFF); it is ignored by the processor but is interpreted by operating system software.

Pseudo-instructions

The Assembler supports several pseudo-instructions which are translated into the appropriate combination of ARM instructions at assembly time.

--------------------------------------------------------
ADR        |Assemble address to register                
--------------------------------------------------------

Because the ARM has no `load effective address' instruction the assembler provides ADR, which will always assemble to produce ADD, SUB, MOV or MVN instructions to generate the address. The syntax is:

ADR{condition}{L} register,expression

The expression can be register-relative, program-relative or numeric. ADR must assemble to one instruction, whereas ADRL allows a wider range of effective addresses to be assembled in two instructions..

--------------------------------------------------------
NOP        |No operation                                
--------------------------------------------------------

This generates the preferred no-operation code for a given ARM processor, which is often MOV R0,R0. NOP is really a directive and so cannot be used conditionally; not executing a no-operation is the same as executing it, so conditional execution is pointless.