ARM Optimization

From Wiki.ooo4kids.org

Jump to: navigation, search


ARM machines performances improvement


Project EducOOo / Epitech 2011 ( see OOo4Kids performances improvement on ARM machines)


Important: If you use these information, don't forget everything you'll find here is under CC by-sa License, and you must mention your sources in what you write ...

This section contains information from several sources: Christine Rochange 2006 (University Paul Sabatier), Tarik Graba (Telecom Paris Tech) and especially ARM.com.

ARM Instruction Set (armv7+)

ARM is a load/store architecture, where data-processing operations only operate on register contents, not directly on memory contents.

ARMv7 is documented as a set of architecture profiles. Three profiles have been defined as follows:

  1. ARMv7-A: the application profile for systems supporting the "ARM" and "Thumb" instruction sets, and requiring virtual address support in the memory management model.
  2. ARMv7-R: the realtime profile for systems supporting the ARM and Thumb instruction sets, and requiring physical address only support in the memory management model.
  3. ARMv7-M: the microcontroller profile for systems supporting only the Thumb instruction set, and where overall size and deterministic operation for an implementation are more important than absolute performance.

While profiles were formally introduced with the ARMv7 development, the A-profile and R-profile have implicitly existed in earlier versions, associated with the Virtual Memory System Architecture (VMSA) and Protected Memory System Architecture (PMSA) respectively.

Memory and Registers (specific to ARM processors)

Syntax of an asm source file (written in assembler)

Instruction Set

The data processing instructions

Mathematical and logical operations

ADD        r0,r1,r2          @ r0 ← r1 + r2         addition
ADC        r0,r1,r2          @ r0 ← r1 + r2 + C     addition with Carry
SUB        r0,r1,r2          @ r0 ← r1 – r2         substract
SBC        r0,r1,r2          @ r0 ← r1 – r2 + C – 1 substract with Carry
RSB        r0,r1,r2          @ r0 ← r2 – r1         inverse substract
RSC        r0,r1,r2          @ r0 ← r2 – r1 + C – 1 inverse substract with Carry
AND        r0,r1,r2          @ r0 ← r1 & r2         and
ORR        r0,r1,r2          @ r0 ← r1 | r2         or
EOR        r0,r1,r2          @ r0 ← r1 ^ r2         XOR
BIC        r0,r1,r2          @ r0 ← r1 &~ r2        put 0 into the bits from r1 select by r2

Movement between register

MOV        r0,r2             @ r0 ← r2               copy r2 in r0
MVN        r0,r2             @ r0 ← !r2              copy and negation from r2 to r0

Compare

CMP        r1,r2             @ CPSR ← cc(r1 - r2)    compare
CMN        r1,r2             @ CPSR ← cc(r1 + r2)    compare negative 
TST        r1,r2             @ CPSR ← cc(r1 et r2)   test bits
TEQ        r1,r2             @ CPSR ← cc(r1 ⊕ r2)    test bitwise equality

This instructions just update the condition bits in CPSR.

Multiplication

Multiply instructions that produce bottom 32-bit results:

  1. MUL Multiplies the values of two registers together, truncates the result to 32 bits, and stores the result in a third register.
  2. MLA Multiplies the values of two registers together, adds the value of a third register, truncates the result to 32 bits, and stores the result in a fourth register. This can be used to perform multiply-accumulate operations.
  3. MLS Multiply and subtract

Both Normal Multiply instructions can optionally set the N (Negative) and Z (Zero) condition code flags.

No distinction is made between signed and unsigned variants. Only the least significant 32 bits of the result are stored in the destination register, and the sign of the operands does not affect this value.

There are important differences with the other arithmetic operations:

- The second operand can not be an immediate value.
- The registry result can not be identical to the first register operand.
- If the S bit is set (updated condition codes), the V code is not changed and the code C is not significant.

Shift

It is possible to undergo a shift in the second operand before applying an operation.

example:

AND         r0,r1,r2,LSL # 6      @ r0 ← r1 + r2 * 6

The shifts available are:

- Logical Shift Left (LSL) = shift from 0 to 31 positions to the left with introduction of 0.
- Logical Shift Right (LSR) = shift from 0 to 32 positions to the right with introduction of 0.
- Arithmetic Shift Left (ASL) = same as LSL.
- Arithmetic Shift Right (ASR) = shift from 0 to 32 positions to the right with extension of the sign bit.
- ROtate Right (ROR) = circulary shift from 0 to 32 positions to the right.
- Rotate Right eXtended (RRX) = shift from only one position to the right with introduction of Carry bit.

The number of position for the shitf can be define by a constant value (preceded by #) or by a register.

Data Transfer Instructions

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox