ARM Optimization

Project EducOOo / Epitech 2011 ( see OOo4Kids performances improvement on ARM machines)


 * Student : Pédro Moreno
 * Mentor : Eric Bachard

Important: If you use these information, don't forget everything you'll find here is under CC by-sa License, and you must mention your sources in what you write ...

This section contains information from several sources: Christine Rochange 2006 (University Paul Sabatier), Tarik Graba (Telecom Paris Tech) and especially ARM.com.

= ARM Instruction Set (armv7+) =

ARM is a load/store architecture, where data-processing operations only operate on register contents, not directly on memory contents.

ARMv7 is documented as a set of architecture profiles. Three profiles have been defined as follows:


 * 1) ARMv7-A: the application profile for systems supporting the "ARM" and "Thumb" instruction sets, and requiring virtual address support in the memory management model.
 * 2) ARMv7-R: the realtime profile for systems supporting the ARM and Thumb instruction sets, and requiring physical address only support in the memory management model.
 * 3) ARMv7-M: the microcontroller profile for systems supporting only the Thumb instruction set, and where overall size and deterministic operation for an implementation are more important than absolute performance.

While profiles were formally introduced with the ARMv7 development, the A-profile and R-profile have implicitly existed in earlier versions, associated with the Virtual Memory System Architecture (VMSA) and Protected Memory System Architecture (PMSA) respectively.

Memory and Registers (specific to ARM processors)

Syntax of an asm source file (written in assembler)

Mathematical and logical operations
ADD       r0,r1,r2          @ r0 ← r1 + r2         addition ADC       r0,r1,r2          @ r0 ← r1 + r2 + C     addition with Carry SUB       r0,r1,r2          @ r0 ← r1 – r2         substract SBC       r0,r1,r2          @ r0 ← r1 – r2 + C – 1 substract with Carry RSB       r0,r1,r2          @ r0 ← r2 – r1         inverse substract RSC       r0,r1,r2          @ r0 ← r2 – r1 + C – 1 inverse substract with Carry AND       r0,r1,r2          @ r0 ← r1 & r2         and ORR       r0,r1,r2          @ r0 ← r1 | r2         or EOR        r0,r1,r2          @ r0 ← r1 ^ r2         XOR BIC       r0,r1,r2          @ r0 ← r1 &~ r2        put 0 into the bits from r1 select by r2

Movement between register
MOV       r0,r2             @ r0 ← r2               copy r2 in r0 MVN        r0,r2             @ r0 ← !r2              copy and negation from r2 to r0

Compare
CMP       r1,r2             @ CPSR ← cc(r1 - r2)    compare CMN       r1,r2             @ CPSR ← cc(r1 + r2)    compare negative TST       r1,r2             @ CPSR ← cc(r1 et r2)   test bits TEQ       r1,r2             @ CPSR ← cc(r1 ⊕ r2)    test bitwise equality This instructions just update the condition bits in CPSR.

Multiplication
Multiply instructions that produce bottom 32-bit results:

Both Normal Multiply instructions can optionally set the N (Negative) and Z (Zero) condition code flags.
 * 1) MUL Multiplies the values of two registers together, truncates the result to 32 bits, and stores the result in a third register.
 * 2) MLA Multiplies the values of two registers together, adds the value of a third register, truncates the result to 32 bits, and stores the result in a fourth register. This can be used to perform multiply-accumulate operations.
 * 3) MLS Multiply and subtract

No distinction is made between signed and unsigned variants. Only the least significant 32 bits of the result are stored in the destination register, and the sign of the operands does not affect this value.

There are important differences with the other arithmetic operations:


 * - The second operand can not be an immediate value.
 * - The registry result can not be identical to the first register operand.
 * - If the S bit is set (updated condition codes), the V code is not changed and the code C is not significant.

Shift
It is possible to undergo a shift in the second operand before applying an operation.

example: AND        r0,r1,r2,LSL # 6      @ r0 ← r1 + r2 * 6 The shifts available are:


 * - Logical Shift Left (LSL) = shift from 0 to 31 positions to the left with introduction of 0.
 * - Logical Shift Right (LSR) = shift from 0 to 32 positions to the right with introduction of 0.
 * - Arithmetic Shift Left (ASL) = same as LSL.
 * - Arithmetic Shift Right (ASR) = shift from 0 to 32 positions to the right with extension of the sign bit.
 * - ROtate Right (ROR) = circulary shift from 0 to 32 positions to the right.
 * - Rotate Right eXtended (RRX) = shift from only one position to the right with introduction of Carry bit.

The number of position for the shitf can be define by a constant value (preceded by #) or by a register.