Lab 3 Aarch64

It’s time for Aarch64 architecture. First of all, the basic code for print a message:

.text
.globl _start

_start:
mov x0, 1 /* file descriptor: 1 is stdout */
adr x1, msg /* message location (memory address) */
mov x2, len /* message length (bytes) */

mov x8, 64 /* write is syscall #64 */
svc 0 /* invoke syscall */

mov x0, 0 /* status -> 0 */
mov x8, 93 /* exit is syscall #93 */
svc 0 /* invoke syscall */

.data
msg: .ascii "Loop: \n"
len = . - msg /* message length */

This basically prints “Loop: \n”. Now, We’ll try to print 0-9, as we’ve done in x86_64 (but now, a little faster, as we know + or – what to do). First, to the loop, we start variables, as we done in x86_64:

start = 0 /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10 /* loop exits when the index hits this number (loop condition is i<max) */

In the _start, we’ll put the start number into a register, but the registers in Aarch64 are different, instead of %r15, we’ll use x7 and the positions in mov are inverted. What in x86_64 we used mov $start,%r15, now we’ll change the variable %r15 for x7, start does not need $ anymore and invert the positions:

_start:
mov x7, start

To access the position we won’t use variables anymore, to change the space for a number, in the beginning of our loop we’ll load the address of the label in a register (adr x0, msg). Then, we’ll need a register with the number to be printed, so we’ll load a register with the sum of x7 (counter) and 48 (number 0) to turn it into a “string number” (add x1, x7, 48). The next step is to replace the second space with the number, we’ll do it attributing the number (x1) in the message (x0) in the second space position (sixth position, 6), but now, with strb function we can modify only one byte, don’t needing to write the line feed after the number (strb w1, [x0, 6]).

Obs.: Use w1 instead x1 in the strb function.

If we increment the counter (add x7, x7, 1) and do the comparison of our “assembly for” (cmp x7, max and b.ne loop), our code we’ll look like the code below and print the 0-9 loop:

.text
.globl _start

start = 0 /* starting value for the loop index; note that this is a symbol (constant), not a variable */
max = 10 /* loop exits when the index hits this number (loop condition is i<max) */

_start:
mov x7, start /* loop index */

loop:
adr x0, msg /* load x0 with address msg */
add x1, x7, 48 /* turn counter into string number */
strb w1, [x0, 6] /* change second space to number */
mov x0, 1 /* file descriptor: 1 is stdout */
adr x1, msg /* message location (memory address) */
mov x2, len /* message length (bytes) */

mov x8, 64 /* write is syscall #64 */
svc 0 /* invoke syscall */

add x7, x7, 1 /* increment index */
cmp x7, max /* see if we're done */
b.ne loop /* loop if we're not */

mov x0, 0 /* status -> 0 */
mov x8, 93 /* exit is syscall #93 */
svc 0 /* invoke syscall */

.data
msg: .ascii "Loop: \n"
len = . - msg /* message length */

To loop until 30, we need to change max (max = 31), add a new space into the message ("Loop: \n"), we’ll also have to divide the counter, we need the number 10 into a register to divide (mov x3, 10), putting the quotient in x1 (udiv x1, x7, x3) and remainder in x2 (msub x2, x1, x3, x7) and turning both in “characters numbers” adding 48 (add x1, x1, 48 and add x2, x2, 48).

Obs.2: msub x2, x1, x3, x7 means that, if our number is 12 (x7 = 12), will put in x2 (our remainder) the result of x7 – (x1 * x3). x7 is 12, x1 is 1 (x7 divided for x3, that is 10, gives 1 as quotient) and x3 is 10. 12 – (1 * 10) = 12 – 10 = 2.

Now, to remove the 0 in the 0-9 loop, we’ll do the same as in x86_64. We’ll do another function to change the 0 to space (mov x1, 32) if equals to 48 (cmp x1, 48 and b.eq space) and continue the loop either changing to space or not (b continue).


Last words:

The sensation in programming in assembly to me was like the sensation in programming in general, like I’m solving a puzzle, seeing where each piece has to be put. Also was like to jump from a 100 pieces puzzle to a 3000 pieces one. Good as a challenge, but not a thing that I want as my daily work. I found Aarch64 easier than x86_64, but maybe this sensation was because I take less hours to finish as I became more used with assembly code, it was my first time programming in Assembly so It brings me some headache (literally).

Edit:
The structure of the Makefile is the same:

loop:   loop.s
        as      -g      -o      loop.o  loop.s
        ld              -o      loop    loop.o

Comentários

Postagens mais visitadas