Posts

Showing posts from November, 2022

Plan for the SP0600 Project - Stage 1

Introduction: Hello Everyone, In this blog I am going to discuss the stage 1 about my project for SP0600. The Stage 1 is about the planning of the project. In this project we have to create a proof-of-concept tool to build functions with the use of automatic vectorization. Automatic vectorization have three major implementation of SIMD instructions for AArch64, Advanced SIMD, SVE and SVE2. Some modern gcc compiler have option to choose from one of these implementations at runtime using the function ifunc . However, ifunc requires additional set up and configurations by software developer before use. To learn more about automatic vectorization please visit this link . To learn about ifunc please visit this link . For more brief detail about the ASIMD, SVE and SVE2 please read my previous blog on this link . The primary focus of this project is to eliminate all the setup done by software developer and automate the process by writing a tool. This tool will enable to developer to build thr...

Auto-vectorization in AArch64.

Auto-vectorization: Hi everyone, I am writing a blog about concept of automatic vectorization in parallel computing which can help the reduction of cycles and chains in loops. It means that instead of scalar implementation, code can be converted to perform vector operations which means performing a single operation on multiple operands. GCC compiler is advanced enough to perform vectorization to optimize the code and boost performance. To learn more about auto-vectorization please visit this link . It can be achieved by using optimization flags such as -03, -ftree-vectorize. There are three extensions to AArch64 - SMID (Single Instruction, Multi Data), SVE, SVE2. SMID: Single Instruction, Multi Data is a type of parallel processing  SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously . To learn more about SMID please visit  link We can build the program using following commands for armv8 system- gcc...

AArch64 Looping first 50 even numbers.

Image
Hi everyone, I am writing a blog about a practice exercise for assembly language for AArch64. Recently, I was studying about the assembly language commands and registers for both x86_64 and arch64 systems. I completed several labs, which gave me good understanding about the registers and commands for assembler for both systems. Here is the link to them. I decided to some practice exercises to improve the understanding of registers and to memorize those commands. I decided to make a program which will loop through first 50 numbers and print the even numbers only.  Here is the source code of the program: .text .globl _start min = 0 max = 51 /* loop exits when the index hits this number (loop condition is i<max) */ _start: mov x19, min mov x21, 10 loop: /*Start of loop*/ adr x25, msg /*setting the address of string*/ udiv x20, x19, x21 /*unsigned divide - calculating first digit*/ ...

lab4 - 64 bit Assembly language Lab - Part 2

Image
  This is the second part of the lab 4 for my course SPO600. I am continuing for my previous blog, part 1 of lab4 which can be found at this link - Lab4-Part-1 .  In the previous labs I covered three tasks in which I have to write the assembly language for aarch64 and x86_64 programs. We wrote the loop along with loop counter and printed the counter with each increment from 0 to 9. In this blog I am going to cover remaining tasks Task 4: Extend code for aarch64 to loop up to 30 with 2 digits. In this task, we have to update our previous code for aarch64 and extend the loop to 30 digits. In this, we have to print 2 digits instead of one for which in arrch64 we have to use two different registers to store first digit and second digit of the counter. I used udiv command to calculate quotient which is equivalent to first digit and stored it in a register. Then, I used msub command to calculate remainder, which is equivalent to the second digit and stored it in a another register. ...

Lab 4 - 64 bit Assembly Language Lab - Part 1

Image
  I am writing lab 4 for my course SPO600. This lab is based on introduction to 64 bit Assembly language for aarch64 and x86_64. This lab is Divided into 5 tasks. In these tasks, I have to work with assembly language for the both x86_64 and aarch64 platforms. To access these platforms, we have to connect with SSH servers to two servers provided by professor. First server is Israel server, which operates on aarch64 system, second server is Portugal server - that operates on x86_64 system. Task 1. Build and run aarch64 programs. First tasks is to build and run aarch64 programs. I connected to the Israel server and configured some setting for it. Then, I was provided with some examples to work with. To access these examples, I had to unpack a .tgz file, which was new to me. I researched on the web and found this  link . Following command on linux can be used to unpack tar Tar Gnu Zip(tgz) files.  tar zxvf fileNameHere.tgz After unpacking the tar, I received very basic asse...

6502 - Menu Driven Color selector (October Blog)

Image
I recently worked on 6502 assembly program, this program was based on number guessing game. In this program, I learned to take user input, handle strings,  perform mathematical operations. Here is the link to the blog about the program -  Number Guesser .  After completing the program, I tried to practice writing more 6502 assembly commands. In order to practice more, I thought of making a menu driven color selector program which will display color on bitmap display according to user input.  First of all, program will handle the user input, I decided to take single digit input from (0-9). User can press enter to select the option, or user can press backspace to reenter the number. Program will display the color menu in the text output screen. When user will enter an option, color on the whole bitmap will change.  Here is the menu: Here is the code for the program:   ; ROM routines define SCINIT $ff81 ; initialize/clear screen define CHRIN $ffcf ; input...