Software Portability Project - Stage 2.1 Initial Implementation

arman valaee
Dec 9, 2022
9 min read

Updated: Dec 9, 2022

In this stage of this project, I am going to implement the project that I planned in the previous related blogs. You can find these blogs through the following links.

Also if you want to learn more about iFunc feature, especially on computers with Aarch64 structure, GNU iFunc mechanism on Aarch64 blog post can be helpful.

In stage 1.2 I explained that I am going to use C++ as my language for this project, but as I continued with my research and started to have a better understanding of the project - thanks to Chris Tyler, my professor - I decided to go with Python.

There are many reasons for this decision and the main reason is that implementing this program using python is much easier,

As it is a less structured language than C++,
String manipulation is fast and easy,
and working with files, including reading and writing requires fewer steps.

There is no reason to go with a more structured language for this project. I initially chose C++ as I have more experience with it.

I have less experience with Python, but fortunately, Python is a pretty simple language and there is no big learning curve as you move to Python from a language such as C++.

Now let's get to the initial implementation of the project.

Implementation - GitHub Repository

To begin with my implementation, I created a repository in GitHub and licensed it under GNU GENERAL PUBLIC LICENSE v2.

The task is to get a function.c file that may be implemented for a specific machine architecture and make it runnable for every type of Aarch64 computer.

The main.c and function.c files are supposed to get built under level 3 optimizations which include auto-vectorization.

Not every computer support different levels of auto vectorizations mechanism such as SVE and SVE2.

I will get to the main.c file later, but now let's take a look at our sample function.c file:

/*

        adjust_channels :: adjust red/green/blue colour channels in an image
        
        The function returns an adjusted image in the original location.
        
        Copyright (C)2022 Seneca College of Applied Arts and Technology
        Written by Chris Tyler
        Distributed under the terms of the GNU GPL v2
        
*/

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

// ----------------------------------------------------------------- Naive implementation in C

#include <sys/param.h>

void adjust_channels(unsigned char *image, int x_size, int y_size, 
        float red_factor, float green_factor, float blue_factor) {

        printf("Using adjust_channels() implementation #1 - Naive (autovectorizable)\n");
        
/*

        The image is stored in memory as pixels of 3 bytes, representing red/green/blue values.
        Each of these values is multiplied by the corresponding adjustment factor, with 
        saturation, and then stored back to the original memory location.
        
        This simple implementation causes int to float to int conversions.
        
*/

        for (int i = 0; i < x_size * y_size * 3; i += 3) {
                image[i]   = MIN((float)image[i]   * red_factor,   255);
                image[i+1] = MIN((float)image[i+1] * blue_factor,  255);
                image[i+2] = MIN((float)image[i+2] * green_factor, 255);
        }
}

This is a simple function called adjust_channels which will adjust the inputted pictures by their colors.

As I mentioned this is a sample file. As the function name, return type, and arguments may be different in other files, my program should be able to work with any provided function file.

I also need to duplicate this function 2 times since we are covering 3 different vectorization mechanisms on Aarch64 machines.

Step 1 - Getting the Function Prototype and Name

There are many ways to do this and it doesn't matter in which order.

I believe the easiest way to do this was to use the makeheaders bash command.

This command will take the implementation of the function and create a .h file with the prototype of the given function. In this case, the given function file is function.c, and the targeted function is called adjust_channels.

You can read more about makeheaders over here.

Command: makeheaders function.c

Using the created function.h file I was able to store the function prototype in a local variable as a string.

I retrieved the function name using the find command in python. Everything between the index of the first space character and the first '(' character gives us our function name.

To make it more accurate we can find the index of the first '(' char and remove everything that comes after it, including the '(' itself. Then using rfind function we can get the index of the last space before the function name.

There might be some special situations that this won't work, but I can't think of any at this point.

Step 2 - Adding iFunc Attribute to the Function Prototype in Assembly

This is a simple task as we already have the prototype stored in a local variable.

We can simply add __attribute__ (( ifunc("resolver") )) before the function prototype.

The only notable issue is the iFunc argument name here. We can simply use resolver here and later on use the same name in the iFunc function.

We can also get more creative and put an appropriate argument name here since we already have the function name.

This is how I did it:

iFuncProto = '__attribute__ (( ifunc("resolve_' + funcName + '") )) ' + proto

funcName: This is the name of our function which I retrieved in step 1.

iFuncProto: This is the complete prototype to which we are adding this attribute behind it.

Note: Using this method is it important to match the iFunc function argument name with this format. The program would not work properly if they are different.

I will explain how I managed to match the argument in the iFunc function with this format.

Step 3 - Pragmas

In this step, we need to place a pragma directive into the output file to select between our targets.

Our targets are:

armv8-a
armv8-a+sve
armv8-a+sve2

I stored these targets as strings in the local variables to make it easier to add to our altered file later.

This is how they look:

	#pragma GCC target "arch=armv8-a+sve2

Step 4 - Pasting the Function 3 Times & Changing Their Names

In the early stages of this program, I opened the function.c file using the open("function.c", "r") command and storing its content to a local variable as a string.

It is important to open it using the read-only method since we don't want to change this function's content.

Since we already have the function name locally stored, we can easily make a copy of the original function and change its name using the replace() function.

I prepared 3 variables to store the name of the functions after modification.

# Adjusting function names based on different implementations
funcNameSIMD = funcName + "_SIMD"
funcNameSVE = funcName + "_SVE"
funcNameSVE2 = funcName + "_SVE2"

funcOriginSIMD = funcOrigin.replace(funcName, funcNameSIMD)
funcOriginSVE = funcOrigin.replace(funcName, funcNameSVE)
funcOriginSVE2 = funcOrigin.replace(funcName, funcNameSVE2)

This is necessary so we don't have any function name conflict in our code.

An interesting about the second code snippet above is that it will make a small modification to the function bodies in addition to the function name.

Take a look at the original function.c code on top of this page. You might notice that the function name has been repeated in a print statement. This code snippet will also change that in a way to matches the function name.

Later, this will help us understand which function was run after building the program.

Step 5 - Adding the Resolver Function

The iFunc resolver function is the heart of our code. This is where the codes determine the system's compatibility and chooses the appropriate function implementation.

We also need to include this function in our altered function.c file.

There are various ways to include this function in our code with proper names.

The changes we need to do to this function are pretty much limited to the iFunc resolver argument and the 3 return statements.

The iFunc resolver argument name should match the function prototype with iFunc attribute that I explained in Step 2.

The note written in red in step 2 is directly related to this resolver argument name and they should match.

In addition to that, the 3 return statements that redirect to the 3 implementations of the function should match their name.

We don't know their names before running the program so we should determine it during the runtime.

I believe there are multiple ways to do it but decided to go with the following steps as my approach:

Storing the sample iFunc resolver function as a text file.
Replacing the 3 return statements + the resolver argument name with placeholders.
Replacing these placeholders with appropriate variables: funcNameSIMD, funcNameSVE, funcNameSVE2, and "resolve_" + funcName for the resolver argument. (As shown in step 2 code snippet.)

The actual text file will not be modified, instead, it will get copied into a local variable.

Using all these local variables, we can easily create the function_altered.c file which includes:

#include <sys/auxv.h>,
Function prototype with iFunc resolver attribute in assembly (adjust_channels in this case),
3 Different implementations of the function with different auto-vectorization mechanisms,
3 pragmas for each function,
And the iFunc resolver function.

This code snippet shows how I put them together:

# Creating final function_altered.c filed with 3 implementations of function.c, including their pragmas and the ifunc resolver
ffunctionAltered = open("function_altered.c", "w")
ffunctionAltered.write(includeSysAux + iFuncProto + pragmaSIMD + funcOriginSIMD + 
                seperator + pragmaSVE + funcOriginSVE + 
                seperator + pragmaSVE2 + funcOriginSVE2 +
                seperator + resolverFunc)

Note: separator string has been used for code formatting purposes and it does not have any technical functionalities. In fact, it is just a line of comment!

Executing the Tool -> ifuncCreator.py

This code expects 2 arguments to run. Anything more will result in an error output and it will not work without these 2 inputs.

The first one should be function.c and the second one is function.h.

resolver.txt should also be available in the same folder but there is no need to pass it as an argument.

Compile command:
python ifuncCreator.py function.c function.h

This code will create the function_altered.c file.

The following code snippet is from this altered file. You can easily spot the differences between this and the original function.c which you can review on top of this blog.

#include <sys/auxv.h>

__attribute__ (( ifunc("resolve_adjust_channels") )) void adjust_channels(unsigned char *image,int x_size,int y_size,float red_factor,float green_factor,float blue_factor);

#pragma GCC target "arch=armv8-a"

/*

        adjust_channels_SIMD :: adjust red/green/blue colour channels in an image
        
        The function returns an adjusted image in the original location.
        
        Copyright (C)2022 Seneca College of Applied Arts and Technology
        Written by Chris Tyler
        Distributed under the terms of the GNU GPL v2
        
*/

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

// ----------------------------------------------------------------- Naive implementation in C

#include <sys/param.h>

void adjust_channels_SIMD(unsigned char *image, int x_size, int y_size, 
        float red_factor, float green_factor, float blue_factor) {

        printf("Using adjust_channels_SIMD() implementation #1 - Naive (autovectorizable)\n");
        
/*

        The image is stored in memory as pixels of 3 bytes, representing red/green/blue values.
        Each of these values is multiplied by the corresponding adjustment factor, with 
        saturation, and then stored back to the original memory location.
        
        This simple implementation causes int to float to int conversions.
        
*/

        for (int i = 0; i < x_size * y_size * 3; i += 3) {
                image[i]   = MIN((float)image[i]   * red_factor,   255);
                image[i+1] = MIN((float)image[i+1] * blue_factor,  255);
                image[i+2] = MIN((float)image[i+2] * green_factor, 255);
        }
}



// -----------------------------------------------------------------



#pragma GCC target "arch=armv8-a+sve"

/*

        adjust_channels_SVE :: adjust red/green/blue colour channels in an image
        
        The function returns an adjusted image in the original location.
        
        Copyright (C)2022 Seneca College of Applied Arts and Technology
        Written by Chris Tyler
        Distributed under the terms of the GNU GPL v2
        
*/

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

// ----------------------------------------------------------------- Naive implementation in C

#include <sys/param.h>

void adjust_channels_SVE(unsigned char *image, int x_size, int y_size, 
        float red_factor, float green_factor, float blue_factor) {

        printf("Using adjust_channels_SVE() implementation #1 - Naive (autovectorizable)\n");
        
/*

        The image is stored in memory as pixels of 3 bytes, representing red/green/blue values.
        Each of these values is multiplied by the corresponding adjustment factor, with 
        saturation, and then stored back to the original memory location.
        
        This simple implementation causes int to float to int conversions.
        
*/

        for (int i = 0; i < x_size * y_size * 3; i += 3) {
                image[i]   = MIN((float)image[i]   * red_factor,   255);
                image[i+1] = MIN((float)image[i+1] * blue_factor,  255);
                image[i+2] = MIN((float)image[i+2] * green_factor, 255);
        }
}



// -----------------------------------------------------------------



#pragma GCC target "arch=armv8-a+sve2"

/*

        adjust_channels_SVE2 :: adjust red/green/blue colour channels in an image
        
        The function returns an adjusted image in the original location.
        
        Copyright (C)2022 Seneca College of Applied Arts and Technology
        Written by Chris Tyler
        Distributed under the terms of the GNU GPL v2
        
*/

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

// ----------------------------------------------------------------- Naive implementation in C

#include <sys/param.h>

void adjust_channels_SVE2(unsigned char *image, int x_size, int y_size, 
        float red_factor, float green_factor, float blue_factor) {

        printf("Using adjust_channels_SVE2() implementation #1 - Naive (autovectorizable)\n");
        
/*

        The image is stored in memory as pixels of 3 bytes, representing red/green/blue values.
        Each of these values is multiplied by the corresponding adjustment factor, with 
        saturation, and then stored back to the original memory location.
        
        This simple implementation causes int to float to int conversions.
        
*/

        for (int i = 0; i < x_size * y_size * 3; i += 3) {
                image[i]   = MIN((float)image[i]   * red_factor,   255);
                image[i+1] = MIN((float)image[i+1] * blue_factor,  255);
                image[i+2] = MIN((float)image[i+2] * green_factor, 255);
        }
}



// -----------------------------------------------------------------

// Resolver function - this function picks which of the
// implementations will be executed when foo() is called
//
// The resolver function is only run once, the first time
// that foo() is called.
//
static void (*resolve_adjust_channels(void)) {
        // Each of these two variables is populated with
        // a bitfield indicating specific hardware 
        // capabilities. hwcaps includes a bit for SVE,
        // and hwcaps2 includes a bit for SVE2
        //
        long hwcaps  = getauxval(AT_HWCAP);
        long hwcaps2 = getauxval(AT_HWCAP2);

        printf("\n### Resolver function - selecting the implementation to use for adjust_channels()\n");
        if (hwcaps2 & HWCAP2_SVE2) {
                return adjust_channels_SVE2;
        } else if (hwcaps & HWCAP_SVE) {
                return adjust_channels_SVE;
        } else {
                return adjust_channels_SIMD;
        }
};

After running this code, we require 5 files in our folder which are listed below:

function.c
function.h
main.c
resolver.txt
adjust_channels.h

Now we can test the functionality of our created tool using these commands.

gcc -g -O3 main.c function.c -o main

gcc -g -O3 main.c function_altered.c -o ifuncMain -> Will use SIMD

-qemu-aarch64 gcc -g -O3 main.c function_altered.c -o ifuncMain -> Will use SVE2

I tried to put these commands in the ifuncCreator.py code as bash codes with os.system() function but it did not work.

I still haven't figured out why but I am planning to work on it for the next stage of the project.

Testing

Testing will help us understand if our program is working correctly and find its limitations. Using this information we can develop our program to further steps.

I am planning to cover the testing methods and their results in another blog post!