GSOC 2024 Proposal
Table of Contents
1. General Information
Name Getz Mikalsen Email getz@sdf.org Phone +46737301192 IRC getz at libera.chat Matrix @getz:envs.net Timezone UTC+1
I can dedicate a significant amount of my time during the summer. With no other major obligations, I plan to work around 40 hours per week on the proposed tasks and objectives.
As for the timeline, I will align my schedule with the suggested Google Summer of Code program dates. The project will aim to commence on the official coding start date.
I am committed to maintaining open communication with my mentor and the FreeBSD community throughout the project's duration. This will ensure that I stay on track, address any challenges promptly, and incorporate feedback effectively. Regular progress updates and discussions will be a priority to facilitate a smooth and successful collaboration as suggested by GSOC.
1.1. Biography
For the past four years, I have been an active FreeBSD user, running it on my personal laptop. Currently, I serve as a System Administrator at our University's Computer Society, Datorföreningen vid LU/LTH. In this role, I manage a diverse range of computer systems, including legacy machines like VAX, Sun, and SGI, as well as modern servers running FreeBSD, Debian, and OpenBSD. My responsibilities encompass system upgrades, security patch installation, troubleshooting, and backup and recovery procedures.
As an Embedded Electronics Engineering student pursuing a Master's Programme at Lund University (LTH) in Sweden [1], I have gained extensive knowledge in processor design, computer architecture, and I am currently implementing the RISC-V base ISA as a soft core on an FPGA. Furthermore, I have an interest in reverse engineering and actively participate at crackmes.one (previously crackmes.de). Reading and writing assembly code is fun and at times tedious, -O2 is a mess.
Mentor Robert Clausecker <fuz@FreeBSD.org> volunteers to mentor the project.
We have discussed the project on #freebsd.soc on libera.chat
2. Porting amd64 libc SIMD enhancements to arm64 (aarch64)
2.1. Project Description
The proposed project aims to enhance the performance of the FreeBSD libc library by expanding its performance enhancements to the aarch64 architecture. This will be done by introducing NEON and groundwork for SVE support for optimized string manipulation functions.
This will involve addressing the lack of the pmovmskb instruction in NEON
through clever techniques. Furthermore, the project will lay the groundwork for
future SVE support by introducing an "aarch64_archlevel.h"
header file to
handle different architectural levels. The scope of the project will be limited
to porting to NEON due to time constraints and lack of SVE capable hardware.
I currently have access to NEON capable hardware in the form of an M1 Macbook and two Raspberry Pi 5.
2.1.1. libc Porting and Optimization
The initial phase of the project (before midterm evaluation) is suggested to concentrate on porting and optimizing smaller libc functions from the src/lib/libc/amd64/string directory, such as strlen. These functions will serve as a foundation for optimizing other string manipulation routines, including strchrnul, strcat, and related functions.
2.2. Technical Approach
As an initial step, I have begun porting the strlen function to the aarch64 architecture with NEON optimizations. The amd64 implementation leverages the PCMPEQB, PMOVMSKB, and BSF instructions to efficiently scan the string bytes. For aarch64, I discovered the equivalent optimization technique using the SHRN, FMOV, RBIT, CLZ, and bit shifting instructions, as used in the ARM "Optimized Routines" codebase [2]. Important to keep in mind is to avoid the costly moves back and forth to GPRs.
2.3. Significance for FreeBSD
Enhancing the libc library's support for aarch64 architectures is crucial for ensuring FreeBSD's continued relevance and performance on modern ARM-based systems. As the adoption of ARM processors in server and embedded environments continues to grow, optimizing the libc library for these architectures will enable FreeBSD to deliver optimal performance and efficiency on a broader range of hardware platforms.
I am committed to becoming a long-term FreeBSD contributor, actively maintaining and further improving the optimized code and hope that this GSOC project will be a good starting point.
2.4. Applicant Background
As an Embedded Electronics Engineering student pursuing a Master's Programme at Lund University (LTH) in Sweden, I am passionate about performance optimization and low-power computing. Single Instruction Multiple Data (SIMD) technologies like NEON and SVE offer significant potential for improving the efficiency of critical software components, such as the libc library, on ARM-based systems. By leveraging these instruction set extensions, we can unlock substantial performance gains while minimizing power consumption, which is crucial in the embedded and mobile computing domains.
I'm currently involved in a research project with a group at LTH creating an in-memory computing platform using memristor arrays. This project ends before the summer.
I previously maintained of a branch of GerdaOS [3] for the JioPhone1, a featurephone released in India. Until I broke the device while debugging. :D I have completed relevant courses in computer architecture, digital design, operating systems and effective C programming. I am confident in my ability to contribute effectively to the proposed project.
I have a strong background in programming languages such as C, C++, Rust, and shell scripting (z{sh}, sed, awk). Additionally, I have hands-on experience with patching FreeBSD and am familiar with the workflow, "git pull && make buildworld && make buildkernel".
Im familiar with x86-64 and MIPS assembly which positions me well to grasp ARM/NEON concepts quickly.
3. Deliverables
Introduce a new ARCHLEVEL flag for the aarch64 architecture, allowing selection
between base, NEON, and SVE instruction set extensions. This will involve
creating an aarch64_archlevel.h
header file to manage the different levels of
architectural support.
Update the simd(7)
manual page to document the new SIMD-accelerated string
functions for aarch64, including details on the ARCHLEVEL flag and its usage.
And port the functions with already existing SIMD implementations for amd64 to aarch64.
All development and contributions will be hosted in a publicly accessible fork of the FreeBSD source code repository (freebsd-src git repository). The primary deliverable functions for this project are the following:
Functions to be Ported ------------------------ HEADER FUNCTION NOTES string.h stpcpy String copy functions stpncpy strcat strncat strcpy strncpy strlcpy strlcat strchrnul strrchr strcspn strspn strpbrk strsep String tokenisation functions strcmp String comparison functions strncmp memcpy Memory copy functions memccpy memset Memory initialisation functions memchr Memory search functions memrchr memcmp Memory comparison function strlen String length timingsafe_bcmp Constant time functions timingsafe_memcmp Additional functions that will benefit from these enhancements: strings.h index *legacy, through strchr rindex *legacy, through strrchr bcmp *legacy, through memcmp bcopy *legacy, through memmove bzero *legacy, through memset explicit_bzero through memset memset_s through memset mempcpy through memmove memcpy through memmove strdup through memcpy, strlen strtok through strtok_r
3.1. Testing
Performance benchmarking will be carried out the same way fuz did when he wrote libc SIMD enhancements for amd64 using his strfperf [4] utility. Functionality tests are located in src/lib/libc/tests/string, If further tests are needed they will be written.
For example /src/lib/libc/tests/string/memcmp_test.c
has support for custom
memcmp function. Helping development tremendously.
4. Project Schedule
The submitter is based in Copenhagen, Denmark.
I intend to continue studying ARM intrinsics ahead of the start date and provide a proof of concept SIMD port for strlen(3) for the first week. Then gather feedback and investigate the need for additional unit tests for the ported functions.
This work will build from the same algorithmic ideas developed in the SIMD enhancened functions provided by fuz last year [5].
4.1. Timeline
350 hours are allocated for the project as per the GSOC guidelines for a large project.
I request the coding start date to be moved up from May 27 to May 31st due to collisions with an University exam on the 30th
May 1-26 Community bonding
May 31 Coding start
July 8 Midterm goals reached
July 12 Midterm eval deadline
August 26 - September 2 Final GSoC contributor evaluations
[1] https://www.lunduniversity.lu.se/lubas/i-uoh-lu-TAEEE
[2] https://github.com/ARM-software/optimized-routines
[3] https://archive.luxferre.top/gerdaos