Topics

Wrapping compiler builtins


Jakob Stoklund Olesen <jolesen@...>
 

Hi,

 

I’m new to Zephyr. It’s an impressive project!

 

I’ve been compiling parts of Zephyr in different environments in order to run unit tests on macOS and using the native_posix board on a Linux machine. I hit some trivial portability issues with the Zephyr code, particularly:

  • Macros like __used and __unused are already defined by a system header on macOS.
  • Integer overflow built-ins like __builtin_add_overflow() were introduced in GCC 5.2, so they don’t build with GCC 4.8.

 

I see that header files like toolchain/xcc.h define these built-ins for compilers that don’t have them, but redefining built-ins this way has a couple problems:

  1. C identifiers starting with a double underscore are reserved for the implementation (compiler + standard libraries), so redefining them could cause problems with future compiler releases. See https://en.cppreference.com/w/c/language/identifier#Reserved_identifiers.
  2. The precise semantics of some built-ins can be impossible to replicate in compilers that don’t support them. For example __builtin_add_overflow() is generic over its argument types in a way that would be difficult to imitate even with C++ templates. See https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html.
  3. Some built-ins have unfortunate corner case behaviors that reflect their history more than good design. For example, __builtin_ctz(x) invokes undefined behavior when x=0 which can be surprising.

I think both 2) and 3) have the potential to cause subtle security issues that only show up on some platforms.

 

Other performance-conscious open source projects also use compiler built-ins, but deal with these problems by wrapping built-ins in inline functions. Two examples are

These libraries of inline functions address the problems with naked built-ins by:

  1. Defining inline functions with normal names that aren’t reserved. The implementation uses compiler built-ins when available, so the generated machine code is normally identical after inlining.
  2. Defining functions with semantics that don’t extend the C language. For example, there would be separate xxx_add_overflow() functions for each integer type needed.
  3. Avoiding undefined behavior and surprising corner cases. For example CountTrailingZeroes32(0) is defined to return 32.

 

Would you be interested in pull requests that add something like this to Zephyr? The goal would be to eliminate naked built-ins from the cross-platform parts of the code. The architecture-specific built-ins are much less of a problem, I think.

 

Thanks,

Jakob

 


Kumar Gala
 

On Apr 30, 2019, at 12:37 PM, Jakob Stoklund Olesen <jolesen@...> wrote:

Hi,

I’m new to Zephyr. It’s an impressive project!

I’ve been compiling parts of Zephyr in different environments in order to run unit tests on macOS and using the native_posix board on a Linux machine. I hit some trivial portability issues with the Zephyr code, particularly:
• Macros like __used and __unused are already defined by a system header on macOS.
• Integer overflow built-ins like __builtin_add_overflow() were introduced in GCC 5.2, so they don’t build with GCC 4.8.

I see that header files like toolchain/xcc.h define these built-ins for compilers that don’t have them, but redefining built-ins this way has a couple problems:
• C identifiers starting with a double underscore are reserved for the implementation (compiler + standard libraries), so redefining them could cause problems with future compiler releases. See https://en.cppreference.com/w/c/language/identifier#Reserved_identifiers.
• The precise semantics of some built-ins can be impossible to replicate in compilers that don’t support them. For example __builtin_add_overflow() is generic over its argument types in a way that would be difficult to imitate even with C++ templates. Seehttps://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html.
• Some built-ins have unfortunate corner case behaviors that reflect their history more than good design. For example, __builtin_ctz(x) invokes undefined behavior when x=0 which can be surprising.
I think both 2) and 3) have the potential to cause subtle security issues that only show up on some platforms.

Other performance-conscious open source projects also use compiler built-ins, but deal with these problems by wrapping built-ins in inline functions. Two examples are
• Mozilla: https://hg.mozilla.org/mozilla-central/file/tip/mfbt/MathAlgorithms.h
• LLVM: https://llvm.org/doxygen/MathExtras_8h_source.html
These libraries of inline functions address the problems with naked built-ins by:
• Defining inline functions with normal names that aren’t reserved. The implementation uses compiler built-ins when available, so the generated machine code is normally identical after inlining.
• Defining functions with semantics that don’t extend the C language. For example, there would be separate xxx_add_overflow() functions for each integer type needed.
• Avoiding undefined behavior and surprising corner cases. For example CountTrailingZeroes32(0) is defined to return 32.

Would you be interested in pull requests that add something like this to Zephyr? The goal would be to eliminate naked built-ins from the cross-platform parts of the code. The architecture-specific built-ins are much less of a problem, I think.

Thanks,
Jakob
There is interest in this as we want to support various toolchains beyond GCC. So we need to look at cleaning this up. I’d suggest maybe working up a simple proof of concept PR as the best way to discuss this.

- k


Sigvart Hovland
 

I would say it’s an interest in such a pull request to enable more toolchains to build zephyr.

 

> Would you be interested in pull requests that add something like this to Zephyr? The goal would be to eliminate naked built-ins from the cross-platform parts of the code. The architecture-specific built-ins are much less of a problem, I think.