Abramo Bagnara

Il 19/12/21 23:36, Nicolas Pitre ha scritto:
On Sun, 19 Dec 2021, Abramo Bagnara wrote:

Il 19/12/21 02:43, Nicolas Pitre ha scritto:
On Fri, 17 Dec 2021, Abramo Bagnara wrote:

In the specific of your PR: have you considered that if, instead to keep
increasing buf, you increase a buffer offset you would avoid to do ugly
and NULL subtraction?
Well... I took this as a challenge and attempted it.

And it didn't work satisfactorily.

The problem is that the buf pointer accomplishes two roles: it serves as
a buffer offset as well as a memory boundary indicator used for argument
alignment. Going with a buffer base and an explicit offset doesn't allow
for proper alignment without also taking the buffer base into account.
If the buffer base is NULL then you need yet another variable to hold
the misalignment. This complexifies the code, increases register
pressure, makes resulting binary bigger, and doesn't look any better in
the end.

So my efforts resulted in the following PR instead which I hope would be
an acceptable middle ground.
As long as there is still an undefined behavior in the code, I'd like you
resign yourself to the fact that this cannot be accepted.
No. I disagree to such dogmatic stance.
But let's start with a few facts of life.
Did you know it is illegal to change a light bulb yourself if you live
in Victoria, Australia?
Did you know that the Queen Elizabeth hotel in Montreal is required by
law to provide a stall and food to its clients' horses?
Do you really think those laws are actively enforced?
The law enforcement analogy is completely misdriven: our case is not related to your freedom to do something (I assure you that you are perfectly free to introduce undefined behaviors in your code).

OTOH it is related to *compiler* freedom to do things that you don't expect for your code but that they can do.

Are you the kind of person to never drive faster than 70km/h on the
highway because that's the posted speed limit and therefore the law
despite surrounding traffic going at 115km/h? Obviously it is safer to
drive at 70km/h in theory, but in practice, some circumstances would
make it a safety hazard.

The easy resume is that you are stating that pointer arithmetic with NULL
pointer, despite being declared undefined behavior in all C standards and
despite not being a documented GCC extension is safe.
Yes. And let me precise: with a typed pointer. I fully agree that:
long foo(int *x) { return x - NULL; }
is undefined and without sensible semantic. But, on the other hand:
long foo(int *x) { int *y = NULL; return x - y; }
is totally sensible.

You give proofs of that showing generated assemblers of pointer differences
and showing this has been translated as you expect.

You seem to have forgot that in some obsolete compilers the offsetof macro was
implemented in this way and that this has been changed ages ago due to such
undefined behavior (GCC has __builtin_offsetof exactly for this reason). This
is a sensible proof that also GCC authors does not rely on the fact that code
generation would have worked correctly to implement such macro using null
pointer arithmetic.
I disagree: gcc authors are in a perfect position to know and ensure for
themselves that using a NULL pointer is fine. But they certainly could
have provided a wrapper just to give the C-standard orthodoxy some peace
of mind.
In this case they would have added a documented extension, if they haven't you can deduce they don't want to, i.e. they are free to do whatever with it.

Let me make a real life example (without any horses):

gcc manual says that they allow as an extension zero-length arrays in *C* (that in zephyr are used), but I have tried and they don't give any errors also in C++.

But then I try to run this program:

#include <cstdio>

struct S {
S() {
~S() {
int empty[0];

int main() {
S vals[4];

And I don't get the expected

oops... they told to me they were free to do what they want... why I've not listened?

(I don't want to ruin the suspence telling how clang and gcc behaves)

But let's have a look at the most influential and prevalent Open Source
project out there for a minute. Linux does this:
#undef offsetof
#ifdef __compiler_offsetof
#define offsetof(TYPE, MEMBER) __compiler_offsetof(TYPE, MEMBER)
#define offsetof(TYPE, MEMBER) ((size_t)&((TYPE *)0)->MEMBER)
And elsewhere:
#define __compiler_offsetof(a, b) __builtin_offsetof(a, b)
If someone has no alternatives is forced to extreme measures, this is a fact of life.

But as linux kernel AFAIK can be compiled only by gcc and clang I suspect this is dead code.

So far so good. Yet we can find these:
#define typeof_member(T, m) typeof(((T*)0)->m)
#define container_of(ptr, type, member) ({ \
void *__mptr = (void *)(ptr); \
static_assert(__same_type(*(ptr), ((type *)0)->member) || \
__same_type(*(ptr), void), \
"pointer type mismatch in container_of()"); \
((type *)(__mptr - offsetof(type, member))); })
Isn't it ironic that the extra statement used to improve code
correctness is itself relying on an "undefined" behavior?
Please don't confuse unevaluated or compile-time only constructs with code generation, they don't play the same game and they don't have the same risks.

all compilers capable of compiling Zephyr I have access to do produce
the right result, *all* the time. It is true whether or not the pointer
being NULL is known at compile time (when it is the compiler also does
constant propagation optimizations on it). It is not because something
is "undefined" that it is necessarily "forbidden".
No, it is not forbidden, it is in the domain "you can do what you want, but be prepared to pay the price" and the price is unspecified.

The fact that such an operation remains "undefined" by the standard is
analogous to requiring hotels to stock hay. Maybe it made sense on
I've not said it is related only to the standards, standards can be extended by implementations. I've said that it is undefined by the standard *and* it is not a compiler extension, then it is a no no.

antique CPU architectures where a NULL didn't match the value 0 but some
special magic value the hardware would trigger a fault with. But I
challenge you to find an architecture whose ABI doesn't define NULL as
zero today.
The fact that ABI represents NULL with zero is completely not relevant: in all current architectures the signed integer representation is 2-complement, but still the C arithmetic is undefined for sum overflow.

I also understand that you have a task to accomplish and it is evaluated
in terms of standard violation occurrence reduction. I'm therefore
proposing this change on top of my latest patch (yes, I'm adding more
My day job is completely not relevant here and I invite you to not make this personal:

-#define BUF_OFFSET (size_t)(buf - buf0)
+#define BUF_OFFSET ((uintptr_t)buf - (uintptr_t)buf0)
This should fool the conformance scanning tool just as well as many of
the other conformance changes being proposed.
To say that many of proposed changes are aimed to fool the comformance tool is rather offensive, don't you think?

Would that make your life easier?
The easyness of my life is far out of scope.

Back to technical:

No, the conversion to uintptr_t is not enough to fix that, as in many place the code author relies on undefined NULL arithmetic.

As I said one of the possible solutions is to use a buffer offset that will be applied only when buffer is not NULL.

However it is definitely not my role to say what should be accepted, this is a job for official reviewers and, I guess, steering commitee for important cases.

Said that, personally I think that the attitude toward safety and missing standard/compiler warranties represented in this thread has space for improvement, but I'm sure it will improve as reciprocal trust and awareness will increase.

Abramo Bagnara


Join { to automatically receive all group messages.