On Sun, 12 Dec 2021, Abramo Bagnara wrote:
Who in its right mind will believe that sprinkling typecasts around
would make the code any safer and understandable? And that's not
mentioning those _double_ typecasts being added too!!! We should instead
aim for cast _reduction_ to let the compiler warn more about type
mismatches and air the code, not add new ones!
Note: Sometimes double casts are necessary, but we do hide them behind
special accessors.
In the specific, double casts are needed in two situations:
1) A char argument to ctype predicates should be first converted to
unsigned char (as the C standard requires) and then converted to
int to avoid mixing of signed and unsigned. Said that, nothing
prevents us from using helpers that accept char argument or to
embed such double casts in, e.g.,
#define CTYPE_CHAR(c) ((int)(unsigned char)(c))
That's certainly better. And this provides you with the opportunity to
document the reason for such a macro alongside its definition.
2) A narrower computation is assigned to a wider or differently
signed result, e.g.,
u64 = u32_a - u32_b;
In this case we should write
u64 = (uint64_t)(uint32_t)(u32_a - u32_b);
This is used to make it explicit that (1) the type in which the
difference should be computed is uint32_t and (2) that the
(possibly wrapped) result should be converted to the destination
type.
Without that the risk is writing something like, e.g.,
u64 = u32_a + u32_b;
without realizing that an unwanted wrapping may happen.
Well, you aren't preventing any wrapping by adding a bunch of casts.
Of course, if we want an unsigned 64-bit addition, we should write
u64 = (uint64_t)u32_a + u32_b;
Or even:
u64 = u32_a;
u64 += u32_b;
Two lines, but still more enjoyable to read.
Again, it is possible to encapsulate the cast used to express the
"deliberately done" property in a macro.
Only if that was the actual intent.
I doubt such thing was deliberate. Most likely a value wrap is just not
a possible outcome ever in a given code context and no one deemed it
necessary to explicitly spell such thing out.
Assigning to an u64 is indeed a bit suspicious in such a case. Maybe the
u64 is unnecessary to start with, and if so that's what needs fixing
whereas papering over it with a cast pipeline doesn't improve the code.
Why do we have to make all loop index vars unsigned? It is a very common
idiom to test for "i < 0" in loops. Making some signed and some unsigned
is just adding to the inconsistency for no gain.
Why do we need to test against an unsigned zero everywhere? Does anyone
know a about a compiler implementation where a signed 0 vs an unsigned
0U makes any difference?
s32 < 0U and s32 < 0 makes a *lot* of difference: this is one of the
reasons why MISRA insists so much to not mix signed and unsigned.
Here I disagree.
s32 < 5U is wrong.
s32 < 5 is OK
u32 < 5U is OK
u32 < 5 is OK
So the unqualified literal is less cluttered, and always OK.
The qualified 5U is ugly, and wrong in mixed signedness cases.
So the safest thing to do, and the most convivial for developers and
reviewers alike, is to _NOT_ qualify literals as unsigned by default.
The signed/unsigned semantic is already carried by variables those
literals are used with.
We might consider deviating for signed constants whose value is unchanged by
promotion and usual arithmetic conversion. Would this be ok for you?
That would be an improvement indeed.
And why casting pointer difference to a size_t? The official type for a
pointer difference is ptrdiff_t and the standard offers no explicit
guarantee that sizeof(ptrdiff_t) equals sizeof(size_t). People who live
by the standard will tell you that size_t _could_ be smaller and
therefore silently truncate the result.
Actually, a standard C implementation must ensure that the maximum
(positive) value for ptrdiff_t must be representable as
size_t.
OK.
But still, what's the point? Just to get rid of the signed nature of it?
And a real kicker is this change:
- * @return 0 on success, nonzero on failure
+ * @return false on success, true on failure
Who really thinks that the above is idiomatic and an improvement in code
understandability? It is not that MISRA does require it in this
particular case, but it certainly did skew the mindset of the person who
did that change.
Here I really don't follow you: if the function was *already* returning
a Boolean, why is this change troublesome?
Because it is not returning a boolean. It is returning a non-zero value
on failure. That's not a boolean. The common operating system idiom in
that case is to return an error code, typically negative.
If the particular error reason is unnecessary then it is of course a
valid simplification to return a boolean instead. But it should be true
for success and false for failure which is by far the most common
expectation. And then the name of the function should probably be
modified as well to make it clear to users that the function is no
longer following the error code return paradigm. Otherwise it is best to
leave it as is.
Nonetheless I think that the right mindset is that missing adherence
to MISRA rules should be justified carefully and not the opposite
(i.e., by only following the rules that change nothing or that we are
used to).
I think both cases should be justified carefully. Some rules might have
been justified back when people wrote C code K&R style but are just an
annoyance with no real benefit nowdays.
I've also seen too many times a bug being introduced because the code
was modified just to make it [name your favorite static analysis tool /
coding standard here]-compliant. It is often the case that the code is
questionnably written in the first place, and adding typecasts to it
makes it worse even if the scanning tool is then happy.
Nicolas