Topics

RFC - C99 issue regarding unnamed union/structs


Flavio Ceolin
 

Hi guys,

One problem with have in Zephyr regarding C99 is that we are using a lot
of unnamed struct/unions. One example is:

"""
struct z_kernel {
/* For compatibility with pre-SMP code, union the first CPU
* record with the legacy fields so code can continue to use
* the "_kernel.XXX" expressions and assembly offsets.
*/
union {
struct _cpu cpus[CONFIG_MP_NUM_CPUS];
#ifndef CONFIG_SMP
struct {
/* nested interrupt count */
u32_t nested;

/* interrupt stack pointer base */
char *irq_stack;

/* currently scheduled thread */
struct k_thread *current;
};
#endif
};

...
"""

Named this structs/unions is OK, just a matter of text-replace around
the project. The problem is that we have some symbols being generated by
macros and I didn't find a clean way to fix it without extra changes.
The macro in question is GEN_OFFSET_SYM(). Take as example the following
code:

"""
GEN_OFFSET_SYM(_kernel_t, current);
"""
If I change it for something like:
"""
GEN_OFFSET_SYM(_kernel_t, cpu.state.current);
"""

It won't work because this macro concats parameters 1 and 2 to make a
new symbol. One solution is using a variadic parameter hack:

"""
#define GEN_OFFSET_SYM_INTERNAL_0(S, M) \
GEN_ABSOLUTE_SYM(__##S##_##M##_##OFFSET, offsetof(S, M))
#define GEN_OFFSET_SYM_INTERNAL_1(S, M, ALIAS) \
GEN_ABSOLUTE_SYM(__##S##_##ALIAS##_##OFFSET, offsetof(S, M))
#define GET_OVERRIDE(_1, _2, _3, NAME, ...) NAME

#define GEN_OFFSET_SYM(...) \
GET_OVERRIDE(__VA_ARGS__, GEN_OFFSET_SYM_INTERNAL_1, \
GEN_OFFSET_SYM_INTERNAL_0)(__VA_ARGS__)
"""

The problem with this solution is that C99 requires at least one
argument in a variadic macro. In other words, it also violates C99.

The other possible solution is change (or add a new) macro that receives three
parameters (the third is an alias).

Do you guys have other solution ? if now which one you prefer, add a new
macro or change the current one (and consequently all invocations).


Regards,
Flavio Ceolin


Peter A. Bigot
 

I don't have a solution, but since the question came up I am interested in any reaction to https://github.com/zephyrproject-rtos/zephyr/issues/9552#issuecomment-449718078 since that's relevant to the topic.

Peter

On 1/14/19 1:47 PM, Flavio Ceolin wrote:
Hi guys,

One problem with have in Zephyr regarding C99 is that we are using a lot
of unnamed struct/unions. One example is:


Flavio Ceolin
 

Hi Peter,

I don't have a solution, but since the question came up I am interested
in any reaction to
https://github.com/zephyrproject-rtos/zephyr/issues/9552#issuecomment-449718078
since that's relevant to the topic.
I've given my opnion there :)


Peter

On 1/14/19 1:47 PM, Flavio Ceolin wrote:
Hi guys,

One problem with have in Zephyr regarding C99 is that we are using a lot
of unnamed struct/unions. One example is:
Flavio Ceolin


Thomas Törnblom
 

While on the subject of language compliance, what about the frequent usage of non-standard gcc extensions that limits portability?

I've started looking at making IAR Embedded Workbench a supported tool chain for Zephyr, and while I have not looked at the cmake build tools yet, I've tried to build one of the examples (hello) using our IDE, and have run into several non-standard things like:

1) Empty structs (struct _caller_saved in kernel_arch_thread.h)

2) Packed structs, "struct __packed__ ..." vs "__packed struct .." or "struct __attribute__((__packed__)) ...", the latter two variants are accepted by our tools, the first is not ( _k_thread_stack_element in kernel.h)

3) Zero sized arrays (several places in the include files for logging)

4) Ellipses/ranges in case (printk.c)

5) alloca

Of these alloca() is probably the most problematic for us as we have no alloca, and no easy way to implement it either as we have no frame pointer. In one place I could use a variable size array, but the rb code requires more work, and I would really prefer not having to use the variable size array either.

Comments?

/Thomas

Den 2019-01-14 kl. 22:11, skrev Peter A. Bigot:

I don't have a solution, but since the question came up I am interested in any reaction to https://github.com/zephyrproject-rtos/zephyr/issues/9552#issuecomment-449718078 since that's relevant to the topic.

Peter

On 1/14/19 1:47 PM, Flavio Ceolin wrote:
Hi guys,

One problem with have in Zephyr regarding C99 is that we are using a lot
of unnamed struct/unions. One example is:




--

Thomas Törnblom, Product Engineer
IAR Systems AB
Box 23051, Strandbodgatan 1
SE-750 23 Uppsala, SWEDEN
Mobile: +46 76 180 17 80 Fax: +46 18 16 78 01
E-mail: thomas.tornblom@... Website: www.iar.com
Twitter: www.twitter.com/iarsystems


Paul Sokolovsky
 

On Mon, 14 Jan 2019 11:47:19 -0800
"Flavio Ceolin" <flavio.ceolin@intel.com> wrote:

Hi guys,

One problem with have in Zephyr regarding C99 is that we are using a
lot of unnamed struct/unions.
Anonymous unions are important part of the C language. An obvious
solution to the problem is switching to the C11 standard.

One example is:

"""
struct z_kernel {
/* For compatibility with pre-SMP code, union the first CPU
* record with the legacy fields so code can continue to use
* the "_kernel.XXX" expressions and assembly offsets.
*/
union {
struct _cpu cpus[CONFIG_MP_NUM_CPUS];
#ifndef CONFIG_SMP
struct {
/* nested interrupt count */
u32_t nested;

/* interrupt stack pointer base */
char *irq_stack;

/* currently scheduled thread */
struct k_thread *current;
};
#endif
};

...
"""
[]

--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Andy Ross
 

On 1/15/2019 4:25 AM, Paul Sokolovsky wrote:
On Mon, 14 Jan 2019 11:47:19 -0800
"Flavio Ceolin" <flavio.ceolin@intel.com> wrote:
One problem with have in Zephyr regarding C99 is that we are using a
lot of unnamed struct/unions.
Anonymous unions are important part of the C language. An obvious
solution to the problem is switching to the C11 standard.
Yeah, I have a ton of sympathy for poor Flavio having to deal with all
these conflicting rules and requirements, but not a lot for this
particular rule that was clearly solved by the C standards committee
the better part of a decade ago.

That said, the specific example given is just there for compatibility,
to reduce the impact of the SMP introduction. There's no reason we
can't go in and change all instances of e.g. "_kernel.irq_stack" to
"_kernel.cpus[0].irq_stack" (or actually "_current_cpu.irq_stack").

IIRC the change was looking to get a little messy when dealing with
the assembly offset generation, but a little code motion never hurt
anything.

Andy


Andy Ross
 

In order:

1. Empty structs are really useful, for example the spot you're noticing that allows arch code to define its own set of private data, which might be zero-length.  Obviously that could be worked around with appropriate preprocessor trickery (e.g. CONFIG_ARCH_HAS_CALLER_SAVED_STRUCT or something similarly wordy), but not as cleanly.

2. That looks wrong.   We should surely be using a consistent way to specify packed structs.

3. Flexible arrays are valid C99.  You mean a literal zero in the brackets in the declaration?  Yeah, that's the wrong syntax, the standard specifies "[]" for this.

4. That's a "case '1' ... '9':" expression to detect decimal digits in the middle of an otherwise giant switch.   Sure, I guess we could turn that into nine separate case labels, but wow is it nicer this way and boy is it a useful feature that the compiler gives us to express this.  Hint, hint.

5. Dynamic stack allocation (via either variable arrays or alloca()) is problematic for a various reasons (some dumb, but some good -- note that systemd just got hit by a local root exploit last week due to an attacker-controlled alloca!), and I'm actually the worst offender in the current tree.   These are likely all going to have to go away soon, or at least be reworked into a form such that the code will build without dynamic stacks.

Andy


On 1/15/2019 12:47 AM, Thomas Törnblom wrote:
While on the subject of language compliance, what about the frequent usage of non-standard gcc extensions that limits portability?

I've started looking at making IAR Embedded Workbench a supported tool chain for Zephyr, and while I have not looked at the cmake build tools yet, I've tried to build one of the examples (hello) using our IDE, and have run into several non-standard things like:

1) Empty structs (struct _caller_saved in kernel_arch_thread.h)

2) Packed structs, "struct __packed__ ..." vs "__packed struct .." or "struct __attribute__((__packed__)) ...", the latter two variants are accepted by our tools, the first is not ( _k_thread_stack_element in kernel.h)

3) Zero sized arrays (several places in the include files for logging)

4) Ellipses/ranges in case (printk.c)

5) alloca

Of these alloca() is probably the most problematic for us as we have no alloca, and no easy way to implement it either as we have no frame pointer. In one place I could use a variable size array, but the rb code requires more work, and I would really prefer not having to use the variable size array either.

Comments?

/Thomas

Den 2019-01-14 kl. 22:11, skrev Peter A. Bigot:
I don't have a solution, but since the question came up I am interested in any reaction to https://github.com/zephyrproject-rtos/zephyr/issues/9552#issuecomment-449718078 since that's relevant to the topic.

Peter

On 1/14/19 1:47 PM, Flavio Ceolin wrote:
Hi guys,

One problem with have in Zephyr regarding C99 is that we are using a lot
of unnamed struct/unions. One example is:




--

Thomas Törnblom, Product Engineer
IAR Systems AB
Box 23051, Strandbodgatan 1
SE-750 23 Uppsala, SWEDEN
Mobile: +46 76 180 17 80 Fax: +46 18 16 78 01
E-mail: thomas.tornblom@... Website: www.iar.com
Twitter: www.twitter.com/iarsystems


Flavio Ceolin
 

Hi,

In order:

1. Empty structs are really useful, for example the spot you're noticing
that allows arch code to define its own set of private data, which might
be zero-length.  Obviously that could be worked around with appropriate
preprocessor trickery (e.g. CONFIG_ARCH_HAS_CALLER_SAVED_STRUCT or
something similarly wordy), but not as cleanly.
I'm with Thomas, this is a GCC extension we should rely on them.


2. That looks wrong.   We should surely be using a consistent way to
specify packed structs.
Thumbs up


3. Flexible arrays are valid C99.  You mean a literal zero in the
brackets in the declaration?  Yeah, that's the wrong syntax, the
standard specifies "[]" for this.
Literal zero is GCC extension and about flexible array "[]" it can be
used *only* as the last element in a struct (and this struct cannot be
used in array or inside other struct/union - GCC allows it).



4. That's a "case '1' ... '9':" expression to detect decimal digits in
the middle of an otherwise giant switch.   Sure, I guess we could turn
that into nine separate case labels, but wow is it nicer this way and
boy is it a useful feature that the compiler gives us to express this. 
Hint, hint.

5. Dynamic stack allocation (via either variable arrays or alloca()) is
problematic for a various reasons (some dumb, but some good -- note that
systemd just got hit by a local root exploit last week due to an
attacker-controlled alloca!), and I'm actually the worst offender in the
current tree.   These are likely all going to have to go away soon, or
at least be reworked into a form such that the code will build without
dynamic stacks.
Yeah, remove alloca usage is something that is being tracked (in some
issues actually). e.g >https://github.com/zephyrproject-rtos/zephyr/issues/9892

Andy


On 1/15/2019 12:47 AM, Thomas Törnblom wrote:
While on the subject of language compliance, what about the frequent
usage of non-standard gcc extensions that limits portability?

I've started looking at making IAR Embedded Workbench a supported tool
chain for Zephyr, and while I have not looked at the cmake build tools
yet, I've tried to build one of the examples (hello) using our IDE,
and have run into several non-standard things like:

1) Empty structs (struct _caller_saved in kernel_arch_thread.h)

2) Packed structs, "struct __packed__ ..." vs "__packed struct .." or
"struct __attribute__((__packed__)) ...", the latter two variants are
accepted by our tools, the first is not ( _k_thread_stack_element in
kernel.h)

3) Zero sized arrays (several places in the include files for logging)

4) Ellipses/ranges in case (printk.c)

5) alloca

Of these alloca() is probably the most problematic for us as we have
no alloca, and no easy way to implement it either as we have no frame
pointer. In one place I could use a variable size array, but the rb
code requires more work, and I would really prefer not having to use
the variable size array either.

Comments?

/Thomas

Den 2019-01-14 kl. 22:11, skrev Peter A. Bigot:
I don't have a solution, but since the question came up I am
interested in any reaction to
https://github.com/zephyrproject-rtos/zephyr/issues/9552#issuecomment-449718078
since that's relevant to the topic.

Peter

On 1/14/19 1:47 PM, Flavio Ceolin wrote:
Hi guys,

One problem with have in Zephyr regarding C99 is that we are using a
lot
of unnamed struct/unions. One example is:

--

*Thomas Törnblom*, /Product Engineer/
IAR Systems AB
Box 23051, Strandbodgatan 1
SE-750 23 Uppsala, SWEDEN
Mobile: +46 76 180 17 80 Fax: +46 18 16 78 01
E-mail: thomas.tornblom@iar.com <mailto:thomas.tornblom@iar.com>
Website: www.iar.com <http://www.iar.com>
Twitter: www.twitter.com/iarsystems <http://www.twitter.com/iarsystems>
Regards,
Flavio Ceolin


Thomas Törnblom
 



Den 2019-01-15 kl. 17:45, skrev Andy Ross:
In order:

1. Empty structs are really useful, for example the spot you're noticing that allows arch code to define its own set of private data, which might be zero-length.  Obviously that could be worked around with appropriate preprocessor trickery (e.g. CONFIG_ARCH_HAS_CALLER_SAVED_STRUCT or something similarly wordy), but not as cleanly.
Would there be a problem adding a single anonymous member to this to make it standards compliant?

After all, in C++ empty structs take up space while they don't with gcc.


2. That looks wrong.   We should surely be using a consistent way to specify packed structs.

3. Flexible arrays are valid C99.  You mean a literal zero in the brackets in the declaration?  Yeah, that's the wrong syntax, the standard specifies "[]" for this.

Correct, just the extern declarations, which should be changed to "[]".


4. That's a "case '1' ... '9':" expression to detect decimal digits in the middle of an otherwise giant switch.   Sure, I guess we could turn that into nine separate case labels, but wow is it nicer this way and boy is it a useful feature that the compiler gives us to express this.  Hint, hint.

I agree that it sure looks convenient, but that switch is ~110 lines of code, unrolling that case makes it 119 lines, and standard C.
 

5. Dynamic stack allocation (via either variable arrays or alloca()) is problematic for a various reasons (some dumb, but some good -- note that systemd just got hit by a local root exploit last week due to an attacker-controlled alloca!), and I'm actually the worst offender in the current tree.   These are likely all going to have to go away soon, or at least be reworked into a form such that the code will build without dynamic stacks.

I'm glad to see that the issue with alloca is at least being discussed (https://github.com/zephyrproject-rtos/zephyr/issues/9892).

I see it is a problem also with other vendors tools (https://github.com/zephyrproject-rtos/zephyr/issues/10182).

Andy



Would it be OK to fix issues 1 - 4 in a PR?

/Thomas


On 1/15/2019 12:47 AM, Thomas Törnblom wrote:
While on the subject of language compliance, what about the frequent usage of non-standard gcc extensions that limits portability?

I've started looking at making IAR Embedded Workbench a supported tool chain for Zephyr, and while I have not looked at the cmake build tools yet, I've tried to build one of the examples (hello) using our IDE, and have run into several non-standard things like:

1) Empty structs (struct _caller_saved in kernel_arch_thread.h)

2) Packed structs, "struct __packed__ ..." vs "__packed struct .." or "struct __attribute__((__packed__)) ...", the latter two variants are accepted by our tools, the first is not ( _k_thread_stack_element in kernel.h)

3) Zero sized arrays (several places in the include files for logging)

4) Ellipses/ranges in case (printk.c)

5) alloca

Of these alloca() is probably the most problematic for us as we have no alloca, and no easy way to implement it either as we have no frame pointer. In one place I could use a variable size array, but the rb code requires more work, and I would really prefer not having to use the variable size array either.

Comments?

/Thomas

Den 2019-01-14 kl. 22:11, skrev Peter A. Bigot:
I don't have a solution, but since the question came up I am interested in any reaction to https://github.com/zephyrproject-rtos/zephyr/issues/9552#issuecomment-449718078 since that's relevant to the topic.

Peter

On 1/14/19 1:47 PM, Flavio Ceolin wrote:
Hi guys,

One problem with have in Zephyr regarding C99 is that we are using a lot
of unnamed struct/unions. One example is:




--

Thomas Törnblom, Product Engineer
IAR Systems AB
Box 23051, Strandbodgatan 1
SE-750 23 Uppsala, SWEDEN
Mobile: +46 76 180 17 80 Fax: +46 18 16 78 01
E-mail: thomas.tornblom@... Website: www.iar.com
Twitter: www.twitter.com/iarsystems

--

Thomas Törnblom, Product Engineer
IAR Systems AB
Box 23051, Strandbodgatan 1
SE-750 23 Uppsala, SWEDEN
Mobile: +46 76 180 17 80 Fax: +46 18 16 78 01
E-mail: thomas.tornblom@... Website: www.iar.com
Twitter: www.twitter.com/iarsystems


Andy Ross
 

Thomas Törnblom wrote:
Would there be a problem adding a single anonymous member to this to
make it standards compliant?

After all, in C++ empty structs take up space while they don't with gcc.
And we'd never use this trick in C++ for that reason. :)

I'd personally prefer to see this done via rework that doesn't
introduce needless memory. I know I mocked it, but an
ARCH_HAS_WHATEVER kconfig really isn't an awful thing.

Note that some smarter refactoring could probably find space for this
info in other arch-defined spots that don't have this problem (like on
the stack of the caller instead of in the struct thread, which is done
for various things already and could be unified). That's getting way
beyond language standards compliance though.

Would it be OK to fix issues 1 - 4 in a PR?
I'm not the gatekeeper really, but I wouldn't object, no.

I'd also look more favorably on and be more likely to recommend
toolchains that implement broadly-used industry extensions. :)

Andy


Thomas Törnblom
 



Den 2019-01-16 kl. 18:31, skrev Andy Ross:

I'd also look more favorably on and be more likely to recommend
toolchains that implement broadly-used industry extensions. :)

By "broadly-used industry extensions" you mean "gcc extensions" I guess?

I did a quick test using some other C compilers I have, SunStudio 11/12, Keil uVision 5, VisualStudio 2015, and none of them accepts the "..." extension to the case labels in switch statements, at least not out of the box, so I would hardly call that "broadly-used industry extensions".

To be fair, I did find that SunStudio 12 Update 1 appears to have introduced this.

I might file an RFE for this ;)


Andy


/Thomas

--

Thomas Törnblom, Product Engineer
IAR Systems AB
Box 23051, Strandbodgatan 1
SE-750 23 Uppsala, SWEDEN
Mobile: +46 76 180 17 80 Fax: +46 18 16 78 01
E-mail: thomas.tornblom@... Website: www.iar.com
Twitter: www.twitter.com/iarsystems