RFC: Stopping Zephyr networking from being painful by enabling error logging by default


Paul Sokolovsky
 

Hello,

I more the once mentioned the issue that Zephyr IP networking is very
hard to configure. It almost impossible to configure
a slightly-above-trivial app from scratch: something won't work with it,
usually silently.

A usual response would be "enable debug logging", but here it goes in
the vicious cycle, because it's hard to enable network debug logging in
Zephyr. It requires setting CONFIG_NET_LOG, then if you're lucky, you
discover CONFIG_NET_LOG_GLOBAL, then you also need to figure out that
you need to set CONFIG_SYS_LOG_NET_LEVEL to a cryptic numeric value.

If you think that's enough, then nah, because CONFIG_NET_LOG_GLOBAL
doesn't really enable logging globally. Then maybe you trace another
config option, after which you will likely either get a flood of DEBUG
level logging, or find out that an important condition is not logged at
all.

Debugging the configuration and debug logging itself is quite painful,
after a year on Zephyr, I still didn't master it, what to say about
newcomers?

So, I'd like to propose to make following changes:

1. Enable CONFIG_NET_LOG=y, CONFIG_NET_LOG_GLOBAL=y,
CONFIG_SYS_LOG_NET_LEVEL=2 (WARN) by default.

2. Describe (in the docs, or otherwise) that CONFIG_NET_LOG=n is the
master switch to disable all logging at once.

3. Make sure that CONFIG_NET_LOG_GLOBAL=y actually enables logging for
anything network related.

4. Make sure that any important (to user) conditions actually reported
with NET_WARN() or NET_ERR(), so will be visible to a user by default.


--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Boie, Andrew P
 

So, I'd like to propose to make following changes:

1. Enable CONFIG_NET_LOG=y, CONFIG_NET_LOG_GLOBAL=y,
CONFIG_SYS_LOG_NET_LEVEL=2 (WARN) by default.
I would take this even further. It never made sense to me why the logging stuff, by default, prints nothing even in error conditions.
I think for all subsystems we should default to enabling warnings/errors.

Andrew


Jukka Rissanen
 

Hi Paul,

On Wed, 2017-09-13 at 23:02 +0300, Paul Sokolovsky wrote:
Hello,

I more the once mentioned the issue that Zephyr IP networking is very
hard to configure. It almost impossible to configure
a slightly-above-trivial app from scratch: something won't work with
it,
usually silently.
I am probably biased here and looking the code too close but what
exactly is hard when configuring the IP stack? Is it difficult to
figure out the suitable config options, are the help texts in Kconfig
options too short or what? Please elaborate this more.


A usual response would be "enable debug logging", but here it goes in
the vicious cycle, because it's hard to enable network debug logging
in
Zephyr. It requires setting CONFIG_NET_LOG, then if you're lucky, you
discover CONFIG_NET_LOG_GLOBAL, then you also need to figure out that
you need to set CONFIG_SYS_LOG_NET_LEVEL to a cryptic numeric value.
I never use the CONFIG_NET_LOG_GLOBAL because it just prints out too
much data which slows down everything in the device and makes debugging
even harder. I would actually propose that we remove that option but if
someone sees it useful to have, then we can keep it.


If you think that's enough, then nah, because CONFIG_NET_LOG_GLOBAL
doesn't really enable logging globally. Then maybe you trace another
config option, after which you will likely either get a flood of
DEBUG
level logging, or find out that an important condition is not logged
at
all.
As the name implies, the CONFIG_NET_LOG_GLOBAL is only for networking.
It might also miss some new net debug options.


Debugging the configuration and debug logging itself is quite
painful,
after a year on Zephyr, I still didn't master it, what to say about
newcomers?

So, I'd like to propose to make following changes:

1. Enable CONFIG_NET_LOG=y, CONFIG_NET_LOG_GLOBAL=y,
CONFIG_SYS_LOG_NET_LEVEL=2 (WARN) by default.
We could set the log level to 2 (warn) as you suggest, there was not
many warns in the net stack anyway.

I disagree with enabling logging by default, it bloats the binary and
also increases ram / stack usage. Normally you do not need to have
debugging enabled anyway, and if you need it, then it is easy to set
CONFIG_NET_LOG=y, enable individual component logging or global
logging, and then increase the log level.
Perhaps we could have better documentation about this, could you
perhaps send a patch describing how to do it?


2. Describe (in the docs, or otherwise) that CONFIG_NET_LOG=n is the
master switch to disable all logging at once.

3. Make sure that CONFIG_NET_LOG_GLOBAL=y actually enables logging
for
anything network related.

4. Make sure that any important (to user) conditions actually
reported
with NET_WARN() or NET_ERR(), so will be visible to a user by
default.


Cheers,
Jukka


Tomasz Bursztyka
 

Hi Andrew,

So, I'd like to propose to make following changes:

1. Enable CONFIG_NET_LOG=y, CONFIG_NET_LOG_GLOBAL=y,
CONFIG_SYS_LOG_NET_LEVEL=2 (WARN) by default.
I would take this even further. It never made sense to me why the logging stuff, by default, prints nothing even in error conditions.
I think for all subsystems we should default to enabling warnings/errors.
That's a syslog issue then. Error level as a default would be nice.

Tomasz


Tomasz Bursztyka
 

Hi guys,

A usual response would be "enable debug logging", but here it goes in
the vicious cycle, because it's hard to enable network debug logging
in
Zephyr. It requires setting CONFIG_NET_LOG, then if you're lucky, you
discover CONFIG_NET_LOG_GLOBAL, then you also need to figure out that
you need to set CONFIG_SYS_LOG_NET_LEVEL to a cryptic numeric value.
I never use the CONFIG_NET_LOG_GLOBAL because it just prints out too
much data which slows down everything in the device and makes debugging
even harder. I would actually propose that we remove that option but if
someone sees it useful to have, then we can keep it.
I have been using it when porting new devices, it's an easy option to just see how
the whole behaves at boot/init time.

But indeed, it's not a good option when you need to debug an app.

It's off by default anyway. We can improve the doc there telling this should not
be used unless the person knows what to do with it
i.e.: getting all errors/warning, but not on debug level!

If you think that's enough, then nah, because CONFIG_NET_LOG_GLOBAL
doesn't really enable logging globally. Then maybe you trace another
config option, after which you will likely either get a flood of
DEBUG
level logging, or find out that an important condition is not logged
at
all.
As the name implies, the CONFIG_NET_LOG_GLOBAL is only for networking.
It might also miss some new net debug options.
Most probably yes



Besides all that, debugging net stack is complex mostly because it's a complex stack.
We could probably add a documentation about how to relevantly use the logging options,
that's I guess the best to do right now.


Tomasz


Paul Sokolovsky
 

Hello Andrew,

On Thu, 14 Sep 2017 00:01:09 +0000
"Boie, Andrew P" <andrew.p.boie@intel.com> wrote:

So, I'd like to propose to make following changes:

1. Enable CONFIG_NET_LOG=y, CONFIG_NET_LOG_GLOBAL=y,
CONFIG_SYS_LOG_NET_LEVEL=2 (WARN) by default.
I would take this even further. It never made sense to me why the
logging stuff, by default, prints nothing even in error conditions. I
think for all subsystems we should default to enabling
warnings/errors.
Thanks for support. I'm not much familiar with logging in other
susbystems, so wanted to start with a humble, pilot proposal first,
though I agree that enabling warnings/error by default in general makes
sense.

There're of course drawbacks in doing so, so "risks" and how to manage
them should considered/planned. My ideas are:

1. We should make sure that we're all on the same line regarding
understanding that any such logging, regardless of its level
(error/warning/info/debug) is the *debug* logging. Well, that's why
it's disabled by default. So, we shouldn't be too shy with adding such
logging, including warnings/errors - it's not for production. But the
talk is about making it easy for both novice and regular users to
leverage it, i.e. enabling it by default.

2. From the above, there should be an easy way (ideally, single global
option) to disable all logging. It should be at the users' fingertips -
online help, READMEs, documentation, etc.

3. This approach is non-linear, in the sense that enabling global
warning/error logging will be definitely useful for users, but one
turn of a knob to "debug" level, and the output will be overwhelming.
So, how to deal with that should be also well documented.



Andrew


--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Paul Sokolovsky
 

On Thu, 14 Sep 2017 09:33:22 +0300
Jukka Rissanen <jukka.rissanen@linux.intel.com> wrote:

Hi Paul,

On Wed, 2017-09-13 at 23:02 +0300, Paul Sokolovsky wrote:
Hello,

I more the once mentioned the issue that Zephyr IP networking is
very hard to configure. It almost impossible to configure
a slightly-above-trivial app from scratch: something won't work with
it,
usually silently.
I am probably biased here and looking the code too close but what
exactly is hard when configuring the IP stack? Is it difficult to
figure out the suitable config options, are the help texts in Kconfig
options too short or what? Please elaborate this more.
It's that there're too many options, and need for a particular option
or an effect of some option is hard to anticipate, and if one didn't,
an application usually silently doesn't do anything/something useful or
just crashes.

A usual response would be "enable debug logging", but here it goes
in the vicious cycle, because it's hard to enable network debug
logging in
Zephyr. It requires setting CONFIG_NET_LOG, then if you're lucky,
you discover CONFIG_NET_LOG_GLOBAL, then you also need to figure
out that you need to set CONFIG_SYS_LOG_NET_LEVEL to a cryptic
numeric value.
I never use the CONFIG_NET_LOG_GLOBAL because it just prints out too
much data which slows down everything in the device and makes
debugging even harder. I would actually propose that we remove that
option but if someone sees it useful to have, then we can keep it.
That's why I propose to enable it only with "warning" logging level.
That will be useful for users to spot misconfigurations/understand
misbehavior. But as I mentioned in the previous reply on the thread
(to Andrew), this leads to a problem that if a user adjusts just
the logging level to "debug", they get flooded with spam-logging.

We should anticipate this problem and give instructions for users how to
deal with it (as in: Kconfig help for CONFIG_NET_LOG_GLOBAL should warn
against it's usage with DEBUG, and instead suggest to disable itself
and enabled individual module logging).



If you think that's enough, then nah, because CONFIG_NET_LOG_GLOBAL
doesn't really enable logging globally. Then maybe you trace another
config option, after which you will likely either get a flood of
DEBUG
level logging, or find out that an important condition is not logged
at
all.
As the name implies, the CONFIG_NET_LOG_GLOBAL is only for networking.
It might also miss some new net debug options.
ARP is also networking, BSD Sockets is also networking, SLIP is also
networking. I'm, a Random J User, expect to see errors/warnings in all
of them by default (and ready to be taught how to deal with debug-level
logging enabled).



Debugging the configuration and debug logging itself is quite
painful,
after a year on Zephyr, I still didn't master it, what to say about
newcomers?

So, I'd like to propose to make following changes:

1. Enable CONFIG_NET_LOG=y, CONFIG_NET_LOG_GLOBAL=y,
CONFIG_SYS_LOG_NET_LEVEL=2 (WARN) by default.
We could set the log level to 2 (warn) as you suggest, there was not
many warns in the net stack anyway.

I disagree with enabling logging by default, it bloats the binary and
also increases ram / stack usage.
Above you write that there're not so many warnings in the stack. So,
bloating a binary shouldn't be that big a problem. If it is, we need to
optimize the logging and its ram/stack usage. (But we definitely need
to add more warns/errors, that's the whole point.)

Normally you do not need to have
debugging enabled anyway,
My experience with Zephyr shows the contrary.

and if you need it, then it is easy to set
CONFIG_NET_LOG=y,
So, the talk is about making it uber-easy: users get the errors by
default, and making it easy to *disable* it (by setting
CONFIG_NET_LOG=n).

enable individual component logging
Enable individual component logging is easy? But you need to know that
you need to enable it individually, know how to do that, know which
individual components exist. That's already *very* hard for a novice.

or global
logging, and then increase the log level.
You lost me (a random novice user) here.

Perhaps we could have better documentation about this, could you
perhaps send a patch describing how to do it?
So, the talk is about making it work with least user surprise
for novice and casual users - by enabling the err/warn logging by
default. Then, for users who go as far as using Zephyr in production
(where logging should be disabled), there should be an easy way to
disable it (1 config option). Yes, it should be well documented, and
I'll be happy to contribute to such documentation. (Sorry, expecting
that people start on a new project with thoroughly reading docs is,
well, ungrounded.)


Cheers,
Jukka
Thanks,

--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Nashif, Anas
 

Hi,
I fully agree with your first paragraph but then I got completely lost how enabling logging is going to solve the issue you have presented. Frankly, the first thing I do when trying out an application is turn off all the logging, because it gets in the way, it is too verbose and it also disturbs the net shell which is enabled in many applications.

The main problem I see in many of the networking samples is the fact that we have developers designing the applications for their needs and for their own testing environment, i.e. we do have most configurations with whatever is needed to run in Qemu and in many cases with logging enabled and with kernel and networking options that worked for "someone" at "some point".

The main issues as I see them are the following:
- We do need safe defaults that can be lowered or increased, depending on usage and memory constraints. We basically want to have configuration options that enable a feature safely without having to worry about details, this might increase the binary size and memory usage, but should be possible to be customized down if needed.

Just picking a random example:

CONFIG_NET_PKT_RX_COUNT=16
CONFIG_NET_PKT_TX_COUNT=16
CONFIG_NET_BUF_RX_COUNT=16
CONFIG_NET_BUF_TX_COUNT=16

CONFIG_NET_IF_UNICAST_IPV6_ADDR_COUNT=2
CONFIG_NET_IF_MCAST_IPV6_ADDR_COUNT=4

CONFIG_NET_MAX_CONTEXTS=16

Why do I need to deal with all of that as a "random novice user". In most cases people will just copy paste those without knowing what is going on. We should either set those to safe default values or adjust them automatically based on configured features using Kconfig.

- We do need to generalize the test setting and just maintain them in one place and keep the configurations of samples generic and hw oriented as much as possible. In a test environment we then could merge the test setting and do the qemu magic without clobbering sample configurations with local test settings like


CONFIG_NET_APP_SETTINGS=y
CONFIG_NET_APP_MY_IPV6_ADDR="2001:db8::1"
CONFIG_NET_APP_PEER_IPV6_ADDR="2001:db8::2"
CONFIG_NET_APP_MY_IPV4_ADDR="192.0.2.1"
CONFIG_NET_APP_PEER_IPV4_ADDR="192.0.2.2"


- It would help if we had all network samples behave the same, i.e. all samples should target similar use cases, of course depending on the features being demonstrated that might not be possible, but take protocols for example and anything that would require a full network setup, we could define a setup that is easily done by the novice user and build on top of that, most basic setup I could think of, and this is just an example, more options should be possible:

- DHCPv4
- IPv4

I can here connect my board with Ethernet to a local network with DHCP and be able to send and receive data immediately. Of course this is very simplified, but if we can generalize similar functionality with similar and unified configuration options, then we will make it easy.

- Ok, it is not all bad news, Jukka and the network stack developers introduced this nice netapp interface which makes it very easy to get started and does eliminate tons of code that used to be in the application and moved it to the ip stack, so this was a huge improvement already, we need to continue optimizing into this direction.

- Finally, logging. It can be really useful and I agree we need to enable some type of logging by default (which can be turned off easily). I always get lost in the configuration of logging of network applications, you would assume that CONFIG_SYS_LOG=n would turn off everything, but it does not and there are too many variants and no easy way to understand what enables/disables what. So we do need to revisit this and think how we can easily enable/disable logging and keep it on by default without impacting binary size and performance.


Anas

-----Original Message-----
From: zephyr-devel-bounces@lists.zephyrproject.org [mailto:zephyr-devel-bounces@lists.zephyrproject.org] On Behalf Of Paul Sokolovsky
Sent: Wednesday, September 13, 2017 4:02 PM
To: devel@lists.zephyrproject.org
Subject: [Zephyr-devel] RFC: Stopping Zephyr networking from being painful by enabling error logging by default

Hello,

I more the once mentioned the issue that Zephyr IP networking is very hard to configure. It almost impossible to configure a slightly-above-trivial app from scratch: something won't work with it, usually silently.

A usual response would be "enable debug logging", but here it goes in the vicious cycle, because it's hard to enable network debug logging in Zephyr. It requires setting CONFIG_NET_LOG, then if you're lucky, you discover CONFIG_NET_LOG_GLOBAL, then you also need to figure out that you need to set CONFIG_SYS_LOG_NET_LEVEL to a cryptic numeric value.

If you think that's enough, then nah, because CONFIG_NET_LOG_GLOBAL doesn't really enable logging globally. Then maybe you trace another config option, after which you will likely either get a flood of DEBUG level logging, or find out that an important condition is not logged at all.

Debugging the configuration and debug logging itself is quite painful, after a year on Zephyr, I still didn't master it, what to say about newcomers?

So, I'd like to propose to make following changes:

1. Enable CONFIG_NET_LOG=y, CONFIG_NET_LOG_GLOBAL=y,
CONFIG_SYS_LOG_NET_LEVEL=2 (WARN) by default.

2. Describe (in the docs, or otherwise) that CONFIG_NET_LOG=n is the master switch to disable all logging at once.

3. Make sure that CONFIG_NET_LOG_GLOBAL=y actually enables logging for anything network related.

4. Make sure that any important (to user) conditions actually reported with NET_WARN() or NET_ERR(), so will be visible to a user by default.


--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog _______________________________________________
Zephyr-devel mailing list
Zephyr-devel@lists.zephyrproject.org
https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel


Paul Sokolovsky
 

Hello Anas,

On Thu, 14 Sep 2017 13:22:11 +0000
"Nashif, Anas" <anas.nashif@intel.com> wrote:

Hi,
I fully agree with your first paragraph but then I got completely
lost how enabling logging is going to solve the issue you have
presented.
We can approach it from another side. Let me present some cases I faced
(some of them based on other users' feedback), and you'll try to guess
what's wrong in each of them. 3 cases are presented, ranging from
warm-up (with answer presented) to "hard cases nobody knows how to
solve".

1. You build an app for a new board and it crashes on startup. The app
works on another board. What's wrong?
https://jira.zephyrproject.org/browse/ZEP-2105 tells what was wrong for
this particular case reported by a user.

Anas' answer: (please time how much time it took to arrive at the
"correct" answer)
Paul's answer: The system should log the error condition

2. You write an app which sends data outside. It doesn't work - just
hangs in there. What's wrong? Umm, yeah. More data: another similar app
works well, so you can exclude firewall and DNS setup. Still not
enough data? You fire up Wireshark and see that Zephyr sends out ARP
requests for 0.0.0.0. So, what's wrong? This is actually quite
solvable by a speculation - except when it's end of day, and you just
try to get the flaming feature tested, not doing guesswork.

Anas' answer: ???
Paul's answer: The system should log the error (or warning) condition
precluding it from working

3. If the above went ok, see what's wrong in
https://jira.zephyrproject.org/browse/ZEP-2593 - so far, nobody was
able to present a truly useful trail to debug it.

Anas' answer: ???
Paul's answer: The system should log the error (or warning) condition
precluding it from working properly

Frankly, the first thing I do when trying out an
application is turn off all the logging, because it gets in the way,
it is too verbose
This only shows how much twisted and confusing is the net logging
situation: I complain there's too little (error) logging, and you don't
believe me it's true. Likewise, you complain about too much of
(useless) logging, and I don't believe you either (really, I don't know
of in-tree samples which have debug-level logging enabled by default.

So I'd rather believe you then, and that's exactly the nature of my
proposal: let's enable useful logging by default, and make sure that
too-much-details logging, not interesting to a casual user, is
disabled by default.

and it also disturbs the net shell which is enabled
in many applications.
That's also shows how twisted a situation we have: a proposal to enable
little error logging faces opposition that "it'll bloat binaries", and
at the same time "net shell is enabled in many applications", which is
of course much more bloated that just some of error logging. It's also
barely usable due to lack of command history support and broken paste
from host (so you need to type in long commands manually again and
again).

The main problem I see in many of the networking samples is the fact
The main issues as I see them are the following:
- We do need safe defaults that can be lowered or increased,
- We do need to generalize the test setting and just maintain them in
- It would help if we had all network samples behave the same, i.e.
There're many issues with network stack/apps, and we can't solve them
all at once. In this thread, I present a proposal for useful error
logging in the net stack.

- Ok, it is not all bad news, Jukka and the network stack developers
introduced this nice netapp interface which makes it very easy to get
started and does eliminate tons of code that used to be in the
application and moved it to the ip stack, so this was a huge
improvement already, we need to continue optimizing into this
direction.
Right, there were bunch of improvements in the stack lately, that's why
I submit this RFC - I think we should be ready now to tackle the
"error logging" issue.

- Finally, logging. It can be really useful and I agree we need to
enable some type of logging by default (which can be turned off
easily). I always get lost in the configuration of logging of network
applications, you would assume that CONFIG_SYS_LOG=n would turn off
everything, but it does not and there are too many variants and no
easy way to understand what enables/disables what. So we do need to
revisit this and think how we can easily enable/disable logging and
keep it on by default without impacting binary size and performance.
Cool, thanks. My RFC is exactly how to make first few steps in that
direction.


Anas
[]

--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Nashif, Anas
 

Should I assume you started replying before you read my final comment about logging? :-)

Anas

-----Original Message-----
From: Paul Sokolovsky [mailto:paul.sokolovsky@linaro.org]
Sent: Thursday, September 14, 2017 12:43 PM
To: Nashif, Anas <anas.nashif@intel.com>
Cc: devel@lists.zephyrproject.org
Subject: Re: [Zephyr-devel] RFC: Stopping Zephyr networking from being painful by enabling error logging by default

Hello Anas,

On Thu, 14 Sep 2017 13:22:11 +0000
"Nashif, Anas" <anas.nashif@intel.com> wrote:

Hi,
I fully agree with your first paragraph but then I got completely lost
how enabling logging is going to solve the issue you have presented.
We can approach it from another side. Let me present some cases I faced (some of them based on other users' feedback), and you'll try to guess what's wrong in each of them. 3 cases are presented, ranging from warm-up (with answer presented) to "hard cases nobody knows how to solve".

1. You build an app for a new board and it crashes on startup. The app works on another board. What's wrong?
https://jira.zephyrproject.org/browse/ZEP-2105 tells what was wrong for this particular case reported by a user.

Anas' answer: (please time how much time it took to arrive at the "correct" answer) Paul's answer: The system should log the error condition

2. You write an app which sends data outside. It doesn't work - just hangs in there. What's wrong? Umm, yeah. More data: another similar app works well, so you can exclude firewall and DNS setup. Still not enough data? You fire up Wireshark and see that Zephyr sends out ARP requests for 0.0.0.0. So, what's wrong? This is actually quite solvable by a speculation - except when it's end of day, and you just try to get the flaming feature tested, not doing guesswork.

Anas' answer: ???
Paul's answer: The system should log the error (or warning) condition precluding it from working

3. If the above went ok, see what's wrong in
https://jira.zephyrproject.org/browse/ZEP-2593 - so far, nobody was able to present a truly useful trail to debug it.

Anas' answer: ???
Paul's answer: The system should log the error (or warning) condition precluding it from working properly

Frankly, the first thing I do when trying out an application is turn
off all the logging, because it gets in the way, it is too verbose
This only shows how much twisted and confusing is the net logging
situation: I complain there's too little (error) logging, and you don't believe me it's true. Likewise, you complain about too much of
(useless) logging, and I don't believe you either (really, I don't know of in-tree samples which have debug-level logging enabled by default.

So I'd rather believe you then, and that's exactly the nature of my
proposal: let's enable useful logging by default, and make sure that too-much-details logging, not interesting to a casual user, is disabled by default.

and it also disturbs the net shell which is enabled in many
applications.
That's also shows how twisted a situation we have: a proposal to enable little error logging faces opposition that "it'll bloat binaries", and at the same time "net shell is enabled in many applications", which is of course much more bloated that just some of error logging. It's also barely usable due to lack of command history support and broken paste from host (so you need to type in long commands manually again and again).

The main problem I see in many of the networking samples is the fact
The main issues as I see them are the following:
- We do need safe defaults that can be lowered or increased,
- We do need to generalize the test setting and just maintain them in
- It would help if we had all network samples behave the same, i.e.
There're many issues with network stack/apps, and we can't solve them all at once. In this thread, I present a proposal for useful error logging in the net stack.

- Ok, it is not all bad news, Jukka and the network stack developers
introduced this nice netapp interface which makes it very easy to get
started and does eliminate tons of code that used to be in the
application and moved it to the ip stack, so this was a huge
improvement already, we need to continue optimizing into this
direction.
Right, there were bunch of improvements in the stack lately, that's why I submit this RFC - I think we should be ready now to tackle the "error logging" issue.

- Finally, logging. It can be really useful and I agree we need to
enable some type of logging by default (which can be turned off
easily). I always get lost in the configuration of logging of network
applications, you would assume that CONFIG_SYS_LOG=n would turn off
everything, but it does not and there are too many variants and no
easy way to understand what enables/disables what. So we do need to
revisit this and think how we can easily enable/disable logging and
keep it on by default without impacting binary size and performance.
Cool, thanks. My RFC is exactly how to make first few steps in that direction.


Anas
[]

--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Paul Sokolovsky
 

On Thu, 14 Sep 2017 13:22:43 +0200
Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com> wrote:

you discover CONFIG_NET_LOG_GLOBAL
[]

I have been using it when porting new devices, it's an easy option to
just see how
the whole behaves at boot/init time.

But indeed, it's not a good option when you need to debug an app.

It's off by default anyway. We can improve the doc there telling this
should not
be used unless the person knows what to do with it
i.e.: getting all errors/warning, but not on debug level!
Right, I had the same idea, mentioned in the other mails.

[]

As the name implies, the CONFIG_NET_LOG_GLOBAL is only for
networking. It might also miss some new net debug options.
Most probably yes
I posted https://github.com/zephyrproject-rtos/zephyr/pull/1507 showing
how this can be tackled in a scalable way (without introducing unneeded
Kconfig interdependencies, etc.).

Besides all that, debugging net stack is complex mostly because it's
a complex stack.
Yes, and we can make it simpler.

We could probably add a documentation about how to relevantly use the
logging options,
that's I guess the best to do right now.
Extending on what was said in the previous mails, people first want to
see stuff working (or telling them what's wrong), and then maybe
they'll dig into documentation. Seeing hangs and crashes in this age
will likely prompt "let's move on to something else" approach rather
than give encouragement to read long docs.

Tomasz
--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Paul Sokolovsky
 

On Thu, 14 Sep 2017 17:03:12 +0000
"Nashif, Anas" <anas.nashif@intel.com> wrote:

Should I assume you started replying before you read my final comment
about logging? :-)
Well, just as you tried to present the issues in the net stack from the
different sides, just the same I try to present an extended perspective
of the issue ;-). If we agree that logging can use tweaking, then good,
I already have few simple PRs in the pipeline, which need reviewing,
then hopefully extending, and merging ;-).

Anas

-----Original Message-----
From: Paul Sokolovsky [mailto:paul.sokolovsky@linaro.org]
Sent: Thursday, September 14, 2017 12:43 PM
To: Nashif, Anas <anas.nashif@intel.com>
Cc: devel@lists.zephyrproject.org
Subject: Re: [Zephyr-devel] RFC: Stopping Zephyr networking from
being painful by enabling error logging by default

Hello Anas,

On Thu, 14 Sep 2017 13:22:11 +0000
"Nashif, Anas" <anas.nashif@intel.com> wrote:

Hi,
I fully agree with your first paragraph but then I got completely
lost how enabling logging is going to solve the issue you have
presented.
We can approach it from another side. Let me present some cases I
faced (some of them based on other users' feedback), and you'll try
to guess what's wrong in each of them. 3 cases are presented, ranging
from warm-up (with answer presented) to "hard cases nobody knows how
to solve".

1. You build an app for a new board and it crashes on startup. The
app works on another board. What's wrong?
https://jira.zephyrproject.org/browse/ZEP-2105 tells what was wrong
for this particular case reported by a user.

Anas' answer: (please time how much time it took to arrive at the
"correct" answer) Paul's answer: The system should log the error
condition

2. You write an app which sends data outside. It doesn't work - just
hangs in there. What's wrong? Umm, yeah. More data: another similar
app works well, so you can exclude firewall and DNS setup. Still not
enough data? You fire up Wireshark and see that Zephyr sends out ARP
requests for 0.0.0.0. So, what's wrong? This is actually quite
solvable by a speculation - except when it's end of day, and you just
try to get the flaming feature tested, not doing guesswork.

Anas' answer: ???
Paul's answer: The system should log the error (or warning) condition
precluding it from working

3. If the above went ok, see what's wrong in
https://jira.zephyrproject.org/browse/ZEP-2593 - so far, nobody was
able to present a truly useful trail to debug it.

Anas' answer: ???
Paul's answer: The system should log the error (or warning) condition
precluding it from working properly

Frankly, the first thing I do when trying out an application is
turn off all the logging, because it gets in the way, it is too
verbose
This only shows how much twisted and confusing is the net logging
situation: I complain there's too little (error) logging, and you
don't believe me it's true. Likewise, you complain about too much of
(useless) logging, and I don't believe you either (really, I don't
know of in-tree samples which have debug-level logging enabled by
default.

So I'd rather believe you then, and that's exactly the nature of my
proposal: let's enable useful logging by default, and make sure that
too-much-details logging, not interesting to a casual user, is
disabled by default.

and it also disturbs the net shell which is enabled in many
applications.
That's also shows how twisted a situation we have: a proposal to
enable little error logging faces opposition that "it'll bloat
binaries", and at the same time "net shell is enabled in many
applications", which is of course much more bloated that just some of
error logging. It's also barely usable due to lack of command history
support and broken paste from host (so you need to type in long
commands manually again and again).

The main problem I see in many of the networking samples is the
fact The main issues as I see them are the following:
- We do need safe defaults that can be lowered or increased,
- We do need to generalize the test setting and just maintain them
in
- It would help if we had all network samples behave the same,
i.e.
There're many issues with network stack/apps, and we can't solve them
all at once. In this thread, I present a proposal for useful error
logging in the net stack.

- Ok, it is not all bad news, Jukka and the network stack
developers introduced this nice netapp interface which makes it
very easy to get started and does eliminate tons of code that used
to be in the application and moved it to the ip stack, so this was
a huge improvement already, we need to continue optimizing into
this direction.
Right, there were bunch of improvements in the stack lately, that's
why I submit this RFC - I think we should be ready now to tackle the
"error logging" issue.

- Finally, logging. It can be really useful and I agree we need to
enable some type of logging by default (which can be turned off
easily). I always get lost in the configuration of logging of
network applications, you would assume that CONFIG_SYS_LOG=n would
turn off everything, but it does not and there are too many
variants and no easy way to understand what enables/disables what.
So we do need to revisit this and think how we can easily
enable/disable logging and keep it on by default without impacting
binary size and performance.
Cool, thanks. My RFC is exactly how to make first few steps in that
direction.


Anas
[]

--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro:
http://www.facebook.com/pages/Linaro http://twitter.com/#!/linaroorg
- http://www.linaro.org/linaro-blog


--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Piotr Mienkowski
 

Hi,

On 14.09.2017 08:33, Jukka Rissanen wrote:

Debugging the configuration and debug logging itself is quite
painful,
after a year on Zephyr, I still didn't master it, what to say about
newcomers?

So, I'd like to propose to make following changes:

1. Enable CONFIG_NET_LOG=y, CONFIG_NET_LOG_GLOBAL=y,
CONFIG_SYS_LOG_NET_LEVEL=2 (WARN) by default.
We could set the log level to 2 (warn) as you suggest, there was not
many warns in the net stack anyway.

I disagree with enabling logging by default, it bloats the binary and
also increases ram / stack usage. Normally you do not need to have
debugging enabled anyway, and if you need it, then it is easy to set
CONFIG_NET_LOG=y, enable individual component logging or global
logging, and then increase the log level.
One more vote to set the default log level to 2 (warning) but not to
enable logging by default for the reasons Jukka has mentioned. It
certainly makes sense to provide sane defaults (level 2 is a good idea,
maybe even level 3 if we clean up _INF messages, seeing assigned MAC
address, IP number in the log wouldn't be all that bad) but we should
avoid making choices on behalf of the users. I prefer to enable options
that I need when I need them rather than disable all those I don't only
because someone believed they are good for me. Zephyr documentation
specifically mentions that it's targeting small memory footprint
devices. Few things eat up memory quite so reliably as a couple of printfs.

That said, we could enable logging for all sample applications but not
for the main project. Maybe that was the intention all way long and I
misunderstood it.

Cheers,
Piotr


Paul Sokolovsky
 

Hello Piotr,

On Thu, 14 Sep 2017 22:03:19 +0200
Piotr Mienkowski <piotr.mienkowski@gmail.com> wrote:

[]

I disagree with enabling logging by default, it bloats the binary
and also increases ram / stack usage. Normally you do not need to
have debugging enabled anyway, and if you need it, then it is easy
to set CONFIG_NET_LOG=y, enable individual component logging or
global logging, and then increase the log level.
One more vote to set the default log level to 2 (warning) but not to
enable logging by default for the reasons Jukka has mentioned. It
certainly makes sense to provide sane defaults (level 2 is a good
idea, maybe even level 3 if we clean up _INF messages, seeing
assigned MAC address, IP number in the log wouldn't be all that bad)
but we should avoid making choices on behalf of the users.
I would put it differently: we should make choices for the benefit of
the users.

I prefer
to enable options that I need when I need them rather than disable
all those I don't only because someone believed they are good for me.
I don't think discussion in this direction would be productive. Because
I too don't like choices made for me, and don't appreciate someone
thinking that code size if more important than error reporting, and
making me spend hours again and again debugging whole range of issues,
from trivial to complex.

Instead, we should think what would be in the interest of users, how to
let them engage with Zephyr easily, and keep them afterwards.

We should think beyond that, we should think why, 8 months after the
project being on github, it has barely over 300 stars (which is good
for a personal spare-time project and zilch for something targetting
to influence the landscape). We should think why Zephyr TSC receives a
feedback
(https://docs.google.com/presentation/d/1L3t6V9dr2IhUlz6f4Tc_gz1-zmt00O2aspO3IfyEtBs/edit#slide=id.g1f87755cc1_0_39)
from perspective users which says "Zephyr project would get even more
credibility if there would be device manufacturers & hobbyist".

Indeed, why would somebody make commercial projects based on Zephyr
with all the investment required, if nobody invests their spare fun
time into it? Then you might think if there's a correlation between
that and it being agonizing hard to configure Zephyr and debug
misconfigurations.

Zephyr documentation specifically mentions that it's targeting small
memory footprint devices.
But above that, it targets users, so any premature optimizations of
code size at the expense of user experience are, well, strange.

Few things eat up memory quite so reliably as a couple of printfs.
The network stack has eaten up memory much reliably than printfs, so
adding few won't change picture much, but may improve user
experience considerably.

That said, we could enable logging for all sample applications but not
for the main project. Maybe that was the intention all way long and I
misunderstood it.
No, this issue being talked up for a while now, so it's worth a
solution, not half-measures. I really mean that if you take a new
Linux distro, then it ships with printk's in kernel enabled, so if
something goes wrong during the installation or afterwards, you see it
right away, not receive kind suggestions to dig into documentation
looking for god-knows-what and build your kernel differently just to
approach answering question "what may be wrong". I mean that, just
applied to Zephyr. *Afterwards* someone can debug their stuff, and
optimize code size by disabling logging.


Cheers,
Piotr
--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Paul Sokolovsky
 

Hello Piotr,

On Thu, 14 Sep 2017 22:03:19 +0200
Piotr Mienkowski <piotr.mienkowski@gmail.com> wrote:

[]

I disagree with enabling logging by default, it bloats the binary
and also increases ram / stack usage. Normally you do not need to
have debugging enabled anyway, and if you need it, then it is easy
to set CONFIG_NET_LOG=y, enable individual component logging or
global logging, and then increase the log level.
One more vote to set the default log level to 2 (warning) but not to
enable logging by default for the reasons Jukka has mentioned. It
certainly makes sense to provide sane defaults (level 2 is a good
idea, maybe even level 3 if we clean up _INF messages, seeing
assigned MAC address, IP number in the log wouldn't be all that bad)
but we should avoid making choices on behalf of the users.
I would put it differently: we should make choices for the benefit of
the users.

I prefer
to enable options that I need when I need them rather than disable
all those I don't only because someone believed they are good for me.
I don't think discussion in this direction would be productive. Because
I too don't like choices made for me, and don't appreciate someone
thinking that code size if more important than error reporting, and
making me spend hours again and again debugging whole range of issues,
from trivial to complex.

Instead, we should think what would be in the interest of users, how to
let them engage with Zephyr easily, and keep them afterwards.

We should think beyond that, we should think why, 8 months after the
project being on github, it has barely over 300 stars (which is good
for a personal spare-time project and zilch for something targetting
to influence the landscape). We should think why Zephyr TSC receives a
feedback
(https://docs.google.com/presentation/d/1L3t6V9dr2IhUlz6f4Tc_gz1-zmt00O2aspO3IfyEtBs/edit#slide=id.g1f87755cc1_0_39)
from perspective users which says "Zephyr project would get even more
credibility if there would be device manufacturers & hobbyist".

Indeed, why would somebody make commercial projects based on Zephyr
with all the investment required, if nobody invests their spare fun
time into it? Then you might think if there's a correlation between
that and it being agonizing hard to configure Zephyr and debug
misconfigurations.

Zephyr documentation specifically mentions that it's targeting small
memory footprint devices.
But above that, it targets users, so any premature optimizations of
code size at the expense of user experience are, well, strange.

Few things eat up memory quite so reliably as a couple of printfs.
The network stack has eaten up memory much reliably than printfs, so
adding few won't change picture much, but may improve user
experience considerably.

That said, we could enable logging for all sample applications but not
for the main project. Maybe that was the intention all way long and I
misunderstood it.
No, this issue being talked up for a while now, so it's worth a
solution, not half-measures. I really mean that if you take a new
Linux distro, then it ships with printk's in kernel enabled, so if
something goes wrong during the installation or afterwards, you see it
right away, not receive kind suggestions to dig into documentation
looking for god-knows-what and build your kernel differently just to
approach answering question "what may be wrong". I mean that, just
applied to Zephyr. *Afterwards* someone can debug their stuff, and
optimize code size by disabling logging.


Cheers,
Piotr
--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Luiz Augusto von Dentz
 

Hi Paul,

On Fri, Sep 15, 2017 at 10:55 AM, Paul Sokolovsky
<paul.sokolovsky@linaro.org> wrote:
Hello Piotr,

On Thu, 14 Sep 2017 22:03:19 +0200
Piotr Mienkowski <piotr.mienkowski@gmail.com> wrote:

[]

I disagree with enabling logging by default, it bloats the binary
and also increases ram / stack usage. Normally you do not need to
have debugging enabled anyway, and if you need it, then it is easy
to set CONFIG_NET_LOG=y, enable individual component logging or
global logging, and then increase the log level.
One more vote to set the default log level to 2 (warning) but not to
enable logging by default for the reasons Jukka has mentioned. It
certainly makes sense to provide sane defaults (level 2 is a good
idea, maybe even level 3 if we clean up _INF messages, seeing
assigned MAC address, IP number in the log wouldn't be all that bad)
but we should avoid making choices on behalf of the users.
I would put it differently: we should make choices for the benefit of
the users.

I prefer
to enable options that I need when I need them rather than disable
all those I don't only because someone believed they are good for me.
I don't think discussion in this direction would be productive. Because
I too don't like choices made for me, and don't appreciate someone
thinking that code size if more important than error reporting, and
making me spend hours again and again debugging whole range of issues,
from trivial to complex.

Instead, we should think what would be in the interest of users, how to
let them engage with Zephyr easily, and keep them afterwards.

We should think beyond that, we should think why, 8 months after the
project being on github, it has barely over 300 stars (which is good
for a personal spare-time project and zilch for something targetting
to influence the landscape). We should think why Zephyr TSC receives a
feedback
(https://docs.google.com/presentation/d/1L3t6V9dr2IhUlz6f4Tc_gz1-zmt00O2aspO3IfyEtBs/edit#slide=id.g1f87755cc1_0_39)
from perspective users which says "Zephyr project would get even more
credibility if there would be device manufacturers & hobbyist".
When you use a presentation which says:

Main benefits taking zephyr in use would come from good ble and tcp/ip stacks
...

I understand it can be frustrating to not have proper logs when facing
issues, but things like this happen when we are trying to move as fast
as we can, perhaps too fast? Though I agree that hobbyists would
probably help fill these gaps.

Indeed, why would somebody make commercial projects based on Zephyr
with all the investment required, if nobody invests their spare fun
time into it? Then you might think if there's a correlation between
that and it being agonizing hard to configure Zephyr and debug
misconfigurations.

Zephyr documentation specifically mentions that it's targeting small
memory footprint devices.
But above that, it targets users, so any premature optimizations of
code size at the expense of user experience are, well, strange.

Few things eat up memory quite so reliably as a couple of printfs.
The network stack has eaten up memory much reliably than printfs, so
adding few won't change picture much, but may improve user
experience considerably.

That said, we could enable logging for all sample applications but not
for the main project. Maybe that was the intention all way long and I
misunderstood it.
No, this issue being talked up for a while now, so it's worth a
solution, not half-measures. I really mean that if you take a new
Linux distro, then it ships with printk's in kernel enabled, so if
something goes wrong during the installation or afterwards, you see it
right away, not receive kind suggestions to dig into documentation
looking for god-knows-what and build your kernel differently just to
approach answering question "what may be wrong". I mean that, just
applied to Zephyr. *Afterwards* someone can debug their stuff, and
optimize code size by disabling logging.
Comparing Zephyr with Linux is not fair really, they target completely
different environments, especially when it is concerned to runtime so
we have to keep things at certain perspective. We could perhaps have
debug builds using KConfig that selects whatever makes sense to help
with initial development/prototyping phase, things like _assert
support, warnings logging, etc.

Note, this can all be achieved without a wall of text complaining
about things that doesn't work for you, from the responses here
everyone seems quite positive with the idea of having better logging
so there is no point in keep coming with more more rants about it.


Cheers,
Piotr
--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog
_______________________________________________
Zephyr-devel mailing list
Zephyr-devel@lists.zephyrproject.org
https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel


--
Luiz Augusto von Dentz


Paul Sokolovsky
 

Hello Luiz,

On Fri, 15 Sep 2017 12:26:10 +0300
Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote:

[]

We should think beyond that, we should think why, 8 months after the
project being on github, it has barely over 300 stars (which is good
for a personal spare-time project and zilch for something targetting
to influence the landscape). We should think why Zephyr TSC
receives a feedback
(https://docs.google.com/presentation/d/1L3t6V9dr2IhUlz6f4Tc_gz1-zmt00O2aspO3IfyEtBs/edit#slide=id.g1f87755cc1_0_39)
from perspective users which says "Zephyr project would get even
more credibility if there would be device manufacturers &
hobbyist".
When you use a presentation which says:

Main benefits taking zephyr in use would come from good ble and
tcp/ip stacks ...
Well, when someone comes by and says "You're doing great, but we won't
use your stuff and can't even tell where we will be able to" (and
that's arguably summarizes that presentation), I'd spend less time on
being flattered by a polite conversation starter, and would spend more
time on 2nd part of the message ;-).

I understand it can be frustrating to not have proper logs when facing
issues, but things like this happen when we are trying to move as fast
as we can, perhaps too fast?
It's true that Zephyr development moves fast overall, and yet some
things are vice-versa pretty slow, for example here we, at version 1.9,
discuss whether it makes sense to report error messages or not.

[]

Note, this can all be achieved without a wall of text complaining
about things that doesn't work for you, from the responses here
everyone seems quite positive with the idea of having better logging
so there is no point in keep coming with more more rants about it.
Yeah, sorry about that. I mentioned that I "sent probes" on this
before, and quite anticipated the possible response ("let's not change
things, everything works well as it is"). So, I appreciate that
everyone agrees we can change in logging *something*, I'm just trying
to convey that we should change a bunch of stuff consistently to make a
real difference.

Anyway, I'm off to patches on this stuff.

[]

--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


david zuhn
 

I don't think discussion in this direction would be productive. Because
I too don't like choices made for me, and don't appreciate someone
thinking that code size if more important than error reporting, and
making me spend hours again and again debugging whole range of issues,
from trivial to complex.

This is an incredibly astute comment. 

I am coming to the RTOS world from years of Unix & Linux development, with a recent foray into the Arduino world, which has led me to slightly larger systems which end up in the RTOS space.   I do not consider myself an experienced RTOS developer, but I am an experienced developer who is now trying to look into other systems.   Issues like code size or power consumption are not my primary concern right now.  I *am* very concerned with getting things running, achieving basic functionality.  

As I get things running, then I might become concerned with trying to optimize the configuration. As a hobbyist, I'm buying boards that are way overpowered and overfeatured.   But I'm buying 1 or 2, not 10,000 at a time.   I don't need to try hard to fit into the 32K RAM component instead of the 64K piece because the $1.93 price differential is 20% of my profit margin.   I'm just trying to make something work.   

As such, I have expectations that capabilities that are touted as features of a product should be relatively easy to understand out of the box.   Documentation is critical.  Examples are great.   I don't believe that providing sparse comments in Kconfig files constitutes good documentation.  Having to fully understand the multitude of config options and how they interact in order to get basic functionality (like an IP stack) working seems newcomer-hostile.   Yup, maybe the extra 6K of code size matters to the large-scale production oriented user.   But not out of the box to the hobbyist.  Right now, I've got a part with 512K of flash.     

I do understand that the microcontroller universe is a complex space, and it's not the same as Unix.   But if you have to already be an expert in working in the microcontroller space to work in the microcontroller space, there's a chicken and egg problem.   Right now it seems like one has to be completely knowledgeable about the microcontroller itself, all of Zephyr, *and* the Linux kernel config system in order to work with Zephyr.   That's a tall order.
 
 *Afterwards* someone can debug their stuff, and optimize code size by disabling logging.

Once someone has been able to develop something worthwhile, they will have also picked up on the skills needed to consider the steps needed for optimization.   But if I can't even get my basic functionality working, I'll never even consider using Zephyr.  Something else out there will have been able to meet my hobbyist needs.

I'm seeing now the layers of abstraction that the Arduino developers put into play, keeping me from having to worry about quite a number of things.  Some of those things are now issues that I need to address because I've hit the limits of the abstraction.   But to step into a project and codebase that is focused on tiniest-device-production and not user entry is problematic.   It doesn't have to be one XOR the other.   And being able to tweak my system to achieve the tiniest-device capabilities is a good thing.   But in my experience, I found Zephyr to be hard to achieve basic capabilities.   I'm finding it easier to achieve those capabilities in other freely available RTOS packages.   

david zuhn

 

 


Boie, Andrew P
 

My views on this topic:

1) Right now we have very inconsistent error/debug logging. In lots of places in the kernel, when a bad situation is encountered the problem is simply reported with a printk(). In other places we are using SYS_LOG_*. The default experience for the user is that all the SYS_LOG messages are suppressed, but the printks are emitted. This is not an ideal default configuration, in fact words like "horrible" and "baffling" come to mind when considering it.

2) I still think that SYS_LOG should be turned on by default for error and warning situations, just like printk() is on by default.

3) If people are concerned about code size, there should be some kind of global flag which suppresses all debug output, including printk(). In the future, for very RAM constrained devices we could implement a backend to SYS_LOG which uses tricks like storing format strings completely outside the binary, in an external file used to decode raw log data, stuff like that.

4) The way SYS_LOG_* is configured in Kconfig is currently a confusing disaster and I look forward to seeing what Paul comes up with to clean it up. I think the difficulty in using this mechanism is at least partly why large parts of the kernel do not use it and just do printks instead.

5) We may consider making printk() a thin wrapper for a particular level of SYS_LOG().

Andrew


Nashif, Anas
 

Re (1): in this spirit, I would also enable CONFIG_ASSERT by default...
Re (4): SYS_LOG itself is not the issue, it is the layers added on top of it to enable logging in subsystems such as the IP stack which is confusing to say the least and has been discussed in this thread. We will have situations where we do want to using printk and family to print out kernel level exceptions and oopses, even if the logging was disabled for whatever reason, although the SYS_LOG could be optimized to handle such cases as well, we do have logger hooks that can for example write to a file system instead of the console in production systems, so depending on printk for all messages (and assuming console is always connected) is already not ideal.

This goes beyond the original thread topic and is probably worth a bug/enhancement request to be tracked.

Anas

-----Original Message-----
From: zephyr-devel-bounces@lists.zephyrproject.org [mailto:zephyr-devel-bounces@lists.zephyrproject.org] On Behalf Of Boie, Andrew P
Sent: Friday, September 15, 2017 1:52 PM
To: Paul Sokolovsky <paul.sokolovsky@linaro.org>; Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Cc: zephyr-devel@lists.zephyrproject.org
Subject: Re: [Zephyr-devel] RFC: Stopping Zephyr networking from being painful by enabling error logging by default

My views on this topic:

1) Right now we have very inconsistent error/debug logging. In lots of places in the kernel, when a bad situation is encountered the problem is simply reported with a printk(). In other places we are using SYS_LOG_*. The default experience for the user is that all the SYS_LOG messages are suppressed, but the printks are emitted. This is not an ideal default configuration, in fact words like "horrible" and "baffling" come to mind when considering it.

2) I still think that SYS_LOG should be turned on by default for error and warning situations, just like printk() is on by default.

3) If people are concerned about code size, there should be some kind of global flag which suppresses all debug output, including printk(). In the future, for very RAM constrained devices we could implement a backend to SYS_LOG which uses tricks like storing format strings completely outside the binary, in an external file used to decode raw log data, stuff like that.

4) The way SYS_LOG_* is configured in Kconfig is currently a confusing disaster and I look forward to seeing what Paul comes up with to clean it up. I think the difficulty in using this mechanism is at least partly why large parts of the kernel do not use it and just do printks instead.

5) We may consider making printk() a thin wrapper for a particular level of SYS_LOG().

Andrew
_______________________________________________
Zephyr-devel mailing list
Zephyr-devel@lists.zephyrproject.org
https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel