Date   

Re: [net] net samples not working?

최형준 <hj210.choi@...>
 

 

 

--------- Original Message ---------

Sender : 최형준 <hj210.choi@...> S5(책임)/책임/IoT Lab(S/W센터)/삼성전자

Date : 2017-02-09 09:54 (GMT+9)

Title : FW: Re: [Zephyr-devel] [net] net samples not working?

 

 

 

Hello Richard,

 

>the SLIP-TAP driver seems not to be working in qemu for cortex-m targets,too.

>I can not  get any network traffic between, either between to

>qemu_cortex_m3 targets nor to the PC.

 

>Has anyone tried networking in qemu with qemu_cortex_m3 target?

I did it and it works well even if it printed some error messages about “net_buf_get_reserve()”.

I used "echo_server" sample app. 

 

>I don't think that this is a configuration problem.

>The SLIP-TAP driver was disabled in the cortex_m3 defconfig.

>I enabled it, but no luck.

 

>Would be great to get some help on this!

 

I’ll share the conf file for qemu_cortex_m3 when I tested it.

Please make a file in your PC. (i.e.: zephyr/samples/net/echo_server/prj_qemu_cortex_m3.conf)

 

“prj_qemu_cortex_m3.conf” file has below configurations.

--------------------------------------------

CONFIG_NETWORKING=y

CONFIG_NET_IPV6=y

CONFIG_NET_IPV4=y

CONFIG_NET_UDP=y

CONFIG_NET_TCP=y

CONFIG_TEST_RANDOM_GENERATOR=y

CONFIG_NET_BUF_LOG=y

CONFIG_SYS_LOG_NET_BUF_LEVEL=2

CONFIG_NET_LOG=y

CONFIG_SYS_LOG_NET_LEVEL=2

CONFIG_NET_SLIP_TAP=y

CONFIG_SYS_LOG_SHOW_COLOR=y

CONFIG_INIT_STACKS=y

CONFIG_PRINTK=y

CONFIG_NET_STATISTICS=y

CONFIG_NET_NBUF_RX_COUNT=14

CONFIG_NET_NBUF_TX_COUNT=14

CONFIG_NET_NBUF_DATA_COUNT=30

CONFIG_NET_IF_UNICAST_IPV6_ADDR_COUNT=3

CONFIG_NET_IF_MCAST_IPV6_ADDR_COUNT=2

CONFIG_NET_MAX_CONTEXTS=10

 

CONFIG_NET_SHELL=y

 

CONFIG_NET_SAMPLES_IP_ADDRESSES=y

CONFIG_NET_SAMPLES_MY_IPV6_ADDR="2001:db8::1"

CONFIG_NET_SAMPLES_PEER_IPV6_ADDR="2001:db8::2"

CONFIG_NET_SAMPLES_MY_IPV4_ADDR="192.0.2.1"

CONFIG_NET_SAMPLES_PEER_IPV4_ADDR="192.0.2.2"

 

 

CONFIG_UART_PIPE_ON_DEV_NAME="UART_1"

-------------------------------------------- 

I think this problem is happened because of "CONFIG_UART_PIPE_ON_DEV_NAME".

You can check your echo_server app's connection status by “net shell” after running it.

(e.g.: > select net and conn)

 

Here is my test scenario.

a.     Run echo_server app on qemu_cortex_m3. (e.g.: make BOARD=qemu_cortex_m3 V=1 DEBUG=1 run)

b.     Send echo message on Host(Linux) (e.g.: $ echo HELLO | nc -u 192.0.2.1 4242)

 

I hope it'll help you.

 

Best Regards,

Hyungjun.

 

 

 

 


Re: [net] net samples not working?

최형준 <hj210.choi@...>
 

 

Hello Richard,

 

>the SLIP-TAP driver seems not to be working in qemu for cortex-m targets,too.

>I can not  get any network traffic between, either between to

>qemu_cortex_m3 targets nor to the PC.

 

>Has anyone tried networking in qemu with qemu_cortex_m3 target?

I did it and it works well even if it printed some error messages about “net_buf_get_reserve()”.

I used "echo_server" sample app. 

 

>I don't think that this is a configuration problem.

>The SLIP-TAP driver was disabled in the cortex_m3 defconfig.

>I enabled it, but no luck.

 

>Would be great to get some help on this!

 

I’ll share the conf file for qemu_cortex_m3 when I tested it.

Please make a file in your PC. (i.e.: zephyr/samples/net/echo_server/prj_qemu_cortex_m3.conf)

 

“prj_qemu_cortex_m3.conf” file has below configurations.

--------------------------------------------

CONFIG_NETWORKING=y

CONFIG_NET_IPV6=y

CONFIG_NET_IPV4=y

CONFIG_NET_UDP=y

CONFIG_NET_TCP=y

CONFIG_TEST_RANDOM_GENERATOR=y

CONFIG_NET_BUF_LOG=y

CONFIG_SYS_LOG_NET_BUF_LEVEL=2

CONFIG_NET_LOG=y

CONFIG_SYS_LOG_NET_LEVEL=2

CONFIG_NET_SLIP_TAP=y

CONFIG_SYS_LOG_SHOW_COLOR=y

CONFIG_INIT_STACKS=y

CONFIG_PRINTK=y

CONFIG_NET_STATISTICS=y

CONFIG_NET_NBUF_RX_COUNT=14

CONFIG_NET_NBUF_TX_COUNT=14

CONFIG_NET_NBUF_DATA_COUNT=30

CONFIG_NET_IF_UNICAST_IPV6_ADDR_COUNT=3

CONFIG_NET_IF_MCAST_IPV6_ADDR_COUNT=2

CONFIG_NET_MAX_CONTEXTS=10

 

CONFIG_NET_SHELL=y

 

CONFIG_NET_SAMPLES_IP_ADDRESSES=y

CONFIG_NET_SAMPLES_MY_IPV6_ADDR="2001:db8::1"

CONFIG_NET_SAMPLES_PEER_IPV6_ADDR="2001:db8::2"

CONFIG_NET_SAMPLES_MY_IPV4_ADDR="192.0.2.1"

CONFIG_NET_SAMPLES_PEER_IPV4_ADDR="192.0.2.2"

 

 

CONFIG_UART_PIPE_ON_DEV_NAME="UART_1"

-------------------------------------------- 

 

You can check your echo_server app's connection status by “net shell” after running it.

(e.g.: > select net and conn)

 

Here is my test scenario.

a.     Run echo_server app on qemu_cortex_m3. (e.g.: make BOARD=qemu_cortex_m3 V=1 DEBUG=1 run)

b.     Send echo message on Host(Linux) (e.g.: $ echo HELLO | nc -u 192.0.2.1 4242)

 

I hope it'll help you.

 

Best Regards,

Hyungjun.

 

 

 

 


Re: [net] net samples not working?

Richard Peters <mail@...>
 

Hello Paul,

So, please join the party - we definitely need more eyes and hands.

the SLIP-TAP driver seems not to be working in qemu for cortex-m
targets,too.
I can not get any network traffic between, either between to
qemu_cortex_m3 targets nor to the PC.

Has anyone tried networking in qemu with qemu_cortex_m3 target?
have submitted a patch for this as my first contribution to an
open-source projekt :-)


Re: dhcp integration into the platform

Anas Nashif
 

Hi

On Thu, Feb 9, 2017 at 4:53 AM, Marcus Shawcroft <marcus.shawcroft@...> wrote:
Hi,

This evening I took a look at how we might better integrate dhcpv4
into the platform.

In the current tree, in order to use dhcp the application is expected
to start the dhcp client explicitly.  There is no synchronization
between network link up / link down events and the dhcp state machine.

The dhcp state machine will start issuing discover messages
irrespective of the link status, once the link comes up, dhcp will
typically run through request/ack and install an IP number and GW etc.
Should the link drop and subsequently up again the dhcp state machine
remains oblivious.

Better integration of dhcp into the stack looks reasonably straight forward.

I propose the following:

- dhcp_start() is modified to initialize the dhcp context, but remain
in the INIT state.
- net_if is modified to generate network management link up / down events
- dhcpv4 is modified to capture link up and link down events from net mgmt
- dhcpv4 enters discover on link up
- dhcpv4 performs unset_dhcpv4_on_iface for link down
- unset_dhcpv4_on_iface needs to net_if_ipv4_addr_rm (it should
probably do that anyway)
- dhcp_start is renamed and hooked in by SYS_INIT ?
- A new empty dhcp_start is defined and marked deprecated (to be nice
to apps that might currently call it).
- The current public include/dhcp.h exposed interface is removed.

What we see here are the first signs of a connection manager :)

We do need DHCP to act as a service, when writing an application that relies on DHCP it makes sense to have the initialisation and setup of the networking device consistently with having to reimplement the code over and over again in every application.

I can also see the DNS service being setup if we get a DNS server address via DHCP and set the context of the DNS client to allow resolution queries without having to do that by foot.

+1 for the approach described above.

 

There are a few details Ive not thought through properly yet, notably:

- The net_if layer already has link_up and link_down events.  These
however are defined to fire when a net_if is enabled or disabled, not
when its link goes up or down, hence either these events need to be
redefined or we introduce two new events to represent the link up and
down.

- I'm not sure what the appropriate way of managing the removal of a
public .h file should be.


Well, this API is not released yet, so I think it should be easy to remove, but we are running out of time for 1.7...


Anas

 

- There are several way of wiring up the network management link up /
down notifiers:
1) Drivers do it directly.
2) Drivers call the net_if layer which in turn issues the network
management events.

Before I take this any further I'd appreciate feed back on sanity of
the approach and indeed whether such patches would be welcome.

Cheers
/Marcus
_______________________________________________
Zephyr-devel mailing list
Zephyr-devel@lists.zephyrproject.org
https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel


dhcp integration into the platform

Marcus Shawcroft <marcus.shawcroft@...>
 

Hi,

This evening I took a look at how we might better integrate dhcpv4
into the platform.

In the current tree, in order to use dhcp the application is expected
to start the dhcp client explicitly. There is no synchronization
between network link up / link down events and the dhcp state machine.

The dhcp state machine will start issuing discover messages
irrespective of the link status, once the link comes up, dhcp will
typically run through request/ack and install an IP number and GW etc.
Should the link drop and subsequently up again the dhcp state machine
remains oblivious.

Better integration of dhcp into the stack looks reasonably straight forward.

I propose the following:

- dhcp_start() is modified to initialize the dhcp context, but remain
in the INIT state.
- net_if is modified to generate network management link up / down events
- dhcpv4 is modified to capture link up and link down events from net mgmt
- dhcpv4 enters discover on link up
- dhcpv4 performs unset_dhcpv4_on_iface for link down
- unset_dhcpv4_on_iface needs to net_if_ipv4_addr_rm (it should
probably do that anyway)
- dhcp_start is renamed and hooked in by SYS_INIT ?
- A new empty dhcp_start is defined and marked deprecated (to be nice
to apps that might currently call it).
- The current public include/dhcp.h exposed interface is removed.

There are a few details Ive not thought through properly yet, notably:

- The net_if layer already has link_up and link_down events. These
however are defined to fire when a net_if is enabled or disabled, not
when its link goes up or down, hence either these events need to be
redefined or we introduce two new events to represent the link up and
down.

- I'm not sure what the appropriate way of managing the removal of a
public .h file should be.

- There are several way of wiring up the network management link up /
down notifiers:
1) Drivers do it directly.
2) Drivers call the net_if layer which in turn issues the network
management events.

Before I take this any further I'd appreciate feed back on sanity of
the approach and indeed whether such patches would be welcome.

Cheers
/Marcus


Re: Inconsistent and ever-changing device naming in Zephyr

Paul Sokolovsky
 

Hello Daniel,

On Mon, 6 Feb 2017 16:59:26 +0000
Daniel Thompson <daniel.thompson@linaro.org> wrote:

[]

a) Establish (well, reconfirm/clarify) and uphold consistent naming
conventions across ports, architectures, and boards.

b) Send a strong message to application developers that they should
not rely on any patterns of Zephyr device naming, and even on the
naming at all, and instead should treat it as a configuration
parameter, which ultimately should be set by a user (and thus apps
should make such configuration explicit and easy for users).
Or c) inherit the device names from device tree?

I don't actually remember that much about the proposed DT tooling but
it should certainly be included in this discussion.

It might be made slightly more complex by the presence of aliased
names in DT (often used to bridge SoC datasheet namings into board
based names). I guess Zephyr might prefer to use the most-derived
alias rather then the SoC datasheet name?
I'm not familiar enough with devicetree details to imagine how exactly
that would be done, but Andy's response confirms it's possible to do
something along those ways. But as far as I can see, that still would
depend on establishing common conventions for naming. Perhaps, doing it
via DT would allow to achieve consistency easier, and verify/maintain
it.

But the current question is whether consistency is desirable at all,
i.e. would maintainers of individual SoCs agree each making some
changes to their code/config, and Zephyr maintainers agree to
uphold it afterwards. Given that DT has yet some way to go widely in
Zephyr, discussion or at least consideration of this "consistent
naming" idea might start in parallel.


Daniel.
Thanks,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Re: [net] net samples not working?

Paul Sokolovsky
 

Hello Richard,

Let me start with saying that I agree that Zephyr networking does have
issues. They say that any bug is shallow given enough eyes, and Zephyr
networking clearly lacks enough eyes. So, I was in about the same
situation as you 2 weeks ago - after working on other things and
holidays, I was back to networking, and trying to do something I faced
1st, 2nd, 3rd, 4th, etc. issue in row. Well, I took care to report
them, fix which I could, research the other, ping other folks who work
on networking, etc. I'm happy to report that after these 2 weeks, I see
visible improvements, thanks to both core team's and contributors'
work.

So, please join the party - we definitely need more eyes and hands.

Please see below.

On Wed, 8 Feb 2017 13:39:06 +0100
Richard Peters <mail@richardpeters.de> wrote:

Hi,

the SLIP-TAP driver seems not to be working in qemu for cortex-m
targets,too.
I can not get any network traffic between, either between to
qemu_cortex_m3 targets nor to the PC.

Has anyone tried networking in qemu with qemu_cortex_m3 target?
I did. It didn't work. Looking into that is on my to-do since July last
year. But I find that relatively complex (well, wide) task, and
there're more priority stuff anyway. So, to make qemu_cortex_m3 work,
you'd need to understand: a) basic facts about such networking, that
there should be 2 UARTs, one for console, another for networking; b)
exact details of how QEMU does UART emulation on "output" side (which
connects to POSIX object on a host); c) exact details of a MCU/board
qemu_cortex_m3 emulates, its UART setup, etc.; d) other magic
ingredients.

If you know these points well, I encourage you to look into it. If not,
you can use qemu_x86 for now and there're many more lower-hanging, and
oftentimes more important, tasks.


I don't think that this is a configuration problem.
The SLIP-TAP driver was disabled in the cortex_m3 defconfig.
I enabled it, but no luck.

Would be great to get some help on this!
Re: your other questions from the initial mail: I personally don't go
with testing beyond echo_server application so far. What's the point of
trying MQTT/ZOAP/OCF/whatever if basic TCP and pings don't work? Well,
both are working now, but I'd still recommend to start with playing
well withe echo_server. It should work well now for UDP/TCP * IPv4/IPv6
on QEMU. Just in case, walkthru is here:
https://wiki.zephyrproject.org/view/Networking-with-Qemu .

By all means, file the issues you see, but feel free to have a look
at what's already filed:

https://jira.zephyrproject.org/browse/ZEP-1673?jql=project%20%3D%20ZEP%20AND%20component%20%3D%20%22Networking%20%2F%20IP%20Stack%22%20AND%20status%20!%3D%20Closed

And somebody (well, many people!) need to triage, investigate, then
test solutions for already filed issues too ;-).


Regards,
Richard
--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Re: Inconsistent and ever-changing device naming in Zephyr

Andy Gross
 

On 6 February 2017 at 10:59, Daniel Thompson <daniel.thompson@linaro.org> wrote:
On 06/02/17 13:34, Paul Sokolovsky wrote:

Hello,

Writing on this was in my TODO at least since November, but I expected
this to be a controversial topic, so kind of skipped it. Controversial,
as in: different parties finding to be "obvious" quite different, if
not opposite solutions to the problem. Well, the breakage continues,
so this should be raised.

(Recent) facts:

1. Update to Kinetis (BOARD=frdm_k64f) port broke GPIO support in a
(complex) Zephyr-based application, Zephyr.js:
https://github.com/01org/zephyr.js/issues/665

2. A day or two, Arduino 101 I2C support in Zephyr.js was updated,
apparently because Zephyr upgrade broke it. Today it became known that
the fix for Arduino 101 broke FRDM-K64F.
https://github.com/01org/zephyr.js/issues/676

Investigating the issues, the causes of them are:

1. frdm_k64f GPIO port naming changed from "GPIO_0" to (!!) "gpio_porta".

2. Name of (one of) I2C device for arduino_101 changed from "I2C_0" to
"I2C_SS_0".


This is not the first time when device names change in Zephyr (see
reference above to November 2016), it's the first time (at least for
last half-year), when the naming changes in relatively well developed
ports, which leads to cascade of breakage in the one. And one case, it
can be seen how inconsistent naming (and changes of it) can be.

The question what to do about this. Possible high-level approaches can
be:

a) Establish (well, reconfirm/clarify) and uphold consistent naming
conventions across ports, architectures, and boards.

b) Send a strong message to application developers that they should
not rely on any patterns of Zephyr device naming, and even on the naming
at all, and instead should treat it as a configuration parameter, which
ultimately should be set by a user (and thus apps should make such
configuration explicit and easy for users).

Or c) inherit the device names from device tree?

I don't actually remember that much about the proposed DT tooling but it
should certainly be included in this discussion.

It might be made slightly more complex by the presence of aliased names in
DT (often used to bridge SoC datasheet namings into board based names). I
guess Zephyr might prefer to use the most-derived alias rather then the SoC
datasheet name?
The alias {} node can indeed be used for this. But unless we define a
specific format, parsing this might be interesting. However we have
some solutions for that if we want to go that route. A board specific
yaml file could specify the alias format. It's need to be something
that can co-exist with existing DT files that might be imported from
Linux.

Maybe something along the lines of:
zephyr,uart-0 = &node@xxxxxxxx;
zephyr,uart-1 = &node@xxxxxxxx;
and so on and so forth. These can ghost the aliases already present
if they have something like uart0 = &node@xxxxxxxx; The zephyr,XXXXX
need to be standardized for each device type. We strip the zephyr
part and the generated string name becomes UART-0 or UART0 or whatever
standard we want.

I think this solves both the unified naming and the assigning of the
names to specific nodes.


Daily Gerrit Digest

donotreply@...
 

NEW within last 24 hours:
- https://gerrit.zephyrproject.org/r/10993 : tests/common: Add tests for sys_dlist_*
- https://gerrit.zephyrproject.org/r/11000 : drivers: Convert FOR_EACH macro instances to use CONTAINER
- https://gerrit.zephyrproject.org/r/10999 : Bluetooth: Convert FOR_EACH macro instances to use CONTAINER
- https://gerrit.zephyrproject.org/r/10998 : net: Convert FOR_EACH macro instances to use CONTAINER
- https://gerrit.zephyrproject.org/r/10997 : kernel: Use SYS_DLIST_FOR_EACH_CONTAINER
- https://gerrit.zephyrproject.org/r/10992 : dlist: Introduce CONTAINER macros
- https://gerrit.zephyrproject.org/r/10996 : Bluetooth: AVDTP: Moving structures to headerfile
- https://gerrit.zephyrproject.org/r/10952 : drivers/net/ieee802154: nRF5 802.15.4 radio driver
- https://gerrit.zephyrproject.org/r/10953 : samples/net: ieee802154: Add configuration for nrf5
- https://gerrit.zephyrproject.org/r/10956 : build: Separate out prebuilt kernel logic from toplevel
- https://gerrit.zephyrproject.org/r/10954 : samples/net/ieee802154: Update example with nrf5 802.15.4
- https://gerrit.zephyrproject.org/r/10951 : ext: Integrate Nordic's 802.15.4 radio driver into Zephyr
- https://gerrit.zephyrproject.org/r/10950 : ext: Import Nordic 802.15.4 radio driver
- https://gerrit.zephyrproject.org/r/10991 : eth/mcux: Add temporary workaround to unbreak IPv6 ND features.
- https://gerrit.zephyrproject.org/r/10955 : Xtensa port: Allow user to stop simulation using ctrl+c outside sanitycheck.

UPDATED within last 24 hours:
- https://gerrit.zephyrproject.org/r/10881 : sensor/mma865x: Add driver for MMA865x 3 Axis Accelerometer Family
- https://gerrit.zephyrproject.org/r/10927 : tests/slist: Exercise CONTAINER macros
- https://gerrit.zephyrproject.org/r/10926 : slist: Introduce CONTAINER macros
- https://gerrit.zephyrproject.org/r/10684 : driver: mcr20a: cleanup and use semaphore for PHY access
- https://gerrit.zephyrproject.org/r/10685 : drivers: mcr20a: refactor transceiver interrupt processing
- https://gerrit.zephyrproject.org/r/10944 : drivers: Remove unnecessary CONFIG_SYS_POWER_DEEP_SLEEP
- https://gerrit.zephyrproject.org/r/10948 : samples/mbedtls_dtls_client: Fix wild write in entropy_source
- https://gerrit.zephyrproject.org/r/10946 : samples/mbedtls_dtls_client: Use k_uptime_get_32()
- https://gerrit.zephyrproject.org/r/10880 : bbc_microbit: Enable MAG3110
- https://gerrit.zephyrproject.org/r/10947 : samples/mbedtls_dtls_server: Use k_uptime_get_32()
- https://gerrit.zephyrproject.org/r/10646 : hexiwear_k64: Add RST board documentation
- https://gerrit.zephyrproject.org/r/6481 : net: Moved net_if_ipv6_addr_lookup_by_iface() to net_if.c
- https://gerrit.zephyrproject.org/r/8746 : net: uip: Fix clearing of router solicitation message
- https://gerrit.zephyrproject.org/r/10906 : section_tags.h: cleanup
- https://gerrit.zephyrproject.org/r/10905 : toolchain: gcc.h: add indirection to _GENERIC_SECTION() macro
- https://gerrit.zephyrproject.org/r/10903 : sw_isr_table.h: clean up definition
- https://gerrit.zephyrproject.org/r/10879 : sensor/mag3110: Add mag3110 three axis magnetometer driver.
- https://gerrit.zephyrproject.org/r/6716 : Bluetooth: SDP: Server: Refactor data element structure header
- https://gerrit.zephyrproject.org/r/10920 : kw41z: add base DTS support
- https://gerrit.zephyrproject.org/r/10801 : board: add nucleo_l476rg documentation
- https://gerrit.zephyrproject.org/r/10902 : eth/mcux: Add basic PHY support.
- https://gerrit.zephyrproject.org/r/10860 : v2m_beetle: uart: Add DTS support to UART driver
- https://gerrit.zephyrproject.org/r/10859 : v2m_beetle: Add base DTS support
- https://gerrit.zephyrproject.org/r/10896 : xtensa: cleanup fatal error handling
- https://gerrit.zephyrproject.org/r/10885 : sysgen: Fixed macro evaluation issue in kernel_main.c with compiling with xcc.
- https://gerrit.zephyrproject.org/r/10848 : tinycrypt: Fixed compilation error on Xtensa caused by bad definition of bool.
- https://gerrit.zephyrproject.org/r/10882 : bbc_microbit: Enable MMA8653
- https://gerrit.zephyrproject.org/r/10720 : flash/stm32: driver for STM32F4x series
- https://gerrit.zephyrproject.org/r/10854 : cc3200: dts: Add base DTS support for TI CC3200
- https://gerrit.zephyrproject.org/r/10912 : tests: kernel: import obj_tracing test to unified kernel
- https://gerrit.zephyrproject.org/r/10628 : RFC: DTS: Kinetis: Add FRDM_K64F support
- https://gerrit.zephyrproject.org/r/10853 : drivers: Add Atmel SAM RTC driver
- https://gerrit.zephyrproject.org/r/10855 : cc3200: Enable device tree usage for CC3200
- https://gerrit.zephyrproject.org/r/10852 : samples: tickless: Sample app for tickless kernel
- https://gerrit.zephyrproject.org/r/10851 : power: tickless: hpet: Stop periodic timer
- https://gerrit.zephyrproject.org/r/10850 : power: tickless: Add TICKLESS_KERNEL kconfig option
- https://gerrit.zephyrproject.org/r/9890 : sys_bitfield*(): use 'void *' instead of memaddr_t
- https://gerrit.zephyrproject.org/r/10804 : net/mqtt: Add support for IBM BlueMix Watson topic format
- https://gerrit.zephyrproject.org/r/10807 : net/mqtt: Add BT support to MQTT publisher sample

MERGED within last 24 hours:
- https://gerrit.zephyrproject.org/r/10994 : net: tcp: Remove multiple checks on nbuf protocol family
- https://gerrit.zephyrproject.org/r/10995 : net: tcp: Remove multiple times of nbuf_compact() call
- https://gerrit.zephyrproject.org/r/10972 : tests: add filesystem api test
- https://gerrit.zephyrproject.org/r/10990 : net: Ref net_buf using net_nbuf_ref
- https://gerrit.zephyrproject.org/r/10973 : Merge net branch into master
- https://gerrit.zephyrproject.org/r/10963 : Merge bluetooth branch into master
- https://gerrit.zephyrproject.org/r/10988 : Bluetooth: GATT: fixing unsubscription
- https://gerrit.zephyrproject.org/r/10989 : Bluetooth: Fix incorrect checks for command buffer user data
- https://gerrit.zephyrproject.org/r/10971 : samples: power: Remove mention of specific versions in README
- https://gerrit.zephyrproject.org/r/10964 : kernel: init: use C implementation for STACK_CANARY_INIT
- https://gerrit.zephyrproject.org/r/10962 : xcc: add ccache support
- https://gerrit.zephyrproject.org/r/10966 : samples: net: irc_bot: fix build break
- https://gerrit.zephyrproject.org/r/10968 : samples: net: irc_bot: add testcase.ini
- https://gerrit.zephyrproject.org/r/10967 : samples: net: irc_bot: fix size_t related build warnings
- https://gerrit.zephyrproject.org/r/10959 : net: context: fix net context / net conn leak
- https://gerrit.zephyrproject.org/r/10958 : net: ip: change some error messages to NET_ERR
- https://gerrit.zephyrproject.org/r/10938 : rtc_qmsi: Call QMSI 1.4 context save/restore functions.
- https://gerrit.zephyrproject.org/r/10941 : aonpt_qmsi: Call QMSI 1.4 context save/restore functions.
- https://gerrit.zephyrproject.org/r/10940 : quark_se_ss: Remove enter_arc_state and use QMSI functions
- https://gerrit.zephyrproject.org/r/10723 : ext qmsi: Update to QMSI 1.4 RC2
- https://gerrit.zephyrproject.org/r/10908 : gpio: Error GPIO_INT with GPIO_DIR_OUT consistently.
- https://gerrit.zephyrproject.org/r/10909 : gpio: Error pin out of range consistently.
- https://gerrit.zephyrproject.org/r/10910 : gpio: Error unsupported access_op consistently.
- https://gerrit.zephyrproject.org/r/9449 : tests: add systhreads test case
- https://gerrit.zephyrproject.org/r/10888 : doc: add permalinks to document headings
- https://gerrit.zephyrproject.org/r/6586 : misc: Let the compiler choose whether to omit frame pointer
- https://gerrit.zephyrproject.org/r/10949 : checkpatch: Remove reference to legacy IP stack
- https://gerrit.zephyrproject.org/r/10911 : kernel: Remove redundant TICKLESS_IDLE_SUPPORTED option
- https://gerrit.zephyrproject.org/r/10849 : doc: rename doxygen configuration file and build from doc/
- https://gerrit.zephyrproject.org/r/9680 : quark_se: Fix restore info address
- https://gerrit.zephyrproject.org/r/9681 : quark_se: Add shared GDT in RAM memory to linker
- https://gerrit.zephyrproject.org/r/9682 : quark_d2000: Add shared GDT memory to linker
- https://gerrit.zephyrproject.org/r/9460 : Bluetooth: AVDTP: Add AVDTP Discover Function Definition
- https://gerrit.zephyrproject.org/r/10894 : doc/net: Add L2 and device driver document
- https://gerrit.zephyrproject.org/r/10841 : net/dns: Fix inline documentation
- https://gerrit.zephyrproject.org/r/10832 : mbedtls: add arduino 101 configuration to ssl client sample
- https://gerrit.zephyrproject.org/r/10904 : tests: remove old ARM vector table test
- https://gerrit.zephyrproject.org/r/10890 : xt-run: don't leave dead emulator processes lying around
- https://gerrit.zephyrproject.org/r/10874 : i2c: Implement consistent i2c no msgs behaviour
- https://gerrit.zephyrproject.org/r/10873 : i2c: Remove unused definition.
- https://gerrit.zephyrproject.org/r/10875 : i2c/stm32lx: Fix layout.
- https://gerrit.zephyrproject.org/r/10876 : i2c/dw: Switch from EPERM to EIO
- https://gerrit.zephyrproject.org/r/10877 : i2c: Name parameters consistently.
- https://gerrit.zephyrproject.org/r/10878 : i2c: Elaborate API return values
- https://gerrit.zephyrproject.org/r/10901 : xt-sim: add support for 'make debug'
- https://gerrit.zephyrproject.org/r/10919 : Xtensa port: Fixed memory corruption in interrupt handler exit function.
- https://gerrit.zephyrproject.org/r/10945 : Xtensa port: Fixed defintion of MAX_HEAP_SIZE, thus, compilation of new_lib.


Re: [net] net samples not working?

Jukka Rissanen
 

On Wed, 2017-02-08 at 12:39 +0100, Richard Peters wrote:
Sample: echo_server / echo_client

On 'make server' on server side and 'make client' on client side:

Some frames gets sent and received successfully and than my
screen is
flood by these
messages on the client side:

[net/buf] [ERR] net_buf_alloc_debug: net_nbuf_get_reserve():329:
Failed
to get free buffer
You are running out of buffers, please increase the TX/RX/DATA buf
count.
Maybe should this be patched in the out-of-the-box samples?
Sure, can you provide patches?




2nd Problem
===========

Sample: echo_server / echo_client

On 'make BOARD=qemu_cortex_m3 server' on server side and 'make
BOARD=qemu_cortex_m3 client' on both sides:

does not build, because the Makefile contains this line:

CONF_FILE ?= prj_$(BOARD).conf

But there is no prj_qemu_cortex_m3.conf.
Could you provide such a conf file and send a patch?
I hope so, have to fiddle around with this :)

What is the workflow for the other issues?
Should i create tickets in jira?
Yes, creating a tickets in jira is a good idea in this case.


Regards,
Richard

Cheers,
Jukka


Re: Problems managing NBUF DATA pool in the networking stack

Luiz Augusto von Dentz
 

Hi Marcus,

On Wed, Feb 8, 2017 at 12:37 PM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
On 8 February 2017 at 07:04, Jukka Rissanen
<jukka.rissanen@linux.intel.com> wrote:

One option would be to split the DATA pool to two so one pool for
sending and receiving. Then again this does not solve much as you might
still get to a situation where all the buffers are exhausted.
Running out of resources is bad, dead lock, especially undetected
deadlock, is worse. Avoiding the dead lock where the RX path starves
the rest of the system of resources requires that the resources the RX
path can consume are separate from the resources available to the TX
path(s). Limiting resource consumption by the RX path is straight
forward, buffers come from fixed size pool, when the pool is empty we
drop packets. Now we have a situation where RX cannot starve TX, we
just need to ensure that multiple TX paths cannot deadlock each other.

Dealing with resource exhaustion on the TX side is harder. In a
system with multiple TX paths either, there need to be sufficient TX
resources that all TX paths can acquire sufficient resources to
proceed in parallel or there need to be sufficient resources for any
one path to make progress along with a mechanism to serialize those
paths. The former solution is probably a none starter for a small
system because the number of buffers required is likely to be
unreasonably large. The latter solution I think implies that no TX
path can block waiting for resources unless it currently holds no
resources.... ie blocking to get a buffer is ok, blocking to extend a
buffer or to get a second buffer is not ok.
While I agree we should prevent the remote to consume all the buffer
and possible starve the TX, this is probably due to echo_server design
that deep copies the buffers from RX to TX, in a normal application
the RX would be processed and unrefed causing the data buffers to
return to the pool immediately. Even if we split the RX in a separate
pool any context can just ref the buffer causing the RX to stave
again, so at least in this aspect it seems to be a bug in the
application otherwise we will end up having each and every context to
have its own exclusive pool.

That said it is perhaps not a bad idea to design an optional callback
for the net_context to provide their own pools, we have something like
that for L2CAP channels:

/** Channel alloc_buf callback
*
* If this callback is provided the channel will use it to allocate
* buffers to store incoming data.
*
* @param chan The channel requesting a buffer.
*
* @return Allocated buffer.
*/
struct net_buf *(*alloc_buf)(struct bt_l2cap_chan *chan);

This is how we allocate net_buf from the IP stack which has a much
bigger MTU than Bluetooth and that way we also avoid starving the
Bluetooth RX pool when reassembling the segments, actually this most
likely will be necessary in case there are protocols that need to
implement their own fragmentation and reassembly because in that case
the lifetime of the buffers cannot be controlled directly by the
stack.

The timeout to buffer API helps a bit but still we might run out of
buffers.
For incremental acquisition of further resources this doesn't help, it
can't guarantee to prevent dead lock and its use in the software stack
makes reasoning about deadlock harder.

Cheers
/Marcus
_______________________________________________
Zephyr-devel mailing list
Zephyr-devel@lists.zephyrproject.org
https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel


--
Luiz Augusto von Dentz


Re: [net] net samples not working?

Richard Peters <mail@...>
 

Hi,

the SLIP-TAP driver seems not to be working in qemu for cortex-m
targets,too.
I can not get any network traffic between, either between to
qemu_cortex_m3 targets nor to the PC.

Has anyone tried networking in qemu with qemu_cortex_m3 target?

I don't think that this is a configuration problem.
The SLIP-TAP driver was disabled in the cortex_m3 defconfig.
I enabled it, but no luck.

Would be great to get some help on this!

Regards,
Richard


what is the code in ./zephyr/tests and what is the difference of it between with ./zephyr/samples/ ?

曹子龙
 

Dear all:

   i am running some test pattern about network on arduino_due board, i found that both the  ./zephyr/tests  and ./zephyr/ssamples/ are have some test pattren about network,  they seems related  just frome the name, but each one has is own main entry, so i want to know what is the difference between them and when should i use the test pattern in ./zephyr/test directory.

this is my first time to zephyr network, please bear with me my mistake, thank you very much for your kindly support.



 


Re: [net] net samples not working?

Richard Peters <mail@...>
 

Sample: echo_server / echo_client

On 'make server' on server side and 'make client' on client side:

Some frames gets sent and received successfully and than my screen is
flood by these
messages on the client side:

[net/buf] [ERR] net_buf_alloc_debug: net_nbuf_get_reserve():329:
Failed
to get free buffer
You are running out of buffers, please increase the TX/RX/DATA buf
count.
Maybe should this be patched in the out-of-the-box samples?



2nd Problem
===========

Sample: echo_server / echo_client

On 'make BOARD=qemu_cortex_m3 server' on server side and 'make
BOARD=qemu_cortex_m3 client' on both sides:

does not build, because the Makefile contains this line:

CONF_FILE ?= prj_$(BOARD).conf

But there is no prj_qemu_cortex_m3.conf.
Could you provide such a conf file and send a patch?
I hope so, have to fiddle around with this :)

What is the workflow for the other issues?
Should i create tickets in jira?

Regards,
Richard


Re: [net] net samples not working?

Jukka Rissanen
 

Hi Richard,

On Wed, 2017-02-08 at 11:52 +0100, Richard Peters wrote:
Hi Community,

i have several problems with the net examples.
Maybe there are some bugs involved?

1st Problem
===========

Sample: echo_server / echo_client

On 'make server' on server side and 'make client' on client side:

Some frames gets sent and received successfully and than my screen is
flood by these
messages on the client side:

[net/buf] [ERR] net_buf_alloc_debug: net_nbuf_get_reserve():329:
Failed
to get free buffer
You are running out of buffers, please increase the TX/RX/DATA buf
count.



2nd Problem
===========

Sample: echo_server / echo_client

On 'make BOARD=qemu_cortex_m3 server' on server side and 'make
BOARD=qemu_cortex_m3 client' on both sides:

does not build, because the Makefile contains this line:

CONF_FILE ?= prj_$(BOARD).conf

But there is no prj_qemu_cortex_m3.conf.
Could you provide such a conf file and send a patch?


When i try it with

make BOARD=qemu_cortex_m3  CONF_FILE=prj_slip.conf [server/client]

or

make BOARD=qemu_cortex_m3  CONF_FILE=prj_qemu_x86.conf
[server/client]

than the communication between client and server does not work.
Conf file is probably wrong and has wrong settings.



3rd Problem
===========

Sample zoap_server, zoap_client

On 'make server' on server side and 'make client' on client side, i
get
on the server side:

[zoap-server] [ERR] udp_receive: Invalid data received (-22)


4th Problem
===========

Sample zoap_server, zoap_client

On 'make BOARD=qemu_cortex_m3 server' on server side and 'make
BOARD=qemu_cortex_m3 client' on client side, i get on the client
side:

Unable add option to request.
No idea to zoap issues in 3rd and 4th.



5th Problem
===========

Sample: coaps_server, coaps_client

On 'make server' on server side and 'make client' on client side, i
get
on the server side:

failed!
mbedtls_ssl_handshake returned -0x7900
Packet without payload
No handler for such request (-22)


6th Problem
===========

Sample: coaps_server, coaps_client

On 'make BOARD=qemu_cortex_m3 server' on server side and 'make
BOARD=qemu_cortex_m3 client' on both sides

same as 2nd Problem.
Ditto



Tested on:

- Ubuntu 16.04 (64 Bit)
- Zephyr rev: e2bbad9600d655287496e3434641576e39a6582b
- Toolchain: Zephyr-SDK 0.8.2

Cheers,
Jukka


[net] net samples not working?

Richard Peters <mail@...>
 

Hi Community,

i have several problems with the net examples.
Maybe there are some bugs involved?

1st Problem
===========

Sample: echo_server / echo_client

On 'make server' on server side and 'make client' on client side:

Some frames gets sent and received successfully and than my screen is
flood by these
messages on the client side:

[net/buf] [ERR] net_buf_alloc_debug: net_nbuf_get_reserve():329: Failed
to get free buffer


2nd Problem
===========

Sample: echo_server / echo_client

On 'make BOARD=qemu_cortex_m3 server' on server side and 'make
BOARD=qemu_cortex_m3 client' on both sides:

does not build, because the Makefile contains this line:

CONF_FILE ?= prj_$(BOARD).conf

But there is no prj_qemu_cortex_m3.conf.

When i try it with

make BOARD=qemu_cortex_m3 CONF_FILE=prj_slip.conf [server/client]

or

make BOARD=qemu_cortex_m3 CONF_FILE=prj_qemu_x86.conf [server/client]

than the communication between client and server does not work.


3rd Problem
===========

Sample zoap_server, zoap_client

On 'make server' on server side and 'make client' on client side, i get
on the server side:

[zoap-server] [ERR] udp_receive: Invalid data received (-22)


4th Problem
===========

Sample zoap_server, zoap_client

On 'make BOARD=qemu_cortex_m3 server' on server side and 'make
BOARD=qemu_cortex_m3 client' on client side, i get on the client side:

Unable add option to request.


5th Problem
===========

Sample: coaps_server, coaps_client

On 'make server' on server side and 'make client' on client side, i get
on the server side:

failed!
mbedtls_ssl_handshake returned -0x7900
Packet without payload
No handler for such request (-22)


6th Problem
===========

Sample: coaps_server, coaps_client

On 'make BOARD=qemu_cortex_m3 server' on server side and 'make
BOARD=qemu_cortex_m3 client' on both sides

same as 2nd Problem.


Tested on:

- Ubuntu 16.04 (64 Bit)
- Zephyr rev: e2bbad9600d655287496e3434641576e39a6582b
- Toolchain: Zephyr-SDK 0.8.2


Re: Problems managing NBUF DATA pool in the networking stack

Marcus Shawcroft <marcus.shawcroft@...>
 

On 8 February 2017 at 07:04, Jukka Rissanen
<jukka.rissanen@linux.intel.com> wrote:

One option would be to split the DATA pool to two so one pool for
sending and receiving. Then again this does not solve much as you might
still get to a situation where all the buffers are exhausted.
Running out of resources is bad, dead lock, especially undetected
deadlock, is worse. Avoiding the dead lock where the RX path starves
the rest of the system of resources requires that the resources the RX
path can consume are separate from the resources available to the TX
path(s). Limiting resource consumption by the RX path is straight
forward, buffers come from fixed size pool, when the pool is empty we
drop packets. Now we have a situation where RX cannot starve TX, we
just need to ensure that multiple TX paths cannot deadlock each other.

Dealing with resource exhaustion on the TX side is harder. In a
system with multiple TX paths either, there need to be sufficient TX
resources that all TX paths can acquire sufficient resources to
proceed in parallel or there need to be sufficient resources for any
one path to make progress along with a mechanism to serialize those
paths. The former solution is probably a none starter for a small
system because the number of buffers required is likely to be
unreasonably large. The latter solution I think implies that no TX
path can block waiting for resources unless it currently holds no
resources.... ie blocking to get a buffer is ok, blocking to extend a
buffer or to get a second buffer is not ok.

The timeout to buffer API helps a bit but still we might run out of
buffers.
For incremental acquisition of further resources this doesn't help, it
can't guarantee to prevent dead lock and its use in the software stack
makes reasoning about deadlock harder.

Cheers
/Marcus


Re: Problems managing NBUF DATA pool in the networking stack

Jukka Rissanen
 

Hi Piotr,

On Tue, 2017-02-07 at 17:56 +0100, Piotr Mienkowski wrote:
Hi,
There seems to be a conceptual issue in a way networking buffers are
currently set up. I was thinking about entering Jira bug report but
maybe it's just me missing some information or otherwise
misunderstanding how the networking stack is supposed to be used.
I'll shortly describe the problem here based on Zephyr echo_server
sample application.
Currently if the echo_server application receives a large amount of
data, e.g. when a large file is sent via ncat the application will
lock up and stop responding. The only way out is to reset the device.
This problem is very easily observed with eth_sam_gmac Ethernet
driver and should be just as easy to spot with eth_mcux. Due to a
different driver architecture it may be more difficult to observe
with eth_enc28j60.
The problem is as follows. Via Kconfig we define RX, TX and data
buffers pool. Let's say like this:
CONFIG_NET_NBUF_RX_COUNT=14
CONFIG_NET_NBUF_TX_COUNT=14
CONFIG_NET_NBUF_DATA_COUNT=72
The number of RX and TX buffers corresponds to the number of RX/TX
frames which may be simultaneously received/send. The data buffers
count tells us how much storage we reserve for the actual data. This
pool is shared between RX and TX path. If we receive a large amount
of data the RX path will consume all available data buffers leaving
none for the TX path. If an application then tries to reserve data
buffers for the TX path, e.g. echo_server does it in
build_reply_buf() function, it will get stuck waiting forever for a
free data buffer. echo_server application gets stuck on the following
line
frag = net_nbuf_get_data(context);
The simplified sequence of events in the echo_server application is
as follows: receive RX frame -> reserve data buffers for TX frame ->
copy data from RX frame to TX frame -> free resources associated with
RX frame -> send TX frame.
One way to avoid it is to define number of data buffers large enough
so the RX path cannot exhaust available data pool. Taking into
account that data buffer size is 128 bytes, this is defined by the
following Kconfig parameter,
CONFIG_NET_NBUF_DATA_SIZE=128
and maximum frame size is 1518 or 1536 bytes one RX frame can use up
to 12 data buffers. In our example we would need to reserve more than
12*14 data buffers to ensure correct behavior. In case of
eth_sam_gmac Ethernet driver even more. 
After recent updates to the networking stack the functions reserving
RX/TX/DATA buffers have a timeout parameter. That would prevent lock
up but it still does not really solve the issue.
Is there a better way to manage this?
One option would be to split the DATA pool to two so one pool for
sending and receiving. Then again this does not solve much as you might
still get to a situation where all the buffers are exhausted.

The timeout to buffer API helps a bit but still we might run out of
buffers.

One should allocate as many buffers to DATA pool as possible but this
really depends on hw and applications of course.


Thanks and regards,
Piotr
Cheers,
Jukka


Re: Inconsistent and ever-changing device naming in Zephyr

Maureen Helm
 

Hi Paul,

-----Original Message-----
From: Paul Sokolovsky [mailto:paul.sokolovsky@linaro.org]
Sent: Monday, February 06, 2017 7:34 AM
To: devel@lists.zephyrproject.org; Anas Nashif <anas.nashif@intel.com>;
Maureen Helm <maureen.helm@nxp.com>; sakari.poussa@intel.com;
geoff@linux.intel.com; Daniel Thompson <daniel.thompson@linaro.org>
Subject: Inconsistent and ever-changing device naming in Zephyr

Hello,

Writing on this was in my TODO at least since November, but I expected this to
be a controversial topic, so kind of skipped it. Controversial, as in: different
parties finding to be "obvious" quite different, if not opposite solutions to the
problem. Well, the breakage continues, so this should be raised.

(Recent) facts:

1. Update to Kinetis (BOARD=frdm_k64f) port broke GPIO support in a
(complex) Zephyr-based application, Zephyr.js:
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
b.com%2F01org%2Fzephyr.js%2Fissues%2F665&data=01%7C01%7Cmaureen.h
elm%40nxp.com%7C8e842c85d2ec427bb45308d44e94e0a3%7C686ea1d3bc2b4c
6fa92cd99c5c301635%7C0&sdata=%2BNQJiBZZiZO0DCFSerir9SPSDmLKfvWSzhT
wUJ3dgpg%3D&reserved=0

2. A day or two, Arduino 101 I2C support in Zephyr.js was updated, apparently
because Zephyr upgrade broke it. Today it became known that the fix for
Arduino 101 broke FRDM-K64F.
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
b.com%2F01org%2Fzephyr.js%2Fissues%2F676&data=01%7C01%7Cmaureen.h
elm%40nxp.com%7C8e842c85d2ec427bb45308d44e94e0a3%7C686ea1d3bc2b4c
6fa92cd99c5c301635%7C0&sdata=u3JpPzybvD%2FcVNSzWzrw%2BsgGBbZ1vGK
AEwoLvmzyaaw%3D&reserved=0

Investigating the issues, the causes of them are:

1. frdm_k64f GPIO port naming changed from "GPIO_0" to (!!) "gpio_porta".
This is my fault. I was not aware the port names were being used this way. I tend to use the config names rather than the strings themselves. For example, in boards/arm/frdm_k64f/Kconfig.defconfig:

config FXOS8700_GPIO_NAME
default GPIO_MCUX_PORTC_NAME

Please, please let me know if you encounter problems like this again.


2. Name of (one of) I2C device for arduino_101 changed from "I2C_0" to
"I2C_SS_0".


This is not the first time when device names change in Zephyr (see reference
above to November 2016), it's the first time (at least for last half-year), when
the naming changes in relatively well developed ports, which leads to cascade
of breakage in the one. And one case, it can be seen how inconsistent naming
(and changes of it) can be.

The question what to do about this. Possible high-level approaches can
be:

a) Establish (well, reconfirm/clarify) and uphold consistent naming conventions
across ports, architectures, and boards.
Since you're most affected, mind making a proposal?


b) Send a strong message to application developers that they should not rely
on any patterns of Zephyr device naming, and even on the naming at all, and
instead should treat it as a configuration parameter, which ultimately should be
set by a user (and thus apps should make such configuration explicit and easy
for users).

Whether a) is a viable solution in turn depends on paradigmatic decision what
aim Zephyr is trying to achieve:

I. Is it a common platform, literally, an Operating System, providing consistent
means to develop software *across* various hardware?

II. Or is it loose collection of APIs of working with hardware, with the aim of
getting most (or something) of a particular hardware, without pledging much
for consistency and portability across hardware.


It would be nice to get opinions of both the core maintainers and the
maintainers of particular ports, as well as specific suggestions how to deal with
the bugs above.


Thanks,
Paul

Linaro.org | Open source software for ARM SoCs Follow Linaro:
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.
facebook.com%2Fpages%2FLinaro&data=01%7C01%7Cmaureen.helm%40nxp.
com%7C8e842c85d2ec427bb45308d44e94e0a3%7C686ea1d3bc2b4c6fa92cd99c5
c301635%7C0&sdata=0U09i2dq9yMB0XQPrUBcduLolSF2e2zEo%2BqHtBALOg4%
3D&reserved=0
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftwitte
r.com%2F%23!%2Flinaroorg&data=01%7C01%7Cmaureen.helm%40nxp.com%7
C8e842c85d2ec427bb45308d44e94e0a3%7C686ea1d3bc2b4c6fa92cd99c5c30163
5%7C0&sdata=n3rYqztnrOOxdvjXOY4zWhaADesQAle3S%2B%2FU0bFWzOk%3
D&reserved=0 -
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.l
inaro.org%2Flinaro-
blog&data=01%7C01%7Cmaureen.helm%40nxp.com%7C8e842c85d2ec427bb45
308d44e94e0a3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0&sdata=YQ3UqYJ
oivu24eyW1pNviWSouVTwpQMvmo2LjcxvfwE%3D&reserved=0


Problems managing NBUF DATA pool in the networking stack

Piotr Mienkowski
 

Hi,

There seems to be a conceptual issue in a way networking buffers are currently set up. I was thinking about entering Jira bug report but maybe it's just me missing some information or otherwise misunderstanding how the networking stack is supposed to be used. I'll shortly describe the problem here based on Zephyr echo_server sample application.

Currently if the echo_server application receives a large amount of data, e.g. when a large file is sent via ncat the application will lock up and stop responding. The only way out is to reset the device. This problem is very easily observed with eth_sam_gmac Ethernet driver and should be just as easy to spot with eth_mcux. Due to a different driver architecture it may be more difficult to observe with eth_enc28j60.

The problem is as follows. Via Kconfig we define RX, TX and data buffers pool. Let's say like this:

CONFIG_NET_NBUF_RX_COUNT=14
CONFIG_NET_NBUF_TX_COUNT=14
CONFIG_NET_NBUF_DATA_COUNT=72

The number of RX and TX buffers corresponds to the number of RX/TX frames which may be simultaneously received/send. The data buffers count tells us how much storage we reserve for the actual data. This pool is shared between RX and TX path. If we receive a large amount of data the RX path will consume all available data buffers leaving none for the TX path. If an application then tries to reserve data buffers for the TX path, e.g. echo_server does it in build_reply_buf() function, it will get stuck waiting forever for a free data buffer. echo_server application gets stuck on the following line

frag = net_nbuf_get_data(context);

The simplified sequence of events in the echo_server application is as follows: receive RX frame -> reserve data buffers for TX frame -> copy data from RX frame to TX frame -> free resources associated with RX frame -> send TX frame.

One way to avoid it is to define number of data buffers large enough so the RX path cannot exhaust available data pool. Taking into account that data buffer size is 128 bytes, this is defined by the following Kconfig parameter,

CONFIG_NET_NBUF_DATA_SIZE=128

and maximum frame size is 1518 or 1536 bytes one RX frame can use up to 12 data buffers. In our example we would need to reserve more than 12*14 data buffers to ensure correct behavior. In case of eth_sam_gmac Ethernet driver even more.

After recent updates to the networking stack the functions reserving RX/TX/DATA buffers have a timeout parameter. That would prevent lock up but it still does not really solve the issue.

Is there a better way to manage this?

Thanks and regards,
Piotr

5461 - 5480 of 7903