dhcp integration into the platform
Marcus Shawcroft <marcus.shawcroft@...>
Hi,
This evening I took a look at how we might better integrate dhcpv4 into the platform. In the current tree, in order to use dhcp the application is expected to start the dhcp client explicitly. There is no synchronization between network link up / link down events and the dhcp state machine. The dhcp state machine will start issuing discover messages irrespective of the link status, once the link comes up, dhcp will typically run through request/ack and install an IP number and GW etc. Should the link drop and subsequently up again the dhcp state machine remains oblivious. Better integration of dhcp into the stack looks reasonably straight forward. I propose the following: - dhcp_start() is modified to initialize the dhcp context, but remain in the INIT state. - net_if is modified to generate network management link up / down events - dhcpv4 is modified to capture link up and link down events from net mgmt - dhcpv4 enters discover on link up - dhcpv4 performs unset_dhcpv4_on_iface for link down - unset_dhcpv4_on_iface needs to net_if_ipv4_addr_rm (it should probably do that anyway) - dhcp_start is renamed and hooked in by SYS_INIT ? - A new empty dhcp_start is defined and marked deprecated (to be nice to apps that might currently call it). - The current public include/dhcp.h exposed interface is removed. There are a few details Ive not thought through properly yet, notably: - The net_if layer already has link_up and link_down events. These however are defined to fire when a net_if is enabled or disabled, not when its link goes up or down, hence either these events need to be redefined or we introduce two new events to represent the link up and down. - I'm not sure what the appropriate way of managing the removal of a public .h file should be. - There are several way of wiring up the network management link up / down notifiers: 1) Drivers do it directly. 2) Drivers call the net_if layer which in turn issues the network management events. Before I take this any further I'd appreciate feed back on sanity of the approach and indeed whether such patches would be welcome. Cheers /Marcus
|
|
Re: Inconsistent and ever-changing device naming in Zephyr
Paul Sokolovsky
Hello Daniel,
toggle quoted messageShow quoted text
On Mon, 6 Feb 2017 16:59:26 +0000
Daniel Thompson <daniel.thompson@...> wrote: [] I'm not familiar enough with devicetree details to imagine how exactlya) Establish (well, reconfirm/clarify) and uphold consistent namingOr c) inherit the device names from device tree? that would be done, but Andy's response confirms it's possible to do something along those ways. But as far as I can see, that still would depend on establishing common conventions for naming. Perhaps, doing it via DT would allow to achieve consistency easier, and verify/maintain it. But the current question is whether consistency is desirable at all, i.e. would maintainers of individual SoCs agree each making some changes to their code/config, and Zephyr maintainers agree to uphold it afterwards. Given that DT has yet some way to go widely in Zephyr, discussion or at least consideration of this "consistent naming" idea might start in parallel. Daniel.Thanks, Paul Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog
|
|
Re: [net] net samples not working?
Paul Sokolovsky
Hello Richard,
Let me start with saying that I agree that Zephyr networking does have issues. They say that any bug is shallow given enough eyes, and Zephyr networking clearly lacks enough eyes. So, I was in about the same situation as you 2 weeks ago - after working on other things and holidays, I was back to networking, and trying to do something I faced 1st, 2nd, 3rd, 4th, etc. issue in row. Well, I took care to report them, fix which I could, research the other, ping other folks who work on networking, etc. I'm happy to report that after these 2 weeks, I see visible improvements, thanks to both core team's and contributors' work. So, please join the party - we definitely need more eyes and hands. Please see below. On Wed, 8 Feb 2017 13:39:06 +0100 Richard Peters <mail@...> wrote: Hi,I did. It didn't work. Looking into that is on my to-do since July last year. But I find that relatively complex (well, wide) task, and there're more priority stuff anyway. So, to make qemu_cortex_m3 work, you'd need to understand: a) basic facts about such networking, that there should be 2 UARTs, one for console, another for networking; b) exact details of how QEMU does UART emulation on "output" side (which connects to POSIX object on a host); c) exact details of a MCU/board qemu_cortex_m3 emulates, its UART setup, etc.; d) other magic ingredients. If you know these points well, I encourage you to look into it. If not, you can use qemu_x86 for now and there're many more lower-hanging, and oftentimes more important, tasks. Re: your other questions from the initial mail: I personally don't go with testing beyond echo_server application so far. What's the point of trying MQTT/ZOAP/OCF/whatever if basic TCP and pings don't work? Well, both are working now, but I'd still recommend to start with playing well withe echo_server. It should work well now for UDP/TCP * IPv4/IPv6 on QEMU. Just in case, walkthru is here: https://wiki.zephyrproject.org/view/Networking-with-Qemu . By all means, file the issues you see, but feel free to have a look at what's already filed: https://jira.zephyrproject.org/browse/ZEP-1673?jql=project%20%3D%20ZEP%20AND%20component%20%3D%20%22Networking%20%2F%20IP%20Stack%22%20AND%20status%20!%3D%20Closed And somebody (well, many people!) need to triage, investigate, then test solutions for already filed issues too ;-). -- Best Regards, Paul Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog
|
|
Re: Inconsistent and ever-changing device naming in Zephyr
Andy Gross
On 6 February 2017 at 10:59, Daniel Thompson <daniel.thompson@...> wrote:
On 06/02/17 13:34, Paul Sokolovsky wrote:The alias {} node can indeed be used for this. But unless we define a specific format, parsing this might be interesting. However we have some solutions for that if we want to go that route. A board specific yaml file could specify the alias format. It's need to be something that can co-exist with existing DT files that might be imported from Linux. Maybe something along the lines of: zephyr,uart-0 = &node@xxxxxxxx; zephyr,uart-1 = &node@xxxxxxxx; and so on and so forth. These can ghost the aliases already present if they have something like uart0 = &node@xxxxxxxx; The zephyr,XXXXX need to be standardized for each device type. We strip the zephyr part and the generated string name becomes UART-0 or UART0 or whatever standard we want. I think this solves both the unified naming and the assigning of the names to specific nodes.
|
|
Daily Gerrit Digest
donotreply@...
NEW within last 24 hours:
- https://gerrit.zephyrproject.org/r/10993 : tests/common: Add tests for sys_dlist_* - https://gerrit.zephyrproject.org/r/11000 : drivers: Convert FOR_EACH macro instances to use CONTAINER - https://gerrit.zephyrproject.org/r/10999 : Bluetooth: Convert FOR_EACH macro instances to use CONTAINER - https://gerrit.zephyrproject.org/r/10998 : net: Convert FOR_EACH macro instances to use CONTAINER - https://gerrit.zephyrproject.org/r/10997 : kernel: Use SYS_DLIST_FOR_EACH_CONTAINER - https://gerrit.zephyrproject.org/r/10992 : dlist: Introduce CONTAINER macros - https://gerrit.zephyrproject.org/r/10996 : Bluetooth: AVDTP: Moving structures to headerfile - https://gerrit.zephyrproject.org/r/10952 : drivers/net/ieee802154: nRF5 802.15.4 radio driver - https://gerrit.zephyrproject.org/r/10953 : samples/net: ieee802154: Add configuration for nrf5 - https://gerrit.zephyrproject.org/r/10956 : build: Separate out prebuilt kernel logic from toplevel - https://gerrit.zephyrproject.org/r/10954 : samples/net/ieee802154: Update example with nrf5 802.15.4 - https://gerrit.zephyrproject.org/r/10951 : ext: Integrate Nordic's 802.15.4 radio driver into Zephyr - https://gerrit.zephyrproject.org/r/10950 : ext: Import Nordic 802.15.4 radio driver - https://gerrit.zephyrproject.org/r/10991 : eth/mcux: Add temporary workaround to unbreak IPv6 ND features. - https://gerrit.zephyrproject.org/r/10955 : Xtensa port: Allow user to stop simulation using ctrl+c outside sanitycheck. UPDATED within last 24 hours: - https://gerrit.zephyrproject.org/r/10881 : sensor/mma865x: Add driver for MMA865x 3 Axis Accelerometer Family - https://gerrit.zephyrproject.org/r/10927 : tests/slist: Exercise CONTAINER macros - https://gerrit.zephyrproject.org/r/10926 : slist: Introduce CONTAINER macros - https://gerrit.zephyrproject.org/r/10684 : driver: mcr20a: cleanup and use semaphore for PHY access - https://gerrit.zephyrproject.org/r/10685 : drivers: mcr20a: refactor transceiver interrupt processing - https://gerrit.zephyrproject.org/r/10944 : drivers: Remove unnecessary CONFIG_SYS_POWER_DEEP_SLEEP - https://gerrit.zephyrproject.org/r/10948 : samples/mbedtls_dtls_client: Fix wild write in entropy_source - https://gerrit.zephyrproject.org/r/10946 : samples/mbedtls_dtls_client: Use k_uptime_get_32() - https://gerrit.zephyrproject.org/r/10880 : bbc_microbit: Enable MAG3110 - https://gerrit.zephyrproject.org/r/10947 : samples/mbedtls_dtls_server: Use k_uptime_get_32() - https://gerrit.zephyrproject.org/r/10646 : hexiwear_k64: Add RST board documentation - https://gerrit.zephyrproject.org/r/6481 : net: Moved net_if_ipv6_addr_lookup_by_iface() to net_if.c - https://gerrit.zephyrproject.org/r/8746 : net: uip: Fix clearing of router solicitation message - https://gerrit.zephyrproject.org/r/10906 : section_tags.h: cleanup - https://gerrit.zephyrproject.org/r/10905 : toolchain: gcc.h: add indirection to _GENERIC_SECTION() macro - https://gerrit.zephyrproject.org/r/10903 : sw_isr_table.h: clean up definition - https://gerrit.zephyrproject.org/r/10879 : sensor/mag3110: Add mag3110 three axis magnetometer driver. - https://gerrit.zephyrproject.org/r/6716 : Bluetooth: SDP: Server: Refactor data element structure header - https://gerrit.zephyrproject.org/r/10920 : kw41z: add base DTS support - https://gerrit.zephyrproject.org/r/10801 : board: add nucleo_l476rg documentation - https://gerrit.zephyrproject.org/r/10902 : eth/mcux: Add basic PHY support. - https://gerrit.zephyrproject.org/r/10860 : v2m_beetle: uart: Add DTS support to UART driver - https://gerrit.zephyrproject.org/r/10859 : v2m_beetle: Add base DTS support - https://gerrit.zephyrproject.org/r/10896 : xtensa: cleanup fatal error handling - https://gerrit.zephyrproject.org/r/10885 : sysgen: Fixed macro evaluation issue in kernel_main.c with compiling with xcc. - https://gerrit.zephyrproject.org/r/10848 : tinycrypt: Fixed compilation error on Xtensa caused by bad definition of bool. - https://gerrit.zephyrproject.org/r/10882 : bbc_microbit: Enable MMA8653 - https://gerrit.zephyrproject.org/r/10720 : flash/stm32: driver for STM32F4x series - https://gerrit.zephyrproject.org/r/10854 : cc3200: dts: Add base DTS support for TI CC3200 - https://gerrit.zephyrproject.org/r/10912 : tests: kernel: import obj_tracing test to unified kernel - https://gerrit.zephyrproject.org/r/10628 : RFC: DTS: Kinetis: Add FRDM_K64F support - https://gerrit.zephyrproject.org/r/10853 : drivers: Add Atmel SAM RTC driver - https://gerrit.zephyrproject.org/r/10855 : cc3200: Enable device tree usage for CC3200 - https://gerrit.zephyrproject.org/r/10852 : samples: tickless: Sample app for tickless kernel - https://gerrit.zephyrproject.org/r/10851 : power: tickless: hpet: Stop periodic timer - https://gerrit.zephyrproject.org/r/10850 : power: tickless: Add TICKLESS_KERNEL kconfig option - https://gerrit.zephyrproject.org/r/9890 : sys_bitfield*(): use 'void *' instead of memaddr_t - https://gerrit.zephyrproject.org/r/10804 : net/mqtt: Add support for IBM BlueMix Watson topic format - https://gerrit.zephyrproject.org/r/10807 : net/mqtt: Add BT support to MQTT publisher sample MERGED within last 24 hours: - https://gerrit.zephyrproject.org/r/10994 : net: tcp: Remove multiple checks on nbuf protocol family - https://gerrit.zephyrproject.org/r/10995 : net: tcp: Remove multiple times of nbuf_compact() call - https://gerrit.zephyrproject.org/r/10972 : tests: add filesystem api test - https://gerrit.zephyrproject.org/r/10990 : net: Ref net_buf using net_nbuf_ref - https://gerrit.zephyrproject.org/r/10973 : Merge net branch into master - https://gerrit.zephyrproject.org/r/10963 : Merge bluetooth branch into master - https://gerrit.zephyrproject.org/r/10988 : Bluetooth: GATT: fixing unsubscription - https://gerrit.zephyrproject.org/r/10989 : Bluetooth: Fix incorrect checks for command buffer user data - https://gerrit.zephyrproject.org/r/10971 : samples: power: Remove mention of specific versions in README - https://gerrit.zephyrproject.org/r/10964 : kernel: init: use C implementation for STACK_CANARY_INIT - https://gerrit.zephyrproject.org/r/10962 : xcc: add ccache support - https://gerrit.zephyrproject.org/r/10966 : samples: net: irc_bot: fix build break - https://gerrit.zephyrproject.org/r/10968 : samples: net: irc_bot: add testcase.ini - https://gerrit.zephyrproject.org/r/10967 : samples: net: irc_bot: fix size_t related build warnings - https://gerrit.zephyrproject.org/r/10959 : net: context: fix net context / net conn leak - https://gerrit.zephyrproject.org/r/10958 : net: ip: change some error messages to NET_ERR - https://gerrit.zephyrproject.org/r/10938 : rtc_qmsi: Call QMSI 1.4 context save/restore functions. - https://gerrit.zephyrproject.org/r/10941 : aonpt_qmsi: Call QMSI 1.4 context save/restore functions. - https://gerrit.zephyrproject.org/r/10940 : quark_se_ss: Remove enter_arc_state and use QMSI functions - https://gerrit.zephyrproject.org/r/10723 : ext qmsi: Update to QMSI 1.4 RC2 - https://gerrit.zephyrproject.org/r/10908 : gpio: Error GPIO_INT with GPIO_DIR_OUT consistently. - https://gerrit.zephyrproject.org/r/10909 : gpio: Error pin out of range consistently. - https://gerrit.zephyrproject.org/r/10910 : gpio: Error unsupported access_op consistently. - https://gerrit.zephyrproject.org/r/9449 : tests: add systhreads test case - https://gerrit.zephyrproject.org/r/10888 : doc: add permalinks to document headings - https://gerrit.zephyrproject.org/r/6586 : misc: Let the compiler choose whether to omit frame pointer - https://gerrit.zephyrproject.org/r/10949 : checkpatch: Remove reference to legacy IP stack - https://gerrit.zephyrproject.org/r/10911 : kernel: Remove redundant TICKLESS_IDLE_SUPPORTED option - https://gerrit.zephyrproject.org/r/10849 : doc: rename doxygen configuration file and build from doc/ - https://gerrit.zephyrproject.org/r/9680 : quark_se: Fix restore info address - https://gerrit.zephyrproject.org/r/9681 : quark_se: Add shared GDT in RAM memory to linker - https://gerrit.zephyrproject.org/r/9682 : quark_d2000: Add shared GDT memory to linker - https://gerrit.zephyrproject.org/r/9460 : Bluetooth: AVDTP: Add AVDTP Discover Function Definition - https://gerrit.zephyrproject.org/r/10894 : doc/net: Add L2 and device driver document - https://gerrit.zephyrproject.org/r/10841 : net/dns: Fix inline documentation - https://gerrit.zephyrproject.org/r/10832 : mbedtls: add arduino 101 configuration to ssl client sample - https://gerrit.zephyrproject.org/r/10904 : tests: remove old ARM vector table test - https://gerrit.zephyrproject.org/r/10890 : xt-run: don't leave dead emulator processes lying around - https://gerrit.zephyrproject.org/r/10874 : i2c: Implement consistent i2c no msgs behaviour - https://gerrit.zephyrproject.org/r/10873 : i2c: Remove unused definition. - https://gerrit.zephyrproject.org/r/10875 : i2c/stm32lx: Fix layout. - https://gerrit.zephyrproject.org/r/10876 : i2c/dw: Switch from EPERM to EIO - https://gerrit.zephyrproject.org/r/10877 : i2c: Name parameters consistently. - https://gerrit.zephyrproject.org/r/10878 : i2c: Elaborate API return values - https://gerrit.zephyrproject.org/r/10901 : xt-sim: add support for 'make debug' - https://gerrit.zephyrproject.org/r/10919 : Xtensa port: Fixed memory corruption in interrupt handler exit function. - https://gerrit.zephyrproject.org/r/10945 : Xtensa port: Fixed defintion of MAX_HEAP_SIZE, thus, compilation of new_lib.
|
|
Re: [net] net samples not working?
Jukka Rissanen
On Wed, 2017-02-08 at 12:39 +0100, Richard Peters wrote:
Sure, can you provide patches?Maybe should this be patched in the out-of-the-box samples?Sample: echo_server / echo_clientYou are running out of buffers, please increase the TX/RX/DATA buf Yes, creating a tickets in jira is a good idea in this case.I hope so, have to fiddle around with this :)Could you provide such a conf file and send a patch?
Cheers, Jukka
|
|
Re: Problems managing NBUF DATA pool in the networking stack
Luiz Augusto von Dentz
Hi Marcus,
On Wed, Feb 8, 2017 at 12:37 PM, Marcus Shawcroft <marcus.shawcroft@...> wrote: On 8 February 2017 at 07:04, Jukka RissanenWhile I agree we should prevent the remote to consume all the buffer and possible starve the TX, this is probably due to echo_server design that deep copies the buffers from RX to TX, in a normal application the RX would be processed and unrefed causing the data buffers to return to the pool immediately. Even if we split the RX in a separate pool any context can just ref the buffer causing the RX to stave again, so at least in this aspect it seems to be a bug in the application otherwise we will end up having each and every context to have its own exclusive pool. That said it is perhaps not a bad idea to design an optional callback for the net_context to provide their own pools, we have something like that for L2CAP channels: /** Channel alloc_buf callback * * If this callback is provided the channel will use it to allocate * buffers to store incoming data. * * @param chan The channel requesting a buffer. * * @return Allocated buffer. */ struct net_buf *(*alloc_buf)(struct bt_l2cap_chan *chan); This is how we allocate net_buf from the IP stack which has a much bigger MTU than Bluetooth and that way we also avoid starving the Bluetooth RX pool when reassembling the segments, actually this most likely will be necessary in case there are protocols that need to implement their own fragmentation and reassembly because in that case the lifetime of the buffers cannot be controlled directly by the stack. The timeout to buffer API helps a bit but still we might run out ofFor incremental acquisition of further resources this doesn't help, it -- Luiz Augusto von Dentz
|
|
Re: [net] net samples not working?
Richard Peters <mail@...>
Hi,
the SLIP-TAP driver seems not to be working in qemu for cortex-m targets,too. I can not get any network traffic between, either between to qemu_cortex_m3 targets nor to the PC. Has anyone tried networking in qemu with qemu_cortex_m3 target? I don't think that this is a configuration problem. The SLIP-TAP driver was disabled in the cortex_m3 defconfig. I enabled it, but no luck. Would be great to get some help on this! Regards, Richard
|
|
what is the code in ./zephyr/tests and what is the difference of it between with ./zephyr/samples/ ?
曹子龙
Dear all: i am running some test pattern about network on arduino_due board, i found that both the ./zephyr/tests and ./zephyr/ssamples/ are have some test pattren about network, they seems related just frome the name, but each one has is own main entry, so i want to know what is the difference between them and when should i use the test pattern in ./zephyr/test directory. this is my first time to zephyr network, please bear with me my mistake, thank you very much for your kindly support.
|
|
Re: [net] net samples not working?
Richard Peters <mail@...>
Maybe should this be patched in the out-of-the-box samples?Sample: echo_server / echo_clientYou are running out of buffers, please increase the TX/RX/DATA buf I hope so, have to fiddle around with this :)Could you provide such a conf file and send a patch? What is the workflow for the other issues? Should i create tickets in jira? Regards, Richard
|
|
Re: [net] net samples not working?
Jukka Rissanen
Hi Richard,
On Wed, 2017-02-08 at 11:52 +0100, Richard Peters wrote: Hi Community,You are running out of buffers, please increase the TX/RX/DATA buf count. Could you provide such a conf file and send a patch? Conf file is probably wrong and has wrong settings. No idea to zoap issues in 3rd and 4th. Ditto Cheers, Jukka
|
|
[net] net samples not working?
Richard Peters <mail@...>
Hi Community,
i have several problems with the net examples. Maybe there are some bugs involved? 1st Problem =========== Sample: echo_server / echo_client On 'make server' on server side and 'make client' on client side: Some frames gets sent and received successfully and than my screen is flood by these messages on the client side: [net/buf] [ERR] net_buf_alloc_debug: net_nbuf_get_reserve():329: Failed to get free buffer 2nd Problem =========== Sample: echo_server / echo_client On 'make BOARD=qemu_cortex_m3 server' on server side and 'make BOARD=qemu_cortex_m3 client' on both sides: does not build, because the Makefile contains this line: CONF_FILE ?= prj_$(BOARD).conf But there is no prj_qemu_cortex_m3.conf. When i try it with make BOARD=qemu_cortex_m3 CONF_FILE=prj_slip.conf [server/client] or make BOARD=qemu_cortex_m3 CONF_FILE=prj_qemu_x86.conf [server/client] than the communication between client and server does not work. 3rd Problem =========== Sample zoap_server, zoap_client On 'make server' on server side and 'make client' on client side, i get on the server side: [zoap-server] [ERR] udp_receive: Invalid data received (-22) 4th Problem =========== Sample zoap_server, zoap_client On 'make BOARD=qemu_cortex_m3 server' on server side and 'make BOARD=qemu_cortex_m3 client' on client side, i get on the client side: Unable add option to request. 5th Problem =========== Sample: coaps_server, coaps_client On 'make server' on server side and 'make client' on client side, i get on the server side: failed! mbedtls_ssl_handshake returned -0x7900 Packet without payload No handler for such request (-22) 6th Problem =========== Sample: coaps_server, coaps_client On 'make BOARD=qemu_cortex_m3 server' on server side and 'make BOARD=qemu_cortex_m3 client' on both sides same as 2nd Problem. Tested on: - Ubuntu 16.04 (64 Bit) - Zephyr rev: e2bbad9600d655287496e3434641576e39a6582b - Toolchain: Zephyr-SDK 0.8.2
|
|
Re: Problems managing NBUF DATA pool in the networking stack
Marcus Shawcroft <marcus.shawcroft@...>
On 8 February 2017 at 07:04, Jukka Rissanen
<jukka.rissanen@...> wrote: Running out of resources is bad, dead lock, especially undetected deadlock, is worse. Avoiding the dead lock where the RX path starves the rest of the system of resources requires that the resources the RX path can consume are separate from the resources available to the TX path(s). Limiting resource consumption by the RX path is straight forward, buffers come from fixed size pool, when the pool is empty we drop packets. Now we have a situation where RX cannot starve TX, we just need to ensure that multiple TX paths cannot deadlock each other. Dealing with resource exhaustion on the TX side is harder. In a system with multiple TX paths either, there need to be sufficient TX resources that all TX paths can acquire sufficient resources to proceed in parallel or there need to be sufficient resources for any one path to make progress along with a mechanism to serialize those paths. The former solution is probably a none starter for a small system because the number of buffers required is likely to be unreasonably large. The latter solution I think implies that no TX path can block waiting for resources unless it currently holds no resources.... ie blocking to get a buffer is ok, blocking to extend a buffer or to get a second buffer is not ok. The timeout to buffer API helps a bit but still we might run out ofFor incremental acquisition of further resources this doesn't help, it can't guarantee to prevent dead lock and its use in the software stack makes reasoning about deadlock harder. Cheers /Marcus
|
|
Re: Problems managing NBUF DATA pool in the networking stack
Jukka Rissanen
Hi Piotr,
On Tue, 2017-02-07 at 17:56 +0100, Piotr Mienkowski wrote: Hi,One option would be to split the DATA pool to two so one pool for sending and receiving. Then again this does not solve much as you might still get to a situation where all the buffers are exhausted. The timeout to buffer API helps a bit but still we might run out of buffers. One should allocate as many buffers to DATA pool as possible but this really depends on hw and applications of course. Thanks and regards,Cheers, Jukka
|
|
Re: Inconsistent and ever-changing device naming in Zephyr
Maureen Helm
Hi Paul,
toggle quoted messageShow quoted text
-----Original Message-----This is my fault. I was not aware the port names were being used this way. I tend to use the config names rather than the strings themselves. For example, in boards/arm/frdm_k64f/Kconfig.defconfig: config FXOS8700_GPIO_NAME default GPIO_MCUX_PORTC_NAME Please, please let me know if you encounter problems like this again. Since you're most affected, mind making a proposal?
|
|
Problems managing NBUF DATA pool in the networking stack
Piotr Mienkowski
Hi, There seems to be a conceptual issue in a way networking buffers are currently set up. I was thinking about entering Jira bug report but maybe it's just me missing some information or otherwise misunderstanding how the networking stack is supposed to be used. I'll shortly describe the problem here based on Zephyr echo_server sample application. Currently if the echo_server application receives a large amount of data, e.g. when a large file is sent via ncat the application will lock up and stop responding. The only way out is to reset the device. This problem is very easily observed with eth_sam_gmac Ethernet driver and should be just as easy to spot with eth_mcux. Due to a different driver architecture it may be more difficult to observe with eth_enc28j60. The problem is as follows. Via Kconfig we define RX, TX and data
buffers pool. Let's say like this: CONFIG_NET_NBUF_RX_COUNT=14 CONFIG_NET_NBUF_TX_COUNT=14 CONFIG_NET_NBUF_DATA_COUNT=72 The number of RX and TX buffers corresponds to the number of
RX/TX frames which may be simultaneously received/send. The data
buffers count tells us how much storage we reserve for the actual
data. This pool is shared between RX and TX path. If we receive a
large amount of data the RX path will consume all available data
buffers leaving none for the TX path. If an application then tries
to reserve data buffers for the TX path, e.g. echo_server does it
in build_reply_buf() function, it will get stuck waiting forever
for a free data buffer. echo_server application gets stuck on the
following line frag = net_nbuf_get_data(context); The simplified sequence of events in the echo_server application
is as follows: receive RX frame -> reserve data buffers for TX
frame -> copy data from RX frame to TX frame -> free
resources associated with RX frame -> send TX frame. One way to avoid it is to define number of data buffers large
enough so the RX path cannot exhaust available data pool. Taking
into account that data buffer size is 128 bytes, this is defined
by the following Kconfig parameter, CONFIG_NET_NBUF_DATA_SIZE=128 and maximum frame size is 1518 or 1536 bytes one RX frame can use
up to 12 data buffers. In our example we would need to reserve
more than 12*14 data buffers to ensure correct behavior. In case
of eth_sam_gmac Ethernet driver even more. After recent updates to the networking stack the functions reserving RX/TX/DATA buffers have a timeout parameter. That would prevent lock up but it still does not really solve the issue. Is there a better way to manage this? Thanks and regards,
|
|
Re: CONFIG_SEMAPHORE_GROUPS
Benjamin Walsh <benjamin.walsh@...>
On Tue, Feb 07, 2017 at 11:42:22AM +0000, Shtivelman, Bella wrote:
Hi,OK, so it seems that, since you can disable the CONFIG_SEMAPHORE_GROUPS feature, you are _not_ using said feature. In theory, this if statement should always be hit in handle_sem_group(): if (!(thread->base.thread_state & _THREAD_DUMMY)) { /* * The awakened thread is a real thread and thus was not * involved in a semaphore group operation. */ return 0; } and handle_sem_group() should never do anything else, unless the thread_state of some thread got corrupted. Can you verify that the thread->base.thread.state _THREAD_DUMMY bit is actually set when you get that crash ? Looking at it, I think I found a bug in handle_sem_group(): if (desc->sem != sem) { sem_thread = CONTAINER_OF(desc, struct _sem_thread, desc); struct k_thread *dummy_thread = (struct k_thread *)&sem_thread->dummy; if (_is_thread_timeout_expired(dummy_thread)) { + sys_dlist_remove(node); + node = next; continue; } _abort_thread_timeout(dummy_thread); _unpend_thread(dummy_thread); sys_dlist_remove(node); } If you _are_ using sem groups, I would suggest using the new k_poll() API instead: semaphore groups are part of the legacy API, and the implemenation is pretty bad w.r.t. interrupt locking duration. Regards, Ben
|
|
Daily Gerrit Digest
donotreply@...
NEW within last 24 hours:
- https://gerrit.zephyrproject.org/r/10947 : samples/mbedtls_dtls_server: Use k_uptime_get_32() - https://gerrit.zephyrproject.org/r/10946 : samples/mbedtls_dtls_client: Use k_uptime_get_32() - https://gerrit.zephyrproject.org/r/10949 : checkpatch: Remove reference to legacy IP stack - https://gerrit.zephyrproject.org/r/10918 : Xtensa port: Used plain C implementation for STACK_CANARY_INIT. - https://gerrit.zephyrproject.org/r/10948 : samples/mbedtls_dtls_client: Fix wild write in entropy_source - https://gerrit.zephyrproject.org/r/10945 : Xtensa port: Fixed defintion of MAX_HEAP_SIZE, thus, compilation of new_lib. - https://gerrit.zephyrproject.org/r/10901 : xt-sim: add support for 'make debug' - https://gerrit.zephyrproject.org/r/10910 : gpio: Error unsupported access_op consistently. - https://gerrit.zephyrproject.org/r/10927 : tests/slist: Exercise CONTAINER macros - https://gerrit.zephyrproject.org/r/10926 : slist: Introduce CONTAINER macros - https://gerrit.zephyrproject.org/r/10896 : xtensa: cleanup fatal error handling - https://gerrit.zephyrproject.org/r/10920 : kw41z: add base DTS support - https://gerrit.zephyrproject.org/r/10919 : Xtensa port: Fixed memory corruption in interrupt handler exit function. - https://gerrit.zephyrproject.org/r/10902 : eth/mcux: Add basic PHY support. - https://gerrit.zephyrproject.org/r/10894 : doc/net: Add L2 and device driver document - https://gerrit.zephyrproject.org/r/10886 : errno.h: Changed errno.h to make it safe to be included in assembly files. - https://gerrit.zephyrproject.org/r/10913 : Add packer verify and merge jobs - https://gerrit.zephyrproject.org/r/10912 : tests: kernel: import obj_tracing test to unified kernel - https://gerrit.zephyrproject.org/r/10911 : kernel: Remove redundant TICKLESS_IDLE_SUPPORTED option - https://gerrit.zephyrproject.org/r/10903 : sw_isr_table.h: clean up definition - https://gerrit.zephyrproject.org/r/10909 : gpio: Error pin out of range consistently. - https://gerrit.zephyrproject.org/r/10908 : gpio: Error GPIO_INT with GPIO_DIR_OUT consistently. - https://gerrit.zephyrproject.org/r/10905 : toolchain: gcc.h: add indirection to _GENERIC_SECTION() macro - https://gerrit.zephyrproject.org/r/10906 : section_tags.h: cleanup - https://gerrit.zephyrproject.org/r/10904 : tests: remove old ARM vector table test - https://gerrit.zephyrproject.org/r/10889 : make_zephyr_sdk.sh: Bump version to 0.9.1 - https://gerrit.zephyrproject.org/r/10890 : xt-run: don't leave dead emulator processes lying around - https://gerrit.zephyrproject.org/r/10888 : doc: add permalinks to document headings - https://gerrit.zephyrproject.org/r/10885 : sysgen: Fixed macro evaluation issue in kernel_main.c with compiling with xcc. - https://gerrit.zephyrproject.org/r/10883 : boards: add panther board UPDATED within last 24 hours: - https://gerrit.zephyrproject.org/r/10859 : v2m_beetle: Add base DTS support - https://gerrit.zephyrproject.org/r/10848 : tinycrypt: Fixed compilation error on Xtensa caused by bad definition of bool. - https://gerrit.zephyrproject.org/r/10690 : tests: Introduced new config option to add extra stack size for tests. - https://gerrit.zephyrproject.org/r/10881 : sensor/mma865x: Add driver for MMA865x 3 Axis Accelerometer Family - https://gerrit.zephyrproject.org/r/10879 : sensor/mag3110: Add mag3110 three axis magnetometer driver. - https://gerrit.zephyrproject.org/r/10720 : flash/stm32: driver for STM32F4x series - https://gerrit.zephyrproject.org/r/10755 : flash/nrf5: fix invalid write access - https://gerrit.zephyrproject.org/r/6720 : Bluetooth: A2DP: Stream End Point Registration - https://gerrit.zephyrproject.org/r/10882 : bbc_microbit: Enable MMA8653 - https://gerrit.zephyrproject.org/r/10880 : bbc_microbit: Enable MAG3110 - https://gerrit.zephyrproject.org/r/7492 : Bluetooth: A2DP: Added Preset Structure - https://gerrit.zephyrproject.org/r/6586 : misc: Let the compiler choose whether to omit frame pointer - https://gerrit.zephyrproject.org/r/10853 : drivers: Add Atmel SAM RTC driver - https://gerrit.zephyrproject.org/r/10860 : v2m_beetle: uart: Add DTS support to UART driver - https://gerrit.zephyrproject.org/r/9460 : Bluetooth: AVDTP: Add AVDTP Discover Function Definition - https://gerrit.zephyrproject.org/r/10807 : net/mqtt: Add BT support to MQTT publisher sample - https://gerrit.zephyrproject.org/r/10628 : RFC: DTS: Kinetis: Add FRDM_K64F support - https://gerrit.zephyrproject.org/r/9663 : Bluetooth: AVDTP: Add AVDTP Receive Function - https://gerrit.zephyrproject.org/r/10806 : Bluetooth: AVDTP: Handling Discover response - https://gerrit.zephyrproject.org/r/10670 : xtensa port: Added arch .ini file to support xt-sim - https://gerrit.zephyrproject.org/r/10645 : Bluetooth: HFP HF: Handling AG Network error - https://gerrit.zephyrproject.org/r/10834 : Xtensa port: Ensure tests exit when running in simulator. - https://gerrit.zephyrproject.org/r/9449 : tests: add systhreads test case - https://gerrit.zephyrproject.org/r/10858 : stm32: uart: Add DTS support to STM32 UART driver - https://gerrit.zephyrproject.org/r/9820 : tests: add zephyr adc driver api test case - https://gerrit.zephyrproject.org/r/10857 : nucleo_l476rg: Enable device tree usage for Nucleo - https://gerrit.zephyrproject.org/r/10626 : RFC: DTS: Add base DTS and YAML definitions - https://gerrit.zephyrproject.org/r/10854 : cc3200: dts: Add base DTS support for TI CC3200 - https://gerrit.zephyrproject.org/r/10627 : RFC: DTS: Kinetis: Add base support for Kinetis - https://gerrit.zephyrproject.org/r/10856 : stm32: Add base DTS support for Nucleo board - https://gerrit.zephyrproject.org/r/10855 : cc3200: Enable device tree usage for CC3200 - https://gerrit.zephyrproject.org/r/10629 : RFC: DTS: Kinetis: Add support for Hexiwear K64 - https://gerrit.zephyrproject.org/r/7738 : RFC: Add support for Device Tree - https://gerrit.zephyrproject.org/r/10801 : board: add nucleo_l476rg documentation - https://gerrit.zephyrproject.org/r/10718 : tests: dma: update dma loop transfer app - https://gerrit.zephyrproject.org/r/10693 : tests: dma: update dma block transfer app - https://gerrit.zephyrproject.org/r/10788 : xtensa: apply overlay to newlib - https://gerrit.zephyrproject.org/r/10787 : xtensa: add recipes-devtools-xtensa - https://gerrit.zephyrproject.org/r/10134 : drivers: QMSI PWM: simplify driver reentrancy code using IS_ENABLED macro - https://gerrit.zephyrproject.org/r/9681 : quark_se: Add shared GDT in RAM memory to linker - https://gerrit.zephyrproject.org/r/9682 : quark_d2000: Add shared GDT memory to linker - https://gerrit.zephyrproject.org/r/10723 : DO NOT MERGE ext qmsi: update to QMSI 1.4 - https://gerrit.zephyrproject.org/r/9680 : quark_se: Fix restore info address - https://gerrit.zephyrproject.org/r/10541 : drivers: dma_shim: update dma qmsi shim driver - https://gerrit.zephyrproject.org/r/10877 : i2c: Name parameters consistently. - https://gerrit.zephyrproject.org/r/10804 : net/mqtt: Add support for IBM BlueMix Watson topic format - https://gerrit.zephyrproject.org/r/10646 : hexiwear_k64: Add RST board documentation - https://gerrit.zephyrproject.org/r/9890 : sys_bitfield*(): use 'void *' instead of memaddr_t - https://gerrit.zephyrproject.org/r/10599 : arm: cmsis: Convert _ClearFaults to use direct CMSIS register access - https://gerrit.zephyrproject.org/r/10598 : arm: cmsis: Convert printing of MMFSR, BFSR, and UFSR to CMSIS - https://gerrit.zephyrproject.org/r/10596 : arm: cmsis: Convert _Scb*FaultIs* & _ScbIs*Fault to use CMSIS register access - https://gerrit.zephyrproject.org/r/7190 : build: Fix unconditional re-link of libzephyr.a - https://gerrit.zephyrproject.org/r/10862 : ARC: fix I2C SPI and GPIO default name issue for ARC - https://gerrit.zephyrproject.org/r/10725 : ext: qmsi: wait for AON periodic timer alarm's being cleared - https://gerrit.zephyrproject.org/r/10849 : doc: rename doxygen configuration file and build from doc/ - https://gerrit.zephyrproject.org/r/10874 : i2c: Implement consistent i2c no msgs behaviour - https://gerrit.zephyrproject.org/r/10875 : i2c/stm32lx: Fix layout. - https://gerrit.zephyrproject.org/r/10876 : i2c/dw: Switch from EPERM to EIO - https://gerrit.zephyrproject.org/r/10878 : i2c: Elaborate API return values - https://gerrit.zephyrproject.org/r/10873 : i2c: Remove unused definition. MERGED within last 24 hours: - https://gerrit.zephyrproject.org/r/10943 : net: nbuf: Fix net_nbuf_compact() API - https://gerrit.zephyrproject.org/r/10942 : net: nbuf: Remove unused net_nbuf_push() API - https://gerrit.zephyrproject.org/r/10925 : net: tests: 15.4: Increase max data size and fix config option - https://gerrit.zephyrproject.org/r/10937 : Bluetooth: L2CAP: Make l2cap_br_send_conn_rsp return void - https://gerrit.zephyrproject.org/r/10936 : Bluetooth: SDP: Make bt_sdp_create_pdu static - https://gerrit.zephyrproject.org/r/10934 : Bluetooth: Remove some dead code - https://gerrit.zephyrproject.org/r/10928 : net: arp: Fix the ethernet header location - https://gerrit.zephyrproject.org/r/10924 : net: fragment: Fix the 802.15.4 fragmentation - https://gerrit.zephyrproject.org/r/10935 : Bluetooth: L2CAP: Remove dead code - https://gerrit.zephyrproject.org/r/10930 : Bluetooth: Controller: Use LL_ASSERT instead of BT_ASSERT - https://gerrit.zephyrproject.org/r/10933 : Bluetooth: Use assert when getting net buf with K_FOREVER - https://gerrit.zephyrproject.org/r/10915 : net: nbuf: Add helper to print fragment chain - https://gerrit.zephyrproject.org/r/10916 : net: nbuf: Remove dead code in net_nbuf_compact() - https://gerrit.zephyrproject.org/r/10917 : net: nbuf: Fix double free in net_nbuf_compact() - https://gerrit.zephyrproject.org/r/10914 : Bluetooth: samples: Add missing README.rst files - https://gerrit.zephyrproject.org/r/10897 : REVERTME: disable xip test on xtensa - https://gerrit.zephyrproject.org/r/10899 : Xtensa port: Fixed compilation of cpp_synchronization legacy test. - https://gerrit.zephyrproject.org/r/10900 : Xtensa port: Implemented STACK_CANARY_INIT for Xtensa cores. - https://gerrit.zephyrproject.org/r/10895 : Xtensa port: Connect Xtensa timer to tick IRQ in legacy test_context. - https://gerrit.zephyrproject.org/r/10887 : doc: frdm_k64f: Document Eth PHY known issue - https://gerrit.zephyrproject.org/r/10863 : net: nbuf: Add timeout to net_buf getters - https://gerrit.zephyrproject.org/r/10861 : net: todo: Add CAN support entry - https://gerrit.zephyrproject.org/r/10837 : net: context: keep randomly assigned port for TCP bind() calls - https://gerrit.zephyrproject.org/r/10865 : drivers/net/ieee802154: Change configuration prefix - https://gerrit.zephyrproject.org/r/10867 : net: ipv6: Fix pending buf leak when NA is received - https://gerrit.zephyrproject.org/r/10868 : net: ipv6: Free received NA net_buf - https://gerrit.zephyrproject.org/r/10869 : drivers/eth/mcux: Free net_buf using net_nbuf_unref - https://gerrit.zephyrproject.org/r/10842 : Xtensa port: Set Swap function result to -EAGAIN. - https://gerrit.zephyrproject.org/r/10644 : Bluetooth: HFP HF: Remove unused variable 'buf' - https://gerrit.zephyrproject.org/r/10871 : Xtensa port: Removed unsupported c++ flags cuasing xt-c++ to throw an error. - https://gerrit.zephyrproject.org/r/10872 : Bluetooth: GATT: set subscribe value to zero for unsubscription - https://gerrit.zephyrproject.org/r/10835 : Xtensa port: xt*-run requires options to be passed before file to be ran. - https://gerrit.zephyrproject.org/r/10547 : Bluetooth: HFP HF: Disconnect rfcomm on SLC error - https://gerrit.zephyrproject.org/r/10798 : Bluetooth: GATT: introduce volatile subscription flag - https://gerrit.zephyrproject.org/r/10830 : Xtensa port: Moved coprocessor stack area on bottom of stack, just after TCS. - https://gerrit.zephyrproject.org/r/10870 : samples: net: Fix invalid memory access for TCP - https://gerrit.zephyrproject.org/r/10683 : drivers: mcr20a: control access to SPI with semaphore - https://gerrit.zephyrproject.org/r/10864 : drivers/ieee802154: Split drivers Kconfig
|
|
CONFIG_SEMAPHORE_GROUPS
Shtivelman, Bella <bella.shtivelman@...>
Hi,
I have a question regarding the impact of CONFIG_SEMAPHORE_GROUPS flag. In my sample application I experienced weird failures in k_sem_give being called from udp receive callback. After debugging this issue I found out that the failure occurred in sem_give_common function, and more precisely, when entering this if condition: If(handle_sem_group(sem, thread)) where handle_sem_group is defined as follows: #ifdef CONFIG_SEMAPHORE_GROUPS static int handle_sem_group(struct k_sem *sem, struct k_thread *thread) { … } #else #define handle_sem_group(sem, thread) 0 #endif Disassembly shows the following (faulting instruction address is 0x0040f38c:) 0040f36c: ands.w r5, r3, #1 0040f370: beq.n 0x40f42a <sem_give_common+226> 177 list = (sys_dlist_t *)dummy->desc.thread->base.swap_data; 0040f372: ldr r3, [r4, #48] ; 0x30 0040f374: ldr.w r8, [r3, #12] 132 return list->head == list; 0040f378: ldr.w r4, [r8] 145 return sys_dlist_is_empty(list) ? NULL : list->head; 0040f37c: cmp r8, r4 0040f37e: it eq 0040f380: moveq r4, #0 176 return (!node || node == list->tail) ? NULL : node->next; 0040f382: cbz r4, 0x40f390 <sem_give_common+72> 0040f384: ldr.w r3, [r8, #4] 0040f388: cmp r3, r4 0040f38a: beq.n 0x40f394 <sem_give_common+76> 0040f38c: ldr r5, [r4, #0] 0040f38e: b.n 0x40f396 <sem_give_common+78> 0040f390: mov r5, r4 0040f392: b.n 0x40f396 <sem_give_common+78> 0040f394: movs r5, #0 0040f396: ldr r3, [r4, #12] 0040f398: cmp r6, r3 0040f39a: beq.n 0x40f3c8 <sem_give_common+128> 0040f39c: ldr.w r3, [r4, #-8] 0040f3a0: adds r3, #2 0040f3a2: beq.n 0x40f382 <sem_give_common+58> 0040f3a4: sub.w r9, r4, #40 ; 0x28 0040f3a8: sub.w r0, r4, #24 0040f3ac: bl 0x40f2a0 <_abort_timeout> 0040f3b0: mov r0, r9 0040f3b2: bl 0x40f290 <sys_dlist_remove> The default value of CONFIG_SEMAPHORE_GROUPS is “y”, and after setting it to “n” in my prj.conf I stopped facing this issue. But I’m not sure it is a correct handling of the issue. Are you familiar with this issue? Am I missing anything here? Regards, Bella. --------------------------------------------------------------------- A member of the Intel Corporation group of companies This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
|
|
Re: Design Philosophy 'Minimal runtime error checking'...
Chuck Jordan <Chuck.Jordan@...>
As Anas mentioned, I still think we need 2 level of "asserts". There might be one level which, as you state, is there only to detect early user errors during development and can be turned off on production. But then I feel we should have the "fatal" or "panic" level of assert macro that covers unrecoverable errors in a way that can *not* be disabled *at all*. The reason for those is to detect memory corruption or any other unrecoverable situation in a way that provides information to the developers if it ever happens in the wild. This typically takes very little ROM space (just injects an invalid instruction that triggers a fault which then recovers the PC and outputs/stores it somehow, then resets the IC). This is the approach that us (Nordic) have followed for unrecoverable BLE errors for years, and it has proven to be extremely valuable to find rare conditions and get information about those from customers, especially given the wide range of behavior that devices encounter in the wild from other peers.
Regards, Carles [Chuck Jordan] " injects an invalid instruction that triggers a fault" is an implementation detail. Actually, when you have power management in the mix or when you have multi-core, when a fault happens, there may be many things you need to do BEFORE triggering a reset. You may have to flush all files and close them, you may have to save state to non-volatile memories, you may have to signal other CPUs that you are about to reset, etc. It can be quite complicated depending on the topology of the hardware.
|
|