Hi Geoff,
While it is probably a good idea to look for good research solutions we actually need to make sense what it does, and what doesn't make sense, for zephyr. Pretty much any layer that requires a lot more memory, and that include threads that requires dedicated stack, buffer pools and complexity in general is imo a big no, no for zephyr. That said the net_buf, which is what nbuf uses, has been based on skb concept from Linux, the pool work a big differently though since we don't use memory allocation, but it is not that we haven't look at any prior art, it just that we don't have any plans for queuing discipline/network schedulers that perhaps you have in mind.
toggle quoted messageShow quoted text
On Tue, Feb 14, 2017 at 5:24 PM, Geoff Thorpe <geoff.thorpe@nxp.com> wrote: While I don't personally have answers to these buffer-management questions, I am certain they are well-studied, because they are intermingled with lots of other well-studied questions and use-cases that influence buffer handling, like flow-control, QoS, order-restoration, order-preservation, bridging, forwarding, tunneling, VLANs, and so on. If I recall, the "obvious solutions" usually aren't - i.e. they're either not obvious or not (general) solutions. The buffer-handling change to remediate one problematic use-case usually causes some other equally valid use-case to degenerate.
I guess I'm just saying that we should find prior art and best practice, rather than trying to derive it from first principles and experimentation.
Do we already have in our midst anyone who has familiarity with NPUs, OpenDataPlane, etc? If not, I can put out some feelers.
Cheers Geoff
-----Original Message----- From: zephyr-devel-bounces@lists.zephyrproject.org [mailto:zephyr-devel-bounces@lists.zephyrproject.org] On Behalf Of Jukka Rissanen Sent: February-14-17 8:46 AM To: Piotr Mieńkowski <piotr.mienkowski@gmail.com>; zephyr-devel@lists.zephyrproject.org Subject: Re: [Zephyr-devel] Problems managing NBUF DATA pool in the networking stack
Hi Piotr,
On Tue, 2017-02-14 at 02:26 +0100, Piotr Mieńkowski wrote:
Hi,
While I agree we should prevent the remote to consume all the buffer and possible starve the TX, this is probably due to echo_server design that deep copies the buffers from RX to TX, in a normal application Indeed the echo server could perhaps be optimized not to deep copy thus removing the issue. The wider question here is whether or not we want a design rule that effectively states that all applications should consume and unref their rx buffers before attempting to allocate tx buffers. This may be convenient for some applications, but I'm not convinced that is always the case. Such a design rule effectively states that an application that needs to retain or process information from request to response must now have somewhere to store all of that information between buffers and rules out any form of incremental processing of an rx buffer interleaved with the construction of the tx message. If you read the entire email it would be clearer that I did not suggest it was fine to rule out incremental processing, in fact I suggested to add pools per net_context that way the stack itself will not have to drop its own packets and stop working because some context is taking all its buffers just to create clones. So, what should be the final solution to the NBUF DATA issue? Do we want to redesign echo_server sample application to use shallow copy, should we introduce NBUF DATA pool per context, a separate NBUF DATA pool for TX and RX? Something else?
In my opinion enforcing too much granularity on allocation of data buffers, i.e. having a separate nbuf data pool per context, maybe another one for networking stack will not be optimal. Firstly, Kconfig would become even more complex and users would have hard time figuring out a safe set of options. What if we know one context will not use many data buffers and another one a lot. Should we still assign the same amount of data buffers per context? Secondly, every separate data pool will add some spare buffers as a 'margin error'. Thirdly, Ethernet driver which reserves data buffers for the RX path has no notion of context, doesn't know which packets are meant for the networking stack, which one for the application. It would not know from which data pool to take the buffers. It can only distinguish between RX and TX path.
In principle, having shared resources is not a bad design approach. However, we probably should have a way to guarantee a minimum amount of buffers for the TX path. As a software engineer, if I need to design a TX path in my networking application and I know that I have some fixed amount of data buffers available I should be able to manage it. The same task becomes much more difficult if my fixed amount of data buffers can at any given moment become zero for reasons which are beyond my control. This is the case currently. I agree that having too fine grained setup for the buffers is bad and should be avoided. The current setup of RX, TX and shared DATA buffers has worked for UDP quite well. For TCP the situation gets much more difficult as TCP might hold the nbuf for a while until an ack is received for those pending packets. TCP code should not affect the other part of the IP stack and starve the buffers from other part of the stack.
One option is to have a separate pool for TCP data nbuf's that could be shared by all the TCP contexts. The TCP code could allocate all the buffers that need to wait ack from this pool instead of global data pool. This would avoid allocating a separate pool for each context which is sub-optimal for memory consumption.
Cheers, Jukka
_______________________________________________ Zephyr-devel mailing list Zephyr-devel@lists.zephyrproject.org https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel _______________________________________________ Zephyr-devel mailing list Zephyr-devel@lists.zephyrproject.org https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel
-- Luiz Augusto von Dentz
|