Re: Problems managing NBUF DATA pool in the networking stack

Luiz Augusto von Dentz

Hi Geoff,

While it is probably a good idea to look for good research solutions
we actually need to make sense what it does, and what doesn't make
sense, for zephyr. Pretty much any layer that requires a lot more
memory, and that include threads that requires dedicated stack, buffer
pools and complexity in general is imo a big no, no for zephyr. That
said the net_buf, which is what nbuf uses, has been based on skb
concept from Linux, the pool work a big differently though since we
don't use memory allocation, but it is not that we haven't look at any
prior art, it just that we don't have any plans for queuing
discipline/network schedulers that perhaps you have in mind.

On Tue, Feb 14, 2017 at 5:24 PM, Geoff Thorpe <> wrote:
While I don't personally have answers to these buffer-management questions, I am certain they are well-studied, because they are intermingled with lots of other well-studied questions and use-cases that influence buffer handling, like flow-control, QoS, order-restoration, order-preservation, bridging, forwarding, tunneling, VLANs, and so on. If I recall, the "obvious solutions" usually aren't - i.e. they're either not obvious or not (general) solutions. The buffer-handling change to remediate one problematic use-case usually causes some other equally valid use-case to degenerate.

I guess I'm just saying that we should find prior art and best practice, rather than trying to derive it from first principles and experimentation.

Do we already have in our midst anyone who has familiarity with NPUs, OpenDataPlane, etc? If not, I can put out some feelers.


-----Original Message-----
From: [] On Behalf Of Jukka Rissanen
Sent: February-14-17 8:46 AM
To: Piotr Mieńkowski <>;
Subject: Re: [Zephyr-devel] Problems managing NBUF DATA pool in the networking stack

Hi Piotr,

On Tue, 2017-02-14 at 02:26 +0100, Piotr Mieńkowski wrote:
While I agree we should prevent the remote to consume all the
and possible starve the TX, this is probably due to echo_server
that deep copies the buffers from RX to TX, in a normal
Indeed the echo server could perhaps be optimized not to deep
thus removing the issue. The wider question here is whether or
not we
want a design rule that effectively states that all applications
should consume and unref their rx buffers before attempting to
allocate tx buffers. This may be convenient for some
but I'm not convinced that is always the case. Such a design
effectively states that an application that needs to retain or
information from request to response must now have somewhere to
all of that information between buffers and rules out any form of
incremental processing of an rx buffer interleaved with the
construction of the tx message.
If you read the entire email it would be clearer that I did not
suggest it was fine to rule out incremental processing, in fact I
suggested to add pools per net_context that way the stack itself
not have to drop its own packets and stop working because some
is taking all its buffers just to create clones.
So, what should be the final solution to the NBUF DATA issue? Do we
want to redesign echo_server sample application to use shallow copy,
should we introduce NBUF DATA pool per context, a separate NBUF DATA
pool for TX and RX? Something else?

In my opinion enforcing too much granularity on allocation of data
buffers, i.e. having a separate nbuf data pool per context, maybe
another one for networking stack will not be optimal. Firstly,
Kconfig would become even more complex and users would have hard time
figuring out a safe set of options. What if we know one context will
not use many data buffers and another one a lot. Should we still
assign the same amount of data buffers per context? Secondly, every
separate data pool will add some spare buffers as a 'margin error'.
Thirdly, Ethernet driver which reserves data buffers for the RX path
has no notion of context, doesn't know which packets are meant for
the networking stack, which one for the application. It would not
know from which data pool to take the buffers. It can only
distinguish between RX and TX path.

In principle, having shared resources is not a bad design approach.
However, we probably should have a way to guarantee a minimum amount
of buffers for the TX path. As a software engineer, if I need to
design a TX path in my networking application and I know that I have
some fixed amount of data buffers available I should be able to
manage it. The same task becomes much more difficult if my fixed
amount of data buffers can at any given moment become zero for
reasons which are beyond my control. This is the case currently.
I agree that having too fine grained setup for the buffers is bad and
should be avoided. The current setup of RX, TX and shared DATA buffers
has worked for UDP quite well. For TCP the situation gets much more
difficult as TCP might hold the nbuf for a while until an ack is
received for those pending packets. TCP code should not affect the
other part of the IP stack and starve the buffers from other part of
the stack.

One option is to have a separate pool for TCP data nbuf's that could be
shared by all the TCP contexts. The TCP code could allocate all the
buffers that need to wait ack from this pool instead of global data
pool. This would avoid allocating a separate pool for each context
which is sub-optimal for memory consumption.


Zephyr-devel mailing list
Zephyr-devel mailing list
Luiz Augusto von Dentz

Join to automatically receive all group messages.