Networking stack - Ethernet driver design


Piotr Mieńkowski <piotr.mienkowski at gmail.com...>
 

Hi all,

I have a few questions/discussion points related to the new networking
stack in context of Ethernet driver development.

1. Currently Ethernet drivers are by default initialized before the
networking stack (as set by CONFIG_ETH_INIT_PRIORITY). That's going
to be problematic for a zero-copy implementation of Ethernet driver.
Such driver will need to reserve during the initialization phase a
set of net data buffers where the incoming packets can be stored.
Later when the complete frame is received these buffers will be
passed to the higher layer. However, the net buffer pool is
initialized by the networking stack so reserving net buffers is only
possible after networking stack was initialized. That implies that
the Ethernet driver should be initialized after the networking
stack. Secondly, even in case of the more typical implementation of
the Ethernet driver, the one which has its own set of RX/TX buffers
and copies data between them and net buffers, as currently done in
Zephyr, the driver will start working immediately after being
initialized. If it receives a frame just at that moment it will try
to pass it to the higher layer even if the rest of the networking
stack was not yet initialized. Once again that implies that the
Ethernet driver should be initialized after the networking stack is.
2. Modern Ethernet modules in most of the SoC devices have ability to
generate IP, TCP and UDP checksums in hardware. Is it possible to
tell networking stack not to compute these checksums in software?
3. One of the parameters to NET_DEVICE_INIT is MTU. Shouldn't we have a
set of predefined constants provided by the networking stack for
some typical interfaces. E.g. NET_MTU_ETHERNET set to 1500? Should
we configure the MTU value in Kconfig?

Regards,
Piotr


Chuck Jordan <Chuck.Jordan@...>
 

From: Piotr Mienkowski [mailto:piotr.mienkowski(a)gmail.com]
Sent: Wednesday, January 04, 2017 9:25 AM
To: devel(a)lists.zephyrproject.org
Subject: [devel] Networking stack - Ethernet driver design


Hi all,

I have a few questions/discussion points related to the new networking stack in context of Ethernet driver development.

1. Currently Ethernet drivers are by default initialized before the networking stack (as set by CONFIG_ETH_INIT_PRIORITY). That's going to be problematic for a zero-copy implementation of Ethernet driver. Such driver will need to reserve during the initialization phase a set of net data buffers where the incoming packets can be stored. Later when the complete frame is received these buffers will be passed to the higher layer. However, the net buffer pool is initialized by the networking stack so reserving net buffers is only possible after networking stack was initialized. That implies that the Ethernet driver should be initialized after the networking stack. Secondly, even in case of the more typical implementation of the Ethernet driver, the one which has its own set of RX/TX buffers and copies data between them and net buffers, as currently done in Zephyr, the driver will start working immediately after being initialized. If it receives a frame just at that moment it will try to pass it to the higher layer even if the rest of the networking stack was not yet initialized. Once again that implies that the Ethernet driver should be initialized after the networking stack is.
[chuckj] I would think if the Enet driver comes alive FIRST, and receives something, but no protocol stack has attached itself yet, that it could simply DROP the packet. ENET allows you to drop packets anywhere, anytime. So I could imagine the ENET driver coming alive first, doing its own initialization, acquiring its own buffers. Further, it should be agnostic as to which protocol stack attaches to it. There are protocols that have something OTHER than TCP/IP over ENET. For example, ATM packets over ENET, or any other encapsulation. Although, admittedly, this is very unlikely to appear with Zephyr.
So one way to design this is that the DRIVER owns the buffers, and that the buffers must always be returned to the driver when the protocol stack has consumed the packet. The API the protocol stack uses is a call to read a packet, getting a new buffer, and then another API to return the buffer later – to keep it zero copy. Sending would first have the application acquire a buffer, then fill it with a packet, then transmit it – with an attempt to utilize scatter-gather techniques to avoid memory copying (i.e. payload is not copied).

1. Modern Ethernet modules in most of the SoC devices have ability to generate IP, TCP and UDP checksums in hardware. Is it possible to tell networking stack not to compute these checksums in software?
This would be another configurable thing I guess. Also ENDIAN swap. Also the 2-byte GAP can be a problem between ENET header and IP header – if hardware requires 32-bit alignment. There are a lot of little things like this that some hardware can handle and some hardware cannot. I suppose all of these things should be configurable.

1. One of the parameters to NET_DEVICE_INIT is MTU. Shouldn't we have a set of predefined constants provided by the networking stack for some typical interfaces. E.g. NET_MTU_ETHERNET set to 1500? Should we configure the MTU value in Kconfig?
[chuckj] Most of the time MTU 1500 is probably just fine. Where I have seen people set this bigger is when streaming video (or something else) requires packets that are a lot bigger for efficiency. So YES having this configurable might be a good idea – although on low-end Zephyr devices, with modest memory sizes, modest MAC/PHY, its probably unlikely to be set big.
When TCP-offload devices are used, I would think there are large set of things to configure.
Regards,
Piotr


Tomasz Bursztyka
 

Hi Piotr,

1. Currently Ethernet drivers are by default initialized before the
networking stack (as set by CONFIG_ETH_INIT_PRIORITY). That's
going to be problematic for a zero-copy implementation of Ethernet
driver. Such driver will need to reserve during the initialization
phase a set of net data buffers where the incoming packets can be
stored. Later when the complete frame is received these buffers
will be passed to the higher layer. However, the net buffer pool
is initialized by the networking stack so reserving net buffers is
only possible after networking stack was initialized. That implies
that the Ethernet driver should be initialized after the
networking stack. Secondly, even in case of the more typical
implementation of the Ethernet driver, the one which has its own
set of RX/TX buffers and copies data between them and net buffers,
as currently done in Zephyr, the driver will start working
immediately after being initialized. If it receives a frame just
at that moment it will try to pass it to the higher layer even if
the rest of the networking stack was not yet initialized. Once
again that implies that the Ethernet driver should be initialized
after the networking stack is.
You are mixing different issues here: your own driver design and known
current limitations in net stack.
I will address your driver design issue in your patches (however apply
first style comments).

Currently, all net interface are up and running as soon as they are
initialized. This was done like that because it was just easier to move
on with more important things.
Work is being done to modify this behavior (up/down iface state,
etc...). Drivers will always be initialized before the network stack,
there is currently no design that require
the other way round, even yours.

1. Modern Ethernet modules in most of the SoC devices have ability to
generate IP, TCP and UDP checksums in hardware. Is it possible to
tell networking stack not to compute these checksums in software?
For now, no. There is no generic CRC API that could - depending on
configuration and hardware - either call a software routine or run it on
hardware directly.

1. One of the parameters to NET_DEVICE_INIT is MTU. Shouldn't we have
a set of predefined constants provided by the networking stack for
some typical interfaces. E.g. NET_MTU_ETHERNET set to 1500? Should
we configure the MTU value in Kconfig?
Don't bother with that. MTU is not taken into account in network stack,
this is a big known crappy limitations.
There is on-going work to fix it properly.

Br,

Tomasz


Piotr Mieńkowski <piotr.mienkowski at gmail.com...>
 

Hi Thomas and Chuck,

Thanks for the feedback. Regarding comments from Thomas:


You are mixing different issues here: your own driver design and known
current limitations in net stack.
Currently, all net interface are up and running as soon as they are
initialized. This was done like that because it was just easier to
move on with more important things.
Work is being done to modify this behavior (up/down iface state,
etc...). Drivers will always be initialized before the network stack
Just to make sure I'm not misunderstood I was not implying that
initializing Ethernet driver before networking stack is a bad idea, only
that it does not play well with the current design. If the architecture
is being worked on, some things will be changed there is no issue.

Where can I find the list of known current limitations of the networking
stack? Is it documented?

1. Modern Ethernet modules in most of the SoC devices have ability
to generate IP, TCP and UDP checksums in hardware. Is it possible
to tell networking stack not to compute these checksums in software?
For now, no. There is no generic CRC API that could - depending on
configuration and hardware - either call a software routine or run it
on hardware directly.
Shall I create a Jira issue so this is not forgotten?

Regards,
Piotr


Tomasz Bursztyka
 

Hi Piotr,

Where can I find the list of known current limitations of the
networking stack? Is it documented?
No. We have a TODO which is more like a 1:1 with Jira tickets, but
that's all.


1. Modern Ethernet modules in most of the SoC devices have ability
to generate IP, TCP and UDP checksums in hardware. Is it
possible to tell networking stack not to compute these checksums
in software?
For now, no. There is no generic CRC API that could - depending on
configuration and hardware - either call a software routine or run it
on hardware directly.
Shall I create a Jira issue so this is not forgotten?
Definitely, yes, go ahead.


Tomasz