Re: BSD Sockets in mainline, and how that affects design decisions for the rest of IP stack (e.g. send MTU handling)


Paul Sokolovsky
 

Hello Jukka,

On Wed, 11 Oct 2017 13:06:25 +0300
Jukka Rissanen <jukka.rissanen@linux.intel.com> wrote:

[]
A solution originally proposed was that the mentioned API functions
should take an MTU into account, and not allow a user to add more
data
than MTU allows (accounting also for protocol headers). This
solution is rooted in the well-known POSIX semantics of "short
writes" - an application can request an arbitrary amount of data to
be written, but
a system is free to process less data, based on system resource
availability. Amount of processed data is returned, and an
application
is expected to retry the operation for the remaining data. It was
posted as https://github.com/zephyrproject-rtos/zephyr/pull/119 .
Again,
at that time, there was no consensus about way to solve it, so it
was implemented only for BSD Sockets API.
We can certainly implement something like this for the net_context
APIs. There is at least one issue with this as it is currently not
easy to pass information to application how much data we are able to
send, so currently it would be either that we could send all the data
or none of it.
To clarify, there's no need "to pass information to application" per
se. However, the IP stack itself has to know the size of packet headers
at the time of packet creation. IIRC, that was the concern raised for
#119. So, I propose to work towards resolving that issue, and there're
at least 2 approaches:

1. To actually make the stack work like that ("plan ahead"), which might
require some non-trivial refactoring.
2. Or, just conservatively reserve the highest value for the header
size, even if that may mean that some packets will contain less
payload than maximum.

[]

Note that currently we do not have IPv4 fragmentation support
implemented, and IPv6 fragmentation is also disabled by default.
Reason for this is that the fragmentation requires lot of extra
memory to be used which might not be necessary in usual cases. Having
TCP segments split needs much less memory.
Perhaps. But POSIX "short write" approach would require ~ zero extra
memory.

I would like to raise an additional argument while POSIX-inspired
approach may be better.
I would say there is no better or worse approach here. Just a
different point of view.


Consider a case when an application wants to
send a big amount of constant data, e.g. 900KB. It can be a system
with e.g. 1MB of flash and 64KB of RAM, an app sitting in ~100KB
of flash, the rest containing constant data to send. Following an
"split oversized packet" approach wouldn't help - an app wouldn't be
able to create an oversized packet of 900K - there's simply not
enough
RAM for it. So, it would need to handle such a case differently
anyway.
Of course your application is constrained by available memory and
other limits by your hw.
I don't think this comment does good to the usecase presented. Of
course, any app is constrained by such, but the algorithm based on POSIX
"short write" approach allows to cover wider usecases in a simpler
manner.

[]

Please note that BSD socket API is fully optional and not always
available. You cannot rely it to be present especially if you want to
minimize memory consumption. We need more general solution instead of
something that is only available for BSD sockets.
As clarified in the response to Anas, in no way I propose BSD Sockets
API as an alternative to native API. However, I do propose some
design/implementation choices from BSD Sockets API to be adopted for
the native API.

[]

There has not been any public talk in mailing list about
userspace/kernel separation and how it affects IP stack etc. so it is
a bit difficult to say anything about this.
That's true, but we can/should think how it may be affected, or we'll
be caught in the cold water with them, and may come up with random
on-spot designs to address such future requirements, instead of
something well thought out.

[]

--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog

Join devel@lists.zephyrproject.org to automatically receive all group messages.