BSD Sockets in mainline, and how that affects design decisions for the rest of IP stack (e.g. send MTU handling)


Paul Sokolovsky
 

Hello,

There was an RFC on this list to implement BSD Sockets API
compatibility layer for Zephyr some 6 months ago. Majority of that
functionality went into the 1.9 release (with some additional pieces
still going into).

Before and while working on sockets, there were a number of issues with
the native stack discovered/documented, and solutions for some were
proposed. At that time they were rather tentative and experimental, and
there was no consensus how to resolve them, so as a proof of concept,
they were implemented just in the Sockets layer.

An example is handling of the send MTU, originally
https://jira.zephyrproject.org/browse/ZEP-1998 , now
https://github.com/zephyrproject-rtos/zephyr/issues/3439 . The essence
of the issue is that native networking API functions to create an
outgoing packet don't control packet size in any way. It's easy to
create an oversized packet which will fail during an actual send
operation.

A solution originally proposed was that the mentioned API functions
should take an MTU into account, and not allow a user to add more data
than MTU allows (accounting also for protocol headers). This solution
is rooted in the well-known POSIX semantics of "short writes" - an
application can request an arbitrary amount of data to be written, but
a system is free to process less data, based on system resource
availability. Amount of processed data is returned, and an application
is expected to retry the operation for the remaining data. It was
posted as https://github.com/zephyrproject-rtos/zephyr/pull/119 . Again,
at that time, there was no consensus about way to solve it, so it was
implemented only for BSD Sockets API.

Much later,
https://github.com/zephyrproject-rtos/zephyr/pull/1330 was posted. It
works in following way: it allows an application to create an
oversized packet, but a stack does a separate pass over it and splits
this packet into several packets with a valid length. A comment
immediately received (not by me) was that this patch just duplicates
in an adhoc way IP fragmentation support as required by TCP/IP
protocol.

I would like to raise an additional argument while POSIX-inspired
approach may be better. Consider a case when an application wants to
send a big amount of constant data, e.g. 900KB. It can be a system
with e.g. 1MB of flash and 64KB of RAM, an app sitting in ~100KB
of flash, the rest containing constant data to send. Following an
"split oversized packet" approach wouldn't help - an app wouldn't be
able to create an oversized packet of 900K - there's simply not enough
RAM for it. So, it would need to handle such a case differently anyway.
But POSIX-based approach, would allow to handle it right away - any
application need to be prepared to retry operation until
completion anyway, the amount of data is not important.


That's the essence of the question this RFC poses: given that the
POSIX-based approach is already in the mainline, does it make sense to
go for a Zephyr-special, adhoc solutions for a problem (and as
mentioned at the beginning, there can be more issues with a similar
choice).

Answering "yes" may have interesting implications. For example, the
code in https://github.com/zephyrproject-rtos/zephyr/pull/1330 is
not needed for applications using BSD Sockets. There's at least another
issue solved on BSD Sockets level, but not on the native API. There's
an ongoing effort to separate kernel and userspace, and BSD Sockets
offer an automagic solution for that, while native API allows a user
app to access straight to the kernel networking buffer, so there's a
lot to solve there yet. Going like that, it may turn out that native
adhoc API, which initially was intended to small and efficient, will
grow bigger and more complex (== harder to stabilize, containing more
bugs) than something based on well tried and tested approach like POSIX.

So, it would be nice if the networking stack, and overall Zephyr
architecture stakeholders consider both a particular issue and overall
implications on the design/implementation. There're many more
details than presented above, and the devil is definitely in details,
there's no absolutely "right" solution, it's a compromise. I hope that
Jukka and Tomasz, who are proponents of the second (GH-1330) approach
can correct me on the benefits of it.


Thanks,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Jukka Rissanen
 

Hi,

On Tue, 2017-10-10 at 21:50 +0300, Paul Sokolovsky wrote:
Hello,


A solution originally proposed was that the mentioned API functions
should take an MTU into account, and not allow a user to add more
data
than MTU allows (accounting also for protocol headers). This solution
is rooted in the well-known POSIX semantics of "short writes" - an
application can request an arbitrary amount of data to be written,
but
a system is free to process less data, based on system resource
availability. Amount of processed data is returned, and an
application
is expected to retry the operation for the remaining data. It was
posted as https://github.com/zephyrproject-rtos/zephyr/pull/119 .
Again,
at that time, there was no consensus about way to solve it, so it was
implemented only for BSD Sockets API.
We can certainly implement something like this for the net_context
APIs. There is at least one issue with this as it is currently not
easy to pass information to application how much data we are able to
send, so currently it would be either that we could send all the data
or none of it.


Much later,
https://github.com/zephyrproject-rtos/zephyr/pull/1330 was posted. It
works in following way: it allows an application to create an
oversized packet, but a stack does a separate pass over it and splits
this packet into several packets with a valid length. A comment
immediately received (not by me) was that this patch just duplicates
in an adhoc way IP fragmentation support as required by TCP/IP
protocol.
Note that currently we do not have IPv4 fragmentation support
implemented, and IPv6 fragmentation is also disabled by default. Reason
for this is that the fragmentation requires lot of extra memory to be
used which might not be necessary in usual cases. Having TCP segments
split needs much less memory.


I would like to raise an additional argument while POSIX-inspired
approach may be better.
I would say there is no better or worse approach here. Just a different
point of view.


Consider a case when an application wants to
send a big amount of constant data, e.g. 900KB. It can be a system
with e.g. 1MB of flash and 64KB of RAM, an app sitting in ~100KB
of flash, the rest containing constant data to send. Following an
"split oversized packet" approach wouldn't help - an app wouldn't be
able to create an oversized packet of 900K - there's simply not
enough
RAM for it. So, it would need to handle such a case differently
anyway.
Of course your application is constrained by available memory and other
limits by your hw.

But POSIX-based approach, would allow to handle it right away - any
application need to be prepared to retry operation until
completion anyway, the amount of data is not important.


That's the essence of the question this RFC poses: given that the
POSIX-based approach is already in the mainline, does it make sense
to
go for a Zephyr-special, adhoc solutions for a problem (and as
mentioned at the beginning, there can be more issues with a similar
choice).
Please note that BSD socket API is fully optional and not always
available. You cannot rely it to be present especially if you want to
minimize memory consumption. We need more general solution instead of
something that is only available for BSD sockets.



Answering "yes" may have interesting implications. For example, the
code in https://github.com/zephyrproject-rtos/zephyr/pull/1330 is
not needed for applications using BSD Sockets. There's at least
another
issue solved on BSD Sockets level, but not on the native API. There's
an ongoing effort to separate kernel and userspace, and BSD Sockets
offer an automagic solution for that, while native API allows a user
app to access straight to the kernel networking buffer, so there's a
lot to solve there yet. Going like that, it may turn out that native
adhoc API, which initially was intended to small and efficient, will
grow bigger and more complex (== harder to stabilize, containing more
bugs) than something based on well tried and tested approach like
POSIX.
There has not been any public talk in mailing list about
userspace/kernel separation and how it affects IP stack etc. so it is a
bit difficult to say anything about this.



So, it would be nice if the networking stack, and overall Zephyr
architecture stakeholders consider both a particular issue and
overall
implications on the design/implementation. There're many more
details than presented above, and the devil is definitely in details,
there's no absolutely "right" solution, it's a compromise. I hope
that
Jukka and Tomasz, who are proponents of the second (GH-1330) approach
can correct me on the benefits of it.
You are unnecessarily creating this scenario about pro or against
solution. I have an example application in https://github.com/zephyrpro
ject-rtos/zephyr/pull/980 that needs to send large (several kb) file to
outside world using HTTP, and I am trying so solve it efficiently. The
application will not use BSD sockets.



Thanks,
Paul

Jukka


Nashif, Anas
 

Paul,

You gave very detailed background information and listed issues we had in the past but it was not clear what you are proposing, we do have sockets already, are you suggesting we should move everything to use sockets? Is the socket interface ready for this?
Then there is the usual comments being made whenever we discuss the IP stack related to memory usage and footprint (here made by Jukka), can we please quantify this and provide more data and context? For example I would be interested in numbers showing how much more memory/flash do we consume when sockets are used vs the same implementation using low level APIs. What is the penalty and is it justifiable, given that using sockets would give us a more portable solution and would allow the random user/developer to implement protocols more easily.

So my request is to have a more details proposals with going into the history of this and how we can move forward from here and what such a proposal would mean to existing code and protocols not using sockets...

Anas

-----Original Message-----
From: Jukka Rissanen [mailto:jukka.rissanen@linux.intel.com]
Sent: Wednesday, October 11, 2017 6:06 AM
To: Paul Sokolovsky <paul.sokolovsky@linaro.org>; devel@lists.zephyrproject.org; Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>; David Brown <david.brown@linaro.org>; Kumar Gala <kumar.gala@linaro.org>; Nashif, Anas <anas.nashif@intel.com>
Subject: Re: BSD Sockets in mainline, and how that affects design decisions for the rest of IP stack (e.g. send MTU handling)

Hi,

On Tue, 2017-10-10 at 21:50 +0300, Paul Sokolovsky wrote:
Hello,


A solution originally proposed was that the mentioned API functions
should take an MTU into account, and not allow a user to add more data
than MTU allows (accounting also for protocol headers). This solution
is rooted in the well-known POSIX semantics of "short writes" - an
application can request an arbitrary amount of data to be written, but
a system is free to process less data, based on system resource
availability. Amount of processed data is returned, and an application
is expected to retry the operation for the remaining data. It was
posted as https://github.com/zephyrproject-rtos/zephyr/pull/119 .
Again,
at that time, there was no consensus about way to solve it, so it was
implemented only for BSD Sockets API.
We can certainly implement something like this for the net_context APIs. There is at least one issue with this as it is currently not easy to pass information to application how much data we are able to send, so currently it would be either that we could send all the data or none of it.


Much later,
https://github.com/zephyrproject-rtos/zephyr/pull/1330 was posted. It
works in following way: it allows an application to create an
oversized packet, but a stack does a separate pass over it and splits
this packet into several packets with a valid length. A comment
immediately received (not by me) was that this patch just duplicates
in an adhoc way IP fragmentation support as required by TCP/IP
protocol.
Note that currently we do not have IPv4 fragmentation support implemented, and IPv6 fragmentation is also disabled by default. Reason for this is that the fragmentation requires lot of extra memory to be used which might not be necessary in usual cases. Having TCP segments split needs much less memory.


I would like to raise an additional argument while POSIX-inspired
approach may be better.
I would say there is no better or worse approach here. Just a different point of view.


Consider a case when an application wants to send a big amount of
constant data, e.g. 900KB. It can be a system with e.g. 1MB of flash
and 64KB of RAM, an app sitting in ~100KB of flash, the rest
containing constant data to send. Following an "split oversized
packet" approach wouldn't help - an app wouldn't be able to create an
oversized packet of 900K - there's simply not enough RAM for it. So,
it would need to handle such a case differently anyway.
Of course your application is constrained by available memory and other limits by your hw.

But POSIX-based approach, would allow to handle it right away - any
application need to be prepared to retry operation until completion
anyway, the amount of data is not important.


That's the essence of the question this RFC poses: given that the
POSIX-based approach is already in the mainline, does it make sense to
go for a Zephyr-special, adhoc solutions for a problem (and as
mentioned at the beginning, there can be more issues with a similar
choice).
Please note that BSD socket API is fully optional and not always available. You cannot rely it to be present especially if you want to minimize memory consumption. We need more general solution instead of something that is only available for BSD sockets.



Answering "yes" may have interesting implications. For example, the
code in https://github.com/zephyrproject-rtos/zephyr/pull/1330 is not
needed for applications using BSD Sockets. There's at least another
issue solved on BSD Sockets level, but not on the native API. There's
an ongoing effort to separate kernel and userspace, and BSD Sockets
offer an automagic solution for that, while native API allows a user
app to access straight to the kernel networking buffer, so there's a
lot to solve there yet. Going like that, it may turn out that native
adhoc API, which initially was intended to small and efficient, will
grow bigger and more complex (== harder to stabilize, containing more
bugs) than something based on well tried and tested approach like
POSIX.
There has not been any public talk in mailing list about userspace/kernel separation and how it affects IP stack etc. so it is a bit difficult to say anything about this.



So, it would be nice if the networking stack, and overall Zephyr
architecture stakeholders consider both a particular issue and overall
implications on the design/implementation. There're many more details
than presented above, and the devil is definitely in details, there's
no absolutely "right" solution, it's a compromise. I hope that Jukka
and Tomasz, who are proponents of the second (GH-1330) approach can
correct me on the benefits of it.
You are unnecessarily creating this scenario about pro or against solution. I have an example application in https://github.com/zephyrpro
ject-rtos/zephyr/pull/980 that needs to send large (several kb) file to outside world using HTTP, and I am trying so solve it efficiently. The application will not use BSD sockets.



Thanks,
Paul

Jukka


Luiz Augusto von Dentz
 

Hi Anas,

On Wed, Oct 11, 2017 at 5:56 PM, Nashif, Anas <anas.nashif@intel.com> wrote:
Paul,

You gave very detailed background information and listed issues we had in the past but it was not clear what you are proposing, we do have sockets already, are you suggesting we should move everything to use sockets? Is the socket interface ready for this?
Then there is the usual comments being made whenever we discuss the IP stack related to memory usage and footprint (here made by Jukka), can we please quantify this and provide more data and context? For example I would be interested in numbers showing how much more memory/flash do we consume when sockets are used vs the same implementation using low level APIs. What is the penalty and is it justifiable, given that using sockets would give us a more portable solution and would allow the random user/developer to implement protocols more easily.
Afaik a lot of ram is spent on buffers and if we can't do zero-copy
that means at very least one extra buffer has to exist to move data
around, fine-tuning the buffer size is also tricky especially using
small chunks which is prefered but will take several more calls and
copies into the stack, on the other hand, bigger buffers may bump the
memory footprint but provide better latency. Btw, this sort of trades
will just increase with the addition of kernel and userspace
separation, regardless in which layer that would sit at one point the
kernel will have to copy data from userspace in which case we may have
not just one copy per socket but 2, socket->stack->driver or perhaps 3
if the driver is using a HAL not compatible with net_buf.

So my request is to have a more details proposals with going into the history of this and how we can move forward from here and what such a proposal would mean to existing code and protocols not using sockets...

Anas


-----Original Message-----
From: Jukka Rissanen [mailto:jukka.rissanen@linux.intel.com]
Sent: Wednesday, October 11, 2017 6:06 AM
To: Paul Sokolovsky <paul.sokolovsky@linaro.org>; devel@lists.zephyrproject.org; Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>; David Brown <david.brown@linaro.org>; Kumar Gala <kumar.gala@linaro.org>; Nashif, Anas <anas.nashif@intel.com>
Subject: Re: BSD Sockets in mainline, and how that affects design decisions for the rest of IP stack (e.g. send MTU handling)

Hi,

On Tue, 2017-10-10 at 21:50 +0300, Paul Sokolovsky wrote:
Hello,


A solution originally proposed was that the mentioned API functions
should take an MTU into account, and not allow a user to add more data
than MTU allows (accounting also for protocol headers). This solution
is rooted in the well-known POSIX semantics of "short writes" - an
application can request an arbitrary amount of data to be written, but
a system is free to process less data, based on system resource
availability. Amount of processed data is returned, and an application
is expected to retry the operation for the remaining data. It was
posted as https://github.com/zephyrproject-rtos/zephyr/pull/119 .
Again,
at that time, there was no consensus about way to solve it, so it was
implemented only for BSD Sockets API.
We can certainly implement something like this for the net_context APIs. There is at least one issue with this as it is currently not easy to pass information to application how much data we are able to send, so currently it would be either that we could send all the data or none of it.


Much later,
https://github.com/zephyrproject-rtos/zephyr/pull/1330 was posted. It
works in following way: it allows an application to create an
oversized packet, but a stack does a separate pass over it and splits
this packet into several packets with a valid length. A comment
immediately received (not by me) was that this patch just duplicates
in an adhoc way IP fragmentation support as required by TCP/IP
protocol.
Note that currently we do not have IPv4 fragmentation support implemented, and IPv6 fragmentation is also disabled by default. Reason for this is that the fragmentation requires lot of extra memory to be used which might not be necessary in usual cases. Having TCP segments split needs much less memory.


I would like to raise an additional argument while POSIX-inspired
approach may be better.
I would say there is no better or worse approach here. Just a different point of view.


Consider a case when an application wants to send a big amount of
constant data, e.g. 900KB. It can be a system with e.g. 1MB of flash
and 64KB of RAM, an app sitting in ~100KB of flash, the rest
containing constant data to send. Following an "split oversized
packet" approach wouldn't help - an app wouldn't be able to create an
oversized packet of 900K - there's simply not enough RAM for it. So,
it would need to handle such a case differently anyway.
Of course your application is constrained by available memory and other limits by your hw.

But POSIX-based approach, would allow to handle it right away - any
application need to be prepared to retry operation until completion
anyway, the amount of data is not important.


That's the essence of the question this RFC poses: given that the
POSIX-based approach is already in the mainline, does it make sense to
go for a Zephyr-special, adhoc solutions for a problem (and as
mentioned at the beginning, there can be more issues with a similar
choice).
Please note that BSD socket API is fully optional and not always available. You cannot rely it to be present especially if you want to minimize memory consumption. We need more general solution instead of something that is only available for BSD sockets.



Answering "yes" may have interesting implications. For example, the
code in https://github.com/zephyrproject-rtos/zephyr/pull/1330 is not
needed for applications using BSD Sockets. There's at least another
issue solved on BSD Sockets level, but not on the native API. There's
an ongoing effort to separate kernel and userspace, and BSD Sockets
offer an automagic solution for that, while native API allows a user
app to access straight to the kernel networking buffer, so there's a
lot to solve there yet. Going like that, it may turn out that native
adhoc API, which initially was intended to small and efficient, will
grow bigger and more complex (== harder to stabilize, containing more
bugs) than something based on well tried and tested approach like
POSIX.
There has not been any public talk in mailing list about userspace/kernel separation and how it affects IP stack etc. so it is a bit difficult to say anything about this.



So, it would be nice if the networking stack, and overall Zephyr
architecture stakeholders consider both a particular issue and overall
implications on the design/implementation. There're many more details
than presented above, and the devil is definitely in details, there's
no absolutely "right" solution, it's a compromise. I hope that Jukka
and Tomasz, who are proponents of the second (GH-1330) approach can
correct me on the benefits of it.
You are unnecessarily creating this scenario about pro or against solution. I have an example application in https://github.com/zephyrpro
ject-rtos/zephyr/pull/980 that needs to send large (several kb) file to outside world using HTTP, and I am trying so solve it efficiently. The application will not use BSD sockets.



Thanks,
Paul

Jukka

_______________________________________________
Zephyr-devel mailing list
Zephyr-devel@lists.zephyrproject.org
https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel


--
Luiz Augusto von Dentz


Paul Sokolovsky
 

Hello,

On Wed, 11 Oct 2017 14:56:02 +0000
"Nashif, Anas" <anas.nashif@intel.com> wrote:

Paul,

You gave very detailed background information and listed issues we
had in the past but it was not clear what you are proposing,
Yes, looking at Jukka's response, I must have failed miserably to
convey what I propose. I propose:

1. To reject approach to the send MTU handling in
https://github.com/zephyrproject-rtos/zephyr/pull/1330

2. To adopt approach from
https://github.com/zephyrproject-rtos/zephyr/pull/119 , which however
may need further work to address concerns raised against it.


we do
have sockets already, are you suggesting we should move everything to
use sockets?
No, I don't suggest that (here).

Is the socket interface ready for this? Then there is
the usual comments being made whenever we discuss the IP stack
related to memory usage and footprint (here made by Jukka), can we
please quantify this and provide more data and context? For example I
would be interested in numbers showing how much more memory/flash do
we consume when sockets are used vs the same implementation using low
level APIs.
To have such numbers, first socket-based implementations of various
application-level protocols would need to exist. They currently don't,
and I personally don't think it's worthy investment of effort, at least
with the current state of affairs, when there're still known issues in
the underlying stack.

So, I'm left with just speculating that it's better to cross-adopt
approaches between native API and sockets API, instead of making them
diverge. That was the point of my post.

What is the penalty and is it justifiable, given that
using sockets would give us a more portable solution and would allow
the random user/developer to implement protocols more easily.

So my request is to have a more details proposals with going into the
history of this and how we can move forward from here and what such a
proposal would mean to existing code and protocols not using
sockets...
I exactly tried to go thru the history of the question, with the
relevant links. Hopefully the summary above clarifies the essence of
the proposal.

Thanks,
Paul


Anas


-----Original Message-----
From: Jukka Rissanen [mailto:jukka.rissanen@linux.intel.com]
Sent: Wednesday, October 11, 2017 6:06 AM
To: Paul Sokolovsky <paul.sokolovsky@linaro.org>;
devel@lists.zephyrproject.org; Tomasz Bursztyka
<tomasz.bursztyka@linux.intel.com>; David Brown
<david.brown@linaro.org>; Kumar Gala <kumar.gala@linaro.org>; Nashif,
Anas <anas.nashif@intel.com> Subject: Re: BSD Sockets in mainline,
and how that affects design decisions for the rest of IP stack (e.g.
send MTU handling)

Hi,

On Tue, 2017-10-10 at 21:50 +0300, Paul Sokolovsky wrote:
Hello,


A solution originally proposed was that the mentioned API functions
should take an MTU into account, and not allow a user to add more
data than MTU allows (accounting also for protocol headers). This
solution is rooted in the well-known POSIX semantics of "short
writes" - an application can request an arbitrary amount of data to
be written, but a system is free to process less data, based on
system resource availability. Amount of processed data is returned,
and an application is expected to retry the operation for the
remaining data. It was posted as
https://github.com/zephyrproject-rtos/zephyr/pull/119 . Again,
at that time, there was no consensus about way to solve it, so it
was implemented only for BSD Sockets API.
We can certainly implement something like this for the net_context
APIs. There is at least one issue with this as it is currently not
easy to pass information to application how much data we are able to
send, so currently it would be either that we could send all the data
or none of it.


Much later,
https://github.com/zephyrproject-rtos/zephyr/pull/1330 was posted.
It works in following way: it allows an application to create an
oversized packet, but a stack does a separate pass over it and
splits this packet into several packets with a valid length. A
comment immediately received (not by me) was that this patch just
duplicates in an adhoc way IP fragmentation support as required by
TCP/IP protocol.
Note that currently we do not have IPv4 fragmentation support
implemented, and IPv6 fragmentation is also disabled by default.
Reason for this is that the fragmentation requires lot of extra
memory to be used which might not be necessary in usual cases. Having
TCP segments split needs much less memory.


I would like to raise an additional argument while POSIX-inspired
approach may be better.
I would say there is no better or worse approach here. Just a
different point of view.


Consider a case when an application wants to send a big amount of
constant data, e.g. 900KB. It can be a system with e.g. 1MB of
flash and 64KB of RAM, an app sitting in ~100KB of flash, the rest
containing constant data to send. Following an "split oversized
packet" approach wouldn't help - an app wouldn't be able to create
an oversized packet of 900K - there's simply not enough RAM for it.
So, it would need to handle such a case differently anyway.
Of course your application is constrained by available memory and
other limits by your hw.

But POSIX-based approach, would allow to handle it right away - any
application need to be prepared to retry operation until completion
anyway, the amount of data is not important.


That's the essence of the question this RFC poses: given that the
POSIX-based approach is already in the mainline, does it make sense
to go for a Zephyr-special, adhoc solutions for a problem (and as
mentioned at the beginning, there can be more issues with a similar
choice).
Please note that BSD socket API is fully optional and not always
available. You cannot rely it to be present especially if you want to
minimize memory consumption. We need more general solution instead of
something that is only available for BSD sockets.



Answering "yes" may have interesting implications. For example, the
code in https://github.com/zephyrproject-rtos/zephyr/pull/1330 is
not needed for applications using BSD Sockets. There's at least
another issue solved on BSD Sockets level, but not on the native
API. There's an ongoing effort to separate kernel and userspace,
and BSD Sockets offer an automagic solution for that, while native
API allows a user app to access straight to the kernel networking
buffer, so there's a lot to solve there yet. Going like that, it
may turn out that native adhoc API, which initially was intended to
small and efficient, will grow bigger and more complex (== harder
to stabilize, containing more bugs) than something based on well
tried and tested approach like POSIX.
There has not been any public talk in mailing list about
userspace/kernel separation and how it affects IP stack etc. so it is
a bit difficult to say anything about this.



So, it would be nice if the networking stack, and overall Zephyr
architecture stakeholders consider both a particular issue and
overall implications on the design/implementation. There're many
more details than presented above, and the devil is definitely in
details, there's no absolutely "right" solution, it's a compromise.
I hope that Jukka and Tomasz, who are proponents of the second
(GH-1330) approach can correct me on the benefits of it.
You are unnecessarily creating this scenario about pro or against
solution. I have an example application in
https://github.com/zephyrpro ject-rtos/zephyr/pull/980 that needs to
send large (several kb) file to outside world using HTTP, and I am
trying so solve it efficiently. The application will not use BSD
sockets.



Thanks,
Paul

Jukka


--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Paul Sokolovsky
 

Hello Jukka,

On Wed, 11 Oct 2017 13:06:25 +0300
Jukka Rissanen <jukka.rissanen@linux.intel.com> wrote:

[]
A solution originally proposed was that the mentioned API functions
should take an MTU into account, and not allow a user to add more
data
than MTU allows (accounting also for protocol headers). This
solution is rooted in the well-known POSIX semantics of "short
writes" - an application can request an arbitrary amount of data to
be written, but
a system is free to process less data, based on system resource
availability. Amount of processed data is returned, and an
application
is expected to retry the operation for the remaining data. It was
posted as https://github.com/zephyrproject-rtos/zephyr/pull/119 .
Again,
at that time, there was no consensus about way to solve it, so it
was implemented only for BSD Sockets API.
We can certainly implement something like this for the net_context
APIs. There is at least one issue with this as it is currently not
easy to pass information to application how much data we are able to
send, so currently it would be either that we could send all the data
or none of it.
To clarify, there's no need "to pass information to application" per
se. However, the IP stack itself has to know the size of packet headers
at the time of packet creation. IIRC, that was the concern raised for
#119. So, I propose to work towards resolving that issue, and there're
at least 2 approaches:

1. To actually make the stack work like that ("plan ahead"), which might
require some non-trivial refactoring.
2. Or, just conservatively reserve the highest value for the header
size, even if that may mean that some packets will contain less
payload than maximum.

[]

Note that currently we do not have IPv4 fragmentation support
implemented, and IPv6 fragmentation is also disabled by default.
Reason for this is that the fragmentation requires lot of extra
memory to be used which might not be necessary in usual cases. Having
TCP segments split needs much less memory.
Perhaps. But POSIX "short write" approach would require ~ zero extra
memory.

I would like to raise an additional argument while POSIX-inspired
approach may be better.
I would say there is no better or worse approach here. Just a
different point of view.


Consider a case when an application wants to
send a big amount of constant data, e.g. 900KB. It can be a system
with e.g. 1MB of flash and 64KB of RAM, an app sitting in ~100KB
of flash, the rest containing constant data to send. Following an
"split oversized packet" approach wouldn't help - an app wouldn't be
able to create an oversized packet of 900K - there's simply not
enough
RAM for it. So, it would need to handle such a case differently
anyway.
Of course your application is constrained by available memory and
other limits by your hw.
I don't think this comment does good to the usecase presented. Of
course, any app is constrained by such, but the algorithm based on POSIX
"short write" approach allows to cover wider usecases in a simpler
manner.

[]

Please note that BSD socket API is fully optional and not always
available. You cannot rely it to be present especially if you want to
minimize memory consumption. We need more general solution instead of
something that is only available for BSD sockets.
As clarified in the response to Anas, in no way I propose BSD Sockets
API as an alternative to native API. However, I do propose some
design/implementation choices from BSD Sockets API to be adopted for
the native API.

[]

There has not been any public talk in mailing list about
userspace/kernel separation and how it affects IP stack etc. so it is
a bit difficult to say anything about this.
That's true, but we can/should think how it may be affected, or we'll
be caught in the cold water with them, and may come up with random
on-spot designs to address such future requirements, instead of
something well thought out.

[]

--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Boie, Andrew P
 

There has not been any public talk in mailing list about
userspace/kernel separation and how it affects IP stack etc. so it is
a bit difficult to say anything about this.
That's true, but we can/should think how it may be affected, or we'll be caught
in the cold water with them, and may come up with random on-spot designs to
address such future requirements, instead of something well thought out.
The userspace work has progressed to the point where we have enough confidence in the API design to open up the design discussion to a larger audience; until now enough things have been in flux (or uncertain) such that we've kept the discussion to the working group we established for it.

What we are trying to do is get something feature-complete into the tree for the upcoming 1.10 release, with partial test case coverage and initial documentation, on an 'experimental' basis; i.e. APIs and policies are subject to change. Then polish everything up for the 1.11 release, which would be the official debut of this feature.

I have to admit my knowledge of the network stack is quite poor, but broadly speaking here are a set of slides recently drafted which goes into some detail about what sort of kernel objects are accessible from user threads and the sort of restrictions they have. We expect to expose a subset of existing kernel APIs to user threads, and all driver APIs which don't involve registration of callbacks. Please feel free to leave comments in the document, or on this list.

https://docs.google.com/presentation/d/195ciwFwv7s0MX4AvL0KFB1iHm1_gRoXmn54mjS5fki8/edit?usp=sharing

I suspect the biggest implication for the network stack is that it uses registration of callbacks heavily, and it's forbidden to allow user threads to directly register callbacks that run in supervisor mode. But you can get around this (for example) by having the callback do minimal processing of the incoming data and signal a semaphore to have a user mode worker thread do the rest of the work. We are also looking into supporting user-mode workqueues. We also don't (yet) have a clear picture on what support for k_poll APIs we will have for userspace.

There's also the question of memory buffers, there would need to be some care taken that any buffers used by the stack that are exposed to the application contain purely data and no internal data structures private to the kernel. This constraint is why we don't provide system call interfaces to k_queue APIs.

Ideally in the fullness of time, we could migrate some parts of the network protocol stack to run in user mode, which I think would enhance the security of the system.

At the moment, current implementation effort is centered around getting our test cases running in user mode, and getting started on the formal documentation.

HTH,
Andrew


Paul Sokolovsky
 

Hello Andrew,

On Wed, 11 Oct 2017 18:53:24 +0000
"Boie, Andrew P" <andrew.p.boie@intel.com> wrote:

There has not been any public talk in mailing list about
userspace/kernel separation and how it affects IP stack etc. so
it is a bit difficult to say anything about this.
That's true, but we can/should think how it may be affected, or
we'll be caught in the cold water with them, and may come up with
random on-spot designs to address such future requirements, instead
of something well thought out.
The userspace work has progressed to the point where we have enough
confidence in the API design to open up the design discussion to a
larger audience; until now enough things have been in flux (or
uncertain) such that we've kept the discussion to the working group
we established for it.

What we are trying to do is get something feature-complete into the
tree for the upcoming 1.10 release, with partial test case coverage
and initial documentation, on an 'experimental' basis; i.e. APIs and
policies are subject to change. Then polish everything up for the
1.11 release, which would be the official debut of this feature.

I have to admit my knowledge of the network stack is quite poor, but
broadly speaking here are a set of slides recently drafted which goes
into some detail about what sort of kernel objects are accessible
from user threads and the sort of restrictions they have. We expect
to expose a subset of existing kernel APIs to user threads, and all
driver APIs which don't involve registration of callbacks. Please
feel free to leave comments in the document, or on this list.

https://docs.google.com/presentation/d/195ciwFwv7s0MX4AvL0KFB1iHm1_gRoXmn54mjS5fki8/edit?usp=sharing
Thanks for sharing these. What caught my attention is "No good way to
assert validity of k_mem_block object passed to k_mem_pool_free() - Just
tell userspace to use heap memory APIs in newlib! No need to re-invent
the C library..."

That's pretty much what I talked about - that with new requirements and
challenges, we may find out that a well-known and proven API like BSD
Sockets is a very good way to address them, instead of continuing to
add complexity to existing adhoc APIs.


I suspect the biggest implication for the network stack is that it
uses registration of callbacks heavily, and it's forbidden to allow
user threads to directly register callbacks that run in supervisor
mode. But you can get around this (for example) by having the
callback do minimal processing of the incoming data and signal a
semaphore to have a user mode worker thread do the rest of the work.
That's half of the work BSD Sockets do - they put network packets as
delivered via a callback into per-socket fifo.

We are also looking into supporting user-mode workqueues. We also
don't (yet) have a clear picture on what support for k_poll APIs we
will have for userspace.

There's also the question of memory buffers, there would need to be
some care taken that any buffers used by the stack that are exposed
to the application contain purely data and no internal data
structures private to the kernel. This constraint is why we don't
provide system call interfaces to k_queue APIs.
net_pkt's and net_buf's as used by native networking API do share this
problem - they have internal kernel data. Not only that, they are also
allocated from the pool, and are small objects, which can't be
protected by MPU or MMU individually. Which means that one application
could have access/corrupt networking data for other apps.

And above you write about protecting kernel from userspace, but is
there a requirement to protect one userspace entity (a thread in our
case, as we don't support processes) from another? I hope there's,
because it doesn't make much sense to go so long way of kernel vs
userspace separation and don't think about task separation. Just
imagine that the could be a thread running OTA, and another thread
running an application level 3rd-party lib. We don't want
vulnerability in the latter to compromise OTA process.

The solution to the problem is well known - don't try to export
kernel-level object (like network buffers) to userspace, just copy
*data* there as needed. That's 2nd part of what BSD Sockets do.

Ideally in the fullness of time, we could migrate some parts of the
network protocol stack to run in user mode, which I think would
enhance the security of the system.

At the moment, current implementation effort is centered around
getting our test cases running in user mode, and getting started on
the formal documentation.

HTH,
Andrew


--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Boie, Andrew P
 

That's pretty much what I talked about - that with new requirements and
challenges, we may find out that a well-known and proven API like BSD Sockets
is a very good way to address them, instead of continuing to add complexity to
existing adhoc APIs.
What I stated is that when faced with an existing Zephyr API that works only in supervisor mode, rather than do a parallel implementation in user mode, first see if the C library already provides a suitable equivalent.
I don't know if this is a valid comparison to whatever your agenda is with BSD sockets API, my feeling is that it isn't, but at any rate I am strictly neutral on that matter. You need to talk to Jukka.

And above you write about protecting kernel from userspace, but is there a
requirement to protect one userspace entity (a thread in our case, as we don't
support processes) from another? I hope there's, because it doesn't make much
sense to go so long way of kernel vs userspace separation and don't think about
task separation. Just imagine that the could be a thread running OTA, and
another thread running an application level 3rd-party lib. We don't want
vulnerability in the latter to compromise OTA process.
You're not going to believe this, but we did actually consider this situation.
The memory domain APIs exist for this purpose. User threads otherwise only have access to their own stack. They are in-tree and documented, although only working on ARM at the moment.

Andrew


Tomasz Bursztyka
 

Hi guys,

It was
posted as https://github.com/zephyrproject-rtos/zephyr/pull/119 . Again,
at that time, there was no consensus about way to solve it, so it was
implemented only for BSD Sockets API.

Much later,
https://github.com/zephyrproject-rtos/zephyr/pull/1330 was posted. It
works in following way: it allows an application to create an
oversized packet
(...)

There're many more
details than presented above, and the devil is definitely in details,
there's no absolutely "right" solution, it's a compromise. I hope that
Jukka and Tomasz, who are proponents of the second (GH-1330) approach
can correct me on the benefits of it.
Actually I missed the fact PR 1330 was about MTU handling. Does not sound generic enough.

In the end I don't approve both of the proposed solution. Let me explain why:

First, let's not rush on this MTU handling just yet, though it is much needed.
We first need this:

-> https://github.com/zephyrproject-rtos/zephyr/issues/3283

it will simplify a lot how packet are allocated. I haven't touched MTU stuff since
I did the net_pkt move because of this feature we'll use.

I foresee a lot of possible improvements with this issue resolved: certainly MTU handling, better memory
management than current frag model, but also better response against low memory (we could after all
send asap a tinier than MTU TCP segment if there was only a small amount of memory available,
and continue with the rest etc...).

Tomasz


Paul Sokolovsky
 

Hello Tomasz,

Thanks for responding and bringing up this discussion - it got
backlogged (so I'm doing homework on it in the background).

On Wed, 25 Oct 2017 18:13:18 +0200
Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com> wrote:

Hi guys,

It was
posted as https://github.com/zephyrproject-rtos/zephyr/pull/119 .
Again, at that time, there was no consensus about way to solve it,
so it was implemented only for BSD Sockets API.

Much later,
https://github.com/zephyrproject-rtos/zephyr/pull/1330 was posted.
It works in following way: it allows an application to create an
oversized packet
(...)

There're many more
details than presented above, and the devil is definitely in
details, there's no absolutely "right" solution, it's a compromise.
I hope that Jukka and Tomasz, who are proponents of the second
(GH-1330) approach can correct me on the benefits of it.
Actually I missed the fact PR 1330 was about MTU handling. Does not
sound generic enough.

In the end I don't approve both of the proposed solution.
That sounds fresh, thanks ;-)

Let me
explain why:

First, let's not rush on this MTU handling just yet, though it is
much needed. We first need this:

-> https://github.com/zephyrproject-rtos/zephyr/issues/3283
Ack, that's good thing to do...


it will simplify a lot how packet are allocated. I haven't touched
MTU stuff since I did the net_pkt move because of this feature we'll
use.

I foresee a lot of possible improvements with this issue resolved:
certainly MTU handling, better memory management than current frag
model, but also better response against low memory
... but I don't see how it directly relates to the topic of this RFC,
which is selecting paradigm to deal with the case that we have finite
units of buffering, and how that should affect user-facing API design.

There're definitely a lot to improve and optimize in our IP stack, and
the issue you mention is one of them. But it's going to be just that -
the optimization. But what we discuss is how to structure API:

1. Accept that the amount of buffering we can do is very finite, and
make applications be aware of that and ready to handle - the POSIX
inspired way. If done that way, we can just use a network packet as
a buffering unit and further optimize that handling.

2. Keep pretending that we can buffer mini-infinite amount of data.
It's mini-infinite because we still won't be able to buffer more than
RAM allows (actually, more than TX slab allows), and that's still too
little, so won't work for "real" amounts of data, which still will need
to fall back to p.1 handling above. Packet buffers are still used for
buffering, but looking at Jukka's implementation, they are used as
generic data buffers, and require pretty heavy post-processing - first
splitting oversized buffers into packet-friendly sizes (#1330),
stuffing protocol headers in front (we already do that, and that's
pretty awful and not zero-copy at all), etc. Again, all that happens
with no free memory available - it was already spent to buffer that
"mini-infinite" amount of data.


You also say that you don't like any of these choices. Well, there're
only so many ways to do. What do you have in mind?


(we could after
all send asap a tinier than MTU TCP segment if there was only a small
amount of memory available, and continue with the rest etc...).
That's how sockets work already - they ask user's data to be added to a
packets, and if less is added, it passes that info back to app (for it
to retry). The whole talk is about making that available to the native
API too (governed also by other constraints like MTU size).


Tomasz


--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog


Tomasz Bursztyka
 

On 26/10/2017 14:37, Paul Sokolovsky wrote:
Hello Tomasz,

Thanks for responding and bringing up this discussion - it got
backlogged (so I'm doing homework on it in the background).

On Wed, 25 Oct 2017 18:13:18 +0200
Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com> wrote:

Hi guys,

It was
posted as https://github.com/zephyrproject-rtos/zephyr/pull/119 .
Again, at that time, there was no consensus about way to solve it,
so it was implemented only for BSD Sockets API.

Much later,
https://github.com/zephyrproject-rtos/zephyr/pull/1330 was posted.
It works in following way: it allows an application to create an
oversized packet
(...)

There're many more
details than presented above, and the devil is definitely in
details, there's no absolutely "right" solution, it's a compromise.
I hope that Jukka and Tomasz, who are proponents of the second
(GH-1330) approach can correct me on the benefits of it.
Actually I missed the fact PR 1330 was about MTU handling. Does not
sound generic enough.

In the end I don't approve both of the proposed solution.
That sounds fresh, thanks ;-)

Let me
explain why:

First, let's not rush on this MTU handling just yet, though it is
much needed. We first need this:

-> https://github.com/zephyrproject-rtos/zephyr/issues/3283
Ack, that's good thing to do...

it will simplify a lot how packet are allocated. I haven't touched
MTU stuff since I did the net_pkt move because of this feature we'll
use.

I foresee a lot of possible improvements with this issue resolved:
certainly MTU handling, better memory management than current frag
model, but also better response against low memory
... but I don't see how it directly relates to the topic of this RFC,
which is selecting paradigm to deal with the case that we have finite
units of buffering, and how that should affect user-facing API design.
I was indeed only responding on MTU handling (as both PR do in a way).

There're definitely a lot to improve and optimize in our IP stack, and
the issue you mention is one of them. But it's going to be just that -
the optimization. But what we discuss is how to structure API:

1. Accept that the amount of buffering we can do is very finite, and
make applications be aware of that and ready to handle - the POSIX
inspired way. If done that way, we can just use a network packet as
a buffering unit and further optimize that handling.

2. Keep pretending that we can buffer mini-infinite amount of data.
It's mini-infinite because we still won't be able to buffer more than
RAM allows (actually, more than TX slab allows), and that's still too
little, so won't work for "real" amounts of data, which still will need
to fall back to p.1 handling above. Packet buffers are still used for
buffering, but looking at Jukka's implementation, they are used as
generic data buffers, and require pretty heavy post-processing - first
splitting oversized buffers into packet-friendly sizes (#1330),
stuffing protocol headers in front (we already do that, and that's
pretty awful and not zero-copy at all), etc. Again, all that happens
with no free memory available - it was already spent to buffer that
"mini-infinite" amount of data.


You also say that you don't like any of these choices. Well, there're
only so many ways to do. What do you have in mind?
As I am not using at all user APIs, I can't tell what would be the best.
But the issue is mostly found in the API usage it seems.

On socket you have one opaque type to access the network and read/write on it.
The data buffer is then split in two: user has to manage his raw data in his own buffers,
(and take to/from it when receiving/sending) where the actual network data
(encapsulated raw data) is handled behind by socket on send/recv.
Both sending/receiving logic is easy, as long as you have enough memory for user's buffer.

In zephyr, both are directly found at once in net_pkt. The user does not have
to create its own buffers: he just has to pick from net slabs, populate it, finalize and done.

From a memory usage point of view, the later is easier and more efficient
as long as the net stack is doing it well, obviously, like properly handling MTU, etc...
but mostly on _tx_ only. On rx, as the data is scattered around buffer fragments, it requires
the user do add logic on his side to parse the received data (the encapsulated one).
Thus the net_frag_read() and derived functions. Which can be a bit complicated to grasp I guess.

About the "mini-inifinite", that's up to the user to handle net_pkt_append() returned amount.
A bit like send() in socket. Though the difference here if of course data still needs to be send.


As I did change net_nbuf to net_pkt, my point doing it was exactly that net_pkt should represent
one unique IP packet. So from that, I would say net_pkt_append() must not go over MTU.

Note however that this MTU can be either HW MTU or IP based one.
For instance on 15.4/6lo you don't want to limit IPv6 packets on HW MTU, but you use this
IPv6 (minimum) size of 1280 bytes which in turn is fragmented through 6lo to fit in as many
as necessary 15.4 frames. But if not using 6lo, it would need to use HW MTU only.
(There is a possibility to handle bigger IPv6 packets through IPv6 fragmentation, I can't remember
is that generates as many net_pkt as necessary, or if it does all on one which would go against
net_pkt usage)

As I am not working against user API, I don't have good overview on how it's supposed to work.

Well, maybe my blabbering can help you guys to decide which way should be the best.

Tomasz


Paul Sokolovsky
 

Hello,

On Wed, 11 Oct 2017 13:06:25 +0300
Jukka Rissanen <jukka.rissanen@linux.intel.com> wrote:

[]

You are unnecessarily creating this scenario about pro or against
solution. I have an example application in
https://github.com/zephyrproject-rtos/zephyr/pull/980 that needs to
send large (several kb) file to outside world using HTTP, and I am
trying so solve it efficiently. The application will not use BSD
sockets.
So, this thread got backlogged somehow (bumped up by Tomasz yesterday),
so I decided to approach it from the other side - to look into your
usecase (#980) and see how easy it would be to convert it to rely on
https://github.com/zephyrproject-rtos/zephyr/pull/119 instead.

Before going to that, I'd like to mention another thing which happened
in the meantime: https://github.com/zephyrproject-rtos/zephyr/pull/4402
got merged, which actually uses the technique proposed by me to send
largish files (more than 1 network packet), and can pass a more or less
non-trivial load tests (10000 iterations with Apache Bench). That's an
improvement from few months ago, when it was easy to deadlock it with
much less iterations. So, the solution can be called "tested and tried"
now, in a sense. (There're still deadlocks happening (e.g. #4216)
which we need to investigate.)


Anyway, back to your
https://github.com/zephyrproject-rtos/zephyr/pull/980. Specifically, I
reviewed its commit
https://github.com/zephyrproject-rtos/zephyr/pull/980/commits/2092924326dc59eea16eb3327a385666431c39e7#diff-1118253502f0844ef016460a07db48df
"samples: net: rpl: Simple RPL border router application".

Looking thru it, I got an idea why a socket sample is subject to
deadlocks (#4216), while other samples maybe not. That's because they
have comments like:

+#if defined(CONFIG_NET_L2_BLUETOOTH)
+#error "TCP connections over Bluetooth need CONFIG_NET_CONTEXT_NET_PKT_POOL "\
+ "defined."
+#endif /* CONFIG_NET_L2_BLUETOOTH */

Instead of investigating cause of deadlocks, they workaround it with:

+#if defined(CONFIG_NET_CONTEXT_NET_PKT_POOL)
+NET_PKT_TX_SLAB_DEFINE(http_srv_tx, 64);
+NET_PKT_DATA_POOL_DEFINE(http_srv_data, 64);

That's 8K with our default fragment size of 128 bytes.

But that's actually not what's done by your sample app, it doesn't
define CONFIG_NET_CONTEXT_NET_PKT_POOL. Instead it defines:

+CONFIG_NET_BUF_RX_COUNT=128
+CONFIG_NET_BUF_TX_COUNT=128

That's 16KB RX and TX buffers each.


So, let's summarize:

Your application, with 16KB send buffer, and patch
https://github.com/zephyrproject-rtos/zephyr/pull/1330, can send files
of several kb in size. Few simple questions:

1. What happens if your app needs to send file of 17KB?
2. What happens if there're no 16KB for send buffers, but only 1-2K?

The answer is obvious: it won't work.


At the same time, my proposal is all about making an API which will
allow any app to send 1MB (or more) files with 1KB (or less) buffers.


I agree with what you wrote - there're different ways to approach
problems and many ways of implementation. But we design an embedded IP
stack, and constrained by the hardware resources ("Zephyr runs in
8K"). It doesn't make sense two implement 2 solutions. We should
choose the one which allows to cover more usecases with less resources.


Now to remind, I started looking with the idea to see how the sample
app can be converted to use short-write-and-retry approach. I found
that it's not directly possible on the level of the app - due to
peculiarities of HTTP API used:
https://github.com/zephyrproject-rtos/zephyr/pull/980#pullrequestreview-71843524

I shared my concerns with the existing HTTP API (e.g.
https://github.com/zephyrproject-rtos/zephyr/issues/3796), and concerns
that its rewrite doesn't solve enough issues. But all this time, I
treated matter of HTTP API exactly as "there're different ways to do
it, and one way shouldn't be much worse than another". But I'm afraid,
we reached a point when design of the HTTP API affects the design of
IP stack, and not in the very right direction. I suggest we pause and
try to rework it (HTTP API), even if from the basics, and using the
ground requirements like "relying on more buffering than absolute
bare minimum is a bad thing".


I'm also pretty much sad to come out with such suggestion, because you
have a pretty cool and useful app on your hands, and I just
some useless demo which barely started to work. But I explained the
problem with it - your app works, because it requires more resources
than needed, and thus it won't work so well on other hardware. And as
experience shows, every app so far has various problems, so by taking
time to rebase it on a more generic, simpler API, we can solve many
yet-to-be-exposed problems.


Thanks for your consideration.

[]

--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog