Re: dhcp integration into the platform


Piotr Mienkowski
 

On 24.03.2017 12:07, Luiz Augusto von Dentz wrote:
Hi Marcus,

On Fri, Mar 24, 2017 at 11:59 AM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
Hi,


On 23 March 2017 at 19:26, Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote:
Hi Marcus,

On Thu, Mar 23, 2017 at 6:51 PM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
Hi,

The network interface patches proposed as a result of this thread have
generated a fair amount of discussion both in patch reviews and in
IRC. Now would seem like a good time to summarize where we are and
pull together some of the various discussion points that have been
raised.

Current status:

A bunch of preparatory patches to dhcpv4 have been merged. Notable changes:
- Public interface now provides net_dhcpv4_start(iface) and
net_dhcpv4_stop(iface).
- Various initialization issues that would prevent concurrent dhcpv4
operation on multiple ifaces are resolved.
- dhcpv4 will now remove leased resources from the network stack on
lease lapse/release.

There is one more small but significant dhcpv4 patch outstanding that
catches L2 up/down events and kicks the appropriate dhcpv4 machinery
per interface. This patch is currently blocked pending necessary
support in net_if (see below). Once this patch is in place an
application will be able to start (and stop dhcpv4) per interface as
now. Once started dhcpv4 will catch L2 up/down events and acquire,
renew and release leases as required, Eventually the responsibility
to call call net_dhcpv4_start/stop() may be moved from an application
to a 'connection manager'... but that is for the future.

The 'net_if' patches are in their third iteration and have generated
by far the most discussion.

The objective of the net_if patches is to arrange for L2 up/down
network management events to be raised when a functional L2 iface
becomes available for use, or conversely becomes unavailable. These
events can then be caught by dhcpv4 in order for dhcpv4 to manage
IP/L3 configuration.

In the current net_if implementation there are two significant
functions: net_if_up and net_if_down(). These functions call the
underlying L2 enable() callback, set and clear the net_if
NET_IF_ENABLED flag and raise NET_EVENT_IF_UP/DOWN network management
events.

After re-reading various comments and discussion on the existing patch
set I've come to the conclusion that there are two different world
views of the conceptual purpose of net_if_up() and net_if_down().

View 1:
net_if_up/down provide an interface for a higher/management layer to
communicate downwards and mark an iface as enabled or disabled
(irrespective of the state of the underlying L2)

This world view is supported by:
- these functions call down to the enable() callback in the underlying
L2 (ie they direcly call L2 telling it whether to enable or disable).
- in the absence of a connection manager the network stack hardwires a
call to net_if_up() for every iface at system boot (net_if_post_init).

View 2:
net_if_up/down provide an interface for an underlying L2 to
communicate upwards that an iface as up/working.

This world view is supported by:
- the bluetooth stack calls net_if_up/down on ipsp connect/disconnect
- the net_if_up/down terminology suggests this behaviour (as opposed
to being explicitly called enable/disable)

Conceptually there are four APIs here: enable/disable and up/down.
The former two provide a management interface that allows a higher
layer to requested that an iface is enabled or disabled, likely called
by a connection manager or equivalent. The latter two allow the stack
below the iface to report upwards whether or not an enabled iface
actually has a link up or not.

The l2 enable callback conceptually belongs with the enable/disable
interface. The network management event up/down signalling
conceptually belongs with the up/down interface.

In the current tree I think we have a slightly odd merge of the two
concepts where some code treats net_if_up/down() as if they implement
enable/disable semantics, while other code treats
net_if_up()/net_if_down() as if they implement up/down semantics.
Notably we have the network stack initialization code hardwiring
net_if_up() on all interfaces and we have L2 enable hung on
net_if_up/down() both of these behaviours associated with an
enable/disable semantic yet we also have the BT stack using
net_if_up/down() as a notification mechanism that L2 is up/down. (It
appears to me that for an iface associated with BT, the iface will be
up'd at system boot and then re-up'd on ipsp connect )
Can anyone add some insight on this issue? The code looks to me as if
net_if_up/down() were originally added to provide an enable/disable
semantic, but we have BT using them in what appears to be an up/down
semantic. In the context of BT ifaces' , why do we have net_if_up()
being called twice on the iface. Initially at system boot and then
again on ipsp_connect ?
Enable/disable was added as an interface to net_if_up and net_if_down,
but since these functions are only supposed to be called from the L2
driver we could actually remove the enable/disable interface since it
is just calling the L2 driver which is what is calling net_if_up in
the first place.
I guess we end up with this design because we wanted net_if_up to turn
the NET_IF_UP during the init procedure, that way drivers that don't
have link detection, those that don't implement enable callback, just
bypass it. Is this that hard to figure out from the code?

Various points that have come up in discussion (please correct me if I
misrepresent or miss some point of view):

1) Should we have enable/disable. The general view seems to be that
we don't have a solid use case for enable/disable therefore we should
not have them.

2) BT advertise should be disabled if a network interface is
disabled(). IMO this is actually the use case that suggests we should
keep enable/disable.

3) Should we have 1 or 2 net_if flags. The general view seems to be
that we should have only 1, I think in practice this is driven by
whether we keep or remove an enable/disable API.

4) Physical interfaces should not power up power down as a result of
L2 enable/disable, that should all be handled via a separate power
management API.


There are (at least) two ways forward:

1) We drop the enable/disable semantic. This implies:
- We remove the L2 enable() callback completely.
- We remove hardwired net_if_up() calls when the network stack boots.
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops).
This may be racy since the net_if_post_init may not have been called
yet which means the RX and TX thread may not be ready.

- BT stays as it is (advertise hardwired on)
Note that the enable/disable semantic was introduced for L2 link
detection,
which is why it is an L2/LL API. Now from the discussion we
had in the IRC what we seem to be really missing is a L3/IP interface
to tell when that layer is available so the application can start
sending packets. We did agree that we need to make samples that react
to L3/IP being up not L2/LL which should probably remain just to start
the procedure to acquire IP address, etc, so by given this option it
either means we did not understand each other or you did not agree
after all the discussions we had.
I agree, there is a need for a mechanism for applications to
synchronize with L3. However, that is not related to the issue of how
dhcp synchronizes with L2, hence I've deliberately not addressed it in
this thread. That can wait until we have L2 sorted.
dhcp shall subscribe for NET_EVENT_IF_UP, this is in fact already used
in some sample, which should probably be changed to wait for an IP
event instead. Perhaps you want to change the semantic of
NET_EVENT_IF_UP
to signal when both L2 and L3 are up, that is of course possible but
then we need a new event to signal L2 is up, e.g: NET_EVENT_L2_UP,
dhcp would subscribe to that and later emit NET_EVENT_IF_UP when IP is
up, that is all possible but it doesn't have anything to do with
l2->enable/disable as these callbacks are for L2 as the name suggests.

2) We keep the enable/disable semantic. This implies:
- We split net_if_up/down into net_if_enable/disable() and
net_if_up/down() such that net_if_enable calls l2->enable() while
net_if_up/down deals with NET_IF_UP and raising net_event_if_up/down
- The hardwired net_if_up() calls at network stack boot are switched
to call net_if_enable()
- The BT L2 enable callback is used to turn advertising on/off
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops). (BT already does this)
Either this is mixing layer L3/IP states with L2/LL, or you do want to
introduce runtime RFKILL concept, which is it? If this is for L3/IP
then that should not mess up with L2 API, for runtime RFKILL this
should be done in the L1 driver so we disable everything, including
interrupts and could possible power down the radio. Thoughts?
Neither. I don't believe there is any L3 in either proposal here, if
you think there is then can you be specific about where you see it?

My understanding of rfkill is that it is a mechanism to allow some
management agent or the user to kill all RF output in a device. The
reference to BT advertising above is actually in response to your IRC
comment that if the interface is disabled we should not be advertising
in bluetooth, I interpreted which I've interpreted as BT should not
advertise IPSP rather than as "RFKILL everything", did i miss
understand?

My intention here is not to change the BT behaviour. My interest is
in finding a way forward to split the current mixed net_if/L2
enable/disable/up/down behaviour embedded in net_if_up() and
net_if_down() into a distinct enable/disable and up/down, specifically
splitting out the up/down semantic such that network devices can
communicate link up/down upwards through the stack and dhcp can catch
those notifications.
So, what is the final outcome of the discussion. Which one of the two
proposals Marcus has made is the preferred one? Is there a third option?
It seems it is still unclear how to handle data link layer connection
lost / reestablished events, e.g. which function should be called by an
Ethernet driver when it determines that network cable was dis-/re-connected?

Piotr

Join devel@lists.zephyrproject.org to automatically receive all group messages.