dhcp integration into the platform


Marcus Shawcroft <marcus.shawcroft@...>
 

Hi,

This evening I took a look at how we might better integrate dhcpv4
into the platform.

In the current tree, in order to use dhcp the application is expected
to start the dhcp client explicitly. There is no synchronization
between network link up / link down events and the dhcp state machine.

The dhcp state machine will start issuing discover messages
irrespective of the link status, once the link comes up, dhcp will
typically run through request/ack and install an IP number and GW etc.
Should the link drop and subsequently up again the dhcp state machine
remains oblivious.

Better integration of dhcp into the stack looks reasonably straight forward.

I propose the following:

- dhcp_start() is modified to initialize the dhcp context, but remain
in the INIT state.
- net_if is modified to generate network management link up / down events
- dhcpv4 is modified to capture link up and link down events from net mgmt
- dhcpv4 enters discover on link up
- dhcpv4 performs unset_dhcpv4_on_iface for link down
- unset_dhcpv4_on_iface needs to net_if_ipv4_addr_rm (it should
probably do that anyway)
- dhcp_start is renamed and hooked in by SYS_INIT ?
- A new empty dhcp_start is defined and marked deprecated (to be nice
to apps that might currently call it).
- The current public include/dhcp.h exposed interface is removed.

There are a few details Ive not thought through properly yet, notably:

- The net_if layer already has link_up and link_down events. These
however are defined to fire when a net_if is enabled or disabled, not
when its link goes up or down, hence either these events need to be
redefined or we introduce two new events to represent the link up and
down.

- I'm not sure what the appropriate way of managing the removal of a
public .h file should be.

- There are several way of wiring up the network management link up /
down notifiers:
1) Drivers do it directly.
2) Drivers call the net_if layer which in turn issues the network
management events.

Before I take this any further I'd appreciate feed back on sanity of
the approach and indeed whether such patches would be welcome.

Cheers
/Marcus


Anas Nashif
 

Hi

On Thu, Feb 9, 2017 at 4:53 AM, Marcus Shawcroft <marcus.shawcroft@...> wrote:
Hi,

This evening I took a look at how we might better integrate dhcpv4
into the platform.

In the current tree, in order to use dhcp the application is expected
to start the dhcp client explicitly.  There is no synchronization
between network link up / link down events and the dhcp state machine.

The dhcp state machine will start issuing discover messages
irrespective of the link status, once the link comes up, dhcp will
typically run through request/ack and install an IP number and GW etc.
Should the link drop and subsequently up again the dhcp state machine
remains oblivious.

Better integration of dhcp into the stack looks reasonably straight forward.

I propose the following:

- dhcp_start() is modified to initialize the dhcp context, but remain
in the INIT state.
- net_if is modified to generate network management link up / down events
- dhcpv4 is modified to capture link up and link down events from net mgmt
- dhcpv4 enters discover on link up
- dhcpv4 performs unset_dhcpv4_on_iface for link down
- unset_dhcpv4_on_iface needs to net_if_ipv4_addr_rm (it should
probably do that anyway)
- dhcp_start is renamed and hooked in by SYS_INIT ?
- A new empty dhcp_start is defined and marked deprecated (to be nice
to apps that might currently call it).
- The current public include/dhcp.h exposed interface is removed.

What we see here are the first signs of a connection manager :)

We do need DHCP to act as a service, when writing an application that relies on DHCP it makes sense to have the initialisation and setup of the networking device consistently with having to reimplement the code over and over again in every application.

I can also see the DNS service being setup if we get a DNS server address via DHCP and set the context of the DNS client to allow resolution queries without having to do that by foot.

+1 for the approach described above.

 

There are a few details Ive not thought through properly yet, notably:

- The net_if layer already has link_up and link_down events.  These
however are defined to fire when a net_if is enabled or disabled, not
when its link goes up or down, hence either these events need to be
redefined or we introduce two new events to represent the link up and
down.

- I'm not sure what the appropriate way of managing the removal of a
public .h file should be.


Well, this API is not released yet, so I think it should be easy to remove, but we are running out of time for 1.7...


Anas

 

- There are several way of wiring up the network management link up /
down notifiers:
1) Drivers do it directly.
2) Drivers call the net_if layer which in turn issues the network
management events.

Before I take this any further I'd appreciate feed back on sanity of
the approach and indeed whether such patches would be welcome.

Cheers
/Marcus
_______________________________________________
Zephyr-devel mailing list
Zephyr-devel@lists.zephyrproject.org
https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel


Ravi kumar Veeramally
 

Hi Marcus,

One more thing, we should avoid sending dhcpv4 discover messages
on unsupported interfaces. E.g. If we have 15.4 and Ethernet, both
IPv6 and IPv4 are enabled. We should not start DHCPv4 on 15.4 or
any unsupported interface. (we can not enable Kconfig options
per interface).

- Ravi.

On 02/09/2017 01:23 AM, Marcus Shawcroft wrote:
Hi,

This evening I took a look at how we might better integrate dhcpv4
into the platform.

In the current tree, in order to use dhcp the application is expected
to start the dhcp client explicitly. There is no synchronization
between network link up / link down events and the dhcp state machine.

The dhcp state machine will start issuing discover messages
irrespective of the link status, once the link comes up, dhcp will
typically run through request/ack and install an IP number and GW etc.
Should the link drop and subsequently up again the dhcp state machine
remains oblivious.

Better integration of dhcp into the stack looks reasonably straight forward.

I propose the following:

- dhcp_start() is modified to initialize the dhcp context, but remain
in the INIT state.
- net_if is modified to generate network management link up / down events
- dhcpv4 is modified to capture link up and link down events from net mgmt
- dhcpv4 enters discover on link up
- dhcpv4 performs unset_dhcpv4_on_iface for link down
- unset_dhcpv4_on_iface needs to net_if_ipv4_addr_rm (it should
probably do that anyway)
- dhcp_start is renamed and hooked in by SYS_INIT ?
- A new empty dhcp_start is defined and marked deprecated (to be nice
to apps that might currently call it).
- The current public include/dhcp.h exposed interface is removed.

There are a few details Ive not thought through properly yet, notably:

- The net_if layer already has link_up and link_down events. These
however are defined to fire when a net_if is enabled or disabled, not
when its link goes up or down, hence either these events need to be
redefined or we introduce two new events to represent the link up and
down.

- I'm not sure what the appropriate way of managing the removal of a
public .h file should be.

- There are several way of wiring up the network management link up /
down notifiers:
1) Drivers do it directly.
2) Drivers call the net_if layer which in turn issues the network
management events.

Before I take this any further I'd appreciate feed back on sanity of
the approach and indeed whether such patches would be welcome.

Cheers
/Marcus
_______________________________________________
Zephyr-devel mailing list
Zephyr-devel@lists.zephyrproject.org
https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel


Shtivelman, Bella <bella.shtivelman@...>
 

Hi,

In the existing DHCP design it’s not enough for application just subscribe to NET_EVENT_IF_UP event.

For our client application we observed situations when the link was up, but IPV4 address still hasn’t been assigned yet.

The issue was solved by subscribing to NET_EVENT_IPV4_ADDR_ADD event.

So maybe in the new design it should be taken into account as well.

 

Thanks,

Bella

 

From: zephyr-devel-bounces@... [mailto:zephyr-devel-bounces@...] On Behalf Of Anas Nashif
Sent: Thursday, February 09, 2017 01:56
To: Marcus Shawcroft <marcus.shawcroft@...>
Cc: Rissanen, Jukka <jukka.rissanen@...>; Bursztyka, Tomasz <tomasz.bursztyka@...>; devel@...
Subject: Re: [Zephyr-devel] dhcp integration into the platform

 

Hi

 

On Thu, Feb 9, 2017 at 4:53 AM, Marcus Shawcroft <marcus.shawcroft@...> wrote:

Hi,

This evening I took a look at how we might better integrate dhcpv4
into the platform.

In the current tree, in order to use dhcp the application is expected
to start the dhcp client explicitly.  There is no synchronization
between network link up / link down events and the dhcp state machine.

The dhcp state machine will start issuing discover messages
irrespective of the link status, once the link comes up, dhcp will
typically run through request/ack and install an IP number and GW etc.
Should the link drop and subsequently up again the dhcp state machine
remains oblivious.

Better integration of dhcp into the stack looks reasonably straight forward.

I propose the following:

- dhcp_start() is modified to initialize the dhcp context, but remain
in the INIT state.
- net_if is modified to generate network management link up / down events
- dhcpv4 is modified to capture link up and link down events from net mgmt
- dhcpv4 enters discover on link up
- dhcpv4 performs unset_dhcpv4_on_iface for link down
- unset_dhcpv4_on_iface needs to net_if_ipv4_addr_rm (it should
probably do that anyway)
- dhcp_start is renamed and hooked in by SYS_INIT ?
- A new empty dhcp_start is defined and marked deprecated (to be nice
to apps that might currently call it).
- The current public include/dhcp.h exposed interface is removed.

 

What we see here are the first signs of a connection manager :)

 

We do need DHCP to act as a service, when writing an application that relies on DHCP it makes sense to have the initialisation and setup of the networking device consistently with having to reimplement the code over and over again in every application.

 

I can also see the DNS service being setup if we get a DNS server address via DHCP and set the context of the DNS client to allow resolution queries without having to do that by foot.

 

+1 for the approach described above.

 

 


There are a few details Ive not thought through properly yet, notably:

- The net_if layer already has link_up and link_down events.  These
however are defined to fire when a net_if is enabled or disabled, not
when its link goes up or down, hence either these events need to be
redefined or we introduce two new events to represent the link up and
down.

- I'm not sure what the appropriate way of managing the removal of a
public .h file should be.

 

 

Well, this API is not released yet, so I think it should be easy to remove, but we are running out of time for 1.7...

 

 

Anas

 

 


- There are several way of wiring up the network management link up /
down notifiers:
1) Drivers do it directly.
2) Drivers call the net_if layer which in turn issues the network
management events.

Before I take this any further I'd appreciate feed back on sanity of
the approach and indeed whether such patches would be welcome.

Cheers
/Marcus
_______________________________________________
Zephyr-devel mailing list
Zephyr-devel@...
https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel

 

---------------------------------------------------------------------
A member of the Intel Corporation group of companies

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


Gil Pitney
 

In the existing DHCP design it’s not enough for application just subscribe
to NET_EVENT_IF_UP event.

For our client application we observed situations when the link was up, but
IPV4 address still hasn’t been assigned yet.

The issue was solved by subscribing to NET_EVENT_IPV4_ADDR_ADD event.

So maybe in the new design it should be taken into account as well.
+1.

For example, echo_client application starts DHCP (if configured), but
does not wait for the IPv4 address to be acquired before proceeding
(and so uses an incorrect static IP address instead).

But perhaps, the event should be something like NET_DHCP_IPV4_ADDR_ACQUIRED?


Marcus Shawcroft <marcus.shawcroft@...>
 

On 23 February 2017 at 19:35, Gil Pitney <gil.pitney@linaro.org> wrote:
In the existing DHCP design it’s not enough for application just subscribe
to NET_EVENT_IF_UP event.

For our client application we observed situations when the link was up, but
IPV4 address still hasn’t been assigned yet.

The issue was solved by subscribing to NET_EVENT_IPV4_ADDR_ADD event.

So maybe in the new design it should be taken into account as well.
+1.

For example, echo_client application starts DHCP (if configured), but
does not wait for the IPv4 address to be acquired before proceeding
(and so uses an incorrect static IP address instead).

But perhaps, the event should be something like NET_DHCP_IPV4_ADDR_ACQUIRED?
Hi

Is it important to distinguish how the address was acquired?

/Marcus


Gil Pitney
 

Anticipating application initialization code like this:

#ifdef CONFIG_NET_DHCPV4
<wait on semaphore signaled by a NET_DHCP_IPV4_ADDR_ACQUIRED callback >
#else
<use pre-configured static ip address>
#endif

It probably doesn't matter how the address is acquired, but the
application needs to somehow know to wait for the IPv4 address
notification before proceeding. If DHCPV4 is configured, at least
the app knows it needs to wait in that case.

Could also remove the #ifdef in the app, have the app ask the net
interface if DHCP is supported (or if IP to be assigned by other
means), and if so, register to wait on a semaphore and wait on it,
otherwise proceed with a static IP address.



On 23 February 2017 at 12:49, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
On 23 February 2017 at 19:35, Gil Pitney <gil.pitney@linaro.org> wrote:
In the existing DHCP design it’s not enough for application just subscribe
to NET_EVENT_IF_UP event.

For our client application we observed situations when the link was up, but
IPV4 address still hasn’t been assigned yet.

The issue was solved by subscribing to NET_EVENT_IPV4_ADDR_ADD event.

So maybe in the new design it should be taken into account as well.
+1.

For example, echo_client application starts DHCP (if configured), but
does not wait for the IPv4 address to be acquired before proceeding
(and so uses an incorrect static IP address instead).

But perhaps, the event should be something like NET_DHCP_IPV4_ADDR_ACQUIRED?
Hi

Is it important to distinguish how the address was acquired?

/Marcus


Marcus Shawcroft <marcus.shawcroft@...>
 

On 23 February 2017 at 22:55, Gil Pitney <gil.pitney@linaro.org> wrote:
Anticipating application initialization code like this:

#ifdef CONFIG_NET_DHCPV4
<wait on semaphore signaled by a NET_DHCP_IPV4_ADDR_ACQUIRED callback >
#else
<use pre-configured static ip address>
#endif
An interface always gets an address, hence I think that for both
static and DHCP cases the following would work:

<wait on semaphore signaled by a NET_EVENT_IPV4_ADDR_ADD callback >

Cheers
/Marcus


Gil Pitney
 

As long as all apps with static IPv4 addresses call net_if_ipv4_addr_add().

So, yes, that should work!

Thanks

On 23 February 2017 at 15:06, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
On 23 February 2017 at 22:55, Gil Pitney <gil.pitney@linaro.org> wrote:
Anticipating application initialization code like this:

#ifdef CONFIG_NET_DHCPV4
<wait on semaphore signaled by a NET_DHCP_IPV4_ADDR_ACQUIRED callback >
#else
<use pre-configured static ip address>
#endif
An interface always gets an address, hence I think that for both
static and DHCP cases the following would work:

<wait on semaphore signaled by a NET_EVENT_IPV4_ADDR_ADD callback >

Cheers
/Marcus


Marcus Shawcroft <marcus.shawcroft@...>
 

Hi,

The network interface patches proposed as a result of this thread have
generated a fair amount of discussion both in patch reviews and in
IRC. Now would seem like a good time to summarize where we are and
pull together some of the various discussion points that have been
raised.

Current status:

A bunch of preparatory patches to dhcpv4 have been merged. Notable changes:
- Public interface now provides net_dhcpv4_start(iface) and
net_dhcpv4_stop(iface).
- Various initialization issues that would prevent concurrent dhcpv4
operation on multiple ifaces are resolved.
- dhcpv4 will now remove leased resources from the network stack on
lease lapse/release.

There is one more small but significant dhcpv4 patch outstanding that
catches L2 up/down events and kicks the appropriate dhcpv4 machinery
per interface. This patch is currently blocked pending necessary
support in net_if (see below). Once this patch is in place an
application will be able to start (and stop dhcpv4) per interface as
now. Once started dhcpv4 will catch L2 up/down events and acquire,
renew and release leases as required, Eventually the responsibility
to call call net_dhcpv4_start/stop() may be moved from an application
to a 'connection manager'... but that is for the future.

The 'net_if' patches are in their third iteration and have generated
by far the most discussion.

The objective of the net_if patches is to arrange for L2 up/down
network management events to be raised when a functional L2 iface
becomes available for use, or conversely becomes unavailable. These
events can then be caught by dhcpv4 in order for dhcpv4 to manage
IP/L3 configuration.

In the current net_if implementation there are two significant
functions: net_if_up and net_if_down(). These functions call the
underlying L2 enable() callback, set and clear the net_if
NET_IF_ENABLED flag and raise NET_EVENT_IF_UP/DOWN network management
events.

After re-reading various comments and discussion on the existing patch
set I've come to the conclusion that there are two different world
views of the conceptual purpose of net_if_up() and net_if_down().

View 1:
net_if_up/down provide an interface for a higher/management layer to
communicate downwards and mark an iface as enabled or disabled
(irrespective of the state of the underlying L2)

This world view is supported by:
- these functions call down to the enable() callback in the underlying
L2 (ie they direcly call L2 telling it whether to enable or disable).
- in the absence of a connection manager the network stack hardwires a
call to net_if_up() for every iface at system boot (net_if_post_init).

View 2:
net_if_up/down provide an interface for an underlying L2 to
communicate upwards that an iface as up/working.

This world view is supported by:
- the bluetooth stack calls net_if_up/down on ipsp connect/disconnect
- the net_if_up/down terminology suggests this behaviour (as opposed
to being explicitly called enable/disable)

Conceptually there are four APIs here: enable/disable and up/down.
The former two provide a management interface that allows a higher
layer to requested that an iface is enabled or disabled, likely called
by a connection manager or equivalent. The latter two allow the stack
below the iface to report upwards whether or not an enabled iface
actually has a link up or not.

The l2 enable callback conceptually belongs with the enable/disable
interface. The network management event up/down signalling
conceptually belongs with the up/down interface.

In the current tree I think we have a slightly odd merge of the two
concepts where some code treats net_if_up/down() as if they implement
enable/disable semantics, while other code treats
net_if_up()/net_if_down() as if they implement up/down semantics.
Notably we have the network stack initialization code hardwiring
net_if_up() on all interfaces and we have L2 enable hung on
net_if_up/down() both of these behaviours associated with an
enable/disable semantic yet we also have the BT stack using
net_if_up/down() as a notification mechanism that L2 is up/down. (It
appears to me that for an iface associated with BT, the iface will be
up'd at system boot and then re-up'd on ipsp connect )

Various points that have come up in discussion (please correct me if I
misrepresent or miss some point of view):

1) Should we have enable/disable. The general view seems to be that
we don't have a solid use case for enable/disable therefore we should
not have them.

2) BT advertise should be disabled if a network interface is
disabled(). IMO this is actually the use case that suggests we should
keep enable/disable.

3) Should we have 1 or 2 net_if flags. The general view seems to be
that we should have only 1, I think in practice this is driven by
whether we keep or remove an enable/disable API.

4) Physical interfaces should not power up power down as a result of
L2 enable/disable, that should all be handled via a separate power
management API.


There are (at least) two ways forward:

1) We drop the enable/disable semantic. This implies:
- We remove the L2 enable() callback completely.
- We remove hardwired net_if_up() calls when the network stack boots.
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops).
- BT stays as it is (advertise hardwired on)

2) We keep the enable/disable semantic. This implies:
- We split net_if_up/down into net_if_enable/disable() and
net_if_up/down() such that net_if_enable calls l2->enable() while
net_if_up/down deals with NET_IF_UP and raising net_event_if_up/down
- The hardwired net_if_up() calls at network stack boot are switched
to call net_if_enable()
- The BT L2 enable callback is used to turn advertising on/off
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops). (BT already does this)

In option 1 we remove the mechanism we have to communicate to the BT
stack that advertising should be on/off.

IMHO route 2 is a better way forward.

Thoughts?

/Marcus


Luiz Augusto von Dentz
 

Hi Marcus,

On Thu, Mar 23, 2017 at 6:51 PM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
Hi,

The network interface patches proposed as a result of this thread have
generated a fair amount of discussion both in patch reviews and in
IRC. Now would seem like a good time to summarize where we are and
pull together some of the various discussion points that have been
raised.

Current status:

A bunch of preparatory patches to dhcpv4 have been merged. Notable changes:
- Public interface now provides net_dhcpv4_start(iface) and
net_dhcpv4_stop(iface).
- Various initialization issues that would prevent concurrent dhcpv4
operation on multiple ifaces are resolved.
- dhcpv4 will now remove leased resources from the network stack on
lease lapse/release.

There is one more small but significant dhcpv4 patch outstanding that
catches L2 up/down events and kicks the appropriate dhcpv4 machinery
per interface. This patch is currently blocked pending necessary
support in net_if (see below). Once this patch is in place an
application will be able to start (and stop dhcpv4) per interface as
now. Once started dhcpv4 will catch L2 up/down events and acquire,
renew and release leases as required, Eventually the responsibility
to call call net_dhcpv4_start/stop() may be moved from an application
to a 'connection manager'... but that is for the future.

The 'net_if' patches are in their third iteration and have generated
by far the most discussion.

The objective of the net_if patches is to arrange for L2 up/down
network management events to be raised when a functional L2 iface
becomes available for use, or conversely becomes unavailable. These
events can then be caught by dhcpv4 in order for dhcpv4 to manage
IP/L3 configuration.

In the current net_if implementation there are two significant
functions: net_if_up and net_if_down(). These functions call the
underlying L2 enable() callback, set and clear the net_if
NET_IF_ENABLED flag and raise NET_EVENT_IF_UP/DOWN network management
events.

After re-reading various comments and discussion on the existing patch
set I've come to the conclusion that there are two different world
views of the conceptual purpose of net_if_up() and net_if_down().

View 1:
net_if_up/down provide an interface for a higher/management layer to
communicate downwards and mark an iface as enabled or disabled
(irrespective of the state of the underlying L2)

This world view is supported by:
- these functions call down to the enable() callback in the underlying
L2 (ie they direcly call L2 telling it whether to enable or disable).
- in the absence of a connection manager the network stack hardwires a
call to net_if_up() for every iface at system boot (net_if_post_init).

View 2:
net_if_up/down provide an interface for an underlying L2 to
communicate upwards that an iface as up/working.

This world view is supported by:
- the bluetooth stack calls net_if_up/down on ipsp connect/disconnect
- the net_if_up/down terminology suggests this behaviour (as opposed
to being explicitly called enable/disable)

Conceptually there are four APIs here: enable/disable and up/down.
The former two provide a management interface that allows a higher
layer to requested that an iface is enabled or disabled, likely called
by a connection manager or equivalent. The latter two allow the stack
below the iface to report upwards whether or not an enabled iface
actually has a link up or not.

The l2 enable callback conceptually belongs with the enable/disable
interface. The network management event up/down signalling
conceptually belongs with the up/down interface.

In the current tree I think we have a slightly odd merge of the two
concepts where some code treats net_if_up/down() as if they implement
enable/disable semantics, while other code treats
net_if_up()/net_if_down() as if they implement up/down semantics.
Notably we have the network stack initialization code hardwiring
net_if_up() on all interfaces and we have L2 enable hung on
net_if_up/down() both of these behaviours associated with an
enable/disable semantic yet we also have the BT stack using
net_if_up/down() as a notification mechanism that L2 is up/down. (It
appears to me that for an iface associated with BT, the iface will be
up'd at system boot and then re-up'd on ipsp connect )

Various points that have come up in discussion (please correct me if I
misrepresent or miss some point of view):

1) Should we have enable/disable. The general view seems to be that
we don't have a solid use case for enable/disable therefore we should
not have them.

2) BT advertise should be disabled if a network interface is
disabled(). IMO this is actually the use case that suggests we should
keep enable/disable.

3) Should we have 1 or 2 net_if flags. The general view seems to be
that we should have only 1, I think in practice this is driven by
whether we keep or remove an enable/disable API.

4) Physical interfaces should not power up power down as a result of
L2 enable/disable, that should all be handled via a separate power
management API.


There are (at least) two ways forward:

1) We drop the enable/disable semantic. This implies:
- We remove the L2 enable() callback completely.
- We remove hardwired net_if_up() calls when the network stack boots.
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops).
- BT stays as it is (advertise hardwired on)
Note that the enable/disable semantic was introduced for L2 link
detection, which is why it is an L2/LL API. Now from the discussion we
had in the IRC what we seem to be really missing is a L3/IP interface
to tell when that layer is available so the application can start
sending packets. We did agree that we need to make samples that react
to L3/IP being up not L2/LL which should probably remain just to start
the procedure to acquire IP address, etc, so by given this option it
either means we did not understand each other or you did not agree
after all the discussions we had.

2) We keep the enable/disable semantic. This implies:
- We split net_if_up/down into net_if_enable/disable() and
net_if_up/down() such that net_if_enable calls l2->enable() while
net_if_up/down deals with NET_IF_UP and raising net_event_if_up/down
- The hardwired net_if_up() calls at network stack boot are switched
to call net_if_enable()
- The BT L2 enable callback is used to turn advertising on/off
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops). (BT already does this)
Either this is mixing layer L3/IP states with L2/LL, or you do want to
introduce runtime RFKILL concept, which is it? If this is for L3/IP
then that should not mess up with L2 API, for runtime RFKILL this
should be done in the L1 driver so we disable everything, including
interrupts and could possible power down the radio. Thoughts?

In option 1 we remove the mechanism we have to communicate to the BT
stack that advertising should be on/off.

IMHO route 2 is a better way forward.

Thoughts?

/Marcus


--
Luiz Augusto von Dentz


Gil Pitney
 

Option 1) would be ideal for the upcoming WiFi offload devices (like
TI CC3220), which do full TCP/IP offload onto a co-processor,
essentially bypassing the L2 layer.

In that case, there is no need for an l2->enable() call.

It seems that Option 1) makes the most sense for the offload use case:
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops).
The DHCP client is also offloaded onto the network coprocessor, but
could be configured off by default to use the Zephyr DHCP. Still, it
would be nice to allow DHCP to be offloaded in the future as well
(saving code space, power).


On 23 March 2017 at 12:26, Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote:
Hi Marcus,

On Thu, Mar 23, 2017 at 6:51 PM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
Hi,

The network interface patches proposed as a result of this thread have
generated a fair amount of discussion both in patch reviews and in
IRC. Now would seem like a good time to summarize where we are and
pull together some of the various discussion points that have been
raised.

Current status:

A bunch of preparatory patches to dhcpv4 have been merged. Notable changes:
- Public interface now provides net_dhcpv4_start(iface) and
net_dhcpv4_stop(iface).
- Various initialization issues that would prevent concurrent dhcpv4
operation on multiple ifaces are resolved.
- dhcpv4 will now remove leased resources from the network stack on
lease lapse/release.

There is one more small but significant dhcpv4 patch outstanding that
catches L2 up/down events and kicks the appropriate dhcpv4 machinery
per interface. This patch is currently blocked pending necessary
support in net_if (see below). Once this patch is in place an
application will be able to start (and stop dhcpv4) per interface as
now. Once started dhcpv4 will catch L2 up/down events and acquire,
renew and release leases as required, Eventually the responsibility
to call call net_dhcpv4_start/stop() may be moved from an application
to a 'connection manager'... but that is for the future.

The 'net_if' patches are in their third iteration and have generated
by far the most discussion.

The objective of the net_if patches is to arrange for L2 up/down
network management events to be raised when a functional L2 iface
becomes available for use, or conversely becomes unavailable. These
events can then be caught by dhcpv4 in order for dhcpv4 to manage
IP/L3 configuration.

In the current net_if implementation there are two significant
functions: net_if_up and net_if_down(). These functions call the
underlying L2 enable() callback, set and clear the net_if
NET_IF_ENABLED flag and raise NET_EVENT_IF_UP/DOWN network management
events.

After re-reading various comments and discussion on the existing patch
set I've come to the conclusion that there are two different world
views of the conceptual purpose of net_if_up() and net_if_down().

View 1:
net_if_up/down provide an interface for a higher/management layer to
communicate downwards and mark an iface as enabled or disabled
(irrespective of the state of the underlying L2)

This world view is supported by:
- these functions call down to the enable() callback in the underlying
L2 (ie they direcly call L2 telling it whether to enable or disable).
- in the absence of a connection manager the network stack hardwires a
call to net_if_up() for every iface at system boot (net_if_post_init).

View 2:
net_if_up/down provide an interface for an underlying L2 to
communicate upwards that an iface as up/working.

This world view is supported by:
- the bluetooth stack calls net_if_up/down on ipsp connect/disconnect
- the net_if_up/down terminology suggests this behaviour (as opposed
to being explicitly called enable/disable)

Conceptually there are four APIs here: enable/disable and up/down.
The former two provide a management interface that allows a higher
layer to requested that an iface is enabled or disabled, likely called
by a connection manager or equivalent. The latter two allow the stack
below the iface to report upwards whether or not an enabled iface
actually has a link up or not.

The l2 enable callback conceptually belongs with the enable/disable
interface. The network management event up/down signalling
conceptually belongs with the up/down interface.

In the current tree I think we have a slightly odd merge of the two
concepts where some code treats net_if_up/down() as if they implement
enable/disable semantics, while other code treats
net_if_up()/net_if_down() as if they implement up/down semantics.
Notably we have the network stack initialization code hardwiring
net_if_up() on all interfaces and we have L2 enable hung on
net_if_up/down() both of these behaviours associated with an
enable/disable semantic yet we also have the BT stack using
net_if_up/down() as a notification mechanism that L2 is up/down. (It
appears to me that for an iface associated with BT, the iface will be
up'd at system boot and then re-up'd on ipsp connect )

Various points that have come up in discussion (please correct me if I
misrepresent or miss some point of view):

1) Should we have enable/disable. The general view seems to be that
we don't have a solid use case for enable/disable therefore we should
not have them.

2) BT advertise should be disabled if a network interface is
disabled(). IMO this is actually the use case that suggests we should
keep enable/disable.

3) Should we have 1 or 2 net_if flags. The general view seems to be
that we should have only 1, I think in practice this is driven by
whether we keep or remove an enable/disable API.

4) Physical interfaces should not power up power down as a result of
L2 enable/disable, that should all be handled via a separate power
management API.


There are (at least) two ways forward:

1) We drop the enable/disable semantic. This implies:
- We remove the L2 enable() callback completely.
- We remove hardwired net_if_up() calls when the network stack boots.
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops).
- BT stays as it is (advertise hardwired on)
Note that the enable/disable semantic was introduced for L2 link
detection, which is why it is an L2/LL API. Now from the discussion we
had in the IRC what we seem to be really missing is a L3/IP interface
to tell when that layer is available so the application can start
sending packets. We did agree that we need to make samples that react
to L3/IP being up not L2/LL which should probably remain just to start
the procedure to acquire IP address, etc, so by given this option it
either means we did not understand each other or you did not agree
after all the discussions we had.

2) We keep the enable/disable semantic. This implies:
- We split net_if_up/down into net_if_enable/disable() and
net_if_up/down() such that net_if_enable calls l2->enable() while
net_if_up/down deals with NET_IF_UP and raising net_event_if_up/down
- The hardwired net_if_up() calls at network stack boot are switched
to call net_if_enable()
- The BT L2 enable callback is used to turn advertising on/off
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops). (BT already does this)
Either this is mixing layer L3/IP states with L2/LL, or you do want to
introduce runtime RFKILL concept, which is it? If this is for L3/IP
then that should not mess up with L2 API, for runtime RFKILL this
should be done in the L1 driver so we disable everything, including
interrupts and could possible power down the radio. Thoughts?

In option 1 we remove the mechanism we have to communicate to the BT
stack that advertising should be on/off.

IMHO route 2 is a better way forward.

Thoughts?

/Marcus


--
Luiz Augusto von Dentz


Marcus Shawcroft <marcus.shawcroft@...>
 

Hi,


On 23 March 2017 at 19:26, Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote:
Hi Marcus,

On Thu, Mar 23, 2017 at 6:51 PM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
Hi,

The network interface patches proposed as a result of this thread have
generated a fair amount of discussion both in patch reviews and in
IRC. Now would seem like a good time to summarize where we are and
pull together some of the various discussion points that have been
raised.

Current status:

A bunch of preparatory patches to dhcpv4 have been merged. Notable changes:
- Public interface now provides net_dhcpv4_start(iface) and
net_dhcpv4_stop(iface).
- Various initialization issues that would prevent concurrent dhcpv4
operation on multiple ifaces are resolved.
- dhcpv4 will now remove leased resources from the network stack on
lease lapse/release.

There is one more small but significant dhcpv4 patch outstanding that
catches L2 up/down events and kicks the appropriate dhcpv4 machinery
per interface. This patch is currently blocked pending necessary
support in net_if (see below). Once this patch is in place an
application will be able to start (and stop dhcpv4) per interface as
now. Once started dhcpv4 will catch L2 up/down events and acquire,
renew and release leases as required, Eventually the responsibility
to call call net_dhcpv4_start/stop() may be moved from an application
to a 'connection manager'... but that is for the future.

The 'net_if' patches are in their third iteration and have generated
by far the most discussion.

The objective of the net_if patches is to arrange for L2 up/down
network management events to be raised when a functional L2 iface
becomes available for use, or conversely becomes unavailable. These
events can then be caught by dhcpv4 in order for dhcpv4 to manage
IP/L3 configuration.

In the current net_if implementation there are two significant
functions: net_if_up and net_if_down(). These functions call the
underlying L2 enable() callback, set and clear the net_if
NET_IF_ENABLED flag and raise NET_EVENT_IF_UP/DOWN network management
events.

After re-reading various comments and discussion on the existing patch
set I've come to the conclusion that there are two different world
views of the conceptual purpose of net_if_up() and net_if_down().

View 1:
net_if_up/down provide an interface for a higher/management layer to
communicate downwards and mark an iface as enabled or disabled
(irrespective of the state of the underlying L2)

This world view is supported by:
- these functions call down to the enable() callback in the underlying
L2 (ie they direcly call L2 telling it whether to enable or disable).
- in the absence of a connection manager the network stack hardwires a
call to net_if_up() for every iface at system boot (net_if_post_init).

View 2:
net_if_up/down provide an interface for an underlying L2 to
communicate upwards that an iface as up/working.

This world view is supported by:
- the bluetooth stack calls net_if_up/down on ipsp connect/disconnect
- the net_if_up/down terminology suggests this behaviour (as opposed
to being explicitly called enable/disable)

Conceptually there are four APIs here: enable/disable and up/down.
The former two provide a management interface that allows a higher
layer to requested that an iface is enabled or disabled, likely called
by a connection manager or equivalent. The latter two allow the stack
below the iface to report upwards whether or not an enabled iface
actually has a link up or not.

The l2 enable callback conceptually belongs with the enable/disable
interface. The network management event up/down signalling
conceptually belongs with the up/down interface.

In the current tree I think we have a slightly odd merge of the two
concepts where some code treats net_if_up/down() as if they implement
enable/disable semantics, while other code treats
net_if_up()/net_if_down() as if they implement up/down semantics.
Notably we have the network stack initialization code hardwiring
net_if_up() on all interfaces and we have L2 enable hung on
net_if_up/down() both of these behaviours associated with an
enable/disable semantic yet we also have the BT stack using
net_if_up/down() as a notification mechanism that L2 is up/down. (It
appears to me that for an iface associated with BT, the iface will be
up'd at system boot and then re-up'd on ipsp connect )
Can anyone add some insight on this issue? The code looks to me as if
net_if_up/down() were originally added to provide an enable/disable
semantic, but we have BT using them in what appears to be an up/down
semantic. In the context of BT ifaces' , why do we have net_if_up()
being called twice on the iface. Initially at system boot and then
again on ipsp_connect ?

Various points that have come up in discussion (please correct me if I
misrepresent or miss some point of view):

1) Should we have enable/disable. The general view seems to be that
we don't have a solid use case for enable/disable therefore we should
not have them.

2) BT advertise should be disabled if a network interface is
disabled(). IMO this is actually the use case that suggests we should
keep enable/disable.

3) Should we have 1 or 2 net_if flags. The general view seems to be
that we should have only 1, I think in practice this is driven by
whether we keep or remove an enable/disable API.

4) Physical interfaces should not power up power down as a result of
L2 enable/disable, that should all be handled via a separate power
management API.


There are (at least) two ways forward:

1) We drop the enable/disable semantic. This implies:
- We remove the L2 enable() callback completely.
- We remove hardwired net_if_up() calls when the network stack boots.
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops).
- BT stays as it is (advertise hardwired on)
Note that the enable/disable semantic was introduced for L2 link
detection,
which is why it is an L2/LL API. Now from the discussion we
had in the IRC what we seem to be really missing is a L3/IP interface
to tell when that layer is available so the application can start
sending packets. We did agree that we need to make samples that react
to L3/IP being up not L2/LL which should probably remain just to start
the procedure to acquire IP address, etc, so by given this option it
either means we did not understand each other or you did not agree
after all the discussions we had.
I agree, there is a need for a mechanism for applications to
synchronize with L3. However, that is not related to the issue of how
dhcp synchronizes with L2, hence I've deliberately not addressed it in
this thread. That can wait until we have L2 sorted.

2) We keep the enable/disable semantic. This implies:
- We split net_if_up/down into net_if_enable/disable() and
net_if_up/down() such that net_if_enable calls l2->enable() while
net_if_up/down deals with NET_IF_UP and raising net_event_if_up/down
- The hardwired net_if_up() calls at network stack boot are switched
to call net_if_enable()
- The BT L2 enable callback is used to turn advertising on/off
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops). (BT already does this)
Either this is mixing layer L3/IP states with L2/LL, or you do want to
introduce runtime RFKILL concept, which is it? If this is for L3/IP
then that should not mess up with L2 API, for runtime RFKILL this
should be done in the L1 driver so we disable everything, including
interrupts and could possible power down the radio. Thoughts?
Neither. I don't believe there is any L3 in either proposal here, if
you think there is then can you be specific about where you see it?

My understanding of rfkill is that it is a mechanism to allow some
management agent or the user to kill all RF output in a device. The
reference to BT advertising above is actually in response to your IRC
comment that if the interface is disabled we should not be advertising
in bluetooth, I interpreted which I've interpreted as BT should not
advertise IPSP rather than as "RFKILL everything", did i miss
understand?

My intention here is not to change the BT behaviour. My interest is
in finding a way forward to split the current mixed net_if/L2
enable/disable/up/down behaviour embedded in net_if_up() and
net_if_down() into a distinct enable/disable and up/down, specifically
splitting out the up/down semantic such that network devices can
communicate link up/down upwards through the stack and dhcp can catch
those notifications.

Cheers
/Marcus


Luiz Augusto von Dentz
 

Hi Marcus,

On Fri, Mar 24, 2017 at 11:59 AM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
Hi,


On 23 March 2017 at 19:26, Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote:
Hi Marcus,

On Thu, Mar 23, 2017 at 6:51 PM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
Hi,

The network interface patches proposed as a result of this thread have
generated a fair amount of discussion both in patch reviews and in
IRC. Now would seem like a good time to summarize where we are and
pull together some of the various discussion points that have been
raised.

Current status:

A bunch of preparatory patches to dhcpv4 have been merged. Notable changes:
- Public interface now provides net_dhcpv4_start(iface) and
net_dhcpv4_stop(iface).
- Various initialization issues that would prevent concurrent dhcpv4
operation on multiple ifaces are resolved.
- dhcpv4 will now remove leased resources from the network stack on
lease lapse/release.

There is one more small but significant dhcpv4 patch outstanding that
catches L2 up/down events and kicks the appropriate dhcpv4 machinery
per interface. This patch is currently blocked pending necessary
support in net_if (see below). Once this patch is in place an
application will be able to start (and stop dhcpv4) per interface as
now. Once started dhcpv4 will catch L2 up/down events and acquire,
renew and release leases as required, Eventually the responsibility
to call call net_dhcpv4_start/stop() may be moved from an application
to a 'connection manager'... but that is for the future.

The 'net_if' patches are in their third iteration and have generated
by far the most discussion.

The objective of the net_if patches is to arrange for L2 up/down
network management events to be raised when a functional L2 iface
becomes available for use, or conversely becomes unavailable. These
events can then be caught by dhcpv4 in order for dhcpv4 to manage
IP/L3 configuration.

In the current net_if implementation there are two significant
functions: net_if_up and net_if_down(). These functions call the
underlying L2 enable() callback, set and clear the net_if
NET_IF_ENABLED flag and raise NET_EVENT_IF_UP/DOWN network management
events.

After re-reading various comments and discussion on the existing patch
set I've come to the conclusion that there are two different world
views of the conceptual purpose of net_if_up() and net_if_down().

View 1:
net_if_up/down provide an interface for a higher/management layer to
communicate downwards and mark an iface as enabled or disabled
(irrespective of the state of the underlying L2)

This world view is supported by:
- these functions call down to the enable() callback in the underlying
L2 (ie they direcly call L2 telling it whether to enable or disable).
- in the absence of a connection manager the network stack hardwires a
call to net_if_up() for every iface at system boot (net_if_post_init).

View 2:
net_if_up/down provide an interface for an underlying L2 to
communicate upwards that an iface as up/working.

This world view is supported by:
- the bluetooth stack calls net_if_up/down on ipsp connect/disconnect
- the net_if_up/down terminology suggests this behaviour (as opposed
to being explicitly called enable/disable)

Conceptually there are four APIs here: enable/disable and up/down.
The former two provide a management interface that allows a higher
layer to requested that an iface is enabled or disabled, likely called
by a connection manager or equivalent. The latter two allow the stack
below the iface to report upwards whether or not an enabled iface
actually has a link up or not.

The l2 enable callback conceptually belongs with the enable/disable
interface. The network management event up/down signalling
conceptually belongs with the up/down interface.

In the current tree I think we have a slightly odd merge of the two
concepts where some code treats net_if_up/down() as if they implement
enable/disable semantics, while other code treats
net_if_up()/net_if_down() as if they implement up/down semantics.
Notably we have the network stack initialization code hardwiring
net_if_up() on all interfaces and we have L2 enable hung on
net_if_up/down() both of these behaviours associated with an
enable/disable semantic yet we also have the BT stack using
net_if_up/down() as a notification mechanism that L2 is up/down. (It
appears to me that for an iface associated with BT, the iface will be
up'd at system boot and then re-up'd on ipsp connect )
Can anyone add some insight on this issue? The code looks to me as if
net_if_up/down() were originally added to provide an enable/disable
semantic, but we have BT using them in what appears to be an up/down
semantic. In the context of BT ifaces' , why do we have net_if_up()
being called twice on the iface. Initially at system boot and then
again on ipsp_connect ?
Enable/disable was added as an interface to net_if_up and net_if_down,
but since these functions are only supposed to be called from the L2
driver we could actually remove the enable/disable interface since it
is just calling the L2 driver which is what is calling net_if_up in
the first place.
I guess we end up with this design because we wanted net_if_up to turn
the NET_IF_UP during the init procedure, that way drivers that don't
have link detection, those that don't implement enable callback, just
bypass it. Is this that hard to figure out from the code?

Various points that have come up in discussion (please correct me if I
misrepresent or miss some point of view):

1) Should we have enable/disable. The general view seems to be that
we don't have a solid use case for enable/disable therefore we should
not have them.

2) BT advertise should be disabled if a network interface is
disabled(). IMO this is actually the use case that suggests we should
keep enable/disable.

3) Should we have 1 or 2 net_if flags. The general view seems to be
that we should have only 1, I think in practice this is driven by
whether we keep or remove an enable/disable API.

4) Physical interfaces should not power up power down as a result of
L2 enable/disable, that should all be handled via a separate power
management API.


There are (at least) two ways forward:

1) We drop the enable/disable semantic. This implies:
- We remove the L2 enable() callback completely.
- We remove hardwired net_if_up() calls when the network stack boots.
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops).
This may be racy since the net_if_post_init may not have been called
yet which means the RX and TX thread may not be ready.

- BT stays as it is (advertise hardwired on)
Note that the enable/disable semantic was introduced for L2 link
detection,
which is why it is an L2/LL API. Now from the discussion we
had in the IRC what we seem to be really missing is a L3/IP interface
to tell when that layer is available so the application can start
sending packets. We did agree that we need to make samples that react
to L3/IP being up not L2/LL which should probably remain just to start
the procedure to acquire IP address, etc, so by given this option it
either means we did not understand each other or you did not agree
after all the discussions we had.
I agree, there is a need for a mechanism for applications to
synchronize with L3. However, that is not related to the issue of how
dhcp synchronizes with L2, hence I've deliberately not addressed it in
this thread. That can wait until we have L2 sorted.
dhcp shall subscribe for NET_EVENT_IF_UP, this is in fact already used
in some sample, which should probably be changed to wait for an IP
event instead. Perhaps you want to change the semantic of
NET_EVENT_IF_UP
to signal when both L2 and L3 are up, that is of course possible but
then we need a new event to signal L2 is up, e.g: NET_EVENT_L2_UP,
dhcp would subscribe to that and later emit NET_EVENT_IF_UP when IP is
up, that is all possible but it doesn't have anything to do with
l2->enable/disable as these callbacks are for L2 as the name suggests.

2) We keep the enable/disable semantic. This implies:
- We split net_if_up/down into net_if_enable/disable() and
net_if_up/down() such that net_if_enable calls l2->enable() while
net_if_up/down deals with NET_IF_UP and raising net_event_if_up/down
- The hardwired net_if_up() calls at network stack boot are switched
to call net_if_enable()
- The BT L2 enable callback is used to turn advertising on/off
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops). (BT already does this)
Either this is mixing layer L3/IP states with L2/LL, or you do want to
introduce runtime RFKILL concept, which is it? If this is for L3/IP
then that should not mess up with L2 API, for runtime RFKILL this
should be done in the L1 driver so we disable everything, including
interrupts and could possible power down the radio. Thoughts?
Neither. I don't believe there is any L3 in either proposal here, if
you think there is then can you be specific about where you see it?

My understanding of rfkill is that it is a mechanism to allow some
management agent or the user to kill all RF output in a device. The
reference to BT advertising above is actually in response to your IRC
comment that if the interface is disabled we should not be advertising
in bluetooth, I interpreted which I've interpreted as BT should not
advertise IPSP rather than as "RFKILL everything", did i miss
understand?

My intention here is not to change the BT behaviour. My interest is
in finding a way forward to split the current mixed net_if/L2
enable/disable/up/down behaviour embedded in net_if_up() and
net_if_down() into a distinct enable/disable and up/down, specifically
splitting out the up/down semantic such that network devices can
communicate link up/down upwards through the stack and dhcp can catch
those notifications.

Cheers
/Marcus


--
Luiz Augusto von Dentz


Piotr Mienkowski
 

Hi Marcus,

Thanks for your extensive and very well written emails. Looking at this thread it's clear that there indeed "there are two different world views of the conceptual purpose of net_if_up() and net_if_down()."

To add to the discussion I would like to remind here one design issue related to Ethernet drivers which, I belive, is not yet clarified. As principle Ethernet drivers are initialized before the network stack is. Currently the initialization process involves bringing up the interface and establishing a functioning data link. If at that time Ethernet driver receives an RX frame it will try to forward it to the network stack even if it may not be initialized yet. That's a race condition. One way to solve it would be to have an interface enable/disable function. The data link would be established only when it is explicitly requested by the network stack.

As you have mentioned a few times already we also need to handle a scenario where a user unplugs the network cable and plugs it in some time later possibly on a different LAN. So also for me the option 2) seems like a more natural way forward.

> 2) We keep the enable/disable semantic.  This implies:
> - We split net_if_up/down into net_if_enable/disable() and
> net_if_up/down() such that net_if_enable calls l2->enable() while
> net_if_up/down deals with NET_IF_UP and raising net_event_if_up/down
> - The hardwired net_if_up() calls at network stack boot are switched
> to call net_if_enable()
> - The BT L2 enable callback is used to turn advertising on/off
> - Every network device needs to call net_if_up() once it has a link
> established (and net_if_down when it drops). (BT already does this)

I'm only not sure how an interface power up / power down functionality fits in here. In any case the final say belongs to the network stack maintainers. 

Cheers,
Piotr


Jukka Rissanen
 

Hi Piotr,

On Tue, 2017-03-28 at 11:11 +0200, Piotr Mienkowski wrote:
Hi Marcus,

Thanks for your extensive and very well written emails. Looking at
this thread it's clear that there indeed "there are two different
world views of the conceptual purpose of net_if_up() and
net_if_down()."

To add to the discussion I would like to remind here one design issue
related to Ethernet drivers which, I belive, is not yet clarified. As
principle Ethernet drivers are initialized before the network stack
is. Currently the initialization process involves bringing up the
interface and establishing a functioning data link. If at that time
Ethernet driver receives an RX frame it will try to forward it to the
network stack even if it may not be initialized yet. That's a race
condition. One way to solve it would be to have an interface
enable/disable function. The data link would be established only when
it is explicitly requested by the network stack.
Currently the interface is marked UP when we are ready to receive
packets. So far this has worked well and I propose we keep it like
this.


As you have mentioned a few times already we also need to handle a
scenario where a user unplugs the network cable and plugs it in some
time later possibly on a different LAN. So also for me the option 2)
seems like a more natural way forward.

2) We keep the enable/disable semantic.  This implies:
- We split net_if_up/down into net_if_enable/disable() and
net_if_up/down() such that net_if_enable calls l2->enable() while
net_if_up/down deals with NET_IF_UP and raising
net_event_if_up/down
- The hardwired net_if_up() calls at network stack boot are
switched
to call net_if_enable()
- The BT L2 enable callback is used to turn advertising on/off
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops). (BT already does this)
I'm only not sure how an interface power up / power down
functionality fits in here. In any case the final say belongs to the
network stack maintainers. 

Cheers,
Piotr
Cheers,
Jukka


Piotr Mienkowski
 

On 24.03.2017 12:07, Luiz Augusto von Dentz wrote:
Hi Marcus,

On Fri, Mar 24, 2017 at 11:59 AM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
Hi,


On 23 March 2017 at 19:26, Luiz Augusto von Dentz <luiz.dentz@gmail.com> wrote:
Hi Marcus,

On Thu, Mar 23, 2017 at 6:51 PM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
Hi,

The network interface patches proposed as a result of this thread have
generated a fair amount of discussion both in patch reviews and in
IRC. Now would seem like a good time to summarize where we are and
pull together some of the various discussion points that have been
raised.

Current status:

A bunch of preparatory patches to dhcpv4 have been merged. Notable changes:
- Public interface now provides net_dhcpv4_start(iface) and
net_dhcpv4_stop(iface).
- Various initialization issues that would prevent concurrent dhcpv4
operation on multiple ifaces are resolved.
- dhcpv4 will now remove leased resources from the network stack on
lease lapse/release.

There is one more small but significant dhcpv4 patch outstanding that
catches L2 up/down events and kicks the appropriate dhcpv4 machinery
per interface. This patch is currently blocked pending necessary
support in net_if (see below). Once this patch is in place an
application will be able to start (and stop dhcpv4) per interface as
now. Once started dhcpv4 will catch L2 up/down events and acquire,
renew and release leases as required, Eventually the responsibility
to call call net_dhcpv4_start/stop() may be moved from an application
to a 'connection manager'... but that is for the future.

The 'net_if' patches are in their third iteration and have generated
by far the most discussion.

The objective of the net_if patches is to arrange for L2 up/down
network management events to be raised when a functional L2 iface
becomes available for use, or conversely becomes unavailable. These
events can then be caught by dhcpv4 in order for dhcpv4 to manage
IP/L3 configuration.

In the current net_if implementation there are two significant
functions: net_if_up and net_if_down(). These functions call the
underlying L2 enable() callback, set and clear the net_if
NET_IF_ENABLED flag and raise NET_EVENT_IF_UP/DOWN network management
events.

After re-reading various comments and discussion on the existing patch
set I've come to the conclusion that there are two different world
views of the conceptual purpose of net_if_up() and net_if_down().

View 1:
net_if_up/down provide an interface for a higher/management layer to
communicate downwards and mark an iface as enabled or disabled
(irrespective of the state of the underlying L2)

This world view is supported by:
- these functions call down to the enable() callback in the underlying
L2 (ie they direcly call L2 telling it whether to enable or disable).
- in the absence of a connection manager the network stack hardwires a
call to net_if_up() for every iface at system boot (net_if_post_init).

View 2:
net_if_up/down provide an interface for an underlying L2 to
communicate upwards that an iface as up/working.

This world view is supported by:
- the bluetooth stack calls net_if_up/down on ipsp connect/disconnect
- the net_if_up/down terminology suggests this behaviour (as opposed
to being explicitly called enable/disable)

Conceptually there are four APIs here: enable/disable and up/down.
The former two provide a management interface that allows a higher
layer to requested that an iface is enabled or disabled, likely called
by a connection manager or equivalent. The latter two allow the stack
below the iface to report upwards whether or not an enabled iface
actually has a link up or not.

The l2 enable callback conceptually belongs with the enable/disable
interface. The network management event up/down signalling
conceptually belongs with the up/down interface.

In the current tree I think we have a slightly odd merge of the two
concepts where some code treats net_if_up/down() as if they implement
enable/disable semantics, while other code treats
net_if_up()/net_if_down() as if they implement up/down semantics.
Notably we have the network stack initialization code hardwiring
net_if_up() on all interfaces and we have L2 enable hung on
net_if_up/down() both of these behaviours associated with an
enable/disable semantic yet we also have the BT stack using
net_if_up/down() as a notification mechanism that L2 is up/down. (It
appears to me that for an iface associated with BT, the iface will be
up'd at system boot and then re-up'd on ipsp connect )
Can anyone add some insight on this issue? The code looks to me as if
net_if_up/down() were originally added to provide an enable/disable
semantic, but we have BT using them in what appears to be an up/down
semantic. In the context of BT ifaces' , why do we have net_if_up()
being called twice on the iface. Initially at system boot and then
again on ipsp_connect ?
Enable/disable was added as an interface to net_if_up and net_if_down,
but since these functions are only supposed to be called from the L2
driver we could actually remove the enable/disable interface since it
is just calling the L2 driver which is what is calling net_if_up in
the first place.
I guess we end up with this design because we wanted net_if_up to turn
the NET_IF_UP during the init procedure, that way drivers that don't
have link detection, those that don't implement enable callback, just
bypass it. Is this that hard to figure out from the code?

Various points that have come up in discussion (please correct me if I
misrepresent or miss some point of view):

1) Should we have enable/disable. The general view seems to be that
we don't have a solid use case for enable/disable therefore we should
not have them.

2) BT advertise should be disabled if a network interface is
disabled(). IMO this is actually the use case that suggests we should
keep enable/disable.

3) Should we have 1 or 2 net_if flags. The general view seems to be
that we should have only 1, I think in practice this is driven by
whether we keep or remove an enable/disable API.

4) Physical interfaces should not power up power down as a result of
L2 enable/disable, that should all be handled via a separate power
management API.


There are (at least) two ways forward:

1) We drop the enable/disable semantic. This implies:
- We remove the L2 enable() callback completely.
- We remove hardwired net_if_up() calls when the network stack boots.
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops).
This may be racy since the net_if_post_init may not have been called
yet which means the RX and TX thread may not be ready.

- BT stays as it is (advertise hardwired on)
Note that the enable/disable semantic was introduced for L2 link
detection,
which is why it is an L2/LL API. Now from the discussion we
had in the IRC what we seem to be really missing is a L3/IP interface
to tell when that layer is available so the application can start
sending packets. We did agree that we need to make samples that react
to L3/IP being up not L2/LL which should probably remain just to start
the procedure to acquire IP address, etc, so by given this option it
either means we did not understand each other or you did not agree
after all the discussions we had.
I agree, there is a need for a mechanism for applications to
synchronize with L3. However, that is not related to the issue of how
dhcp synchronizes with L2, hence I've deliberately not addressed it in
this thread. That can wait until we have L2 sorted.
dhcp shall subscribe for NET_EVENT_IF_UP, this is in fact already used
in some sample, which should probably be changed to wait for an IP
event instead. Perhaps you want to change the semantic of
NET_EVENT_IF_UP
to signal when both L2 and L3 are up, that is of course possible but
then we need a new event to signal L2 is up, e.g: NET_EVENT_L2_UP,
dhcp would subscribe to that and later emit NET_EVENT_IF_UP when IP is
up, that is all possible but it doesn't have anything to do with
l2->enable/disable as these callbacks are for L2 as the name suggests.

2) We keep the enable/disable semantic. This implies:
- We split net_if_up/down into net_if_enable/disable() and
net_if_up/down() such that net_if_enable calls l2->enable() while
net_if_up/down deals with NET_IF_UP and raising net_event_if_up/down
- The hardwired net_if_up() calls at network stack boot are switched
to call net_if_enable()
- The BT L2 enable callback is used to turn advertising on/off
- Every network device needs to call net_if_up() once it has a link
established (and net_if_down when it drops). (BT already does this)
Either this is mixing layer L3/IP states with L2/LL, or you do want to
introduce runtime RFKILL concept, which is it? If this is for L3/IP
then that should not mess up with L2 API, for runtime RFKILL this
should be done in the L1 driver so we disable everything, including
interrupts and could possible power down the radio. Thoughts?
Neither. I don't believe there is any L3 in either proposal here, if
you think there is then can you be specific about where you see it?

My understanding of rfkill is that it is a mechanism to allow some
management agent or the user to kill all RF output in a device. The
reference to BT advertising above is actually in response to your IRC
comment that if the interface is disabled we should not be advertising
in bluetooth, I interpreted which I've interpreted as BT should not
advertise IPSP rather than as "RFKILL everything", did i miss
understand?

My intention here is not to change the BT behaviour. My interest is
in finding a way forward to split the current mixed net_if/L2
enable/disable/up/down behaviour embedded in net_if_up() and
net_if_down() into a distinct enable/disable and up/down, specifically
splitting out the up/down semantic such that network devices can
communicate link up/down upwards through the stack and dhcp can catch
those notifications.
So, what is the final outcome of the discussion. Which one of the two
proposals Marcus has made is the preferred one? Is there a third option?
It seems it is still unclear how to handle data link layer connection
lost / reestablished events, e.g. which function should be called by an
Ethernet driver when it determines that network cable was dis-/re-connected?

Piotr