Re: Zephyr DFU protocol


David Brown
 

On Mon, Aug 28, 2017 at 12:45:37PM +0000, Cufi, Carles wrote:

As you might already know, we've been working on the introduction of
DFU (Device Firmware Upgrade) to Zephyr. Several Pull Requests have
been posted dealing with the low-level flash and image access modules
required to store a received image and then boot into it, but that
leaves out one of the key items in the system: the update protocol
that allows an existing running image to obtain an updated one over a
transport mechanism.
My first suggestion. Unless we are stricly implementing the USB DFU
protocol, we really should call this something else. DFU is defined
by USB standards, and is a very specific protocol with a very specific
purpose. If what we're looking is for something general across other
transports, we should call it a different name to avoid confusion.

There are several fundamental requirements for such a protocol if we
want it to be future-proof, extensible and practical for embedded
devices:

- Must be packet-based and transport-agnostic
Although this makes sense for non-usb, it also precludes using
existing tools for the update when we do have USB as our transport.

My suggestion would be to support DFU for USB, and device another
protocol for the other transports.

- Must be extensible and flexible
- The server-side implementation (assuming a request/response model) must be relatively simple and require little resources
- Must be compatible with the mcuboot project and model
- At the very least the following transports must be supported: BLE, UART, IP, USB
- A client-side tool (assuming a request/response model) must either exist already or be easily implementable
So this is solved for USB DFU. We would probably have to create tools
for other transports.

With that in mind we proceeded to analyze a few of the existing
protocols out there (the ones we knew about), in order to consider
whether reusing an existing effort was a better approach than
designing and implementing a new protocol from scratch:

1) USB DFU specification[1]
2) Nordic Secure DFU protocol (included in the Nordic SDK)[2]
3) Newt Manager Protocol (part of Mynewt)[3]
4) Distributed DFU over CoAP used in Nordic's Thread SDK[4]

Note: I will use the word "source" to identify the device that
contains the new image, and "target" to identify the one that
receives it, flashes it and the boots into it.

The USB DFU specification does not seem to be a good fit since it
maps specifically to particular USB endpoints and classes, making it
not suitable for other transports without extensive modification.
Using a standard USB class such as CDC ACM as transport, we could
instead map the chosen protocol over a USB physical link.
This is fairly intentional. As I mention above, I would suggest
implementing DFU regardless of other protocols used.

We also see 2 very different image distribution models. In protocols
1, 2 and 3 the source (client) "pushes" an image to the target
(server) after checking that it's applicable based on version
checking and other verifications. In protocol 4 however, the source
acts instead as a server and the targets act as clients that "pulls"
images from the source (server) whenever they are available. I
believe that the Linaro DFU implementation also follows the "pull"
paradigm of protocol 4.
They also serve different purposes. DFU (the real one on USB) works
similar to a recovery mode. You put the target into DFU mode, and the
USB endpoint is a different kind of device than it usually is. The
other upgrade protocols are intended to upgrade live devices. This is
also a good reason to support USB's DFU in addition to whatever other
protocol we come up.

We believe that the right approach for the sort of ecosystem that
Zephyr targets is the "push" approach, to minimize traffic, reduce
power consumption and also make it possible to use with all
transports. That said, it is important to note that although we are
trying to decide on a default DFU mechanism for Zephyr, all layers
(including the image management) will be independent of it, and it
should therefore be entirely possible to implement an additional
protocol for our users. Furthermore we don't exclude the possibility
of extending the chosen protocol to support a "pull" model as well,
something that should be entirely feasible as long as the protocol of
choice is flexible.
The approach that was demoed at the last Linaro Connect was pull
based, and essentially had the firmware living on an http server. It
had the advantage of being fairly easy to implement with existing code
in Zephyr.

After analyzing the different options available, we believe the Newt
Manager Protocol (NMP) to be the better suited option for our current
needs, for reasons outlined below:

- It is proven to work with mcuboot, the default bootloader for Zephyr
- The current mcuboot repository already contains an implementation of NMP for serial recovery
- It uses a "push" model
- It is very simple but also easily extensible
- Uses a simple packet format combining an 8-byte header followed by CBOR[5]-encoded data
- Supports additional functionality on top of basic DFU: stats, filesystem access, date and time setting, etc.
- Already supports the BLE and serial transports
- A command-line tool exists to send images over both BLE and Serial (both Go and JS/Node versions are available)
- It is open source and licensed under the APLv2
- There are commercial products using it already [6]
I agree that newtmgr protocol seems to be the best fit for us. It's
serial model would even fit in fairly well with the Zephyr shell,
since it wraps the packets with a control-character + base-64 packet +
control character, which the shell seems to have partial support for
already.

David

Join devel@lists.zephyrproject.org to automatically receive all group messages.