Date   

Re: Zephyr DFU protocol

Carles Cufi
 

Hi Johann,

Thanks for the feedback.

-----Original Message-----
From: zephyr-devel-bounces@... [mailto:zephyr-devel-
bounces@...] On Behalf Of Johann Fischer
Sent: 28 August 2017 15:35
To: zephyr-devel@...
Subject: Re: [Zephyr-devel] Zephyr DFU protocol

Hi,

On 28.08.2017 14:45, Cufi, Carles wrote:

The USB DFU specification does not seem to be a good fit since it maps
specifically to particular USB endpoints and classes, making it not
suitable for other transports without extensive modification. Using a
standard USB class such as CDC ACM as transport, we could instead map
the chosen protocol over a USB physical link.

That surprised me a little, can you describe it in more detail what you
mean with "it maps specifically to particular USB endpoints and
classes". I think if you have USB, then USB DFU is the most elegant
solution for update. Or is it about using the same update tool for UART
and USB?
Yes, the whole point here is to find a protocol and therefore set of update command-line tools for all transports, so that the only difference among them is an adaption layer for them. That however does *not* prevent Zephyr from also supporting USB DFU or any other DFU mechanism which is widely used and already has a well-established toolset. It is just that I would not recommend using the USB DFU protocol over any other transport as a "universal default protocol".

Regards,

Carles


Re: Zephyr DFU protocol

Johann Fischer
 

Hi,

On 28.08.2017 14:45, Cufi, Carles wrote:
The USB DFU specification does not seem to be a good fit since it maps specifically to particular USB endpoints and classes, making it not suitable for other transports without extensive modification. Using a standard USB class such as CDC ACM as transport, we could instead map the chosen protocol over a USB physical link.
That surprised me a little, can you describe it in more detail what you mean with "it maps specifically to particular USB endpoints and classes". I think if you have USB, then USB DFU is the most elegant solution for update. Or is it about using the same update tool for UART and USB?

--
Best Regards,
Johann Fischer


Zephyr DFU protocol

Carles Cufi
 

Hi all,

As you might already know, we've been working on the introduction of DFU (Device Firmware Upgrade) to Zephyr. Several Pull Requests have been posted dealing with the low-level flash and image access modules required to store a received image and then boot into it, but that leaves out one of the key items in the system: the update protocol that allows an existing running image to obtain an updated one over a transport mechanism.

There are several fundamental requirements for such a protocol if we want it to be future-proof, extensible and practical for embedded devices:

- Must be packet-based and transport-agnostic
- Must be extensible and flexible
- The server-side implementation (assuming a request/response model) must be relatively simple and require little resources
- Must be compatible with the mcuboot project and model
- At the very least the following transports must be supported: BLE, UART, IP, USB
- A client-side tool (assuming a request/response model) must either exist already or be easily implementable

With that in mind we proceeded to analyze a few of the existing protocols out there (the ones we knew about), in order to consider whether reusing an existing effort was a better approach than designing and implementing a new protocol from scratch:

1) USB DFU specification[1]
2) Nordic Secure DFU protocol (included in the Nordic SDK)[2]
3) Newt Manager Protocol (part of Mynewt)[3]
4) Distributed DFU over CoAP used in Nordic's Thread SDK[4]

Note: I will use the word "source" to identify the device that contains the new image, and "target" to identify the one that receives it, flashes it and the boots into it.

The USB DFU specification does not seem to be a good fit since it maps specifically to particular USB endpoints and classes, making it not suitable for other transports without extensive modification. Using a standard USB class such as CDC ACM as transport, we could instead map the chosen protocol over a USB physical link.
The Nordic Secure DFU protocol is also very tightly mapped to the Nordic software architecture, including assumptions that the Bluetooth Protocol Stack is decoupled from the bootloader and application images and is permanently available through a set of system calls.

We also see 2 very different image distribution models. In protocols 1, 2 and 3 the source (client) "pushes" an image to the target (server) after checking that it's applicable based on version checking and other verifications. In protocol 4 however, the source acts instead as a server and the targets act as clients that "pulls" images from the source (server) whenever they are available. I believe that the Linaro DFU implementation also follows the "pull" paradigm of protocol 4.

We believe that the right approach for the sort of ecosystem that Zephyr targets is the "push" approach, to minimize traffic, reduce power consumption and also make it possible to use with all transports. That said, it is important to note that although we are trying to decide on a default DFU mechanism for Zephyr, all layers (including the image management) will be independent of it, and it should therefore be entirely possible to implement an additional protocol for our users. Furthermore we don't exclude the possibility of extending the chosen protocol to support a "pull" model as well, something that should be entirely feasible as long as the protocol of choice is flexible.

After analyzing the different options available, we believe the Newt Manager Protocol (NMP) to be the better suited option for our current needs, for reasons outlined below:

- It is proven to work with mcuboot, the default bootloader for Zephyr
- The current mcuboot repository already contains an implementation of NMP for serial recovery
- It uses a "push" model
- It is very simple but also easily extensible
- Uses a simple packet format combining an 8-byte header followed by CBOR[5]-encoded data
- Supports additional functionality on top of basic DFU: stats, filesystem access, date and time setting, etc.
- Already supports the BLE and serial transports
- A command-line tool exists to send images over both BLE and Serial (both Go and JS/Node versions are available)
- It is open source and licensed under the APLv2
- There are commercial products using it already [6]

The protocol itself consists of two different entities, the client sending requests and the server replying with responses.
The client side is typically a higher specced device running a full operating system (computer or portable device), whereas the server is the target of the DFU procedure and receives the image, stores it and then boots into it.
Additionally, the protocol also supports an OIC (now OCF) variant where the target/server exposes a discoverable server resource through the OCF framework over IPv6 and CoAP, making it possible to use it in a "distributed push" model where a single client can discover multiple servers and push an image to them.[7] This is an interesting feature since it enables DFU over IPv6 and CoAP out of the box, even without having to switch to a "pull" model.

Unfortunately the protocol itself is not documented in a specification, and instead the source code of the different implementations must currently be used to examine and understand the protocol. In terms of currently available implementations, there are the following:

- client/source side:
- newtmgr: Written in Go, this is the official Newt Manager Protocol client. Supports both the standard (over BLE and serial) and OIC (over IP) variants and all additional features [8]
- node-newtmgr: Unofficial NodeJS reimplementation of newtmgr, supports the standard variant over BLE and serial [9]
- Adafruit Mynewt Manager iOS application [10]
- server/target side:
- Mynewt Newt Manager Protocol implementation. Supports both variants and all transports [11]

There's also the choice, not discussed so far, to implement a brand new protocol completely tailored for Zephyr and designed from scratch. Although this has some advantages, such as being able to define it completely and adapt it to the particularities of Zephyr and let everybody contribute to the protocol choices, format and standards to use. That said, and given the fact that a protocol already exists that has been proven to work with an operating system similar to Zephyr, clients are already available for both desktop and iOS, and that it could potentially save a lot of development time to reuse an existing component like we did with mcuboot, we have not pursued this option further for now.

We are eager to hear from everybody regarding the preliminary choice, including whether you know other, alternative protocols that are not known to us, whether there are requirements that are not met by our proposal or in general opinions and questions.

Regards,

Nordic Team

[1] http://www.usb.org/developers/docs/devclass_docs/DFU_1.1.pdf
[2] http://infocenter.nordicsemi.com/index.jsp?topic=%2Fcom.nordic.infocenter.sdk5.v14.0.0%2Flib_bootloader_dfu.html&cp=4_0_0_3_5_1
[3] http://mynewt.apache.org/latest/os/modules/devmgmt/newtmgr/
[4] http://infocenter.nordicsemi.com/index.jsp?topic=%2Fcom.nordic.infocenter.threadsdk.v0.10.0%2Fthread_example_dfu.html&cp=4_2_0_2_3
[5] https://tools.ietf.org/html/rfc7049
[6] https://www.adafruit.com/product/3574
[7] http://mynewt.apache.org/latest/os/modules/devmgmt/oicmgr/
[8] https://github.com/apache/mynewt-newtmgr
[9] https://github.com/jacobrosenthal/node-newtmgr
[10] https://learn.adafruit.com/adafruit-nrf52-pro-feather/adafruit-mynewt-manager
[11] https://github.com/apache/mynewt-core/tree/master/mgmt


Re: RFC: Replacing Make/Kbuild with CMake

Carles Cufi
 

Hi Martï,

 

Sorry for the huge delay on this, it slipped through the cracks!

 

Regarding Meson, I actually don’t think this is a bad idea at all, the list you mention:

 

- is cross-platform (Windows / Mac / Linux)
- supports cross-compilation
- generates a build system (e.g. for Ninja and IDEs like Visual Studio and XCode)
- only has a hard dependency on Python 3 (but the build files are not written in Python)
- includes converter scripts to help start a transition from Make, CMake, and other build systems
- has a modern build file language ("real" data types, immutable variables, not Turing complete)

 

gives a good overview of features that are comparable to CMake, so here is my take on this after looking a little bit more into the Meson doc:

 

Plus sides of using Meson:

 

+ Only has Python 3 as a dependency (we already require it for other parts of the build)

+ The build file language is definitely very clear, concise and perhaps cleaner than CMake’s

+ The project is run in a very similar fashion to Zephyr (mailing list, IRC, open governance)

+ Extensive and clear documentation (I was really impressed by this)

+ It is modern and designed from scratch with the benefit of the experience of CMake and make use

 

And here are the negative sides of it:

 

-          Very young project with a limited set of users, so unknown future

-          Mainly used on Linux for big GNOME and similar projects, very different usecase to ours

-          Cross-compilation is officially supported, but I could not find large bare-metal (i..e not targeting Linux) projects using it, so there might be unexpected hurdles there

-          Integration with Visual Studio seems to require adding a script in VS itself, whereas in CMake you only need to provide the VS version when you generate. It also does only seem to support VS2015

-          Speed is unknown? We know that CMake+Ninja is extremely fast

-          I could not find documentation for cross-platform basic file operations (coppy, move, delete, MD5, etc..). Is one supposed to do this in Python?

 

Thanks again for this, and I do believe there are significant plus sides to Meson, but ultimately it will be the opinion of the majority of our current users and developers that should decide for one or the other.

 

Thanks,

 

Carles

 

From: Marti Bolivar [mailto:marti.bolivar@...]
Sent: 12 April 2017 16:11
To: Cufi, Carles <Carles.Cufi@...>
Cc: Luiz Augusto von Dentz <luiz.dentz@...>; Andersson, Joakim <Joakim.Andersson@...>; devel@...
Subject: Re: [Zephyr-devel] RFC: Replacing Make/Kbuild with CMake

 

Hi Carles,

 

On 12 April 2017 at 08:23, Cufi, Carles <Carles.Cufi@...> wrote:


[snip]

I'd like to add a bit of background here. When it comes to cross-platform (i.e. running natively on Windows) build systems for C/C++ projects, there only seems to be a few well-known and active solutions out there:

* CMake
* SCons

SCons has the advantage of being pure Python code, and given that we already require Python for other build utilities and there's talks of porting Kconfig to Python this would reduce the number of required software packages to build Zephyr. However it doesn't really have proper IDE generation and is slower, so we opted for prototyping with CMake instead.

The other alternative is to build our own, suited to our particular needs, a bit like Mynewt did with their newt tool. But this is in general frowned upon since it would mean writing yet another piece of software instead of reusing a well-tested one, so for now the idea has remained in the sidelines.

 

I wasn't present at the conversations where the build system change was discussed, but was Meson (http://mesonbuild.com/) considered?

I don't have personal experience using it, but from its documentation, Meson:


- is cross-platform (Windows / Mac / Linux)
- supports cross-compilation
- generates a build system (e.g. for Ninja and IDEs like Visual Studio and XCode)
- only has a hard dependency on Python 3 (but the build files are not written in Python)
- includes converter scripts to help start a transition from Make, CMake, and other build systems
- has a modern build file language ("real" data types, immutable variables, not Turing complete)

The main downside I see relative to CMake is that Meson is less mature. However, it is actively developed and various projects (e.g. GNOME, X, Wayland/Weston) are looking seriously at a move to Meson [1], [2].

Any pointers on how the list was narrowed to SCons and CMake would be appreciated.

 

Thanks,


Re: bitfields

Piotr Mienkowski
 

Hi Jukka,

maybe a bit of a weird coding style question, but for CAN support I
need
a CAN ID "struct". The CAN ID is a 11 or 29 bit ID, a flags that says
it
is 29 or 11 bit, a RTR flag and possible a ERROR flag. Which totals
to
exactly 32 bit.

In Linux canid_t is just a typedef for a u32_t, and macros/defines
are
used to access the flags and mask the ID's. something like;

typedef u32_t canid_t;

#define CAN_EFF_FLAG 0x80000000U /* EFF/SFF is set in the MSB */
#define CAN_RTR_FLAG 0x40000000U /* remote transmission request */
#define CAN_ERR_FLAG 0x20000000U /* error message frame */

#define CAN_SFF_MASK 0x000007FFU /* standard frame format (SFF) */
#define CAN_EFF_MASK 0x1FFFFFFFU /* extended frame format (EFF) */

In Zephyr I also seen some use (dma for example) of the "u32_t
flag:1;"
constructs. So a canid could be something like;

struct canid {
u32_t id:29;
u32_t eid:1;
u32_t rtr:1;
u32_t err:1;
};

Is there a preference for either of these constructs to encode
bitfields?
I have no prererence here, using the bit field values is usually quite
convenient but it really depends how you are using these values.
You could have both ways if you put a union inside struct canid.
I believe you are talking about a solution used currently by Zephyr's
I2C API. There we have:

union dev_config {
u32_t raw;
struct __bits {
u32_t use_10_bit_addr : 1;
u32_t speed : 3;
u32_t is_master_device : 1;
u32_t reserved : 26;
} bits;
};

This is however incorrect. C99 §6.7.2.1, paragraph 10 says: "The order
of allocation of bit-fields within a unit (high-order to low-order or
low-order to high-order) is implementation-defined.". I.e. - using union
dev_config as an example - compiler is free to map use_10_bit_addr
either to MSB or to LSB. The two methods of specifying bit fields are
not equivalent and should not be mixed.

I will submit a bug report to remove this usage from I2C API.

- Piotr


Re: bitfields

Jukka Rissanen
 

Hi Erwin,

On Fri, 2017-08-25 at 18:02 +0200, Erwin Rol wrote:
Hello,

maybe a bit of a weird coding style question, but for CAN support I
need
a CAN ID "struct". The CAN ID is a 11 or 29 bit ID, a flags that says
it
is 29 or 11 bit, a RTR flag and possible a ERROR flag. Which totals
to
exactly 32 bit. 

In Linux canid_t is just a typedef for a u32_t, and macros/defines
are
used to access the flags and mask the ID's. something like;

typedef u32_t canid_t;

#define CAN_EFF_FLAG 0x80000000U /* EFF/SFF is set in the MSB */
#define CAN_RTR_FLAG 0x40000000U /* remote transmission request */
#define CAN_ERR_FLAG 0x20000000U /* error message frame */

#define CAN_SFF_MASK 0x000007FFU /* standard frame format (SFF) */
#define CAN_EFF_MASK 0x1FFFFFFFU /* extended frame format (EFF) */

In Zephyr I also seen some use (dma for example) of the "u32_t
flag:1;"
constructs. So a canid could be something like;

struct canid {
u32_t id:29;
u32_t eid:1;
u32_t rtr:1;
u32_t err:1;
};

Is there a preference for either of these constructs to encode
bitfields?
I have no prererence here, using the bit field values is usually quite
convenient but it really depends how you are using these values.
You could have both ways if you put a union inside struct canid.


Cheers,
Jukka


Counter API ambiguity.

Michał Kruszewski <mkru@...>
 

In the counter API we can find following function with its description:

/**
* @brief Set an alarm.
* @param dev Pointer to the device structure for the driver instance.
* @param callback Pointer to the callback function. if this is NULL,
*                 this function unsets the alarm.
* @param count Number of counter ticks.
* @param user_data Pointer to user data.
*
* @retval 0 If successful.
* @retval -ENOTSUP if the counter was not started yet.
* @retval -ENODEV if the device doesn't support interrupt (e.g. free
*                        running counters).
* @retval Negative errno code if failure.
*/
static inline int counter_set_alarm(struct device *dev,
    counter_callback_t callback,
    u32_t count, void *user_data)
{
const struct counter_driver_api *api = dev->driver_api;

return api->set_alarm(dev, callback, count, user_data);
}

Description: * @param count Number of counter ticks is misleading because it is not explicitly defined if it is relative count (relative to current counter value) or absolute counter count value.

Can anyone clarify it and can we make PR to add that information to API so that application developers do not interpret it in wrong way?

Michał Kruszewski

Sent with ProtonMail Secure Email.


bitfields

Erwin Rol
 

Hello,

maybe a bit of a weird coding style question, but for CAN support I need
a CAN ID "struct". The CAN ID is a 11 or 29 bit ID, a flags that says it
is 29 or 11 bit, a RTR flag and possible a ERROR flag. Which totals to
exactly 32 bit.

In Linux canid_t is just a typedef for a u32_t, and macros/defines are
used to access the flags and mask the ID's. something like;

typedef u32_t canid_t;

#define CAN_EFF_FLAG 0x80000000U /* EFF/SFF is set in the MSB */
#define CAN_RTR_FLAG 0x40000000U /* remote transmission request */
#define CAN_ERR_FLAG 0x20000000U /* error message frame */

#define CAN_SFF_MASK 0x000007FFU /* standard frame format (SFF) */
#define CAN_EFF_MASK 0x1FFFFFFFU /* extended frame format (EFF) */

In Zephyr I also seen some use (dma for example) of the "u32_t flag:1;"
constructs. So a canid could be something like;

struct canid {
u32_t id:29;
u32_t eid:1;
u32_t rtr:1;
u32_t err:1;
};

Is there a preference for either of these constructs to encode
bitfields?

- Erwin


Re: Adding Nucleo-F030R8 support to Zephyr - runtime error

Yannis Damigos
 

Hi Maciej,

On 08/25/2017 03:24 PM, Maciej Dębski wrote:
Gentlemen,

thank you for your quick responses!

As I wanted to provide more info and debug output, I accidentally found the issue.
This little change in arch/arm/core/cortex_m/prep_c.c caused sys fatal error on my f0 board, even before the stm32f0_init.

Here is the commit:
https://github.com/zephyrproject-rtos/zephyr/commit/eb48a0a73c11c2a9cd4c3c91864ca4e0cf52dae8 <https://github.com/zephyrproject-rtos/zephyr/commit/eb48a0a73c11c2a9cd4c3c91864ca4e0cf52dae8>

And here are the specific changes causing problem:

diff --git a/arch/arm/core/cortex_m/prep_c.c b/arch/arm/core/cortex_m/prep_c.c
index d23dd8b..1382379 100644
--- a/arch/arm/core/cortex_m/prep_c.c
+++ b/arch/arm/core/cortex_m/prep_c.c
@@ -22,9 +22,20 @@
 #include <linker/linker-defs.h>
 #include <nano_internal.h>
 #include <arch/arm/cortex_m/cmsis.h>
+#include <string.h>
 
 #ifdef CONFIG_ARMV6_M
-static inline void relocate_vector_table(void) { /* do nothing */ }
+
+#define VECTOR_ADDRESS 0
+static inline void relocate_vector_table(void)
+{
+#if defined(CONFIG_XIP) && (CONFIG_FLASH_BASE_ADDRESS != 0) || \
+    !defined(CONFIG_XIP) && (CONFIG_SRAM_BASE_ADDRESS != 0)
+       size_t vector_size = (size_t)_vector_end - (size_t)_vector_start;
+       memcpy(VECTOR_ADDRESS, _vector_start, vector_size);
+#endif
+}
+
 #elif defined(CONFIG_ARMV7_M)
 #ifdef CONFIG_XIP
 #define VECTOR_ADDRESS ((uintptr_t)&_image_rom_start + \

As I understand it (I may be wrong), the code above does not handle the case of stm32f0 where the main Flash memory is aliased in the boot memory space (0x0000 0000), but still accessible from its original memory space (0x0800 0000).

When I deleted the new body of the relocate_vector_table function (to do nothing as it was) - blinky and hello world started to work fine again.
What should I do now? How to report it properly?
You could provide a patch to address the issue and/or you could report it using github issues.

Yannis


RFC: Watchdog API update.

Kruszewski, Michal <Michal.Kruszewski@...>
 

Hello developers,


As current watchdog API looks like legacy from QMSI I would like to propose some refresh.

My RFC adds support for watchdogs with multiple reload channels. It also enables configuring watchdog timer behavior when CPU is in sleep state as well as when it is halted by the debugger.

The biggest advantage is that it enables  setting watchdog timeout value in human friendly unit of microseconds.


Here is my PR: https://github.com/zephyrproject-rtos/zephyr/pull/1260


I am waiting for feedback!


Regards,

Michał Kruszewski


Re: Adding Nucleo-F030R8 support to Zephyr - runtime error

Maciej Dębski <maciej.debski@...>
 

Gentlemen,

thank you for your quick responses!

As I wanted to provide more info and debug output, I accidentally found the issue.
This little change in arch/arm/core/cortex_m/prep_c.c caused sys fatal error on my f0 board, even before the stm32f0_init.

Here is the commit:
And here are the specific changes causing problem:

diff --git a/arch/arm/core/cortex_m/prep_c.c b/arch/arm/core/cortex_m/prep_c.c
index d23dd8b..1382379 100644
--- a/arch/arm/core/cortex_m/prep_c.c
+++ b/arch/arm/core/cortex_m/prep_c.c
@@ -22,9 +22,20 @@
 #include <linker/linker-defs.h>
 #include <nano_internal.h>
 #include <arch/arm/cortex_m/cmsis.h>
+#include <string.h>
 
 #ifdef CONFIG_ARMV6_M
-static inline void relocate_vector_table(void) { /* do nothing */ }
+
+#define VECTOR_ADDRESS 0
+static inline void relocate_vector_table(void)
+{
+#if defined(CONFIG_XIP) && (CONFIG_FLASH_BASE_ADDRESS != 0) || \
+    !defined(CONFIG_XIP) && (CONFIG_SRAM_BASE_ADDRESS != 0)
+       size_t vector_size = (size_t)_vector_end - (size_t)_vector_start;
+       memcpy(VECTOR_ADDRESS, _vector_start, vector_size);
+#endif
+}
+
 #elif defined(CONFIG_ARMV7_M)
 #ifdef CONFIG_XIP
 #define VECTOR_ADDRESS ((uintptr_t)&_image_rom_start + \


When I deleted the new body of the relocate_vector_table function (to do nothing as it was) - blinky and hello world started to work fine again.
What should I do now? How to report it properly?

Thank you!

Yours faithfully,
Maciej Dębski




On Wed, Aug 23, 2017 at 10:47 AM, Maciej Dębski <maciej.debski@...> wrote:
Hello,

I am developing support for nucleo board, with stm32f030r8 MCU. The goal was to compile and run the samples provided with Zephyr, blinky and hello_world.

I managed to finish the job, all was good, so I have done a pull request. Then, one of the reviewers pointed out that new approach to pinctrl nodes and uart pinctrl configuration in stm32 socs DT files was introduced. I was asked to do appropriate changes.

I modified my code to fit the Zephyr master. Sadly, blinky and hello_world have stopped working. The code itself is compiling and flashing fine. Just the board is reporting a fatal error before even my code is executed.
After checking the code over and over, I think I need help!

I believe most of the values are correct. I just do not fully understand the new dts/arm file structure, which is in Python, maybe I have missed something. Would you be so kind as to look at my code and help me find the issue?


This is my pull request. I would focus on dts/arm and include/dt-bindings.

Yours faithfully,
Maciej Dębski


Re: platforms

Boie, Andrew P
 

Name: Randy Seedle
Email: rseedle@...
Comment:
Do you have any plans to port this operating system to a more feature rich
platform like an Atom motherboard or an ITX motherboard (that uses a
standard
x86 part) ?
ARM might also work but once again I need a more feature rich platform than
what I have seen listed here.
Can you describe in more detail the specifications of the hardware you would like to see Zephyr running on?
To me it sounds like Linux is a better choice for you, we target platforms that can't run Linux.

Andrew


platforms

Randy Seedle <rseedle@...>
 

Name: Randy Seedle
Email: rseedle@...
Comment:
Do you have any plans to port this operating system to a more feature rich
platform like an Atom motherboard or an ITX motherboard (that uses a standard
x86 part) ?
ARM might also work but once again I need a more feature rich platform than
what I have seen listed here.


Re: Adding Nucleo-F030R8 support to Zephyr - runtime error

Yannis Damigos
 

Hi Maciej,

On 08/23/2017 11:47 AM, Maciej Dębski wrote:
Hello,

I am developing support for nucleo board, with stm32f030r8 MCU. The goal was to compile and run the samples provided with Zephyr, blinky and hello_world.

I managed to finish the job, all was good, so I have done a pull request. Then, one of the reviewers pointed out that new approach to pinctrl nodes and uart pinctrl configuration in stm32 socs DT files was introduced. I was asked to do appropriate changes.

I modified my code to fit the Zephyr master. Sadly, blinky and hello_world have stopped working. The code itself is compiling and flashing fine. Just the board is reporting a fatal error before even my code is executed.
After checking the code over and over, I think I need help!

I believe most of the values are correct. I just do not fully understand the new dts/arm file structure, which is in Python, maybe I have missed something. Would you be so kind as to look at my code and help me find the issue?
I took a look into the PR and the pinctrl nodes and uart pinctrl configuration seems fine. Could you provide more information about the fatal error? What do you mean by "before even my code is executed"?

I will have a better look in your PR the following days but it is hard to find the problem without the hardware to test it.


https://github.com/zephyrproject-rtos/zephyr/pull/1103 <https://github.com/zephyrproject-rtos/zephyr/pull/1103>

This is my pull request. I would focus on dts/arm and include/dt-bindings.

Yours faithfully,
Maciej Dębski


_______________________________________________
Zephyr-devel mailing list
Zephyr-devel@...
https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel
Yannis


Re: Adding Nucleo-F030R8 support to Zephyr - runtime error

Kumar Gala
 

Happy to help. What issue are you running into?

Also, you can find good help on IRC (#zephyrproject on freenode)

- k

On Aug 23, 2017, at 4:47 AM, Maciej Dębski <maciej.debski@...> wrote:

Hello,

I am developing support for nucleo board, with stm32f030r8 MCU. The goal was to compile and run the samples provided with Zephyr, blinky and hello_world.

I managed to finish the job, all was good, so I have done a pull request. Then, one of the reviewers pointed out that new approach to pinctrl nodes and uart pinctrl configuration in stm32 socs DT files was introduced. I was asked to do appropriate changes.

I modified my code to fit the Zephyr master. Sadly, blinky and hello_world have stopped working. The code itself is compiling and flashing fine. Just the board is reporting a fatal error before even my code is executed.
After checking the code over and over, I think I need help!

I believe most of the values are correct. I just do not fully understand the new dts/arm file structure, which is in Python, maybe I have missed something. Would you be so kind as to look at my code and help me find the issue?

https://github.com/zephyrproject-rtos/zephyr/pull/1103

This is my pull request. I would focus on dts/arm and include/dt-bindings.

Yours faithfully,
Maciej Dębski
_______________________________________________
Zephyr-devel mailing list
Zephyr-devel@...
https://lists.zephyrproject.org/mailman/listinfo/zephyr-devel


Re: Driver model, ISRs and concurrent access

Piotr Mienkowski
 

Hi Tomasz,


1) In general, are driver function APIs supposed to be callable from
an ISR? This limits the amount of kernel primitives a driver
implementation can use, and as I mentioned there is some variation
there. If not all of the driver API functions are allowed to be
called from interrupt context, then should we mark those that are or
are not somehow in the documentation?
Some driver API calls are more prone (or even expected to) to be
called from ISR, whereas others are not, and I think we should make a
clear distinction.
You won't be able to call device drivers API from ISR context, usually.
Mostly because most of the drivers are interrupt based only. As well
as most of them use non-ISR proof kernel functions, as you noticed.
Only UART has a polling mode, which enables console output in ISR mode
as it's useful for debugging.

We thought to provide both modes in the very beginning, but that was
going against the whole preemptive working mode of Zephyr. It could
have been interesting if the kernel would provide a very minimalist
mode, a bit like Contiki does, I guess. Without that, it's more an
hassle of maintaining 2 modes in each drivers. Only UART in the end
has this dual mode for a good reason.

Anyway, if we had to provide polling APIs, the keyword "poll" would
have to be in the API functions names. It would self-document the
behavior. As UART does.
That all sounds reasonable but what about DMA driver? To support
continues, uninterrupted DMA transfer (e.g. audio data) starting the
transfer of the next data block will almost certainly have to happen in
DMA transfer complete callback which runs in ISR context. There is no
time to schedule this task in a work queue or thread. It is possible to
support continues DMA transfer outside of IRS by using DMA linked list
mode but not every DMA driver supports it.
Well, are there any limitation that prevent you from configuring a new
DMA transfer on the same channel,
in ISR mode?
There are no limitations, other than support for calling DMA API in ISR
context.

(current DMA API is quite tricky to use, but that's another issue).
I agree. When writing I2S driver I come along a few DMA API design
issues. I'll try to write down my findings and send it here in another
email.

I said "usually" your are not going to call a device API in ISR mode
but I was too broad.
DMA seems a good candidate for being (re)configured in such context.

Sounds like I was too much interrupt vs polling mode based API in my
answer, and too focused
on data transfer buses.
2) Is there an agreement on concurrent access to hardware resources
and internal driver data? Some of the drivers seem to protect its
internal data and in some cases through that to the hardware itself
through synchronization mechanisms, but many of those require calling
those APIs from a thread and not from ISR (k_sem_take() etc). Some
even expose this as a Kconfig option from what we can see.
Yep, no consensus on that. QMSI shim drivers introduced - only for
themselves - a re-entrance option.
For the SPI API update, it's made generic only on SPI drivers through
spi_context part. But still no nice and generic device driver way. And
that has been an issue for quite some time already.
Are we talking here about some framework similar to Zephyr's power
management subsystem or simply a well defined policy, e.g. if a call to
a driver API requires access to resources which are in use it should
return immediately with -EBUSY error?

The implementation in QMSI drivers seem to be different. Rather than
returning immediately when the resource is busy they block waiting
indefinitely until the resource becomes available.
A well defined policy, per-device types or per-API, would be nice
maybe yes. Using
the same exact types/functions depending on the policy.

Many APIs are made of synchronous calls. Thus the blocking behavior.
But it depends also what type of device you are accessing to.

Handling -EBUSY can be quite a hassle in some. For instance in SPI or
I2C it's not something you would like.
It would be then up to the caller to retry, and eventually get the
chance to get the resource at some point.
That can be really annoying there.
I agree that the blocking behavior on device busy for SPI or I2C drivers
is very convenient in the majority of cases, however it's also likely
that soon there will be someone writing an application where such
behavior will cause problems. What if data to be sent over SPI or I2C
bus are time sensitive and sending them a second later (due to device
busy wait) would require data to be different? Someone may want to send
data immediately or not at all (in case the device is busy). It's
difficult to predict all possible scenarios. I think we need to be
careful when defining such a strict rule as all SPI or I2C API calls
should block indefinitely on device busy. In contrary returning
immediately, while often less handy, provides more flexibility.

I'm aware that Zephyr device driver documentation clearly states that
API calls are intended to be synchronous/blocking. However, the behavior
on device busy feels like a separate issue.

Another possibility, easy to implement, would be to introduce a timeout
value and have it configurable per device. The timeout value would be
set to some sensible default at boot with user having an option to
change it later. In this case an API call to device which is busy could
block indefinitely, return immediately or return after a fixed timeout.

Then there are talks about Zephyr having a POSIX layer. We probably
should make sure that whatever gets decided is compatible with POSIX.

On other device types, such as watchdog timers, if there is no
free-slot for installing a configuration, then
-EBUSY sounds relevant. For instance.

So that behavior has to be defined per-API (and well documented).
Piotr


Re: K_NO_WAIT, K_FOREVER and kernel APIs

Carles Cufi
 

Hi Andrew,

Thanks for your reply.

-----Original Message-----
From: Boie, Andrew P [mailto:andrew.p.boie@...]
Sent: 22 August 2017 21:47
To: Cufi, Carles <Carles.Cufi@...>; zephyr-
devel@...
Cc: Singh, Youvedeep <youvedeep.singh@...>
Subject: RE: K_NO_WAIT, K_FOREVER and kernel APIs

1) Calling k_timer_start() with K_NO_WAIT, meaning that one desires
the timer callback function to be executed as soon as possible. This
is useful when one wants to synchronize execution context and start a
cyclic timer from the same callback code.
There has been some offline discussion about this and I think everyone
agrees that we should change the documentation to indicate that
K_NO_WAIT is an acceptable parameter.
I believe Youvedeep is already working on a patch for this.

2) Calling k_sleep() with K_FOREVER as a parameter. This looks
reasonable to the developer since it simplifies the typical idling
loop that one finds in Zephyr using k_cpu_idle(). Instead the kernel
returns immediately, which is unexpected.
And would be awaken by k_wakeup?
Yes, I don't see why not. For me the semantics of K_FOREVER imply the
amount of time the thread will sleep if left alone, but if a second
thread wakes it up I see no reason for it not to wake up.
I'm confused on the use-case for this, can you provide a more detailed
example?
Why would you do this k_sleep(K_FOREVER) / k_wakeup() dance when
semaphores are designed for this purpose? Why would you do this instead
of using a semaphore?
You wouldn't normally, but the problem is that the mere presence of a function called k_sleep() and a macro called K_FOREVER leads to confusion. This is aggravated by the fact that kernel asserts are compile-time disableable, making it hard to track the error for an inexperienced developer, since, at the end of the application code, assuming you don't want to do anything else with the app thread, it sort of makes sense to sleep forever in the eyes of coders.


I think that, due to the fact that both of those issues have confused
our kernel API consumers already, we could consider making K_NO_WAIT
and K_FOREVER compatible with all of the kernel APIs.
When you say "all of the kernel APIs", did you have others in mind
besides the two you named above?
Any APIs that take a time or duration parameter as input. To be honest I have not gone through the whole kernel code to find out which others might be affected by this, but I assumed (perhaps wrongly) that there would be more.

In general what I think we should do is to clarify the usage of K_FOREVER and K_NO_WAIT in an "unmissable" way. Either we allow the parameters or we both document and somehow enforce the fact that they are invalid for some kernel calls in a way that developers cannot mistakenly use them and get unexplainable results, such as they ones I described in my email.

Thanks,

Carles


Re: Driver model, ISRs and concurrent access

Carles Cufi
 

Hi Tomasz,

Thanks for your reply.

-----Original Message-----
From: zephyr-devel-bounces@... [mailto:zephyr-devel-
bounces@...] On Behalf Of Tomasz Bursztyka
Sent: 22 August 2017 11:34
To: zephyr-devel@...
Subject: Re: [Zephyr-devel] Driver model, ISRs and concurrent access

Hi Carles,

1) In general, are driver function APIs supposed to be callable from
an ISR? This limits the amount of kernel primitives a driver
implementation can use, and as I mentioned there is some variation
there. If not all of the driver API functions are allowed to be called
from interrupt context, then should we mark those that are or are not
somehow in the documentation?
Some driver API calls are more prone (or even expected to) to be
called from ISR, whereas others are not, and I think we should make a
clear distinction.

You won't be able to call device drivers API from ISR context, usually.
Mostly because most of the drivers are interrupt based only. As well as
most of them use non-ISR proof kernel functions, as you noticed.
Only UART has a polling mode, which enables console output in ISR mode
as it's useful for debugging.

We thought to provide both modes in the very beginning, but that was
going against the whole preemptive working mode of Zephyr.
It could have been interesting if the kernel would provide a very
minimalist mode, a bit like Contiki does, I guess. Without that, it's
more an hassle of maintaining 2 modes in each drivers. Only UART in the
end has this dual mode for a good reason.

Anyway, if we had to provide polling APIs, the keyword "poll" would have
to be in the API functions names. It would self-document the behavior.
As UART does.
Right, it's good to hear about the original decisions taken and the reasoning behind them. I think they make sense honestly, it is our particular case that is slightly different since we choose to run parts of the controller from interrupt context. That said, we might be reconsidering that choice in the future and so for now we will stick to the paradigm that you describe, which is to make driver API calls only callable from thread mode, since the alternative would be not only complex and time-consuming, but also maybe not useful in the future.


2) Is there an agreement on concurrent access to hardware resources
and internal driver data? Some of the drivers seem to protect its
internal data and in some cases through that to the hardware itself
through synchronization mechanisms, but many of those require calling
those APIs from a thread and not from ISR (k_sem_take() etc). Some even
expose this as a Kconfig option from what we can see.

Yep, no consensus on that. QMSI shim drivers introduced - only for
themselves - a re-entrance option.
For the SPI API update, it's made generic only on SPI drivers through
spi_context part. But still no nice and generic device driver way.
And that has been an issue for quite some time already.

Fact that it requires to be called from non-ISR mode goes back to your
question 1.
Understood. Perhaps we should reach a decision on this and make it known. For now there remains variability in the implementations, this should be its own topic in a future email to reach consensus.

Thanks again,

Carles


Adding Nucleo-F030R8 support to Zephyr - runtime error

Maciej Dębski <maciej.debski@...>
 

Hello,

I am developing support for nucleo board, with stm32f030r8 MCU. The goal was to compile and run the samples provided with Zephyr, blinky and hello_world.

I managed to finish the job, all was good, so I have done a pull request. Then, one of the reviewers pointed out that new approach to pinctrl nodes and uart pinctrl configuration in stm32 socs DT files was introduced. I was asked to do appropriate changes.

I modified my code to fit the Zephyr master. Sadly, blinky and hello_world have stopped working. The code itself is compiling and flashing fine. Just the board is reporting a fatal error before even my code is executed.
After checking the code over and over, I think I need help!

I believe most of the values are correct. I just do not fully understand the new dts/arm file structure, which is in Python, maybe I have missed something. Would you be so kind as to look at my code and help me find the issue?


This is my pull request. I would focus on dts/arm and include/dt-bindings.

Yours faithfully,
Maciej Dębski


Re: K_NO_WAIT, K_FOREVER and kernel APIs

Boie, Andrew P
 

1) Calling k_timer_start() with K_NO_WAIT, meaning that one desires the
timer callback function to be executed as soon as possible. This is useful
when one wants to synchronize execution context and start a cyclic timer
from the same callback code.
There has been some offline discussion about this and I think everyone agrees that we should change the documentation to indicate that K_NO_WAIT is an acceptable parameter.
I believe Youvedeep is already working on a patch for this.

2) Calling k_sleep() with K_FOREVER as a parameter. This looks
reasonable to the developer since it simplifies the typical idling
loop that one finds in Zephyr using k_cpu_idle(). Instead the kernel
returns immediately, which is unexpected.
And would be awaken by k_wakeup?
Yes, I don't see why not. For me the semantics of K_FOREVER imply the
amount of time the thread will sleep if left alone, but if a second thread
wakes it up I see no reason for it not to wake up.
I'm confused on the use-case for this, can you provide a more detailed example?
Why would you do this k_sleep(K_FOREVER) / k_wakeup() dance when semaphores are designed for this purpose? Why would you do this instead of using a semaphore?

I think that, due to the fact that both of those issues have confused our
kernel API consumers already, we could consider making K_NO_WAIT and
K_FOREVER compatible with all of the kernel APIs.
When you say "all of the kernel APIs", did you have others in mind besides the two you named above?

Andrew

5541 - 5560 of 8778