Topics

sys_dlist_remove: bus fault with CONFIG_NO_OPTIMIZATIONS


Christopher Friedt
 

Hi list,

I'm experimenting with Zephyr on the kinetis platform and I am encountering a bus fault while running the "samples/net/sockets/dumb_http_server" example.

The bus fault only seems to happen when I have CONFIG_NO_OPTIMIZATIONS=y . And is triggered in sys_dlist_remove().

Logs show:

initializing network
IPv4 address: xxx.xxx.xxx.xxx
Single-threaded dumb HTTP server waits for a connection on port 8080...
eth_mcux: Enabled 100M full-duplex mode

At this point, I can ping the device, but as soon as I run curl or use a browser to open the static page, there is a bus fault (imprecise data bus error)

The PC was at sys_dlist_remove and the LR was in sys_dlist_get.

I don't have a full stack trace unfortunately because I'm unable to catch this issue with the debugger at present.

It's a bit concerning to me that disabling optimizations would result in a bus fault in such a critical path. Potentially related to atomic operations?

I haven't yet inspected the sys_dlist*() code yet, but just wanted to probe the list and see if this is a known issue.

Any thoughts?


PS:

Maybe some suggestions for the documentation in this sample project:

User should add/modify samples/net/sockets/dumb_http_server/prj.conf

* CONFIG_NET_CONFIG_MY_IPV4_ADDR
* CONFIG_NET_CONFIG_MY_IPV4_NETMASK
* CONFIG_NET_CONFIG_MY_IPV4_GW


Maureen Helm
 

Hi Christopher,

It’s possible that a stack overflowed. Kinetis platforms have a “system MPU” instead of the Arm MPU, and memory access violations trigger a bus fault.

 

Maureen

 

From: devel@... <devel@...> On Behalf Of Christopher Friedt via Lists.Zephyrproject.Org
Sent: Monday, September 30, 2019 7:28 AM
To: devel@...
Cc: devel@...
Subject: [Zephyr-devel] sys_dlist_remove: bus fault with CONFIG_NO_OPTIMIZATIONS

 

Hi list,

 

I'm experimenting with Zephyr on the kinetis platform and I am encountering a bus fault while running the "samples/net/sockets/dumb_http_server" example.

 

The bus fault only seems to happen when I have CONFIG_NO_OPTIMIZATIONS=y . And is triggered in sys_dlist_remove().

 

Logs show:

 

initializing network

IPv4 address: xxx.xxx.xxx.xxx

Single-threaded dumb HTTP server waits for a connection on port 8080...

eth_mcux: Enabled 100M full-duplex mode

 

At this point, I can ping the device, but as soon as I run curl or use a browser to open the static page, there is a bus fault (imprecise data bus error)

 

The PC was at sys_dlist_remove and the LR was in sys_dlist_get.

 

I don't have a full stack trace unfortunately because I'm unable to catch this issue with the debugger at present.

 

It's a bit concerning to me that disabling optimizations would result in a bus fault in such a critical path. Potentially related to atomic operations?

 

I haven't yet inspected the sys_dlist*() code yet, but just wanted to probe the list and see if this is a known issue.

 

Any thoughts?

 

 

PS:

 

Maybe some suggestions for the documentation in this sample project:

 

User should add/modify samples/net/sockets/dumb_http_server/prj.conf

 

* CONFIG_NET_CONFIG_MY_IPV4_ADDR

* CONFIG_NET_CONFIG_MY_IPV4_NETMASK

* CONFIG_NET_CONFIG_MY_IPV4_GW


Christopher Friedt
 

Hi Maureen,

On Mon., Sep. 30, 2019, 11:26 a.m. Maureen Helm, <maureen.helm@...> wrote:

Hi Christopher,

It’s possible that a stack overflowed. Kinetis platforms have a “system MPU” instead of the Arm MPU, and memory access violations trigger a bus fault.

That's interesting, because now that I think about it, I did have to disable the MPU to even get to main(). Good catch!

Do you know if Zephyr has a way to instrument stack usage?

Cheers,

Chris



Christopher Friedt
 

Hi Maureen / list,

On Mon., Sep. 30, 2019, 11:31 a.m. Christopher Friedt, <chrisfriedt@...> wrote:
Hi Maureen,

On Mon., Sep. 30, 2019, 11:26 a.m. Maureen Helm, <maureen.helm@...> wrote:

Hi Christopher,

It’s possible that a stack overflowed. Kinetis platforms have a “system MPU” instead of the Arm MPU, and memory access violations trigger a bus fault.

That's interesting, because now that I think about it, I did have to disable the MPU to even get to main(). Good catch!

Do you know if Zephyr has a way to instrument stack usage?

I just thought I would follow up on-list in case anyone else has a similar issue.

So I simply enabled

CONFIG_STACK_CANARIES=y

because ARMv7-M supports gcc's stack canaries, and found that the "tx_workq" thread indeed did have a stack overflow.

Great catch Maureen, and thanks for pointing out that the kinetis MPU is not an actual ARM MPU too. I assumed they were the same originally.

Cheers,

C