nRF52832 hardware cycle count freezing at chip start


Thiago Silveira
 

Hi everyone,

We've been having some strange problem lately. When nRF52832 powers up, our code starts running, but the hardware cycle count (calling k_cycle_get_32(), for example) remains static at zero for some time (one to two minutes).
Therefore, when we call k_sleep or k_busy_wait, the code stops there and hangs in the k_busy_wait while loop, as the hardware cycle count is not counting.
After a while, the chip restarts by itself (watchdog wasn't configured by this point), and this second time the code runs normally.

To debug this problem, we added a printk inside k_busy_wait's while loop: printk("%d %d %d\n", start_cycles, current_cycles, cycles_to_wait);
Adding this printk lead us to the evidence above:
1) that the hardware cycle count is not running the first time;
2) that, after a restart (by whom?), the hardware cycle count is normal and the code runs just fine.

The serial output and modified k_busy_wait code is here: https://gist.github.com/durub/edda1fbf6a6c8f1c7f88960d26916ddf
The first line of the chip's output is "[exati-watchdog] [DBG] watchdog_init: [34m-------------=============Watchdog thread scheduled (...)".
The code hangs just after "Protocol initialized.", where we call k_busy_wait.

Can somebody shed some light on this problem? My teammate and I have been struggling with it for the last couple days, but no progress so far.

Additional info:

The watchdog thread was scheduled to initialize 10 seconds after the k_thread_create call, but it never happens (-------------=============Watchdog thread created=============-------------),
as the hardware cycle count is frozen. Because of this, watchdog is not running the first time, but the chip stills restart by itself after some time.

Best regards,
Thiago


Carles Cufi
 

Hi Thiago,

-----Original Message-----
Subject: [Zephyr-users] nRF52832 hardware cycle count freezing at chip
start

Hi everyone,

We've been having some strange problem lately. When nRF52832 powers up,
our code starts running, but the hardware cycle count (calling
k_cycle_get_32(), for example) remains static at zero for some time (one
to two minutes).
Therefore, when we call k_sleep or k_busy_wait, the code stops there and
hangs in the k_busy_wait while loop, as the hardware cycle count is not
counting.
After a while, the chip restarts by itself (watchdog wasn't configured
by this point), and this second time the code runs normally.

To debug this problem, we added a printk inside k_busy_wait's while
loop: printk("%d %d %d\n", start_cycles, current_cycles,
cycles_to_wait); Adding this printk lead us to the evidence above:
1) that the hardware cycle count is not running the first time;
2) that, after a restart (by whom?), the hardware cycle count is normal
and the code runs just fine.

The serial output and modified k_busy_wait code is here:
https://gist.github.com/durub/edda1fbf6a6c8f1c7f88960d26916ddf
The first line of the chip's output is "[exati-watchdog] [DBG]
watchdog_init: [34m-------------=============Watchdog thread scheduled
(...)".
The code hangs just after "Protocol initialized.", where we call
k_busy_wait.

Can somebody shed some light on this problem? My teammate and I have
been struggling with it for the last couple days, but no progress so
far.
Interesting, we haven't heard of this problem before, although there have been some issues with the nRF5x timer driver that we thought we had addressed.

So can you please expand on the following basic info before I try to reproduce:

1) Are you using the latest master or an older release?
2) What board are you building for does it have a 32Khz crystal, and do you enable the 32KHz timer in your .config?
3) Are you enabling TICKLESS_IDLE or TICKLESS_KERNEL? In fact, could you provide your .config file?
4) When did this problem appear if you're using master? There was a commit merged recently that added TICKLESS_KERNEL support for the nrf RTC driver, I wonder if it broke then.

Thanks,

Carles


Thiago Silveira
 

Hi Carles,

Thanks for the quick response.

1) We're using the latest master.
2) We're building for the nrf52_pca10040. I think we are enabling it: CONFIG_CLOCK_CONTROL_NRF5_K32SRC_XTAL=y
3) TICKLESS_IDLE is enabled, but TICKLESS_KERNEL is disabled.
4) I'm sorry, I can't pinpoint it right now to you. I'm going to investigate further here and report back. We've started experiencing this problem at the start of this week, but we were at a two-week development hiatus before then.
The only thing we added as the watchdog. The watchdog code is as follows:

void wdt_init(uint32_t reload_ms) {
NRF_WDT->CONFIG = 0x01 | 0x08;
NRF_WDT->CRV = (reload_ms / 1000) * 32678;
        SYS_LOG_WRN("%d: %d or %u", reload_ms, NRF_WDT->CRV, NRF_WDT->CRV);
NRF_WDT->INTENSET = WDT_INTENSET_TIMEOUT_Msk;
NRF_WDT->TASKS_START = 1;
}

void wdt_reload(uint8_t channel) {
NRF_WDT->RR[channel] = NRF_WDT_RR_VALUE;
}

I attached our .config to the original gist, it is there at the end now: https://gist.github.com/durub/edda1fbf6a6c8f1c7f88960d26916ddf

I tried just now to reproduce the problem without any of our code (but using our .config), and the problem still persists. Our main is:
void main() {
k_busy_wait(100);
}

We've tested this main using nrf52_pca10040 and our PCB. Sample of the final of the faulty output:
0 0 3
0 0 3
(a lot of equal lines)
0 0 3
0 0 3
0 shell> 30 30 3
30 56 3

I must say that this problem is happening intermittently. Now, the simple main is always working (a lot of successive resets with nrfjprog --reset -f nrf52):
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 64 3

Thanks a lot,

Thiago

2017-08-18 13:37 GMT-03:00 Cufi, Carles <Carles.Cufi@...>:

Hi Thiago,

> -----Original Message-----
> Subject: [Zephyr-users] nRF52832 hardware cycle count freezing at chip
> start
>
> Hi everyone,
>
> We've been having some strange problem lately. When nRF52832 powers up,
> our code starts running, but the hardware cycle count (calling
> k_cycle_get_32(), for example) remains static at zero for some time (one
> to two minutes).
> Therefore, when we call k_sleep or k_busy_wait, the code stops there and
> hangs in the k_busy_wait while loop, as the hardware cycle count is not
> counting.
> After a while, the chip restarts by itself (watchdog wasn't configured
> by this point), and this second time the code runs normally.
>
> To debug this problem, we added a printk inside k_busy_wait's while
> loop: printk("%d %d %d\n", start_cycles, current_cycles,
> cycles_to_wait); Adding this printk lead us to the evidence above:
> 1) that the hardware cycle count is not running the first time;
> 2) that, after a restart (by whom?), the hardware cycle count is normal
> and the code runs just fine.
>
> The serial output and modified k_busy_wait code is here:
> https://gist.github.com/durub/edda1fbf6a6c8f1c7f88960d26916ddf
> The first line of the chip's output is "[exati-watchdog] [DBG]
> watchdog_init: [34m-------------=============Watchdog thread scheduled
> (...)".
> The code hangs just after "Protocol initialized.", where we call
> k_busy_wait.
>
> Can somebody shed some light on this problem? My teammate and I have
> been struggling with it for the last couple days, but no progress so
> far.

Interesting, we haven't heard of this problem before, although there have been some issues with the nRF5x timer driver that we thought we had addressed.

So can you please expand on the following basic info before I try to reproduce:

1) Are you using the latest master or an older release?
2) What board are you building for does it have a 32Khz crystal, and do you enable the 32KHz timer in your .config?
3) Are you enabling TICKLESS_IDLE or TICKLESS_KERNEL? In fact, could you provide your .config file?
4) When did this problem appear if you're using master? There was a commit merged recently that added TICKLESS_KERNEL support for the nrf RTC driver, I wonder if it broke then.

Thanks,

Carles


Thiago Silveira
 

Hi,

I'm starting to suspect that this is a problem/weird interaction with the watchdog.
When the problem happens, it persists in resets, that's why the simple main was hanging too.
When we turn off the power to the development kit and we turn it on later, the problem never happens with the simple main.

Maybe we're configuring the watchdog wrong?

Thanks,

Thiago

2017-08-18 14:41 GMT-03:00 Thiago Silveira <thiago@...>:

Hi Carles,

Thanks for the quick response.

1) We're using the latest master.
2) We're building for the nrf52_pca10040. I think we are enabling it: CONFIG_CLOCK_CONTROL_NRF5_K32SRC_XTAL=y
3) TICKLESS_IDLE is enabled, but TICKLESS_KERNEL is disabled.
4) I'm sorry, I can't pinpoint it right now to you. I'm going to investigate further here and report back. We've started experiencing this problem at the start of this week, but we were at a two-week development hiatus before then.
The only thing we added as the watchdog. The watchdog code is as follows:

void wdt_init(uint32_t reload_ms) {
NRF_WDT->CONFIG = 0x01 | 0x08;
NRF_WDT->CRV = (reload_ms / 1000) * 32678;
        SYS_LOG_WRN("%d: %d or %u", reload_ms, NRF_WDT->CRV, NRF_WDT->CRV);
NRF_WDT->INTENSET = WDT_INTENSET_TIMEOUT_Msk;
NRF_WDT->TASKS_START = 1;
}

void wdt_reload(uint8_t channel) {
NRF_WDT->RR[channel] = NRF_WDT_RR_VALUE;
}

I attached our .config to the original gist, it is there at the end now: https://gist.github.com/durub/edda1fbf6a6c8f1c7f88960d26916ddf

I tried just now to reproduce the problem without any of our code (but using our .config), and the problem still persists. Our main is:
void main() {
k_busy_wait(100);
}

We've tested this main using nrf52_pca10040 and our PCB. Sample of the final of the faulty output:
0 0 3
0 0 3
(a lot of equal lines)
0 0 3
0 0 3
0 shell> 30 30 3
30 56 3

I must say that this problem is happening intermittently. Now, the simple main is always working (a lot of successive resets with nrfjprog --reset -f nrf52):
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 64 3

Thanks a lot,

Thiago

2017-08-18 13:37 GMT-03:00 Cufi, Carles <Carles.Cufi@...>:
Hi Thiago,

> -----Original Message-----
> Subject: [Zephyr-users] nRF52832 hardware cycle count freezing at chip
> start
>
> Hi everyone,
>
> We've been having some strange problem lately. When nRF52832 powers up,
> our code starts running, but the hardware cycle count (calling
> k_cycle_get_32(), for example) remains static at zero for some time (one
> to two minutes).
> Therefore, when we call k_sleep or k_busy_wait, the code stops there and
> hangs in the k_busy_wait while loop, as the hardware cycle count is not
> counting.
> After a while, the chip restarts by itself (watchdog wasn't configured
> by this point), and this second time the code runs normally.
>
> To debug this problem, we added a printk inside k_busy_wait's while
> loop: printk("%d %d %d\n", start_cycles, current_cycles,
> cycles_to_wait); Adding this printk lead us to the evidence above:
> 1) that the hardware cycle count is not running the first time;
> 2) that, after a restart (by whom?), the hardware cycle count is normal
> and the code runs just fine.
>
> The serial output and modified k_busy_wait code is here:
> https://gist.github.com/durub/edda1fbf6a6c8f1c7f88960d26916ddf
> The first line of the chip's output is "[exati-watchdog] [DBG]
> watchdog_init: [34m-------------=============Watchdog thread scheduled
> (...)".
> The code hangs just after "Protocol initialized.", where we call
> k_busy_wait.
>
> Can somebody shed some light on this problem? My teammate and I have
> been struggling with it for the last couple days, but no progress so
> far.

Interesting, we haven't heard of this problem before, although there have been some issues with the nRF5x timer driver that we thought we had addressed.

So can you please expand on the following basic info before I try to reproduce:

1) Are you using the latest master or an older release?
2) What board are you building for does it have a 32Khz crystal, and do you enable the 32KHz timer in your .config?
3) Are you enabling TICKLESS_IDLE or TICKLESS_KERNEL? In fact, could you provide your .config file?
4) When did this problem appear if you're using master? There was a commit merged recently that added TICKLESS_KERNEL support for the nrf RTC driver, I wonder if it broke then.

Thanks,

Carles



Chettimada, Vinayak Kariappa
 

Hi Thiago,


2) We're building for the nrf52_pca10040. I think we are enabling it: CONFIG_CLOCK_CONTROL_NRF5_K32SRC_XTAL=y

I assume, you mean, you are using nRF52832 chip and your custom board; and using a Zephyr build using the BOARD=nrf52_pca10040.
Do you have the 32KHz crystal mounted in your custom PCB? If not, you will need to use the internal RC oscillator by enabling CONFIG_CLOCK_CONTROL_NRF5_K32SRC_RC=y.
 
3) TICKLESS_IDLE is enabled, but TICKLESS_KERNEL is disabled.
4) I'm sorry, I can't pinpoint it right now to you. I'm going to investigate further here and report back. We've started experiencing this problem at the start of this week, but we were at a two-week development hiatus before then.
The only thing we added as the watchdog. The watchdog code is as follows:

void wdt_init(uint32_t reload_ms) {
NRF_WDT->CONFIG = 0x01 | 0x08;
NRF_WDT->CRV = (reload_ms / 1000) * 32678;
        SYS_LOG_WRN("%d: %d or %u", reload_ms, NRF_WDT->CRV, NRF_WDT->CRV);
NRF_WDT->INTENSET = WDT_INTENSET_TIMEOUT_Msk;
NRF_WDT->TASKS_START = 1;
}

void wdt_reload(uint8_t channel) {
NRF_WDT->RR[channel] = NRF_WDT_RR_VALUE;
}

I attached our .config to the original gist, it is there at the end now: https://gist.github.com/durub/edda1fbf6a6c8f1c7f88960d26916ddf

I tried just now to reproduce the problem without any of our code (but using our .config), and the problem still persists. Our main is:
void main() {
k_busy_wait(100);
}


Currently there is no watchdog driver for nRF52 contributed yet to Zephyr. I am not an expert on this peripheral, but I do notice in your code, you enable interrupt from the watchdog peripheral, hope you have setup interrupt handler correctly, clear the events and kick the dog sufficiently, etc.
If your application needs watchdog, I would advise you implement a watchdog driver following the Zephyr driver model (include/watchdog.h and in drivers/watchdog folder). We will be glad to review your driver and a simple sample application. 

We've tested this main using nrf52_pca10040 and our PCB. Sample of the final of the faulty output:
0 0 3
0 0 3
(a lot of equal lines)
0 0 3
0 0 3
0 shell> 30 30 3
30 56 3

I must say that this problem is happening intermittently. Now, the simple main is always working (a lot of successive resets with nrfjprog --reset -f nrf52):
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 64 3

Please remember that nRF52 is a ultra low power chip and there is no functional ARM systick timer, the system timer is implemented using the NRF_RTC peripheral.
The resolution of each tick is in 32KHz units. If you print in busywait, you will see lot of lines with same values until each 32KHz (*if* UART tx time is very much less than 30.517 us, which I doubt).
Ok, you want to wait 100 microseconds and your printk inside the “for” loop in k_busy_wait consumes more time (I am certain) and the loop does not break out correctly.

Could you please explain the symptoms of the problem without any of your debugging (printk influences the k_busy_wait) ?

Regards,
Vinayak


Thiago Silveira
 

Hi Vinayak,

> I assume, you mean, you are using nRF52832 chip and your custom board; and using a Zephyr build using the BOARD=nrf52_pca10040.
> Do you have the 32KHz crystal mounted in your custom PCB? If not, you will need to use the internal RC oscillator by enabling CONFIG_CLOCK_CONTROL_NRF5_K32SRC_RC=y.

Good question! I checked and yes, we have the 32KHz crystal mounted in our custom PCB. We also tested with the nRF52 DK (the physical nrf52_pca10040 board).
We do use the nrf52_pca10040 board to build for our custom PCB, with some modifications in our prj.conf to suit our board (mainly UART TX/RX).

> Currently there is no watchdog driver for nRF52 contributed yet to Zephyr. I am not an expert on this peripheral, but I do notice in your code, you enable interrupt from the watchdog peripheral, hope you have setup interrupt handler correctly, clear the events and kick the dog sufficiently, etc.
> If your application needs watchdog, I would advise you implement a watchdog driver following the Zephyr driver model (include/watchdog.h and in drivers/watchdog folder). We will be glad to review your driver and a simple sample application. 

We do kick the dog sufficiently, and the watchdog is working fine apart from this initial hiccup. I'm not so sure about the events (other than clearing the channel).
Following your suggestions, I'm going to explore a little further the watchdog in nRF52832 and implement a driver following the Zephyr driver model.
Hopefully I could merge this back into upstream for the 1.10 release (as 1.09 is feature frozen)?

> Please remember that nRF52 is a ultra low power chip and there is no functional ARM systick timer, the system timer is implemented using the NRF_RTC peripheral.
> The resolution of each tick is in 32KHz units. If you print in busywait, you will see lot of lines with same values until each 32KHz (*if* UART tx time is very much less than 30.517 us, which I doubt).
> Ok, you want to wait 100 microseconds and your printk inside the “for” loop in k_busy_wait consumes more time (I am certain) and the loop does not break out correctly.

> Could you please explain the symptoms of the problem without any of your debugging (printk influences the k_busy_wait) ?

The 100 microseconds sample is just a way to show that k_cycle_get_32() is frozen. The only purpose is to test the NRF_RTC peripheral, not to wait any specified amount of time.
The first time it repeats 211 thousand times, with k_cycle_get_32() returning zero, and the second time it repeats only a dozen times, with k_cycle_get_32() returning increasing values.
That is evidence enough of what is happening, even though the waiting time may not be exactly 100 microseconds. Because waiting is not the intention, I don't think the debug influences the output that much.

However, I think that is a moot point now. I think your advice about the watchdog interrupts and events is correct.
I'm going to explore that further and report back to you guys.

Thanks so much for the help,

Thiago

2017-08-19 2:00 GMT-03:00 Chettimada, Vinayak Kariappa <vinayak.kariappa.chettimada@...>:

Hi Thiago,


2) We're building for the nrf52_pca10040. I think we are enabling it: CONFIG_CLOCK_CONTROL_NRF5_K32SRC_XTAL=y

I assume, you mean, you are using nRF52832 chip and your custom board; and using a Zephyr build using the BOARD=nrf52_pca10040.
Do you have the 32KHz crystal mounted in your custom PCB? If not, you will need to use the internal RC oscillator by enabling CONFIG_CLOCK_CONTROL_NRF5_K32SRC_RC=y.
 
3) TICKLESS_IDLE is enabled, but TICKLESS_KERNEL is disabled.
4) I'm sorry, I can't pinpoint it right now to you. I'm going to investigate further here and report back. We've started experiencing this problem at the start of this week, but we were at a two-week development hiatus before then.
The only thing we added as the watchdog. The watchdog code is as follows:

void wdt_init(uint32_t reload_ms) {
NRF_WDT->CONFIG = 0x01 | 0x08;
NRF_WDT->CRV = (reload_ms / 1000) * 32678;
        SYS_LOG_WRN("%d: %d or %u", reload_ms, NRF_WDT->CRV, NRF_WDT->CRV);
NRF_WDT->INTENSET = WDT_INTENSET_TIMEOUT_Msk;
NRF_WDT->TASKS_START = 1;
}

void wdt_reload(uint8_t channel) {
NRF_WDT->RR[channel] = NRF_WDT_RR_VALUE;
}

I attached our .config to the original gist, it is there at the end now: https://gist.github.com/durub/edda1fbf6a6c8f1c7f88960d26916ddf

I tried just now to reproduce the problem without any of our code (but using our .config), and the problem still persists. Our main is:
void main() {
k_busy_wait(100);
}


Currently there is no watchdog driver for nRF52 contributed yet to Zephyr. I am not an expert on this peripheral, but I do notice in your code, you enable interrupt from the watchdog peripheral, hope you have setup interrupt handler correctly, clear the events and kick the dog sufficiently, etc.
If your application needs watchdog, I would advise you implement a watchdog driver following the Zephyr driver model (include/watchdog.h and in drivers/watchdog folder). We will be glad to review your driver and a simple sample application. 

We've tested this main using nrf52_pca10040 and our PCB. Sample of the final of the faulty output:
0 0 3
0 0 3
(a lot of equal lines)
0 0 3
0 0 3
0 shell> 30 30 3
30 56 3

I must say that this problem is happening intermittently. Now, the simple main is always working (a lot of successive resets with nrfjprog --reset -f nrf52):
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 64 3

Please remember that nRF52 is a ultra low power chip and there is no functional ARM systick timer, the system timer is implemented using the NRF_RTC peripheral.
The resolution of each tick is in 32KHz units. If you print in busywait, you will see lot of lines with same values until each 32KHz (*if* UART tx time is very much less than 30.517 us, which I doubt).
Ok, you want to wait 100 microseconds and your printk inside the “for” loop in k_busy_wait consumes more time (I am certain) and the loop does not break out correctly.

Could you please explain the symptoms of the problem without any of your debugging (printk influences the k_busy_wait) ?

Regards,
Vinayak


Thiago Silveira
 

Hi everyone,

It's been a little over a month now, but I promised I would explore this issue further and report back. :-)
Turns out the problem is caused by activating the watchdog, but the watchdog itself is not the issue.

At first I thought it was related to this errata: http://infocenter.nordicsemi.com/index.jsp?topic=%2Fcom.nordic.infocenter.nrf52832.EngB.errata%2Fanomaly_832_20.html
And, turns out, it kind of is. However, this issue is already fixed when activating the LFCLK in _k32src_start. So, why the RTC isn't running properly?

To spare you my whole journey, I'll say that the relevant code is here, right at the start. The TODO and 'if' says it all:

(...)
static int _k32src_start(struct device *dev, clock_control_subsys_t sub_system)
{
u32_t lf_clk_src;
u32_t intenset;

/* TODO: implement the ref count and re-entrancy guard, if a use-case
* needs it.
*/

if ((NRF_CLOCK->LFCLKSTAT & CLOCK_LFCLKSTAT_STATE_Msk)) {
return 0;
}
(...)

After watchdog is activated, LFCLK is forcibly on at all times, as it needs it. After a soft reset, WDT registers are retained, so when the nRF resets, the LFCLK source is on but it isn't configured properly.
When _k32src_start is called to activate and configure the LFCLK source, it checks to see if LFCLK is running.
Thanks to the watchdog, it is, so it stops there and doesn't configure it. The fix is detailed at the TODO.
So much for a week (and more!) of debugging and going through kernel code.

I'm just glad it's fixed -- now we don't have to wait a couple minutes until code starts running after we flash. We had been disabling the watchdog when developing to remedy this... we ended up forgetting to activate it on a couple of field test runs, doh! :-)

That being said...

1) I have opened a pull request to address this issue, which can be located here: https://github.com/zephyrproject-rtos/zephyr/pull/4096

I would be very glad if you guys (Carles and Vinaytak) could review it, as you initially assisted me with this issue.

2) I plan to move forward with merging my nRF5 watchdog driver, which VInayak suggested I write.

Because of this problem, I was reluctant to merge it. Now that this issue is resolved, I will open a pull request to merge the driver in. I'm aware that Michał Kruszewski is working on a new iteration of the watchdog driver and I voiced my concerns about the current watchdog driver API to him, but I think that having a nRF5 driver, even in current API form, is only beneficial to Zephyr and its' users, as any production-ready application *will* require a proper watchdog. Once the RFC is approved, I or someone else could easily port it to the new API.

Best regards,

Thiago

2017-08-21 14:44 GMT-03:00 Thiago Silveira <thiago@...>:

Hi Vinayak,

> I assume, you mean, you are using nRF52832 chip and your custom board; and using a Zephyr build using the BOARD=nrf52_pca10040.
> Do you have the 32KHz crystal mounted in your custom PCB? If not, you will need to use the internal RC oscillator by enabling CONFIG_CLOCK_CONTROL_NRF5_K32SRC_RC=y.

Good question! I checked and yes, we have the 32KHz crystal mounted in our custom PCB. We also tested with the nRF52 DK (the physical nrf52_pca10040 board).
We do use the nrf52_pca10040 board to build for our custom PCB, with some modifications in our prj.conf to suit our board (mainly UART TX/RX).

> Currently there is no watchdog driver for nRF52 contributed yet to Zephyr. I am not an expert on this peripheral, but I do notice in your code, you enable interrupt from the watchdog peripheral, hope you have setup interrupt handler correctly, clear the events and kick the dog sufficiently, etc.
> If your application needs watchdog, I would advise you implement a watchdog driver following the Zephyr driver model (include/watchdog.h and in drivers/watchdog folder). We will be glad to review your driver and a simple sample application. 

We do kick the dog sufficiently, and the watchdog is working fine apart from this initial hiccup. I'm not so sure about the events (other than clearing the channel).
Following your suggestions, I'm going to explore a little further the watchdog in nRF52832 and implement a driver following the Zephyr driver model.
Hopefully I could merge this back into upstream for the 1.10 release (as 1.09 is feature frozen)?

> Please remember that nRF52 is a ultra low power chip and there is no functional ARM systick timer, the system timer is implemented using the NRF_RTC peripheral.
> The resolution of each tick is in 32KHz units. If you print in busywait, you will see lot of lines with same values until each 32KHz (*if* UART tx time is very much less than 30.517 us, which I doubt).
> Ok, you want to wait 100 microseconds and your printk inside the “for” loop in k_busy_wait consumes more time (I am certain) and the loop does not break out correctly.

> Could you please explain the symptoms of the problem without any of your debugging (printk influences the k_busy_wait) ?

The 100 microseconds sample is just a way to show that k_cycle_get_32() is frozen. The only purpose is to test the NRF_RTC peripheral, not to wait any specified amount of time.
The first time it repeats 211 thousand times, with k_cycle_get_32() returning zero, and the second time it repeats only a dozen times, with k_cycle_get_32() returning increasing values.
That is evidence enough of what is happening, even though the waiting time may not be exactly 100 microseconds. Because waiting is not the intention, I don't think the debug influences the output that much.

However, I think that is a moot point now. I think your advice about the watchdog interrupts and events is correct.
I'm going to explore that further and report back to you guys.

Thanks so much for the help,

Thiago

2017-08-19 2:00 GMT-03:00 Chettimada, Vinayak Kariappa <vinayak.kariappa.chettimada@nordicsemi.no>:
Hi Thiago,


2) We're building for the nrf52_pca10040. I think we are enabling it: CONFIG_CLOCK_CONTROL_NRF5_K32SRC_XTAL=y

I assume, you mean, you are using nRF52832 chip and your custom board; and using a Zephyr build using the BOARD=nrf52_pca10040.
Do you have the 32KHz crystal mounted in your custom PCB? If not, you will need to use the internal RC oscillator by enabling CONFIG_CLOCK_CONTROL_NRF5_K32SRC_RC=y.
 
3) TICKLESS_IDLE is enabled, but TICKLESS_KERNEL is disabled.
4) I'm sorry, I can't pinpoint it right now to you. I'm going to investigate further here and report back. We've started experiencing this problem at the start of this week, but we were at a two-week development hiatus before then.
The only thing we added as the watchdog. The watchdog code is as follows:

void wdt_init(uint32_t reload_ms) {
NRF_WDT->CONFIG = 0x01 | 0x08;
NRF_WDT->CRV = (reload_ms / 1000) * 32678;
        SYS_LOG_WRN("%d: %d or %u", reload_ms, NRF_WDT->CRV, NRF_WDT->CRV);
NRF_WDT->INTENSET = WDT_INTENSET_TIMEOUT_Msk;
NRF_WDT->TASKS_START = 1;
}

void wdt_reload(uint8_t channel) {
NRF_WDT->RR[channel] = NRF_WDT_RR_VALUE;
}

I attached our .config to the original gist, it is there at the end now: https://gist.github.com/durub/edda1fbf6a6c8f1c7f88960d26916ddf

I tried just now to reproduce the problem without any of our code (but using our .config), and the problem still persists. Our main is:
void main() {
k_busy_wait(100);
}


Currently there is no watchdog driver for nRF52 contributed yet to Zephyr. I am not an expert on this peripheral, but I do notice in your code, you enable interrupt from the watchdog peripheral, hope you have setup interrupt handler correctly, clear the events and kick the dog sufficiently, etc.
If your application needs watchdog, I would advise you implement a watchdog driver following the Zephyr driver model (include/watchdog.h and in drivers/watchdog folder). We will be glad to review your driver and a simple sample application. 

We've tested this main using nrf52_pca10040 and our PCB. Sample of the final of the faulty output:
0 0 3
0 0 3
(a lot of equal lines)
0 0 3
0 0 3
0 shell> 30 30 3
30 56 3

I must say that this problem is happening intermittently. Now, the simple main is always working (a lot of successive resets with nrfjprog --reset -f nrf52):
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 65 3
shell> 38 38 3
38 64 3
shell> 38 38 3
38 64 3

Please remember that nRF52 is a ultra low power chip and there is no functional ARM systick timer, the system timer is implemented using the NRF_RTC peripheral.
The resolution of each tick is in 32KHz units. If you print in busywait, you will see lot of lines with same values until each 32KHz (*if* UART tx time is very much less than 30.517 us, which I doubt).
Ok, you want to wait 100 microseconds and your printk inside the “for” loop in k_busy_wait consumes more time (I am certain) and the loop does not break out correctly.

Could you please explain the symptoms of the problem without any of your debugging (printk influences the k_busy_wait) ?

Regards,
Vinayak



Chettimada, Vinayak Kariappa
 

Commented in the PR. We can discuss further there. Thank you for your analysis.

 

-Vinayak

 

From: zephyr-users-bounces@... [mailto:zephyr-users-bounces@...] On Behalf Of Thiago Silveira
Sent: Thursday, September 28, 2017 11:25 AM
To: Chettimada, Vinayak Kariappa <vinayak.kariappa.chettimada@...>
Cc: zephyr-users@...
Subject: Re: [Zephyr-users] nRF52832 hardware cycle count freezing at chip start

 

Hi everyone,

It's been a little over a month now, but I promised I would explore this issue further and report back. :-)
Turns out the problem is caused by activating the watchdog, but the watchdog itself is not the issue.

At first I thought it was related to this errata: http://infocenter.nordicsemi.com/index.jsp?topic=%2Fcom.nordic.infocenter.nrf52832.EngB.errata%2Fanomaly_832_20.html
And, turns out, it kind of is. However, this issue is already fixed when activating the LFCLK in _k32src_start. So, why the RTC isn't running properly?

To spare you my whole journey, I'll say that the relevant code is here, right at the start. The TODO and 'if' says it all:

(...)

static int _k32src_start(struct device *dev, clock_control_subsys_t sub_system)

{

          u32_t lf_clk_src;

          u32_t intenset;

 

          /* TODO: implement the ref count and re-entrancy guard, if a use-case

          * needs it.

          */

 

          if ((NRF_CLOCK->LFCLKSTAT & CLOCK_LFCLKSTAT_STATE_Msk)) {

                      return 0;

          }
(...)

After watchdog is activated, LFCLK is forcibly on at all times, as it needs it. After a soft reset, WDT registers are retained, so when the nRF resets, the LFCLK source is on but it isn't configured properly.

When _k32src_start is called to activate and configure the LFCLK source, it checks to see if LFCLK is running.

Thanks to the watchdog, it is, so it stops there and doesn't configure it. The fix is detailed at the TODO.

So much for a week (and more!) of debugging and going through kernel code.

I'm just glad it's fixed -- now we don't have to wait a couple minutes until code starts running after we flash. We had been disabling the watchdog when developing to remedy this... we ended up forgetting to activate it on a couple of field test runs, doh! :-)


That being said...

1) I have opened a pull request to address this issue, which can be located here: https://github.com/zephyrproject-rtos/zephyr/pull/4096


I would be very glad if you guys (Carles and Vinaytak) could review it, as you initially assisted me with this issue.

2) I plan to move forward with merging my nRF5 watchdog driver, which VInayak suggested I write.

Because of this problem, I was reluctant to merge it. Now that this issue is resolved, I will open a pull request to merge the driver in. I'm aware that Michał Kruszewski is working on a new iteration of the watchdog driver and I voiced my concerns about the current watchdog driver API to him, but I think that having a nRF5 driver, even in current API form, is only beneficial to Zephyr and its' users, as any production-ready application *will* require a proper watchdog. Once the RFC is approved, I or someone else could easily port it to the new API.

Best regards,

Thiago

 

2017-08-21 14:44 GMT-03:00 Thiago Silveira <thiago@...>:

Hi Vinayak,

> I assume, you mean, you are using nRF52832 chip and your custom board; and using a Zephyr build using the BOARD=nrf52_pca10040.

> Do you have the 32KHz crystal mounted in your custom PCB? If not, you will need to use the internal RC oscillator by enabling CONFIG_CLOCK_CONTROL_NRF5_K32SRC_RC=y.

 

Good question! I checked and yes, we have the 32KHz crystal mounted in our custom PCB. We also tested with the nRF52 DK (the physical nrf52_pca10040 board).

We do use the nrf52_pca10040 board to build for our custom PCB, with some modifications in our prj.conf to suit our board (mainly UART TX/RX).

 

> Currently there is no watchdog driver for nRF52 contributed yet to Zephyr. I am not an expert on this peripheral, but I do notice in your code, you enable interrupt from the watchdog peripheral, hope you have setup interrupt handler correctly, clear the events and kick the dog sufficiently, etc.

> If your application needs watchdog, I would advise you implement a watchdog driver following the Zephyr driver model (include/watchdog.h and in drivers/watchdog folder). We will be glad to review your driver and a simple sample application. 


We do kick the dog sufficiently, and the watchdog is working fine apart from this initial hiccup. I'm not so sure about the events (other than clearing the channel).
Following your suggestions, I'm going to explore a little further the watchdog in nRF52832 and implement a driver following the Zephyr driver model.
Hopefully I could merge this back into upstream for the 1.10 release (as 1.09 is feature frozen)?

> Please remember that nRF52 is a ultra low power chip and there is no functional ARM systick timer, the system timer is implemented using the NRF_RTC peripheral.

> The resolution of each tick is in 32KHz units. If you print in busywait, you will see lot of lines with same values until each 32KHz (*if* UART tx time is very much less than 30.517 us, which I doubt).

> Ok, you want to wait 100 microseconds and your printk inside the “for” loop in k_busy_wait consumes more time (I am certain) and the loop does not break out correctly.

 

> Could you please explain the symptoms of the problem without any of your debugging (printk influences the k_busy_wait) ?

The 100 microseconds sample is just a way to show that k_cycle_get_32() is frozen. The only purpose is to test the NRF_RTC peripheral, not to wait any specified amount of time.

The first time it repeats 211 thousand times, with k_cycle_get_32() returning zero, and the second time it repeats only a dozen times, with k_cycle_get_32() returning increasing values.

That is evidence enough of what is happening, even though the waiting time may not be exactly 100 microseconds. Because waiting is not the intention, I don't think the debug influences the output that much.

 

However, I think that is a moot point now. I think your advice about the watchdog interrupts and events is correct.

I'm going to explore that further and report back to you guys.

Thanks so much for the help,

Thiago

 

2017-08-19 2:00 GMT-03:00 Chettimada, Vinayak Kariappa <vinayak.kariappa.chettimada@...>:

Hi Thiago,

 

 

2) We're building for the nrf52_pca10040. I think we are enabling it: CONFIG_CLOCK_CONTROL_NRF5_K32SRC_XTAL=y

 

I assume, you mean, you are using nRF52832 chip and your custom board; and using a Zephyr build using the BOARD=nrf52_pca10040.

Do you have the 32KHz crystal mounted in your custom PCB? If not, you will need to use the internal RC oscillator by enabling CONFIG_CLOCK_CONTROL_NRF5_K32SRC_RC=y.

 

3) TICKLESS_IDLE is enabled, but TICKLESS_KERNEL is disabled.
4) I'm sorry, I can't pinpoint it right now to you. I'm going to investigate further here and report back. We've started experiencing this problem at the start of this week, but we were at a two-week development hiatus before then.
The only thing we added as the watchdog. The watchdog code is as follows:

void wdt_init(uint32_t reload_ms) {

NRF_WDT->CONFIG = 0x01 | 0x08;

NRF_WDT->CRV = (reload_ms / 1000) * 32678;

        SYS_LOG_WRN("%d: %d or %u", reload_ms, NRF_WDT->CRV, NRF_WDT->CRV);

NRF_WDT->INTENSET = WDT_INTENSET_TIMEOUT_Msk;

NRF_WDT->TASKS_START = 1;

}

 

void wdt_reload(uint8_t channel) {

NRF_WDT->RR[channel] = NRF_WDT_RR_VALUE;

}


I attached our .config to the original gist, it is there at the end now: https://gist.github.com/durub/edda1fbf6a6c8f1c7f88960d26916ddf

I tried just now to reproduce the problem without any of our code (but using our .config), and the problem still persists. Our main is:

void main() {

k_busy_wait(100);

}

 

Currently there is no watchdog driver for nRF52 contributed yet to Zephyr. I am not an expert on this peripheral, but I do notice in your code, you enable interrupt from the watchdog peripheral, hope you have setup interrupt handler correctly, clear the events and kick the dog sufficiently, etc.

If your application needs watchdog, I would advise you implement a watchdog driver following the Zephyr driver model (include/watchdog.h and in drivers/watchdog folder). We will be glad to review your driver and a simple sample application. 



We've tested this main using nrf52_pca10040 and our PCB. Sample of the final of the faulty output:

0 0 3

0 0 3

(a lot of equal lines)

0 0 3

0 0 3

0 shell> 30 30 3

30 56 3

I must say that this problem is happening intermittently. Now, the simple main is always working (a lot of successive resets with nrfjprog --reset -f nrf52):

shell> 38 38 3

38 64 3

shell> 38 38 3

38 65 3

shell> 38 38 3

38 64 3

shell> 38 38 3

38 65 3

shell> 38 38 3

38 65 3

shell> 38 38 3

38 64 3

shell> 38 38 3

38 65 3

shell> 38 38 3

38 64 3

shell> 38 38 3

38 64 3

 

Please remember that nRF52 is a ultra low power chip and there is no functional ARM systick timer, the system timer is implemented using the NRF_RTC peripheral.

The resolution of each tick is in 32KHz units. If you print in busywait, you will see lot of lines with same values until each 32KHz (*if* UART tx time is very much less than 30.517 us, which I doubt).

Ok, you want to wait 100 microseconds and your printk inside the “for” loop in k_busy_wait consumes more time (I am certain) and the loop does not break out correctly.

 

Could you please explain the symptoms of the problem without any of your debugging (printk influences the k_busy_wait) ?

 

Regards,

Vinayak