Re: ARM Cortex R intermittent MPU memory access check failures with CONFIG_USERSPACE


Glaropoulos, Ioannis
 

+ Andrew, Stephanos

 

Hi Phil,

Thanks for reporting this. I need to do a more careful looking into it, but it is likely that you have spotted a bug in the Cortex-M implementation.

 

The implementation of mpu_buffer_validate() has not changed since its initial porting – it seems that it does not protect writing RNR and then accessing RBAR, RASR registers aferwards and that seems to have always been the case.

So a thread context-switch interrupting this operation may modify the RNR register, making the read of RBAR, RASR or RLAR (v8m) inside the functions called by mpu_buffer_validate() invalid.

 

If mpu_buffer_validate() is called by thread supervisor mode, it needs to be protected by locking IRQs. In fact, only the PendSV needs to be lock because only that one does MPU reprogramming.

Alternatively RNR could be saved and restored in PendSV handler (normal IRQs do not touch MPU).

 

I’d appreciate if you file a bug report for this for Cortex-M.

 

For Cortex-R, user mode is not assumed to be working (yet) so, strictkly speaking, it is not a bug, but if you work on Cortex-R user mode, this is one thing you need to consider!

 

Ioannis

 

 

 

From: devel@... [mailto:devel@...] On Behalf Of phil.erwin via Lists.Zephyrproject.Org
Sent: Monday, January 27, 2020 9:38 PM
To: devel@...
Cc: devel@...
Subject: [Zephyr-devel] ARM Cortex R intermittent MPU memory access check failures with CONFIG_USERSPACE

 

I'm getting intermittent run-time failures such as:

     syscall z_hdlr_k_uptime_get failed check: Memory region 0x0002ff70 (size 8) write access denied
     ***** Hardware exception *****
     Current thread ID = 0x000221b0
     Faulting instruction address = 0x1a
     Fatal fault in thread 0x000221b0! Aborting.

In this example, k_uptime_get() was called, which is going to save its 64-bit return value into memory (due to the syscall macros).  The syscall gear is checking to see if the memory
address is valid.  The routines wind down into mpu_buffer_validate(), which will call is_user_accessible_region() and finally into get_region_ap().  The code
is failing because get_region_ap() has a critical code section which is not wrapped with an irq_lock()/irq_unlock().  In this failure, we write the MPU index register, then
take an external interrupt, go off to another thread for a while ( which is going to re-load the MPU entries), return to the current thread (again re-loading the MPU entries), and read the attribute register (for the wrong index) and fail.

In reading the current Cortex-M code, I don't see it protecting these critical sections either, so I'm left wondering if the 'M' class will save the index register during
interrupt processing.  If so, I would like to fix the 'R' code in a similar fashion.

There are several of these utility subroutines with this same critical code sections.

For reference, I've included the diff of get_region_ap() that I am going to be testing with locally where I do irq_lock() and irq_unlock().

 /**
  * This internal function returns the access permissions of an MPU region
  * specified by its region index.
@@ -133,7 +136,17 @@ static inline int is_enabled_region(u32_t index)
  */
 static inline u32_t get_region_ap(u32_t r_index)
 {
-       set_region_number(r_index);
+   int
+      attr,
+      key = irq_lock();
+  
+   set_region_number(r_index);
+  
+   attr = get_region_attributes();
 
-       return (get_region_attributes() & MPU_RASR_AP_Msk) >> MPU_RASR_AP_Pos;
+   irq_unlock(key);
+
+   return (attr & MPU_RASR_AP_Msk) >> MPU_RASR_AP_Pos;
 }
+

Join {devel@lists.zephyrproject.org to automatically receive all group messages.