Re: [EXT] [testing-wg] Zephyr: Testing WG weekly call - Mon, 05/24/2021 1:00pm-2:00pm, Please RSVP #cal-reminder

Alexey Brodkin

Hi Hake, all,

Our scenario for build problem reproduction is quite simple:
  1. Zephyr SDK installed on NFS share
  2. Massively parallel execution, e.g. 10 instances of "twister" script building & running entire test-suite for all QEMU platforms.
If we move Zephyr SDK right to the build machine the issue goes away.
Use of simulated platforms is important because their executions happen in parallel, while execution on a real target boards is gated by very limited amount of available boards (we may have a few, but not 10s or even 100s).

Another interesting observation is with use of the same one technical user for execution of all "twister" instances (and subsequently all children processes) we quite early hit a limit of processes existing simultaneously for 1 user. On our servers we have "max user processes" ("ulimit -u") set to 4096 by default. And so the first thing we learned it's necessary to set it to something much higher, which we don now to get NFS-related problem reproduced.

Now, NFS-related problem seem to be related to some issues with synchronization as at some point we get a lot of python processes waiting for something with the following stack-trace:
$ cat /proc/30876/stack
[<ffffffff9a912076>] futex_wait_queue_me+0xc6/0x130
[<ffffffff9a912e1b>] futex_wait+0x17b/0x280
[<ffffffff9a914b66>] do_futex+0x106/0x5a0
[<ffffffff9a915080>] SyS_futex+0x80/0x190
[<ffffffff9af8dede>] system_call_fastpath+0x25/0x2a
[<ffffffffffffffff>] 0xffffffffffffffff

We're still in progress of debugging that. Once we have more data, we'll happily share it!


From: testing-wg@... <testing-wg@...> on behalf of Hake Huang <hake.huang@...>
Sent: Monday, May 24, 2021 7:50 AM
To: testing-wg@... <testing-wg@...>
Subject: Re: [EXT] [testing-wg] Zephyr: Testing WG weekly call - Mon, 05/24/2021 1:00pm-2:00pm, Please RSVP #cal-reminder

Hi All,


Todays weekly meeting agenda


Follow up last weekly meeting


1.     Mcuboot can be ROM?

2.     Multi-branch checking in parallel. Alex, do you send out your scenario summary?


according to my experiences, it is because your board uart output some exception character which looks like a EOF to console handler.



General topics:


1.     V.2.6.0 board testing discussion.

a)      Any common pending issues found in this release?


  1. Zephyr summit topic discussion


3.     Round table discussion






From: testing-wg@... <testing-wg@...> On Behalf Of testing-wg@... Calendar via
Sent: 2021
524 20:50
To: testing-wg@...
Subject: [EXT] [testing-wg] Zephyr: Testing WG weekly call - Mon, 05/24/2021 1:00pm-2:00pm, Please RSVP #cal-reminder


Caution: EXT Email

Reminder: Zephyr: Testing WG weekly call

When: Monday, 24 May 2021, 1:00pm to 2:00pm, (GMT+00:00) UTC

Where:Microsoft Teams Meeting

An RSVP is requested. Click here to RSVP

Organizer: testing-wg@...



+1 321-558-6518 United States, Orlando (Toll)

Conference ID: 831 716 531#

Local numbers | Reset PIN | Learn more about Teams | Meeting options




Join to automatically receive all group messages.