Re: [EXT] [testing-wg] Zephyr: Testing WG weekly call - Mon, 05/24/2021 1:00pm-2:00pm, Please RSVP #cal-reminder


Alexey Brodkin
 

Hi Hake, all,

Our scenario for build problem reproduction is quite simple:
  1. Zephyr SDK installed on NFS share
  2. Massively parallel execution, e.g. 10 instances of "twister" script building & running entire test-suite for all QEMU platforms.
If we move Zephyr SDK right to the build machine the issue goes away.
Use of simulated platforms is important because their executions happen in parallel, while execution on a real target boards is gated by very limited amount of available boards (we may have a few, but not 10s or even 100s).

Another interesting observation is with use of the same one technical user for execution of all "twister" instances (and subsequently all children processes) we quite early hit a limit of processes existing simultaneously for 1 user. On our servers we have "max user processes" ("ulimit -u") set to 4096 by default. And so the first thing we learned it's necessary to set it to something much higher, which we don now to get NFS-related problem reproduced.

Now, NFS-related problem seem to be related to some issues with synchronization as at some point we get a lot of python processes waiting for something with the following stack-trace:
---------------------->8----------------------
$ cat /proc/30876/stack
[<ffffffff9a912076>] futex_wait_queue_me+0xc6/0x130
[<ffffffff9a912e1b>] futex_wait+0x17b/0x280
[<ffffffff9a914b66>] do_futex+0x106/0x5a0
[<ffffffff9a915080>] SyS_futex+0x80/0x190
[<ffffffff9af8dede>] system_call_fastpath+0x25/0x2a
[<ffffffffffffffff>] 0xffffffffffffffff
---------------------->8----------------------

We're still in progress of debugging that. Once we have more data, we'll happily share it!

-Alexey


From: testing-wg@... <testing-wg@...> on behalf of Hake Huang <hake.huang@...>
Sent: Monday, May 24, 2021 7:50 AM
To: testing-wg@... <testing-wg@...>
Subject: Re: [EXT] [testing-wg] Zephyr: Testing WG weekly call - Mon, 05/24/2021 1:00pm-2:00pm, Please RSVP #cal-reminder
 

Hi All,

 

Todays weekly meeting agenda

 

Follow up last weekly meeting

 

1.     Mcuboot can be ROM?

2.     Multi-branch checking in parallel. Alex, do you send out your scenario summary?

3.     https://github.com/zephyrproject-rtos/zephyr/issues/34571

according to my experiences, it is because your board uart output some exception character which looks like a EOF to console handler.

 

 

General topics:

 

1.     V.2.6.0 board testing discussion.

a)      Any common pending issues found in this release?

 

  1. Zephyr summit topic discussion

https://github.com/zephyrproject-rtos/zephyr/wiki/2021-Zephyr-Developer-Summit

 

3.     Round table discussion

 

Regards,

Hake

 

 

From: testing-wg@... <testing-wg@...> On Behalf Of testing-wg@... Calendar via lists.zephyrproject.org
Sent: 2021
524 20:50
To: testing-wg@...
Subject: [EXT] [testing-wg] Zephyr: Testing WG weekly call - Mon, 05/24/2021 1:00pm-2:00pm, Please RSVP #cal-reminder

 

Caution: EXT Email

Reminder: Zephyr: Testing WG weekly call

When: Monday, 24 May 2021, 1:00pm to 2:00pm, (GMT+00:00) UTC

Where:Microsoft Teams Meeting

An RSVP is requested. Click here to RSVP

Organizer: testing-wg@...

Description:

________________________________________________________________________________

+1 321-558-6518 United States, Orlando (Toll)

Conference ID: 831 716 531#

Local numbers | Reset PIN | Learn more about Teams | Meeting options

 

 

________________________________________________________________________________

Join testing-wg@lists.zephyrproject.org to automatically receive all group messages.