Topics

ARMv7 Cortex-A port for Xilinx Zynq7000

Immo Birnbaum
 

Hi,

I've been following the development of Zephyr ever since I attended the Embedded World in Nuremberg back in 2016, and I'd like to thank all those who have contributed to the system's current feature set and hardware support. The whole set of what the system now offers has made Zephyr a very attractive RTOS not only to me, but also to my employer, who has become interested in using it for several projects in the IoT field. The only catch is that one of our target hardware platforms is not yet supported, not only at the board/SoC level, but at the architecture level. We'd very much like to see Zephyr run on the dual core (SMP isn't a requirement, though) Cortex-A9 contained in the Xilinx Zynq7000 SoC, despite the lacking support for ARMv7 Cortex-A CPUs.

So I've sat down, equipped with a Digilent Zedboard which to my knowledge can still be considered the de-facto standard evaluation board for the Zynq7000 and is readily available and a Lauterbach debugger, and started porting Zephyr to the v7/Cortex-A architecture. I considered this to be an interesting challenge as although I've used ARM-based systems for quite some time now (all the way back to my trusty ol' Apple Newton), my past development experience is on x86- and PowerPC-based systems and RTOSes like QNX or eCos.

I'm now at a point at which I'd call the progress I've made up to now a working proof of concept. Here's what I've done so far:

- Added cortex_a to arch/arm/core, derived from the cortex_r implementation.
- Added Zynq7000 SoC and Zedboard definitions plus device trees.
- Implemented a simple driver for the GIC PL-390 interrupt controller.
- Re-used the existing Xilinx TTC and UART drivers.
- Implemented a driver for the Xilinx GEM GBit Ethernet controller, which includes PHY initialization via MDIO and link monitoring. The PHY part contains some register accesses specific to the Marvell PHY used on the Zedboard, but just the PHY reset and auto-negotiation management is generic. All of the driver's features are configurable via menuconfig.
- Implemented an interrupt-capable driver for the Xilinx AXI GPIO IP core, as the Zedboard features switches, pushbuttons and LEDs connected to the processor core via this IP core in the programmable logic part of the system.
- A driver for GPIO connected via the EMIO interface shouldn't be too hard either, although the number of available pins might be an issue here. The AXI GPIO driver is limited to single channel operation (the IP core has an optional 2nd channel) as e.g. the pin mask of the GPIO callback struct are limited to 32 bits.
- Enabled FPU support and saving of the FPU's registers during a context switch. The register saving is unconditional as Cortex-A doesn't seem to have an equivalent to the Cortex-M's CONTROL register and the flag indicating that the FPU registers are currently in use. With the GNU ARM Embedded Toolchain that I'm currently using, the FPU works using the softfp ABI.
- The MMU is set up in the simplest possible way, with a page table consisting of nothing but 1 MB page entries. The part of the system's memory map where the RAM is located is configured as cacheable/bufferable, which is a requirement for any unaligned accesses (which can for example be found in the IP stack) to work. If the MMU isn't set up this way, the result is an illegal instruction exception. The rest of the memory map is set up as strongly ordered, covering areas such as the SLCR, the peripherial register space and the OCM.
- Added a corresponding QEMU target using the Zynq7000 template.

Still, the to-do-list has a quite a few points on it:

- CMSIS for Cortex-A is not yet integrated, e.g. SLCR or FPU register accesses are hand-coded.
- Caches not activated yet, as my own driver implementations don't contain any barriers yet where they probably should have some.
- Only compiled with the GNU ARM Embedded Toolchain so far.
- QEMU only tested to the point where the hello world and philosophers demos are apparently working with a basic IRQ/Timer/UART setup.
- No tests from the testsuite have run so far.
- SMP not addressed at all, but this feature isn't that important to me personally. A single 666 MHz core is quite capable by itself...
- None of the security features are addressed, it's all secure mode-only for now.
- The interrupt controller uses the same workaround as the GIC PL-400 in the Cortex-R port: store a pointer to the driver attached to vector [0] as there is no well-defined interrupt controller such as the Cortex-M's NVIC. Therefore, currently all interrupt vectors are offset by one.
- No high-level/scripted configuration of the MMU possible for now.
- I'm currently forcing the ARM instruction set, there's currently no Thumb code being generated.
- I haven't yet looked into whether any other FP ABIs can be supported or not, and there are currently no configuration options for the FPU such as on how to handle denormals.
- Last but not least, the Zynq has some more peripherials generally supported by Zephyr but for which no drivers exist for now. As the processor system in the Zynq usually relies on *some* content in the programmable logic part (e.g. the AXI GPIO IP core), a driver for the Xilinx devcfg interface might be an integral part of the SoC support (although u-boot can handle this just as well), but there's also things like CAN, ADC, SPI, and the Zedboard features an OLED display...

I'd like to contribute my work to the project, is this of any interest to all of you? Are there any must-have items I should tackle beforehand, probably most likely regaring testing? If so, I'd probably have to start merging my modifications into the current code base, as I'm still working on a code base acquired shortly before the arch/arm path was split up into AARCH32 and AARCH64. What would be the next steps in providing this to the community, considering that this is a bit more than just a stand-alone device driver?

Best regards from Germany,
Immo

Cufi, Carles
 

Hi Immo,

 

Thanks for your email and interest in the Zephyr project. Your contributions are certainly welcome!

 

Basic Armv8, Cortex-A support for Zephyr was merged recently:

https://github.com/zephyrproject-rtos/zephyr/pull/20263

 

MMU support is an ongoing effort:

https://github.com/zephyrproject-rtos/zephyr/pull/22446

 

You can track the progress of Cortex-A support here:

https://github.com/zephyrproject-rtos/zephyr/issues/22411

 

Please comment on the existing Pull Requests and Issues and feel free to open your own Pull Requests to extend the Cortex-A support in Zephyr. I have copied Carlo Caione on this email, since he has been driving this effort so far.

 

Regards,

 

Carles

 

 

From: devel@... <devel@...> On Behalf Of Immo Birnbaum via Lists.Zephyrproject.Org
Sent: 11 February 2020 14:54
To: devel@...
Cc: devel@...
Subject: [Zephyr-devel] ARMv7 Cortex-A port for Xilinx Zynq7000

 

Hi,

I've been following the development of Zephyr ever since I attended the Embedded World in Nuremberg back in 2016, and I'd like to thank all those who have contributed to the system's current feature set and hardware support. The whole set of what the system now offers has made Zephyr a very attractive RTOS not only to me, but also to my employer, who has become interested in using it for several projects in the IoT field. The only catch is that one of our target hardware platforms is not yet supported, not only at the board/SoC level, but at the architecture level. We'd very much like to see Zephyr run on the dual core (SMP isn't a requirement, though) Cortex-A9 contained in the Xilinx Zynq7000 SoC, despite the lacking support for ARMv7 Cortex-A CPUs.

So I've sat down, equipped with a Digilent Zedboard which to my knowledge can still be considered the de-facto standard evaluation board for the Zynq7000 and is readily available and a Lauterbach debugger, and started porting Zephyr to the v7/Cortex-A architecture. I considered this to be an interesting challenge as although I've used ARM-based systems for quite some time now (all the way back to my trusty ol' Apple Newton), my past development experience is on x86- and PowerPC-based systems and RTOSes like QNX or eCos.

I'm now at a point at which I'd call the progress I've made up to now a working proof of concept. Here's what I've done so far:

- Added cortex_a to arch/arm/core, derived from the cortex_r implementation.
- Added Zynq7000 SoC and Zedboard definitions plus device trees.
- Implemented a simple driver for the GIC PL-390 interrupt controller.
- Re-used the existing Xilinx TTC and UART drivers.
- Implemented a driver for the Xilinx GEM GBit Ethernet controller, which includes PHY initialization via MDIO and link monitoring. The PHY part contains some register accesses specific to the Marvell PHY used on the Zedboard, but just the PHY reset and auto-negotiation management is generic. All of the driver's features are configurable via menuconfig.
- Implemented an interrupt-capable driver for the Xilinx AXI GPIO IP core, as the Zedboard features switches, pushbuttons and LEDs connected to the processor core via this IP core in the programmable logic part of the system.
- A driver for GPIO connected via the EMIO interface shouldn't be too hard either, although the number of available pins might be an issue here. The AXI GPIO driver is limited to single channel operation (the IP core has an optional 2nd channel) as e.g. the pin mask of the GPIO callback struct are limited to 32 bits.
- Enabled FPU support and saving of the FPU's registers during a context switch. The register saving is unconditional as Cortex-A doesn't seem to have an equivalent to the Cortex-M's CONTROL register and the flag indicating that the FPU registers are currently in use. With the GNU ARM Embedded Toolchain that I'm currently using, the FPU works using the softfp ABI.
- The MMU is set up in the simplest possible way, with a page table consisting of nothing but 1 MB page entries. The part of the system's memory map where the RAM is located is configured as cacheable/bufferable, which is a requirement for any unaligned accesses (which can for example be found in the IP stack) to work. If the MMU isn't set up this way, the result is an illegal instruction exception. The rest of the memory map is set up as strongly ordered, covering areas such as the SLCR, the peripherial register space and the OCM.
- Added a corresponding QEMU target using the Zynq7000 template.

Still, the to-do-list has a quite a few points on it:

- CMSIS for Cortex-A is not yet integrated, e.g. SLCR or FPU register accesses are hand-coded.
- Caches not activated yet, as my own driver implementations don't contain any barriers yet where they probably should have some.
- Only compiled with the GNU ARM Embedded Toolchain so far.
- QEMU only tested to the point where the hello world and philosophers demos are apparently working with a basic IRQ/Timer/UART setup.
- No tests from the testsuite have run so far.
- SMP not addressed at all, but this feature isn't that important to me personally. A single 666 MHz core is quite capable by itself...
- None of the security features are addressed, it's all secure mode-only for now.
- The interrupt controller uses the same workaround as the GIC PL-400 in the Cortex-R port: store a pointer to the driver attached to vector [0] as there is no well-defined interrupt controller such as the Cortex-M's NVIC. Therefore, currently all interrupt vectors are offset by one.
- No high-level/scripted configuration of the MMU possible for now.
- I'm currently forcing the ARM instruction set, there's currently no Thumb code being generated.
- I haven't yet looked into whether any other FP ABIs can be supported or not, and there are currently no configuration options for the FPU such as on how to handle denormals.
- Last but not least, the Zynq has some more peripherials generally supported by Zephyr but for which no drivers exist for now. As the processor system in the Zynq usually relies on *some* content in the programmable logic part (e.g. the AXI GPIO IP core), a driver for the Xilinx devcfg interface might be an integral part of the SoC support (although u-boot can handle this just as well), but there's also things like CAN, ADC, SPI, and the Zedboard features an OLED display...

I'd like to contribute my work to the project, is this of any interest to all of you? Are there any must-have items I should tackle beforehand, probably most likely regaring testing? If so, I'd probably have to start merging my modifications into the current code base, as I'm still working on a code base acquired shortly before the arch/arm path was split up into AARCH32 and AARCH64. What would be the next steps in providing this to the community, considering that this is a bit more than just a stand-alone device driver?

Best regards from Germany,
Immo

Carlo Caione
 

On 11/02/2020 15:33, Cufi, Carles wrote:
Hi Immo,
Hi everyone,

+CC Stephanos Ioannidis (Cortex-R guru)

/snip
Please comment on the existing Pull Requests and Issues and feel free to open your own Pull Requests to extend the Cortex-A support in Zephyr. I have copied Carlo Caione on this email, since he has been driving this effort so far.
This please. Just a quick note that being the ARMv7/Cortex-A still a 32bit arch, that is probably closer to the current ARMv7/Cortex-R work than to my ARMv8/Cortex-A work.

/snip
The only catch is that one of our target hardware platforms is not yet supported, not only at the board/SoC level, but at the architecture level. We'd very much like to see Zephyr run on the dual core (SMP isn't a requirement, though) Cortex-A9 contained in the Xilinx Zynq7000 SoC, despite the lacking support for ARMv7 Cortex-A CPUs.
Cortex-R architecture is supposed to be the arch closer to the ARMv8/Cortex-A if you want a starting point.

/snip
I'm now at a point at which I'd call the progress I've made up to now a working proof of concept. Here's what I've done so far:
/snip

It looks like you have done a lot already and you can definitely start pushing things upstream if they are ready for review and are self-contained / self-testable.

The problem I encountered when working on the ARMv8 port was that upstream wants to have a fully working port before considering the work ready for merging. This means that you must be able to pass as many tests as possible (in theory all of them). The "push early push often" philosophy seems to not apply very well to the Zephyr project when dealing with new architectures.

- The MMU is set up in the simplest possible way, with a page table consisting of nothing but 1 MB page entries. The part of the system's memory map where the RAM is located is configured as cacheable/bufferable, which is a requirement for any unaligned accesses (which can for example be found in the IP stack) to work. If the MMU isn't set up this way, the result is an illegal instruction exception. The rest of the memory map is set up as strongly ordered, covering areas such as the SLCR, the peripherial register space and the OCM.
MMU work for ARMv8 is ongoing at https://github.com/zephyrproject-rtos/zephyr/pull/22446

- Added a corresponding QEMU target using the Zynq7000 template.
Still, the to-do-list has a quite a few points on it:
- CMSIS for Cortex-A is not yet integrated, e.g. SLCR or FPU register accesses are hand-coded.
Not a problem. That could be added later.

- Caches not activated yet, as my own driver implementations don't contain any barriers yet where they probably should have some.
Not a problem.

- Only compiled with the GNU ARM Embedded Toolchain so far.
Did you try to use the Zephyr SDK?

- QEMU only tested to the point where the hello world and philosophers demos are apparently working with a basic IRQ/Timer/UART setup.
- No tests from the testsuite have run so far.
What's the issue with the remaining tests?

- SMP not addressed at all, but this feature isn't that important to me personally. A single 666 MHz core is quite capable by itself...
SMP not required for an initial submission.

- None of the security features are addressed, it's all secure mode-only for now.
Not a problem.

- The interrupt controller uses the same workaround as the GIC PL-400 in the Cortex-R port: store a pointer to the driver attached to vector [0] as there is no well-defined interrupt controller such as the Cortex-M's NVIC. Therefore, currently all interrupt vectors are offset by one.
Currently worked out at https://github.com/zephyrproject-rtos/zephyr/pull/22718

- No high-level/scripted configuration of the MMU possible for now.
Not required.

- I'm currently forcing the ARM instruction set, there's currently no Thumb code being generated.
Not a problem.

- I haven't yet looked into whether any other FP ABIs can be supported or not, and there are currently no configuration options for the FPU such as on how to handle denormals.
Not a problem.

- Last but not least, the Zynq has some more peripherials generally supported by Zephyr but for which no drivers exist for now. As the processor system in the Zynq usually relies on *some* content in the programmable logic part (e.g. the AXI GPIO IP core), a driver for the Xilinx devcfg interface might be an integral part of the SoC support (although u-boot can handle this just as well), but there's also things like CAN, ADC, SPI, and the Zedboard features an OLED display...
Are you able to use QEMU to emulate a working hardware?

I'd like to contribute my work to the project, is this of any interest to all of you? Are there any must-have items I should tackle beforehand, probably most likely regaring testing? If so, I'd probably have to start merging my modifications into the current code base, as I'm still working on a code base acquired shortly before the arch/arm path was split up into AARCH32 and AARCH64. What would be the next steps in providing this to the community, considering that this is a bit more than just a stand-alone device driver?
IMO the next step is making sure you are able to pass as many tests as possible in a QEMU environment when using the Zephyr SDK. If for some reason some tests are not passable just specify why is that when pushing the PR. This should be enough to convince people to review your code.

Cheers and good luck :)

--
Carlo Caione

Henrik Brix Andersen
 

Hello Immo,

That is great news! Thank you for working on this.

I started on add Cortex-A9/Zynq7000 support to Zephyr a couple of weeks ago, but stalled it while awaiting the Cortex-R port (which is very similar) to stabilise.
I am very interested in this port and I would be more than happy to help in reviewing and testing a pull request for adding this to mainline Zephyr.

I started out with qemu and the xilinx-zynq-a9 machine defintion, but did not get as far as what you describe below.

Please keep us posted on any progress and let me know if you need help in creating a pull request for this.

Best regards,
Brix
--
Henrik Brix Andersen

On 11 Feb 2020, at 14.53, Immo Birnbaum <@ImmoB> wrote:

Hi,

I've been following the development of Zephyr ever since I attended the Embedded World in Nuremberg back in 2016, and I'd like to thank all those who have contributed to the system's current feature set and hardware support. The whole set of what the system now offers has made Zephyr a very attractive RTOS not only to me, but also to my employer, who has become interested in using it for several projects in the IoT field. The only catch is that one of our target hardware platforms is not yet supported, not only at the board/SoC level, but at the architecture level. We'd very much like to see Zephyr run on the dual core (SMP isn't a requirement, though) Cortex-A9 contained in the Xilinx Zynq7000 SoC, despite the lacking support for ARMv7 Cortex-A CPUs.

So I've sat down, equipped with a Digilent Zedboard which to my knowledge can still be considered the de-facto standard evaluation board for the Zynq7000 and is readily available and a Lauterbach debugger, and started porting Zephyr to the v7/Cortex-A architecture. I considered this to be an interesting challenge as although I've used ARM-based systems for quite some time now (all the way back to my trusty ol' Apple Newton), my past development experience is on x86- and PowerPC-based systems and RTOSes like QNX or eCos.

I'm now at a point at which I'd call the progress I've made up to now a working proof of concept. Here's what I've done so far:

- Added cortex_a to arch/arm/core, derived from the cortex_r implementation.
- Added Zynq7000 SoC and Zedboard definitions plus device trees.
- Implemented a simple driver for the GIC PL-390 interrupt controller.
- Re-used the existing Xilinx TTC and UART drivers.
- Implemented a driver for the Xilinx GEM GBit Ethernet controller, which includes PHY initialization via MDIO and link monitoring. The PHY part contains some register accesses specific to the Marvell PHY used on the Zedboard, but just the PHY reset and auto-negotiation management is generic. All of the driver's features are configurable via menuconfig.
- Implemented an interrupt-capable driver for the Xilinx AXI GPIO IP core, as the Zedboard features switches, pushbuttons and LEDs connected to the processor core via this IP core in the programmable logic part of the system.
- A driver for GPIO connected via the EMIO interface shouldn't be too hard either, although the number of available pins might be an issue here. The AXI GPIO driver is limited to single channel operation (the IP core has an optional 2nd channel) as e.g. the pin mask of the GPIO callback struct are limited to 32 bits.
- Enabled FPU support and saving of the FPU's registers during a context switch. The register saving is unconditional as Cortex-A doesn't seem to have an equivalent to the Cortex-M's CONTROL register and the flag indicating that the FPU registers are currently in use. With the GNU ARM Embedded Toolchain that I'm currently using, the FPU works using the softfp ABI.
- The MMU is set up in the simplest possible way, with a page table consisting of nothing but 1 MB page entries. The part of the system's memory map where the RAM is located is configured as cacheable/bufferable, which is a requirement for any unaligned accesses (which can for example be found in the IP stack) to work. If the MMU isn't set up this way, the result is an illegal instruction exception. The rest of the memory map is set up as strongly ordered, covering areas such as the SLCR, the peripherial register space and the OCM.
- Added a corresponding QEMU target using the Zynq7000 template.

Still, the to-do-list has a quite a few points on it:

- CMSIS for Cortex-A is not yet integrated, e.g. SLCR or FPU register accesses are hand-coded.
- Caches not activated yet, as my own driver implementations don't contain any barriers yet where they probably should have some.
- Only compiled with the GNU ARM Embedded Toolchain so far.
- QEMU only tested to the point where the hello world and philosophers demos are apparently working with a basic IRQ/Timer/UART setup.
- No tests from the testsuite have run so far.
- SMP not addressed at all, but this feature isn't that important to me personally. A single 666 MHz core is quite capable by itself...
- None of the security features are addressed, it's all secure mode-only for now.
- The interrupt controller uses the same workaround as the GIC PL-400 in the Cortex-R port: store a pointer to the driver attached to vector [0] as there is no well-defined interrupt controller such as the Cortex-M's NVIC. Therefore, currently all interrupt vectors are offset by one.
- No high-level/scripted configuration of the MMU possible for now.
- I'm currently forcing the ARM instruction set, there's currently no Thumb code being generated.
- I haven't yet looked into whether any other FP ABIs can be supported or not, and there are currently no configuration options for the FPU such as on how to handle denormals.
- Last but not least, the Zynq has some more peripherials generally supported by Zephyr but for which no drivers exist for now. As the processor system in the Zynq usually relies on *some* content in the programmable logic part (e.g. the AXI GPIO IP core), a driver for the Xilinx devcfg interface might be an integral part of the SoC support (although u-boot can handle this just as well), but there's also things like CAN, ADC, SPI, and the Zedboard features an OLED display...

I'd like to contribute my work to the project, is this of any interest to all of you? Are there any must-have items I should tackle beforehand, probably most likely regaring testing? If so, I'd probably have to start merging my modifications into the current code base, as I'm still working on a code base acquired shortly before the arch/arm path was split up into AARCH32 and AARCH64. What would be the next steps in providing this to the community, considering that this is a bit more than just a stand-alone device driver?

Best regards from Germany,
Immo

Immo Birnbaum
 

Hi Carles,

thanks for those initial pointers! I'll also keep an eye on what's going on regarding the Cortex-R port besides the ARMv8 Cortex-A port, as Carlo mentioned, the ARMv7 Cortex-A and the Cortex-R are quite close relatives.

Immo Birnbaum
 
Edited

Hi Carlo,

Just a quick note that being the ARMv7/Cortex-A still a 32bit arch, that is probably closer to the current ARMv7/Cortex-R work than to my ARMv8/Cortex-A work.

To my knowledge, the most significant differences are MMU vs. MPU and the newer interrupt controller in the Cortex-R, which expanded the old controller's basic secure/non-secure model into a multiple-priority model. Other than the FPU stuff, I didn't mess much with the assembly code that implements the context switch.

Did you try to use the Zephyr SDK?

My memory on that is a bit hazy, as I started back in November, but to my best recollection I followed the standard 'getting started' procedure at first, including the SDK setup. Yet, the first Zephyr-compatible board I came across at work was an NXP eval kit based around a Cortex-M33 (the LPCXPRESSO55S69), and compiling the demos with the supplied cross-compiler failed as it did not yet support the M33. That's how I ended up with the alternative ARM toolchain as described in the "3rd party toolchains" section. I can easily set up a fresh development VM and give the Zephyr SDK another try, as the A9 is a proven piece of hardware that has long been supported.

What's the issue with the remaining tests?

Fortunately, there's no technical issue here - it's just an issue of me not yet having had time to read up on how the test framework works.

Are you able to use QEMU to emulate a working hardware?

The QEMU template for the Zynq7000 supports all important components (GIC, TTC, UART) plus the two Ethernet controllers. There's also USB which I'm not looking at right now, plus some SPI/I2C support mostly relevant for flash memory connectivity. While the bitstream programming of the FPGA can be simulated, IP cores such as the GPIO don't seem to be on board, so GPIO won't work while the rest will.

IMO the next step is making sure you are able to pass as many tests as possible in a QEMU environment when using the Zephyr SDK.
Will do, but prior to that I'll most likely start by updating my codebase and merging my changes into the new AARCH32/AARCH64 structure. I guess with separate assembly code for each architecture's context switch, that part might be a lot more uncluttered compared to what I have right now.

Thanks for the advice!
Regards,
Immo

Immo Birnbaum
 

Hi Henrik,

I'll keep you updated. Thanks for the offer regarding the pull request preparation, I'll likely get back to you on that :)

Regards,
Immo

Immo Birnbaum
 

Hi all,

here's my first update on the topic of the Zynq-7000 port, this is the progress so far:

- Set up a fresh development VM which now uses the Zephyr SDK, which works fine for the Cortex-A9.
- Forked the Zephyr repository and set up a feature branch, which can be found here: https://github.com/ibirnbaum/zephyr/tree/armv7_cortex_a
- Merged all of the changes/additions described above into the current code base.
- Updated the AXI GPIO driver to match the GPIO driver API which was heavily modified in the meantime.
- Kicked out my own GIC PL-390 interrupt controller driver and switched over to Stephanos Ionannidis' GICv1 implementation, including updated IRQ descriptors in the device tree files.
- I looked into the testsuite and ran tests on the QEMU Cortex-A9 target, plus a hand full of test cases on the actual hardware (as downloading the binary to the board is still a manual process using the Lauterbach TRACE software). The results don't look too bad, for example, these are the results of the 'kernel' test suite:
* 86 test configurations selected, 8 configurations discarded due to filters
* 24 tests skipped (e.g. SMP/ARMv8/userspace related test cases)
* Out of the 62 remaining test cases, 60 pass. One (kernel.timer.tickless) fails to build due to an unresolved symbol (z_clock_uptime). This is odd for two reasons: one, all other 'tickless'-related tests are skipped and two, the Local APIC timer driver seems to be the only timer driver implementing this function. To me, this looks more like a testsuite configuration issue? The other failure is the arch.interrupt test case, which actually runs but fails due to an assertion regarding the expected state of an IRQ to be tested. This is due to the target IRQ selection logic only being implemented for Cortex-M when it comes to the ARM architecture.
The following testsuites pass all test cases on the QEMU target which aren't filtered out for whatever reason (I didn't blacklist anything myself):
- lib
- misc
- portability
- posix
- shell
- subsys
- ztest
I'll have to look into the details as to why in some cases, more test cases are filtered out than executed. In some cases, e.g. ARMv8-specific stuff, it's pretty obvious, but in others I'm suspecting that I ought to whitelist testcases or subsystems for testing, likely in the target's YAML files? For example, despite having full Ethernet support, the 'net' testsuite in its current state does pretty much nothing.

I'll keep you updated and I'll look into the ongoing discussions and the mechanics of pull requests, I could start simple, as for example, the Xilinx TTC timer driver had a faulty prescaler calculation routine. As this source file is already in the main repository, this might be a good exercise for a pull request. Until then, I'd appreciate any feedback if anyone feels like experimenting with my fork of the repository.

Best regards,
Immo

Stephanos Ioannidis
 

Hi Immo,

 

Thanks for looking into this.

 

For the time being, I believe focusing on the following issues would help in getting these changes upstream-ed:

 

  1. Getting all applicable tests to pass
    1. At least, all applicable `kernel` tests must pass.
    2. Some tests (e.g. `benchmark` and `interrupt`) may require additional work to pass (see #22669 and #22670).
  2. Cleaning up commits (all the usual stuff)
    1. Breaking changes into more manageable chunks/commits
    2. Squashing commits that should really be one commit

 

The following are some ongoing issues that could be noteworthy:

 

  • Refactor ARM interrupt system (Cortex-A & Cortex-R) (#22718)
  • Implement benchmark tests for Cortex-R and Cortex-A (#22669)
  • Implement GIC-based ARM interrupt tests (#22670)
  • arch: arm: aarch32: Allow selecting compiler instruction set (#22741)
  • AArch64 / Cortex-A port improvements / TODO (#22411)
  • soc: arm64: Add support for Xilinx ZynqMP APU (#22418)

 

There is also the #arch-arm channel on the Slack where we discuss ARM arch development topics.

 

Regards,

 

Stephanos

 

From: devel@... <devel@...> On Behalf Of Immo Birnbaum via Lists.Zephyrproject.Org
Sent: Wednesday, February 26, 2020 9:55 PM
To: devel@...
Cc: devel@...
Subject: Re: [Zephyr-devel] ARMv7 Cortex-A port for Xilinx Zynq7000

 

Hi all,

here's my first update on the topic of the Zynq-7000 port, this is the progress so far:

- Set up a fresh development VM which now uses the Zephyr SDK, which works fine for the Cortex-A9.
- Forked the Zephyr repository and set up a feature branch, which can be found here: https://github.com/ibirnbaum/zephyr/tree/armv7_cortex_a
- Merged all of the changes/additions described above into the current code base.
- Updated the AXI GPIO driver to match the GPIO driver API which was heavily modified in the meantime.
- Kicked out my own GIC PL-390 interrupt controller driver and switched over to Stephanos Ionannidis' GICv1 implementation, including updated IRQ descriptors in the device tree files.
- I looked into the testsuite and ran tests on the QEMU Cortex-A9 target, plus a hand full of test cases on the actual hardware (as downloading the binary to the board is still a manual process using the Lauterbach TRACE software). The results don't look too bad, for example, these are the results of the 'kernel' test suite:
* 86 test configurations selected, 8 configurations discarded due to filters
* 24 tests skipped (e.g. SMP/ARMv8/userspace related test cases)
* Out of the 62 remaining test cases, 60 pass. One (kernel.timer.tickless) fails to build due to an unresolved symbol (z_clock_uptime). This is odd for two reasons: one, all other 'tickless'-related tests are skipped and two, the Local APIC timer driver seems to be the only timer driver implementing this function. To me, this looks more like a testsuite configuration issue? The other failure is the arch.interrupt test case, which actually runs but fails due to an assertion regarding the expected state of an IRQ to be tested. This is due to the target IRQ selection logic only being implemented for Cortex-M when it comes to the ARM architecture.
The following testsuites pass all test cases on the QEMU target which aren't filtered out for whatever reason (I didn't blacklist anything myself):
- lib
- misc
- portability
- posix
- shell
- subsys
- ztest
I'll have to look into the details as to why in some cases, more test cases are filtered out than executed. In some cases, e.g. ARMv8-specific stuff, it's pretty obvious, but in others I'm suspecting that I ought to whitelist testcases or subsystems for testing, likely in the target's YAML files? For example, despite having full Ethernet support, the 'net' testsuite in its current state does pretty much nothing.

I'll keep you updated and I'll look into the ongoing discussions and the mechanics of pull requests, I could start simple, as for example, the Xilinx TTC timer driver had a faulty prescaler calculation routine. As this source file is already in the main repository, this might be a good exercise for a pull request. Until then, I'd appreciate any feedback if anyone feels like experimenting with my fork of the repository.

Best regards,
Immo

Jukka Rissanen
 

Hi Immo,

for networking tests to run, I think you need to say

- netif:eth

in board yaml file (instead of just "- net")


Cheers,
Jukka

On Wed, 2020-02-26 at 04:54 -0800, Immo Birnbaum wrote:
Hi all,

here's my first update on the topic of the Zynq-7000 port, this is
the progress so far:

- Set up a fresh development VM which now uses the Zephyr SDK, which
works fine for the Cortex-A9.
- Forked the Zephyr repository and set up a feature branch, which can
be found here:
https://github.com/ibirnbaum/zephyr/tree/armv7_cortex_a
- Merged all of the changes/additions described above into the
current code base.
- Updated the AXI GPIO driver to match the GPIO driver API which was
heavily modified in the meantime.
- Kicked out my own GIC PL-390 interrupt controller driver and
switched over to Stephanos Ionannidis' GICv1 implementation,
including updated IRQ descriptors in the device tree files.
- I looked into the testsuite and ran tests on the QEMU Cortex-A9
target, plus a hand full of test cases on the actual hardware (as
downloading the binary to the board is still a manual process using
the Lauterbach TRACE software). The results don't look too bad, for
example, these are the results of the 'kernel' test suite:
* 86 test configurations selected, 8 configurations discarded due to
filters
* 24 tests skipped (e.g. SMP/ARMv8/userspace related test cases)
* Out of the 62 remaining test cases, 60 pass. One
(kernel.timer.tickless) fails to build due to an unresolved symbol
(z_clock_uptime). This is odd for two reasons: one, all other
'tickless'-related tests are skipped and two, the Local APIC timer
driver seems to be the only timer driver implementing this function.
To me, this looks more like a testsuite configuration issue? The
other failure is the arch.interrupt test case, which actually runs
but fails due to an assertion regarding the expected state of an IRQ
to be tested. This is due to the target IRQ selection logic only
being implemented for Cortex-M when it comes to the ARM architecture.
The following testsuites pass all test cases on the QEMU target which
aren't filtered out for whatever reason (I didn't blacklist anything
myself):
- lib
- misc
- portability
- posix
- shell
- subsys
- ztest
I'll have to look into the details as to why in some cases, more test
cases are filtered out than executed. In some cases, e.g. ARMv8-
specific stuff, it's pretty obvious, but in others I'm suspecting
that I ought to whitelist testcases or subsystems for testing, likely
in the target's YAML files? For example, despite having full Ethernet
support, the 'net' testsuite in its current state does pretty much
nothing.

I'll keep you updated and I'll look into the ongoing discussions and
the mechanics of pull requests, I could start simple, as for example,
the Xilinx TTC timer driver had a faulty prescaler calculation
routine. As this source file is already in the main repository, this
might be a good exercise for a pull request. Until then, I'd
appreciate any feedback if anyone feels like experimenting with my
fork of the repository.

Best regards,
Immo

Immo Birnbaum
 

Hi Jukka,

thanks for the info, I looked at the YAML files of other platforms and eventually came across the "-netif:eth" switch. Once I had that integrated, I had a bit of a fight against the QEMU system configuration in conjunction with the UART pipe and SLIP drivers used by the networking tests, I eventually came up with a configuration in which both the UART console and the UART pipe driver work after fixing a minor bug in one of the instantiation macros of the Xilinx UART driver. Also, having the Ethernet controller driver perform polling reads/writes to a PHY that is non-existent in QEMU doesn't help the system booting. Fortunately, I had the PHY initialization as an optional switch in the driver's Kconfig file right from the start.

I don't have the exact number of tests performed available right now, but after all those changes, running sanitycheck for the tests/net branch resulted in a significant number of test cases being executed, of which only 2 or 3 failed - the actual tests all passed, but the test cases failed due to assertions. Obviously, more interfaces than expected are being created, in particular for test cases involving VLANs. I'll leave that branch as it is for now, as I'm currently doing my research regarding the two test cases in tests/kernel that currently fail.

Best regards,
Immo

Jukka Rissanen
 

Hi Immo,

the networking unit tests under tests/net directory should be self
contained i.e., they should not need SLIP to be able to run. The
sanitychecker also compiles samples/net/ programs and those indeed need
SLIP or other network connectivity, but these are not run by
sanitychecker (unless you flash them to the actual device).


Cheers,
Jukka

On Fri, 2020-02-28 at 02:53 -0800, Immo Birnbaum wrote:
Hi Jukka,

thanks for the info, I looked at the YAML files of other platforms
and eventually came across the "-netif:eth" switch. Once I had that
integrated, I had a bit of a fight against the QEMU system
configuration in conjunction with the UART pipe and SLIP drivers used
by the networking tests, I eventually came up with a configuration in
which both the UART console and the UART pipe driver work after
fixing a minor bug in one of the instantiation macros of the Xilinx
UART driver. Also, having the Ethernet controller driver perform
polling reads/writes to a PHY that is non-existent in QEMU doesn't
help the system booting. Fortunately, I had the PHY initialization as
an optional switch in the driver's Kconfig file right from the
start.

I don't have the exact number of tests performed available right now,
but after all those changes, running sanitycheck for the tests/net
branch resulted in a significant number of test cases being executed,
of which only 2 or 3 failed - the actual tests all passed, but the
test cases failed due to assertions. Obviously, more interfaces than
expected are being created, in particular for test cases involving
VLANs. I'll leave that branch as it is for now, as I'm currently
doing my research regarding the two test cases in tests/kernel that
currently fail.

Best regards,
Immo

Immo Birnbaum
 

Hey everyone,

here's another update regarding the ARMv7 port: other than the 'interrupt' test, which I blacklisted, I've now got green lights on all kernel tests running on the QEMU Cortex-A9 (Zynq) target which are not automatically de-selected for whatever reason (61 passed, 23 skipped, 10 discarded). As a solution for the 'interrupt' test assertion failure is already being discussed and as I moved over completely to Stephanos' GIC interrupt handling code (although my code base isn't at the most recent regarding the modification of the interrupt handling, I'll take care of that soon), any solution of this issue should reflect directly in the ARMv7-A branch, eventually making the blacklisting obsolete.

There's been one major change along the way: I've exchanged the timer being used to drive the system. As some of the tests that failed were related to the tickless mode, and as the readily available Xilinx TTC driver isn't suitable for this mode (it's a self-reloading 16-bit counter, where dynamically calculated comparator values would require on-the-fly re-calculation of a prescaler, which would lead to varying imprecision in timing, and where, for example, you might miss out on an entire interval if interrupts are globally locked for more than one period), I've switched over to the ARM global timer, for which Carlo has already provided a high-level interface plus an access implementation in aarch64. The ARM global timer is similar to the timers that drive the system clock on the other architectures, continuously counting up while comparing against an absolute value. While on ARMv8 access is possible via special instructions, on ARMv7 the timer's identical register layout is memory-mapped. So I've added a register-based implementation in aarch32, for which the timer's device tree entry must be supplied with an additional base address. Using this timer, the tickless mode tests passed as well. The fact that the tests failed with the TTC driver might eventually be a test suite issue - the TTC driver doesn't set the tickless capability config flag when it is chosen in the build configuration, yet the test cases didn't get filtered out?

There's just one catch: I had to extend the ARM global timer's interface a little in the aarch32 implementation. While Carlo's implementation provides the basic functions for reading the current value and writing to the comparator which I re-implemented for aarch32, I had to add functions that read and write the state of the interrupt status ("event") bit. Initially I observed that the timer's ISR always ran twice in a row (the IAR register returns the timer's vector twice in a row before returning 0x3FF) when not using the auto-reload feature in tick-based mode (when configured this way, the interrupt works just as expected), erroneously incrementing the comparator value and leading to some really wonky timing behaviour.  I spent the best part of a day experimenting with various things such as extra barriers, puzzled why these double interrupts kept occurring. Eventually, I came across both ARM's and Xilinx's errata documentation - lo and behold, this behaviour is an actual bug in the MPcore silicon (which QEMU even emulates!). The suggested workaround is to clear the event bit *after* updating the comparator (to UINT64_MAX in tickless mode, some value must always be written from within the ISR) and checking its value whenever the ISR is entered. During the second, erroneous execution of the ISR, polling the event bit's value returns 0 as the comparator hasn't matched yet and the ISR exits upon detecting that no event is pending. This is also how the Linux folks implemented the workaround in their driver. It doesn't prevent the extra ISR execution, but at least it doesn't screw up the system's timing. Carlo, would you mind if I added a menuconfig option below the ARM global timer in order to enable the (ARMv7-specific) workaround and included the special handling #ifdef'd in arm_arch_timer.c -> arm_arch_timer_compare_isr()? Also, do you see any use in adding an option enabling the auto-update of the comparator in tick-based mode? It doesn't improve the interrupt latency in any way, it would just save two register writes in my case. I guess it's not that relevant, what do you think?

The one thing that works but isn't yet as clean as I would like it to be (and which should also have a matching test case for aarch32) is the FPU register saving thing. So far, I've completely skipped the CONFIG_FP_SHARING setting whereever I've added a branch for ARMv7 - if it's an ARMv7-A and if it has a FPU, pull the register saving off. Yet, while there is no marker which single thread gets to use the non-shared FPU registers as there is on Cortex-M, and the register saving basically isn't optional once the FPU is in play, I should probably auto-incorporate the FP_SHARING flag just to keep the semantics clean. I also still have to add the setting of the initial FPSCR and FPEXC values to the arch-specific thread creation code. More on that in the next update, I guess...

I'll have a look into how pull requests work and start off small with the minor bugfixes in the existing UART/TTC drivers, while with regards to the rest of my work I still have to work out with the big wigs at work if I personally get to take the credit in the copyright notices of any new sources, or if they'll mention the company - after all, they provided the resources.

Best regards,
Immo