Date
1 - 3 of 3
Situation with net APIs testing
Paul Sokolovsky
Hello,
As I'm approaching final steps in preparing BSD Sockets patchset for submission, I'm looking into a way to add some tests for it. Testing of networking functionality is by default hard, because in general, networking hardware would be required for that, even if "virtual", like tunslip6, etc. tools from net-tools repo running, to support QEMU networking. During prototyping work, I learnt there're loopback capabilities when binding and connecting to the same netif, but it still requires net-tools running just to get QEMU start up with networking support. Well, I took a look at tests/net and saw the whole bunch of tests, whoa! I gave a try to tests/net/tcp , some cases passed, some failed, hmm. But then I killed net-tools/loop-slip-tab.sh script and the test ran in the same manner. Whoa, so we have means to run networking without any requirements on the host side, which means we can run them as part of sanitycheck testsuite! But, 8 tests of tests/net/ have build_only=true, any wonder they're broken? Anyway, I looked at what's involved in net-tools free running, and figured it's CONFIG_NET_L2_DUMMY. Added it to my sockets test, and got only segfault in return. After debugging it, turned out it's the same issue as already faced by me and other folks: if there're no netifs defined, networking code is going to crash (instead of printing clear error to the user): https://jira.zephyrproject.org/browse/ZEP-2105 But how the tests/net/ run then and don't crash? Here's the answer: zephyr/tests/net$ grep NET_DEVICE -r * | wc 22 42 1532 So, almost each and every test defines its own test interface. One would think that if we have 22 repetitive implementations of test interfaces, whose main purpose is to be just loopback interfaces, then we'd have a loopback interface in the main codebase. But nope, as confirmed by Jukka on IRC, we don't. Summary: 1. Writing networking tests is hard, but it Zephyr, it takes extraordinary, agonizing effort. The most annoying is that all needed pieces are there, but instead of presenting a nice picture, they form a mess which greets you with crashes if you try to change anything. 2. There're existing big (~20K each) test which fail. Apparently, because they aren't run, so bitrot. Why do we need these huge, detailed tests if we don't run them? (An alternative explanation is that there's something wrong with my system, and yep, I'd be glad to know what I'm still don't do right with Zephyr after working on it for a year.) I'd be glad if more experienced developers could confirm if it's really like the above, or I miss something. And I'll be happy to work on the above issues, but in the meantime, I'll need to submit BSD Sockets with rather bare and hard to run (not automated) tests due to the situation above. Thanks, -- Best Regards, Paul Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog |
|
Jukka Rissanen
Hi Paul,
On Wed, 2017-06-14 at 18:10 +0300, Paul Sokolovsky wrote: Hello,When we had gerrit and jenkins, some of the net tests run slightly longer that what was desired, so they were marked as build only. Now that situation is different with github and shippable, we can change this. So I will prepare a patch that activates those tests that can be activated. I looked through tests/net what is current status of the tests: ieee802154/crypto * This cannot be run on qemu as it requires suitable hw tcp * Test does not pass, needs fixing mld * Test does not pass, needs fixing ipv6 * Test does not pass, needs fixing lib/mqtt_publisher * Test requires real qemu to run. This needs to be converted lib/mqtt_subscriber * Test requires real qemu to run. This needs to be converted buf * This test runs ok so build_only=true can be removed. all * This is intentional compile test that activates all network config options and tries to compile the binary. The result binary cannot be run mostly because of memory requirements and no suitable test environment. The only issue with this test is that we should remember to add and enable new net config options into this test case. All other tests programs (24 pieces), that consists of quite many individual tests, are run automatically by CI, so the situation is not so bleak as you indicated here. I will fix the relevant failing tests as they have bit rotted after the tests were written. Converting two mqtt tests to not use real qemu requires a bit more work. No, the interface is not a loopback interface although it might look like that. The purpose of the interface that is created in each of the test is to simulate a real network so that we do not have to connect to outside world but the test is self contained. So it kind of looks like a loopback interface but in this case the source and destination IP addresses are not the same (as would be the case with loopback interface), as typically we want to test some real behavior of the system so src/dest addresses should differ. The loopback support has limited use cases actually and we probably need to make that optional (behind Kconfig option) in the code as normally there should be no need to send anything back to itself in the real world. thenI am not sure what kind of mess you mean here but patches are welcome as always to rectify this. Some explanation given above. there'sHmm, I missed the point of your last sentence. Cheers, Jukka |
|
Paul Sokolovsky
Hello Jukka,
toggle quoted message
Show quoted text
On Thu, 15 Jun 2017 10:46:29 +0300
Jukka Rissanen <jukka.rissanen@...> wrote: [] That explains it, thanks.as part of sanitycheck testsuite! But, 8 tests of tests/net/ haveWhen we had gerrit and jenkins, some of the net tests run slightly [] All other tests programs (24 pieces), that consists of quite manyGreat, thanks for clarifying this. Though I hope you'd agree that seeing such tests as "context" or "tcp" fail does lead to concerns. I will fix the relevant failing tests as they have bit rotted afterNice, thanks for finding time for this! Converting two mqtt tests to not use real qemu [] I see. I may imagine they offer more functionality than just a loopbackOne would think that if we have 22 repetitive implementations ofNo, the interface is not a loopback interface although it might look interface. I also may imagine that each of 22 test device implementations are slightly different to cater for particular testcases. Nothing of that help someone wanting to write a new test unfortunately. How would one know that there's no standard device implementation for dependency-free testing, and having figured that, how one would choose which of 22 cases to use as a template? The loopback support has limited use cases actually and we probablyI absolutely agree that loopback device would be mostly useful for development and testing, not for production. It also offers limited testing indeed. But it has one big advantage: tests using it can be easily run using existing sanitycheck framework. And if loopback device existed in the main code, such tests would be also easy to write, unlike now. Summing up, I'd like to give a try to implement one for the mainline. Well, patches alone won't help here. Writing tests is always hard (forthenI am not sure what kind of mess you mean here but patches are welcome various reasons, including "tests are code, so why not write 'real' code instead?"). So, it would be nice to think how to facilitate that. Specific proposal is to add a loopback netif, I assume it's ok, so will go for a patch. [] Well, it's the same issue: various things in Zephyr are "harder thanthere'sHmm, I missed the point of your last sentence. necessary", so one can never know if something really broke or one doesn't do all the needed things to run it successfully. It would be nice to think of making the default config of Zephyr either run OOB, or fail with clear error messages, not crash or lockup. That's again a big meta-task, not something which can be "fixed with a patch", but would be nice to see if different stakeholders of Zephyr agree that there's an issue which needs attention. Thanks for all the replies! -- Best Regards, Paul Linaro.org | Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog |
|