A possible case of TCP server no-response, was: Re: bufs lost in TCP connection establishment


Paul Sokolovsky
 

Hello Rohit, et al,

Thanks for the information below. I didn't yet have chance to test
multiple TCP/IPv4 connections in row, because I have issue with
reliably handling one. Just as before, my setup is a scripting language
(MicroPython), using combination of Zephyr and uIP APIs to emulate BSD
sockets. I however proceeded to play with mbedTLS on top of sockets
implemented in such way. That means multiple bidirectional packet
exchanges which exposes quite a number of edge conditions. There're
packets can be dropped anytime.

One particularly naughty conditions I experienced is that while my
usual testcase usually proceeded beyond connect() call, sometimes I got
time "strips" when connect() call just hanged (I don't use timeouts,
like BSD sockets by default) - with consecutive dozen or so of QEMU
restarts (I'm using QEMU virtual networking still so far). Then it
suddenly started working again, to repeat soon again. Looking with
Wireshark, Zephyr app just was sending SYN packets, with zero reply
from a Linux side. At this time, it's easy to start wonder what gets
wrong - Linux handling of TUN interface, tunslip's setting it up, socat
which is involved in connecting tunslip and QEMU, Wireshark, or
something else.

I set to debug just that weird case, with initial hypothesis that Linux
SYN flood protection may be involved. With some googling and netstat
grepping, I found that when I get those can't-connect time strips,
there's actually ESTABLISHED connection on the host. That led me to
http://stackoverflow.com/questions/6825036/what-will-happen-if-i-send-a-syn-packet-to-the-server-when-there-has-already-bee ,
which finished the picture.

So, uIP/Zephyr don't use source port randomization, and always open
connection from port 1025. While code snippets I ran included closing
a socket, if something went wrong, like a packet was lost during
initial connection or TLS exchange, the connection wasn't closed
explicitly, but left open on server. On QEMU restart, it started to
send SYN to establish new connection from the same source port, which,
as explained in the link above, gets zero response from a Linux host.
Eventually Apache on host side times out that client connection, then
the situation recovers.

The solution is thus to check whether there's ESTABLISHED connection on
host side with netstat and restart a server process which owns it
(apache in my case).

The same situation may surely happen with a real device too, so I hope
this mail may save debugging time to some readers.


Thanks,
Paul


On Wed, 7 Sep 2016 15:26:23 +0000
Rohit Grover <Rohit.Grover(a)arm.com> wrote:

Jukka, Paul,

I find that uIP leaks one TX buf with every TCP connection due to
incorrectly managed ref-counts. I'm able to setup and teardown the
same number of connections as the value of CONFIG_IP_BUF_TX_SIZE.

The initial buf for the SYN packet gets its ref-count bumped to 2 (by
tcpip_poll_tcp()), but then this count never goes down to 0. It seems
to me that when the tcpip_event is posted to process_thread_tcp()
upon the sending of the SYN buf, the following code fragment

if (buf && uip_connected(buf)) {
struct net_context *context = user_data;
NET_DBG("Connection established context %p\n",
user_data);
context->connection_status = -EALREADY;
data = INT_TO_POINTER(TCP_WRITE_EVENT);
goto try_send;
}

is able is able to discover the transition to connected state, and
cycles back to call handle_tcp_connection(); but the ref-count of SYN
buf isn't decremented.

Can you help?

Thanks,
Rohit.

IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose
the contents to any other person, use it for any purpose, or store or
copy the information in any medium. Thank you.


--
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#!/linaroorg - http://www.linaro.org/linaro-blog

Join {devel@lists.zephyrproject.org to automatically receive all group messages.