#stm32 Handling can_send in ERROR_PASSIVE state on STM32 #stm32


Federico Giovanardi
 

Hi, I'm here to ask some explanations on the intended behavior of the CAN interfaces.

Actually I'm using an STM32G0 which uses the can_mcan.c can driver, I've followed the sources and extracted the relevant snippets:

int z_impl_can_send(const struct device *dev, const struct can_frame *frame,
		    k_timeout_t timeout, can_tx_callback_t callback,
		    void *user_data)
{
	const struct can_driver_api *api = (const struct can_driver_api *)dev->api;

	if (callback == NULL) {
		struct can_tx_default_cb_ctx ctx;
		int err;

		k_sem_init(&ctx.done, 0, 1);

		err = api->send(dev, frame, timeout, can_tx_default_cb, &ctx);
		if (err != 0) {
			return err;
		}

		k_sem_take(&ctx.done, K_FOREVER);

		return ctx.status;
	}

	return api->send(dev, frame, timeout, callback, user_data);
}

So, in the case we're not providing a callback the can_send function acquires a semaphore and then release it from the `can_tx_default_cb`, this implementation assumes
that every frame provided gets transmitted or generate an error; if a frame "get lost" and `can_tx_default_cb` never gets called then nobody is going to put the sema and unlock
the caller.

The problem here is what's supposed to be the expected behavior in case node is the only one on the net; in that case nobody is going to provided an ACK, and, at least in
the stm32g0 implementation, the frame will sit in the TX buffer indefinitely, so the application that send a single frame will get stuck, maybe forever, because the can error counter
won't reach the > 127 level required to trigger a busoff.

While it could be fine to hang there if the user called `can_send()` with `K_FOREVER`, I don't think that's fine in the case the user is calling it with a timeout. I've analyzed the implementation
on the mcan and unfortunately I haven't found a way to setup a timeout in the controller, I suppose that's the intended behavior.

I also haven't figured out how to call the `can_send()` in that situation, querying the bus state before calling it's useless, since the controller will report that' OK ( because it hasn't accomulated
enough errors yet), and providing a timeout it's useless as well since it' won't be honored.

I' understand that `by contract` , `api->send()` should call the callback, and I'm willing to fix the implementation and provide a patch if it's necessary but before doing that I'm asking for
guidance to understand what should be the indented behavior in that use case.

Moreover since the mcan HW doesn't provide an HW timer for frames stuck in the TX queue, should we setup a system timer for each can_frame? Should we use the error flags in the ISR? but that's
complicated and maybe wrong because the flags get triggered on the error counters, and that' no way to clear the error counter without going in bus-off, but 1 frame sitting in the queue doesn't accomulate
enough errors to go in bus-off.

Hope that the explanation it's clear.

Regars.
Federico