sfc: Workaround flush failures on Falcon B0

Under certain conditions a PHY may backpressure Falcon B0 in such a way that flushes timeout. In normal circumstances the phy poller would fix the PHY, and the flush could complete. But efx_nic_flush_queues() is always called after efx_stop_all(), so the poller has been stopped. Even if this weren't the case, how long would we have to wait for the poller to fix this? And several callers of efx_nic_flush_queues() are about to reset the device anyway - so we don't need to do anything. Work around this bug by scheduling a reset. Ensure that the MAC is never rewired back into the datapath before the reset runs (we already ignore all rx events anyway). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-06-01 11:17:51 +00:00
parent 319ba649af
commit fd371e32fe
4 changed files with 17 additions and 9 deletions
--- a/drivers/net/sfc/efx.c
+++ b/drivers/net/sfc/efx.c
@@ -27,6 +27,7 @@
 #include "nic.h"

 #include "mcdi.h"
+#include "workarounds.h"

 /**************************************************************************
 *
@@ -556,10 +557,18 @@ static void efx_fini_channels(struct efx_nic *efx)
 	BUG_ON(efx->port_enabled);

 	rc = efx_nic_flush_queues(efx);
-	if (rc)
+	if (rc && EFX_WORKAROUND_7803(efx)) {
+		/* Schedule a reset to recover from the flush failure. The
+		 * descriptor caches reference memory we're about to free,
+		 * but falcon_reconfigure_mac_wrapper() won't reconnect
+		 * the MACs because of the pending reset. */
+		EFX_ERR(efx, "Resetting to recover from flush failure\n");
+		efx_schedule_reset(efx, RESET_TYPE_ALL);
+	} else if (rc) {
 		EFX_ERR(efx, "failed to flush queues\n");
-	else
+	} else {
 		EFX_LOG(efx, "successfully flushed all queues\n");
+	}

 	efx_for_each_channel(channel, efx) {
 		EFX_LOG(channel->efx, "shut down chan %d\n", channel->channel);