IPoIB: Fix deadlock on RTNL in ipoib_stop()

Commit c8c2afe3 ("IPoIB: Use rtnl lock/unlock when changing device flags") added a call to rtnl_lock() in ipoib_mcast_join_task(), which is run from the ipoib_workqueue. However, ipoib_stop() (which is run inside rtnl_lock()) flushes this workqueue, which leads to a deadlock if the join task is pending. Fix this by simply not flushing the workqueue from ipoib_stop(). It turns out that we really don't care about workqueue tasks running during or after ipoib_stop(), as long as we make sure to flush the workqueue before unregistering a netdev. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1114>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-08-19 15:01:32 -07:00
parent ffaa5b984a
commit a77a57a1a2
2 changed files with 18 additions and 11 deletions
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -392,8 +392,16 @@ static int ipoib_mcast_join_complete(int status,
 					   &priv->mcast_task, 0);
 		mutex_unlock(&mcast_mutex);

-		if (mcast == priv->broadcast)
+		if (mcast == priv->broadcast) {
+			/*
+			 * Take RTNL lock here to avoid racing with
+			 * ipoib_stop() and turning the carrier back
+			 * on while a device is being removed.
+			 */
+			rtnl_lock();
 			netif_carrier_on(dev);
+			rtnl_unlock();
+		}

 		return 0;
 	}