Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 2 Aug 2018 08:43:54 +0000 (UTC)
From:      Hans Petter Selasky <hselasky@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org
Subject:   svn commit: r337104 - stable/11/sys/dev/mlx5/mlx5_core
Message-ID:  <201808020843.w728hsCa056188@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: hselasky
Date: Thu Aug  2 08:43:54 2018
New Revision: 337104
URL: https://svnweb.freebsd.org/changeset/base/337104

Log:
  MFC r336398:
  Make sure the state variable is set atomically instead of using a mutex in mlx5core.
  
  Device detach and setting error state may deadlock over the interface mutex
  like this:
  
  a) Detach code in mlx5en waits until error state is set while the interface
  mutex is locked.
  b) The set error handler needs to lock the interface mutex before it can
  set the error state.
  
  The solution is to use atomics to set the error state.
  
  Sponsored by:		Mellanox Technologies

Modified:
  stable/11/sys/dev/mlx5/mlx5_core/mlx5_health.c
Directory Properties:
  stable/11/   (props changed)

Modified: stable/11/sys/dev/mlx5/mlx5_core/mlx5_health.c
==============================================================================
--- stable/11/sys/dev/mlx5/mlx5_core/mlx5_health.c	Thu Aug  2 08:42:40 2018	(r337103)
+++ stable/11/sys/dev/mlx5/mlx5_core/mlx5_health.c	Thu Aug  2 08:43:54 2018	(r337104)
@@ -219,21 +219,19 @@ void mlx5_enter_error_state(struct mlx5_core_dev *dev,
 	u32 fatal_error;
 	int lock = -EBUSY;
 
-	mutex_lock(&dev->intf_state_mutex);
-	if (dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) {
-		goto unlock;
-		return;
-	}
-
 	fatal_error = check_fatal_sensors(dev);
 
 	if (fatal_error || force) {
+		if (xchg(&dev->state, MLX5_DEVICE_STATE_INTERNAL_ERROR) ==
+		    MLX5_DEVICE_STATE_INTERNAL_ERROR)
+			return;
 		if (!force)
 			mlx5_core_err(dev, "internal state error detected\n");
-		dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
 		mlx5_trigger_cmd_completions(dev);
 	}
 
+	mutex_lock(&dev->intf_state_mutex);
+
 	if (force)
 		goto err_state_done;
 
@@ -272,7 +270,6 @@ void mlx5_enter_error_state(struct mlx5_core_dev *dev,
 
 err_state_done:
 	mlx5_core_event(dev, MLX5_DEV_EVENT_SYS_ERROR, 0);
-unlock:
 	mutex_unlock(&dev->intf_state_mutex);
 }
 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201808020843.w728hsCa056188>