From owner-svn-src-stable-11@freebsd.org Mon Mar 26 20:08:22 2018 Return-Path: Delivered-To: svn-src-stable-11@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 15654F66F68; Mon, 26 Mar 2018 20:08:22 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BBDCF70FE2; Mon, 26 Mar 2018 20:08:21 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id B6DD0119B6; Mon, 26 Mar 2018 20:08:21 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w2QK8L08006503; Mon, 26 Mar 2018 20:08:21 GMT (envelope-from hselasky@FreeBSD.org) Received: (from hselasky@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w2QK8LDr006501; Mon, 26 Mar 2018 20:08:21 GMT (envelope-from hselasky@FreeBSD.org) Message-Id: <201803262008.w2QK8LDr006501@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: hselasky set sender to hselasky@FreeBSD.org using -f From: Hans Petter Selasky Date: Mon, 26 Mar 2018 20:08:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r331573 - in stable/11/sys/dev/mlx5: . mlx5_core X-SVN-Group: stable-11 X-SVN-Commit-Author: hselasky X-SVN-Commit-Paths: in stable/11/sys/dev/mlx5: . mlx5_core X-SVN-Commit-Revision: 331573 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Mar 2018 20:08:22 -0000 Author: hselasky Date: Mon Mar 26 20:08:21 2018 New Revision: 331573 URL: https://svnweb.freebsd.org/changeset/base/331573 Log: MFC r330600: Add timeout handle to commands with callback in mlx5core. The current implementation does not handle timeout in case of command with callback request, and this can lead to deadlock if the command doesn't get firmware response. Add delayed callback timeout work before posting the command to firmware. In case of real firmware command completion we will cancel the delayed work. In case of firmware command timeout the callback timeout handler will be called and it will simulate firmware completion with timeout error. linux commit 65ee67084589c1783a74b4a4a5db38d7264ec8b5 Sponsored by: Mellanox Technologies Modified: stable/11/sys/dev/mlx5/driver.h stable/11/sys/dev/mlx5/mlx5_core/mlx5_cmd.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/dev/mlx5/driver.h ============================================================================== --- stable/11/sys/dev/mlx5/driver.h Mon Mar 26 20:06:37 2018 (r331572) +++ stable/11/sys/dev/mlx5/driver.h Mon Mar 26 20:08:21 2018 (r331573) @@ -727,6 +727,7 @@ struct mlx5_cmd_work_ent { void *uout; int uout_size; mlx5_cmd_cbk_t callback; + struct delayed_work cb_timeout_work; void *context; int idx; struct completion done; Modified: stable/11/sys/dev/mlx5/mlx5_core/mlx5_cmd.c ============================================================================== --- stable/11/sys/dev/mlx5/mlx5_core/mlx5_cmd.c Mon Mar 26 20:06:37 2018 (r331572) +++ stable/11/sys/dev/mlx5/mlx5_core/mlx5_cmd.c Mon Mar 26 20:08:21 2018 (r331573) @@ -246,6 +246,7 @@ static void poll_timeout(struct mlx5_cmd_work_ent *ent static void free_cmd(struct mlx5_cmd_work_ent *ent) { + cancel_delayed_work_sync(&ent->cb_timeout_work); kfree(ent); } @@ -499,6 +500,30 @@ static void dump_command(struct mlx5_core_dev *dev, pr_debug("\n"); } +static u16 msg_to_opcode(struct mlx5_cmd_msg *in) +{ + struct mlx5_inbox_hdr *hdr = (struct mlx5_inbox_hdr *)(in->first.data); + + return be16_to_cpu(hdr->opcode); +} + +static void cb_timeout_handler(struct work_struct *work) +{ + struct delayed_work *dwork = container_of(work, struct delayed_work, + work); + struct mlx5_cmd_work_ent *ent = container_of(dwork, + struct mlx5_cmd_work_ent, + cb_timeout_work); + struct mlx5_core_dev *dev = container_of(ent->cmd, struct mlx5_core_dev, + cmd); + + ent->ret = -ETIMEDOUT; + mlx5_core_warn(dev, "%s(0x%x) timeout. Will cause a leak of a command resource\n", + mlx5_command_str(msg_to_opcode(ent->in)), + msg_to_opcode(ent->in)); + mlx5_cmd_comp_handler(dev, 1UL << ent->idx); +} + static int set_internal_err_outbox(struct mlx5_core_dev *dev, u16 opcode, struct mlx5_outbox_hdr *hdr) { @@ -731,6 +756,7 @@ static void cmd_work_handler(struct work_struct *work) struct mlx5_cmd_work_ent *ent = container_of(work, struct mlx5_cmd_work_ent, work); struct mlx5_cmd *cmd = ent->cmd; struct mlx5_core_dev *dev = container_of(cmd, struct mlx5_core_dev, cmd); + unsigned long cb_timeout = msecs_to_jiffies(MLX5_CMD_TIMEOUT_MSEC); struct mlx5_cmd_layout *lay; struct semaphore *sem; @@ -761,6 +787,9 @@ static void cmd_work_handler(struct work_struct *work) dump_command(dev, ent, 1); ent->ts1 = ktime_get_ns(); ent->busy = 0; + if (ent->callback) + schedule_delayed_work(&ent->cb_timeout_work, cb_timeout); + /* ring doorbell after the descriptor is valid */ mlx5_core_dbg(dev, "writing 0x%x to command doorbell\n", 1 << ent->idx); /* make sure data is written to RAM */ @@ -805,13 +834,6 @@ static const char *deliv_status_to_str(u8 status) } } -static u16 msg_to_opcode(struct mlx5_cmd_msg *in) -{ - struct mlx5_inbox_hdr *hdr = (struct mlx5_inbox_hdr *)(in->first.data); - - return be16_to_cpu(hdr->opcode); -} - static int wait_func(struct mlx5_core_dev *dev, struct mlx5_cmd_work_ent *ent) { int timeout = msecs_to_jiffies(MLX5_CMD_TIMEOUT_MSEC); @@ -867,6 +889,7 @@ static int mlx5_cmd_invoke(struct mlx5_core_dev *dev, if (!callback) init_completion(&ent->done); + INIT_DELAYED_WORK(&ent->cb_timeout_work, cb_timeout_handler); INIT_WORK(&ent->work, cmd_work_handler); if (page_queue) { cmd_work_handler(&ent->work); @@ -1065,6 +1088,8 @@ void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, i = ffs(vector) - 1; vector &= ~(1U << i); ent = cmd->ent_arr[i]; + if (ent->callback) + cancel_delayed_work(&ent->cb_timeout_work); ent->ts2 = ktime_get_ns(); memcpy(ent->out->first.data, ent->lay->out, sizeof(ent->lay->out));