From owner-freebsd-current@freebsd.org Thu Sep 29 18:43:33 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 31FFCC02FB6 for ; Thu, 29 Sep 2016 18:43:33 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: from mail-it0-x22e.google.com (mail-it0-x22e.google.com [IPv6:2607:f8b0:4001:c0b::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EBD2B10B for ; Thu, 29 Sep 2016 18:43:32 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: by mail-it0-x22e.google.com with SMTP id 15so791317ita.1 for ; Thu, 29 Sep 2016 11:43:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=xpbdo+1QlG61gNEveiWjHTW7ej3Qypfft7cqh91RkS0=; b=wfj8j07gnTiIBMEmv01BqA0qkKIxlJJLE6eoQ+HbYL9z9+63pvMk8YyBKUGnJie62l CnRdKlLqskyAQ7byXOpokKoiRfjJb5/8DMnrH/qGxdg46OaudNXMYKyMcMy+GsLTaXp0 9uqk6PPiey3z9CQ650U5pQA0PmUJKx9wyd7pogoa16LpJ2JrXkeHIRbShcRZEH8OpObK tVFVvDHHwxerYot61dw4rHXUPH+zKUBI7ROc8wpAHF6Xt94+3IK92OnTkHLxG1tzliNX nKWyqBVstMP1HndS+GEkdNxIdOF767wXjmfwlIeNZ7UN5l6mhPg3ybjuKwhKqZ2YKp9P HOQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=xpbdo+1QlG61gNEveiWjHTW7ej3Qypfft7cqh91RkS0=; b=Kc72gtYtZB4iUfrbLoMuCbs/3eLH+oianOiD2UVF6qkhGtEv77AgROn632NgEEn4+z yJQdP4YTCC/8sTB8FwD2nYSW5m7OdO+aImxx/kGobg15flL7aFrbIKk6bvwXWtCHmaEX FTo4AEYbYOpQzeHaIXKa6mhiRJrcYrATm7lCM8UGAUYlJKT2fVRlzM0j7vuBOid6rK7o X1btp2tgkLhnc4fsOQYAvaGOlK+Id+flaLar0wfAUwYp7QRdOzbhJQOmEIJMLAEmNhFQ w3x5VLmkwSnLLF8v4OxEYM3HeH+IxyUnVrsQ/U7ooGb+hk7hi6bFCdhxtF1Ibdqgtdya kmUQ== X-Gm-Message-State: AA6/9RlZCXjkzgSrxfqjl+l50qxwXAmtJjobGZPf6aakxcLb0/f9eEnC2AbjvLi9KyKdaitKLp8SkV20PpBIzQ== X-Received: by 10.36.14.143 with SMTP id 137mr165413ite.98.1475174612334; Thu, 29 Sep 2016 11:43:32 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.134.138 with HTTP; Thu, 29 Sep 2016 11:43:31 -0700 (PDT) From: Ryan Stone Date: Thu, 29 Sep 2016 14:43:31 -0400 Message-ID: Subject: Callout subsystem doesn't cancel interrupts for canceled callouts To: FreeBSD Current Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Sep 2016 18:43:33 -0000 At $WORK, we're working on adding support for high-precision RTT calculations in TCP. The goal is reduce the retransmission timeout significantly to help mitigate the impact of TCP incast. This means that the retransmit callout for TCP sockets gets scheduled significantly more often with a shorter timeout period, but in the normal case it is expected to be canceled or rescheduled before it times out. What I have noticed is that when the retransmit callout is canceled or rescheduled, the callout subsystem will not reschedule its currently pending interrupt. The result is that my system takes a significant number of "spurious" timer interrupts where there are no callouts to service, which is having a significant performance impact. Unfortunately, neither the callout subsystem nor the eventtimers subsystem really seem to be designed for canceling interrupts. It's not easy to find the "next" event in the callout wheel and the current code doesn't even try when handling an interrupt; the next interrupt is scheduled at a seemingly arbitrary point in the future. I know that when the callout system was reworked the callout wheel data structure was maintained to keep insertion and deletion O(1). However I question whether that was the right decision given the fact that if callouts are frequently deleted, as in my case, we incur the signficant overhead of a spurious timer interrupt. Does anybody know if actual performance measurements were taken to justify this decision?