From owner-freebsd-hackers@freebsd.org Tue Oct 24 09:16:44 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1F52DE45F37; Tue, 24 Oct 2017 09:16:44 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from cu01176b.smtpx.saremail.com (cu01176b.smtpx.saremail.com [195.16.151.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D858676FA1; Tue, 24 Oct 2017 09:16:43 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from [172.16.8.41] (unknown [192.148.167.11]) by proxypop01.sare.net (Postfix) with ESMTPA id 79D5D9DDD5E; Tue, 24 Oct 2017 11:07:32 +0200 (CEST) From: Borja Marcos Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 11.0 \(3445.1.7\)) Subject: Periodic jobs lockf timeout Message-Id: Date: Tue, 24 Oct 2017 11:07:31 +0200 Cc: freebsd-security@freebsd.org To: freebsd-hackers@freebsd.org X-Mailer: Apple Mail (2.3445.1.7) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Oct 2017 09:16:44 -0000 Hi, I=E2=80=99ve come across a problem with the =E2=80=9Cdaily=E2=80=9D = security job. On an overloaded system with lots of ZFS datasets, lots of files, heavy system load and, to add insult to injury, a ZFS = crub going on the find=E2=80=99s issued by the periodic checks can take forever. They can take so long, I have found = several lockf=E2=80=99s waiting. Is it sane to have an unlimited timeout for lockf? Probably it would be = better to have at least a configurable timeout for each cathegory. It=E2=80=99s really unlikely to see an = overlap for a weekly or monthly job, but for daily jobs it would be good to have a sane default, say, an hour or two. There=E2=80=99s even a parameter on /etc/defaults/periodic.conf but it = seems it=E2=80=99s not used right now. # Max time to sleep to avoid causing congestion on download servers anticongestion_sleeptime=3D3600 The alternative would be to have defaults for a sane timeout for each = cathegory, like daily_lockf_timeout weekly_lockf_timeout monthly_lockf_timeout Thoughts? It=E2=80=99s pretty simple to do and overlapping periodic jobs = are really useless. Borja.