From owner-freebsd-stable@FreeBSD.ORG Mon Feb 7 15:44:27 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A3D541065670; Mon, 7 Feb 2011 15:44:27 +0000 (UTC) (envelope-from gleb.kurtsou@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id C94C68FC0A; Mon, 7 Feb 2011 15:44:26 +0000 (UTC) Received: by bwz12 with SMTP id 12so4961089bwz.13 for ; Mon, 07 Feb 2011 07:44:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=IfXnB0x7Z/1fRIbu2fAzfG6yeefBDobpgX5DbG4BEmQ=; b=NFukoqj1s0L83nEhFc+FmSoJHA6SRx7Wfotbd60avEybV1rVQ5MAyhtcPwUNvkWBRC rWFVXl6+I+SxziAyFmJvu1jHN495KVSYg77x03dldVHcSSkWY8auZVRaLkW3M96ram5V aQE4yhjxe56xXAhvv/7tIRBqTq4GXmtUmNQaU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:content-transfer-encoding :in-reply-to:user-agent; b=qtRojSjuZgMXMebDFLd2zT/q+rDyDMw3+o3K6lIFY0rCkPPabTOAvqgemC90DOkCAq loAdbWyRMqxf569KgoVttAaB8FsMi5s2/XYQWTXkIUmuXrV4NnmidF2+1IwK6XqQmWQV WcQ4UmVjePH3oIs25AUnAs9CThKyLUbqr6H2Y= Received: by 10.204.16.138 with SMTP id o10mr24739bka.157.1297093465620; Mon, 07 Feb 2011 07:44:25 -0800 (PST) Received: from localhost (lan-78-157-92-5.vln.skynet.lt [78.157.92.5]) by mx.google.com with ESMTPS id u23sm2114791bkw.21.2011.02.07.07.44.24 (version=SSLv3 cipher=RC4-MD5); Mon, 07 Feb 2011 07:44:24 -0800 (PST) Date: Mon, 7 Feb 2011 17:44:06 +0200 From: Gleb Kurtsou To: Ivan Voras Message-ID: <20110207154406.GA28877@tops.skynet.lt> References: <4D36A2CF.1080508@fsn.hu> <20110119084648.GA28278@icarus.home.lan> <4D36B85B.8070201@fsn.hu> <20110119150200.GY2518@deviant.kiev.zoral.com.ua> <20110207133748.GA16327@tops.skynet.lt> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Kostik Belousov , freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: tmpfs is zero bytes (no free space), maybe a zfs bug? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Feb 2011 15:44:27 -0000 On (07/02/2011 15:35), Ivan Voras wrote: > On 7 February 2011 14:37, Gleb Kurtsou wrote: > > > It's up to user to mount tmpfs filesystems of reasonable size to prevent > > resource exhaustion. Anyway, enormously large tmpfs killing all your > > process is not the way to go. > > Of course not, but as I see it (from admin perspective), tmpfs should > behave as close to regular processes in consuming memory as possible > (where possible; obviously it cannot be subject to the OOM killer :) > ). Here is key difference it's not subject to be killed by OOM killer. Thus exhausting all resource for real. I propose to enforce specifying upper limit of filesystem size by user. > The problem described in this thread is that there is enough memory in > various lists and tmpfs still reports "0 bytes free". See my message: > the machine had more than 8 GB of "free" memory (reported by "top") > and still "0 bytes free" in tmpfs - and that's not counting inactive > and other forms of used memory which could be freed or swapped out > (and also not counting swap). That's because tmpfs incorrectly checks how much memory is available including both swap and ram. In VM world that's not so easy. > By "as close to regular processes in consuming memory" I mean that I > would expect tmpfs to allocate from the same total pool of memory as > processes and be subject to the same mechanisms of VM, including swap. > If that is not possible, I would (again, as an admin) like to extend > the tmpfs(5) man page and other documentation with information about > what types of memory will and will not count towards available to > tmpfs. > > > Unless there are objections, I'm planning to do the following: > > > > 1. By default set tmpfs size to max(all swap/2, all memory/2) and print > > warning that filesystem size should be specified manually. > > Max(swap/2,mem/2) is used as a band-aid for the case when no swap is setup. > > You mean as a reservation, maximum limit or something else? If a tmpfs > with "size" of e.g. 16 GB is configured, will the memory be > preallocated? wired? Memory in tmpfs is allocated/freed as needed, there is no preallocated or reserved memory/swap. It already behaves the way you've described. I'm against preallocating or reserving memory. There is ramfs in linux that does preallocation, but it looks deprecated. > I don't think there should be default hard size limits to tmpfs - it > should be able to hold sudden bursts of large temp files (using swap > if needed), but that could be achieved by configuring a tmpfs whose > size is RAM+swap if the memory is not preallocated so not a big > problem. But there is one in Linux (Documentation/filesystems/tmpfs.txt): 59 size: The limit of allocated bytes for this tmpfs instance. The 60 default is half of your physical RAM without swap. If you 61 oversize your tmpfs instances the machine will deadlock 62 since the OOM handler will not be able to free that memory. That's actually what I've proposed: size=mem/2 vs size=max(mem/2,swap/2) Limit should be there not to panic the system. > > 3. Remove "live" filesystem size checks, i.e. do not depend on > > free/inact memory. > > I'm for it, if it's possible in the light of #1 > > > 2. Add support for resizing tmpfs on the fly: > >        mount -u -o size= /tmpfs > > ditto. It's trivial. If it can be resized, change maxsize in struct, fail otherwise. > > Reserving swap for tmpfs might not be what user expects: generally I use > > tmpfs for work dir for building ports, it's unused most of the time. > > It looks like we think the opposite of it :) I would like it to be > swapped out if needed, making room for running processes etc. as > regular VM paging algorithms decide. Of course, if that could be > controlled with a flag we'd both be happy :) Perhaps there is a bit of misunderstanding, it will be swapped out, will behave exactly as it does now, but will have sane default filesystem size limit by default, will change semantics of calculating available memory: I want it to try hard to allocate memory unless filesystem limit is hit, failing only if there is clearly memory shortage (as I said, it is for user to properly configure it). > > btw, what linux and opensolaris do when available mem/swap gets low due > > to tmpfs and how filesystem size determined at real-time? > > There's some information here: http://en.wikipedia.org/wiki/Tmpfs