Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 24 Aug 2002 00:26:32 +0300
From:      Giorgos Keramidas <keramida@FreeBSD.org>
To:        Chris Ptacek <cptacek@sitaranetworks.com>
Cc:        Carlos Carnero <zopewiz@yahoo.com>, freebsd-questions@FreeBSD.org, freebsd-fs@FreeBSD.org
Subject:   Re: optimization changed from TIME to SPACE ?!
Message-ID:  <20020823212631.GA64644@hades.hell.gr>
In-Reply-To: <31269226357BD211979E00A0C9866DAB02BB9988@rios.sitaranetworks.com>
References:  <31269226357BD211979E00A0C9866DAB02BB9988@rios.sitaranetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
The following postings describe the problems Carlos Carnero is having
with Squid causing space optimizations to be always turned on, without
his /var filesystem being too full.  Those of you who are confident
about your ffs/ufs understanding correct me if I'm wrong in my
comments below, but I think I've found why this happens.  Any
suggestions to avoid excessive fragmentation and avoid triggering
this?  After reading what happens, I'd be indebted if you helped a bit :)

- Giorgos

On 2002-08-23 09:48 +0000, Carlos Carnero wrote:
> > Your /var filesystem is almost 100% full.
>
> I thought so the moment I saw the message, but for
> several months now I monitor that box using SNMP and
> that partition is *always* kept very loose space-wise.

On 2002-08-23 13:43 +0000, Chris Ptacek wrote:
> Actually I have been trying to figure out this myself for a while.
> I am having the same issues (squid cache), TIME to SPACE changes
> with the partition at 50-70% used.

I think I've found it :)))

[[ A little kernel background. ]]

That's an interesting observation.  You are indeed correct.  The code
of /usr/src/sys/ufs/ffs/ffs_alloc.c is the one that controls when
SPACE or TIME optimization kicks in.  The default optimization is TIME
to make operations as fast as possible, with the added disadvantage
that some times blocks or fragments will be allocated in positions
that are in "good" positions, thus wasting space.  The following parts
of the ffs_alloc.c source show this:

   287                  if (fs->fs_minfree <= 5 ||
   288                      fs->fs_cstotal.cs_nffree >
   289                      (off_t)fs->fs_dsize * fs->fs_minfree / (2 * 100))
   290                          break;
   291                  log(LOG_NOTICE, "%s: optimization changed from SPACE to TIME\n",
   292                          fs->fs_fsmnt);
   293                  fs->fs_optim = FS_OPTTIME;
   294                  break;

If the fs_minfree percentage of minimum free blocks is less than 5%
then the optimization is NEVER set to FS_OPTTIME to enable fast
operation.  Running a filesystem with less than 5% of free space will
keep everything a bit slower.

The next part checks to ensure that the total number of fragments in
all cylinder groups doesn't exceed in size 50% of the free reserve.
THIS is where the problems you're seeing lies.  In a relatively empty
disk with a free reserve of 5%, with many thousands small files
(typical of Squid caches), it's easy to have many thousands of small
fragments.  In that case, when the total disk space allocated to
fragments exceeds 50% of the 5% (which is a relatively small amount of
space, regardless of the total disk size), SPACE optimization won't be
turned off.

Similar things can be seen where TIME optimization is set to SPACE,
further down in ffs_alloc.c.  Only in this case, SPACE optimization
kicks in faster than before.  When the space occupied by fragments
grows reaches 80% of the space allocated to the free reserve it is
considered excess fragmentation and SPACE optimizations kick in.

   307                  if (fs->fs_cstotal.cs_nffree <
   308                      (off_t)fs->fs_dsize * (fs->fs_minfree - 2) / 100)
   309                          break;
   310                  log(LOG_NOTICE, "%s: optimization changed from TIME to SPACE\n",
   311                          fs->fs_fsmnt);
   312                  fs->fs_optim = FS_OPTSPACE;
   313                  break;

In the code above, fs_cstotal.cs_nffree is the number of free
fragments available in all cylinder groups of the filesystem.
fs_minfree is the percentage of free space reserved by tunefs(8).
(fs_minfree - 2) is the percentage that fragments are allowed to take.
If more disk space than that is dedicated to fragments, then the check
fails and SPACE optimizations are turned on.

> From what I have been able to find this has to do with fragmentation in the
> partition and not the disk actually being full.

That was a VERY useful hint in finding out why this happens.

Now that I have understood that this is an interesting interaction
between the free space reserved aside from the total disk space and
fragmentation, perhaps we can find some way to solve the problems
SPACE optimizations might cause.

What techniques would you use to reduce fragmentation?  Changes in
block/fragment ratio?  Changes to the default fragment or block size?

Ideas anyone?

-- 
FreeBSD: The Power to Serve <> http://www.FreeBSD.org
FreeBSD 5.0-CURRENT #0: Wed Aug 21 22:08:19 EEST 2002

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020823212631.GA64644>