Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 May 2007 15:34:05 -0400
From:      Kris Kennaway <kris@obsecurity.org>
To:        Gore Jarold <gore_jarold@yahoo.com>
Cc:        freebsd-fs@freebsd.org, Brooks Davis <brooks@freebsd.org>
Subject:   How to report bugs (Re: VERY frustrated with FreeBSD/UFS stability - please help or comment...)
Message-ID:  <20070521193405.GA80086@xor.obsecurity.org>
In-Reply-To: <475187.33232.qm@web63006.mail.re1.yahoo.com>
References:  <20070521174818.GA64826@lor.one-eyed-alien.net> <475187.33232.qm@web63006.mail.re1.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, May 21, 2007 at 12:16:33PM -0700, Gore Jarold wrote:

> > > a) am I really the only person in the world that
> > moves
> > > around millions of inodes throughout the day ?  Am
> > I
> > > the only person in the world that has ever filled
> > up a
> > > snapshotted FS (or a quota'd FS, for that matter)
> > ? 

There are known panics that may occur when a snapshot grows to fill a
filesystem.  I am told that these are fundamentally difficult to
avoid.

You are certainly not the only persion who operates on millions of
inodes, but it is disingenuous to suggest that this is either a
"mainstream" or "simple" workload.  Also, I personally know of several
people who do this without apparent problem, so that is further
evidence that whatever problems you are seeing are something specific
to your workload or configuration, or you are just unlucky.

The larger issue here is that apparently you have been suffering in
silence for many years with your various frustrations and they have
finally exploded into this email.  This is really a poor way to
approach the goal of getting your problems solved: it is fundamentally
a failure of your expectations to think that without adequately
reporting your bugs that they will somehow get fixed.

When you encounter a FreeBSD bug, file a bug report ("PR").  We have
extensive documentation on how to write effective bug reports, but to
summarize some key steps:

1) Provide a reproducible test case.  As Brooks noted, this is a key
step that will greatly increase the chances of your bug being
reproduced, identified and fixed.  You failed to do it before now, so
we were left completely in the dark.

2) Provide sufficient debugging when the problem occurs.  Consult the
developers handbook for a full discussion, but it involves things like
configuring a debugger, obtaining process traces, enabling additional
debugging options, etc.

Without these two things there is really very little that a developer
can do to try and guess what might possibly be happening on your
system.  However, it appears that we might now be making some
progress:

> ssh user@host rm -rf backup.2
> ssh user@host mv backup.1 backup.2
> ssh user@host cp -al backup.0 backup.1
> rsync /files user@host:/backup.0
> 
> The /files in question range from .2 to 2.2 million
> files, all told. This means that when this script
> runs, it first either deletes OR unlinks up to 2
> million items.  Then it does a (presumably) zero cost
> move operation.  Then it does a hard-link-creating cp
> of the same (up to 2 million) items.

Please provide additional details of how the filesystems in question
are configured, your kernel configuration, hardware configuration, and
the debugging data referred to in 2) above.

Thanks,
Kris



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070521193405.GA80086>