Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Jul 2010 13:41:24 -0700
From:      Jeremy Chadwick <freebsd@jdc.parodius.com>
To:        "Mikhail T." <mi+thun@aldan.algebra.com>
Cc:        stable@freebsd.org, fs@freebsd.org
Subject:   Re: panic: handle_written_inodeblock: bad size
Message-ID:  <20100719204124.GA21573@icarus.home.lan>
In-Reply-To: <4C44758F.7080209@aldan.algebra.com>
References:  <4C43F35D.5020007@aldan.algebra.com> <20100719113147.GA4786@icarus.home.lan> <4C44758F.7080209@aldan.algebra.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jul 19, 2010 at 11:55:59AM -0400, Mikhail T. wrote:
> 19.07.2010 07:31, Jeremy Chadwick написав(ла):
> >If you boot the machine in single-user, and run fsck manually, are there
> >any errors?
> Thanks, Jeremy... I wish, there was a way to learn, /which/
> file-system is giving trouble... However, after sending the question
> out last night, I tried to pkg_delete a package on the machine, and
> was very lucky to see a file-system error (inode something or other)
> before the panic struck. That, at least, told me, which file-system
> was in trouble (/var). I dump-ed it out, re-created, and then
> restored it... Although dumping went smooth, there were two errors
> at which restore offered to abort. I told it not to and got (most of
> the) file-system restored. (The dump is available to anyone wishing
> to investigate -- contact me privately. I'm not posting it publicly
> because of the passwd-file backup under /var).

I see where you're going with this -- the only way you knew it was /var
was based on the inode error you saw before the system crashed.

> So far seems quiet -- no panics for two more hours before I went to bed.
> >Only thing I can think of off the top of my head: there's a known
> >situation (also applies to RELENG_7) where a background fsck doesn't
> >correct all errors after a system crash/unclean shutdown.  I mention
> >this because I see "softdep" in the above stack trace (usually refers to
> >softupdates).  I don't know if this got fixed, but the workaround is to
> >use background_fsck="no" in rc.conf.  Yes, after a crash this means you
> >have to wait for the entire fsck to run.
> When setting up my main machine 4 years ago, I turned off background
> fsck... But I thought, things have improved sufficiently enough
> since then :-( Maybe, background fsck should still be disabled by
> default?
> 
> And, IMO, at the very least, *any panic related to a file-system
> must clearly identify the file-system in question*... What do you
> think?

I think that's a reasonable request and would be ideal for situations
like this.  If it's possible or not is a completely different question.

I might be able to code something like this up (would be my first time
messing around in the kernel in this regard -- warning, alert, hazard,
danger Will Robinson!), but it'd probably make more sense for someone
already familiar with that section of code to do it.  I'm much more
familiar with userland stuff, the kernel is "black magic".  ;-)

Assuming work tonight isn't that busy for me, I'll see if I can dedicate
some cycles to printing this information in the error string you saw.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100719204124.GA21573>