FreeBSD Mail Archives

Date:      Tue, 5 Nov 1996 10:21:12 +0900 (JST)
From:      Michael Hancock <michaelh@cet.co.jp>
To:        Thomas David Rivers <ponds!rivers@dg-rtp.dg.com>
Cc:        FreeBSD Hackers <Hackers@FreeBSD.ORG>
Subject:   Re: More info on the daily panics...
Message-ID:  <Pine.SV4.3.95.961105095516.725A-100000@parkplace.cet.co.jp>
In-Reply-To: <199611040418.XAA01799@lakes.water.net>

I think you're on the right track.

Continue looking at the initialization code of the inode allocation
code and also start looking at deallocation.
/sys/ufs/ufs/ufs_inode.c:ufs_reclaim() and /sys/kern/vfs_subr.c:vclean().

It could also be a nasty race condition.

Also, please provide some stats on your system including the size of RAM,
Size of Disk, %Free Disk Space, %Free Inodes, the number of fictitious
cylinders, etc.

Regards,


Mike Hancock

btw, I am by no means an fs expert.  I am interested in file systems
though.

On Sun, 3 Nov 1996, Thomas David Rivers wrote:

>  I've applied Bruce's fix; but it really was for a different problem.
> 
>  I got another panic today:
> 
> 	ffs_valloc: dup alloc
> 
>  that's the one I usually get when doing the newfs on the install
> (for 2.1.0 and 2.1.5) if I don't alter the newfs parms.  Just to
> be clear; this is a 2.1.5-Stable kernel with Bruce's ufs_vnops.c
> fix.
> 
>  Here's the gdb -k traceback.  This isn't a debugable kernel (I should
> probably fix that), so there's not too much information:
> 
> 
> Script started on Sun Nov  3 22:39:49 1996
> [ponds.water.net]$ gdb -k kernel.6 vmcore.6
> GDB is free software and you are welcome to distribute copies of it
>  under certain conditions; type "show copying" to see the conditions.
> There is absolutely no warranty for GDB; type "show warranty" for details.
> GDB 4.13 (i386-unknown-freebsd), 
> Copyright 1994 Free Software Foundation, Inc...(no debugging symbols found)...
> IdlePTD 1e4000
> current pcb at 1d5484
> panic: ffs_valloc: dup alloc
> #0  0xf0193c7b in boot ()
> (kgdb) where
> #0  0xf0193c7b in boot ()
> #1  0xf0112b83 in panic ()
> #2  0xf0175183 in ffs_valloc ()
> #3  0xf01813c6 in ufs_makeinode ()
> #4  0xf017ed85 in ufs_create ()
> #5  0xf012cb97 in vn_open ()
> #6  0xf012a3cf in open ()
> #7  0xf019bff6 in syscall ()
> #8  0xf01914bb in Xsyscall ()
> #9  0x33c0 in ?? ()
> #10 0x32cd in ?? ()
> #11 0x327d in ?? ()
> #12 0x31d7 in ?? ()
> #13 0x2f7b in ?? ()
> #14 0x2e76 in ?? ()
> #15 0x3b87 in ?? ()
> #16 0x4854 in ?? ()
> #17 0x474e in ?? ()
> #18 0x467a in ?? ()
> #19 0x1ffe in ?? ()
> #20 0x1e6b in ?? ()
> #21 0x1d21 in ?? ()
> #22 0x16a7 in ?? ()
> #23 0x10d3 in ?? ()
> (kgdb) quit
> [ponds.water.net]$ 
> Script done on Sun Nov  3 22:40:09 1996
> 
>  As you can see from the ls -l on /var/crash - after I enabled
> savedump; I get a crash almost every day (although I went for
> almost a week and ran just fine):
> 
> [ponds.water.net]$ pwd
> /var/crash
> [ponds.water.net]$ ls -l
> total 131684
> -rw-r--r--  1 root  wheel        2 Nov  3 13:21 bounds
> -rw-r--r--  1 root  wheel   956177 Oct 23 01:29 kernel.0
> -rw-r--r--  1 root  wheel   956177 Oct 24 03:07 kernel.1
> -rw-r--r--  1 root  wheel   965207 Oct 30 07:54 kernel.2
> -rw-r--r--  1 root  wheel   965207 Oct 31 03:20 kernel.3
> -rw-r--r--  1 root  wheel   965207 Nov  1 03:21 kernel.4
> -rw-r--r--  1 root  wheel   965207 Nov  2 03:24 kernel.5
> -rw-r--r--  1 root  wheel   965207 Nov  3 13:22 kernel.6
> -rw-rw-r--  1 root  wheel        5 Jul 16 22:37 minfree
> -rw-r--r--  1 root  wheel  8650752 Oct 23 01:29 vmcore.0
> -rw-r--r--  1 root  wheel  8650752 Oct 24 03:07 vmcore.1
> -rw-r--r--  1 root  wheel  8650752 Oct 30 07:54 vmcore.2
> -rw-r--r--  1 root  wheel  8650752 Oct 31 03:20 vmcore.3
> -rw-r--r--  1 root  wheel  8650752 Nov  1 03:21 vmcore.4
> -rw-r--r--  1 root  wheel  8650752 Nov  2 03:23 vmcore.5
> -rw-r--r--  1 root  wheel  8650752 Nov  3 13:22 vmcore.6
> 
> 
> (By the way, the panic on Nov 2nd - vmcore.5, was 
>     panic: ifree: freeing free inode
>  the most frequent one I see...)
> 
> You'll note from the times, these crashes tend to occur at
> roughly the same time.  I've been scanning the cron logs; the
> only thing I see starting up about that time is the 03:15:00 
> /usr/libexec/atrun and a 03:15:00 /usr/lib/newsbin/input/newsrun.
> Both of these run every hour (well, atrun runs more often than that,
> of course) - so I'm not yet sure exactly what command(s) is/are
> causing the panic.
> 
> Almost all the entries in the cron logs look like:
> 
> Oct 23 01:15:01 ponds CRON[3494]: (news) CMD (/usr/lib/newsbin/input/newsrun)
> Oct 23 01:20:00 ponds CRON[3610]: (root) CMD (/usr/libexec/atrun)
> Oct 23 01:30:01 ponds cron[138]: (CRON) STARTUP (fork ok)
> 
> although a few have something besides 'newsrun' starting up...  it all
> seems pretty harmless.
> 
> Let me add that there are no 'at' entries in the queue, for anyone
> on the system...
> 
> Perhaps my next step is to build a debuggable kernel and start
> debugging...
> 
> Let me add that a duplicate alloc, and freeing already free'd inodes,
> etc.... tend to point to a problem in the same area.  Something
> fishy the the inode management in ffs_alloc.c.
> 
> Of course, a reliable reproduction of the problem would help toward
> locating the fix...
> 
> The 'freeing free inode' occurs if cg_inosused for the inode being
> free'd is already clear.  The 'dup alloc' panic occurs if the i_mode
> of the inode which was returned from the allocator was already set
> (indicating the inode was already allocated.)  The allocator here
> is ffs_nodealloccg() - which can return a wrong result if the cg_inosused
> field is clear... 
> 
> isclr() (from /usr/include/sys/param.h) is:
> 
>    #define isclr(a,i)      (((a)[(i)/NBBY] & (1<<((i)%NBBY))) == 0)
> 
> [NBBY is '8' from types.h.]  This seems completely reasonable for
> accessing a bit from a character array...
> 
> The cg_inosused() macro to access the used inodes table would be
> incorrect if CG_MAGIC wasn't set for some reason - then you would
> wind up pointing to trash for the inode table...  
> 
> This is what makes me think that allocating more inodes/blocks
> in a file system tickles this problem...
> 
> Ideas on possible investigation avenues (i.e. hints on where to 
> look) would be appreciated :-)   But, I'm starting to become
> familiar with ffs_valloc.c & the cg structure :-)
> 
> 	- Dave Rivers -

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.SV4.3.95.961105095516.725A-100000>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation