Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Mar 2003 11:09:25 -0800
From:      Kirk McKusick <mckusick@beastie.mckusick.com>
To:        dwmalone@freebsd.org, El Vampiro <vampiro@rootshell.ru>
Cc:        "Evgueni V. Gavrilov" <aquatique@rusunix.org>, Mike Makonnen <mtm@identd.net>, src-committers@freebsd.org, cvs-src@freebsd.org, cvs-all@freebsd.org
Subject:   kern/42277
Message-ID:  <200303171909.h2HJ9PFL013291@beastie.mckusick.com>

next in thread | raw e-mail | index | archive | help
	Date: Mon, 24 Feb 2003 03:04:47 -0500
	From: Mike Makonnen <mtm@identd.net>
	To: Kirk McKusick <mckusick@FreeBSD.org>
	Cc: src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org
	Subject: Re: cvs commit: src/sys/ufs/ffs ffs_softdep.c

	On Sun, 23 Feb 2003 23:28:41 -0800 (PST)
	Kirk McKusick <mckusick@FreeBSD.org> wrote:

	> mckusick    2003/02/23 23:28:41 PST
	> 
	>   Modified files:
	>     sys/ufs/ffs          ffs_softdep.c 
	>   Log:
	>   When removing the last item from a non-empty worklist, the worklist
	>   tail pointer must be updated.
	>   

	This looks like it might solve kern/42277. Is that correct?

	-- 
	Mike Makonnen  | GPG-KEY: http://www.identd.net/~mtm/mtm.asc
	mtm@identd.net |
	    Fingerprint: D228 1A6F C64E 120A A1C9  A3AA DAE1 E2AF DBCC 68B9

I had hoped that the above patch would solve the problem, but
alas it did not (though that fix should be MFC'ed into -stable
as it is a valid fix). Further investigation with the help of
Evgueni V. Gavrilov has determined that there is a memory
corruption problem in the 128-byte bucket space. Applying the
patch below makes the soft updates panics stop. It does so by
creating an "unused" field in the inodedep structure at the
64-byte offset. The printf showing that the "unused" field has
changed does occur, but at least soft updates continues to work.
Evgueni V. Gavrilov and I have been unable to track down the
memory corrupting culprit, but perhaps some other on this list
can help. I am not inclined to check in this patch to -stable
as it fixes a symptom rather than a bug. I am hoping that the
true cause of the corruption can be found and fixed instead.

	Kirk McKusick

=-=-=-=-=

To: "Evgueni V. Gavrilov" <aquatique@rusunix.org>
Subject: Re: kern/42277: crash #4 
In-Reply-To: Your message of "Sat, 01 Mar 2003 13:25:20 +0600."
             <20030301072520.GA52366@rusunix.org> 
Date: Sun, 02 Mar 2003 15:14:15 -0800
From: Kirk McKusick <mckusick@beastie.mckusick.com>

	Date: Sat, 1 Mar 2003 13:25:20 +0600
	From: "Evgueni V. Gavrilov" <aquatique@rusunix.org>
	To: Kirk McKusick <mckusick@beastie.mckusick.com>
	Subject: Re: kern/42277: crash #4
	X-ASK-Info: Whitelist match

	ehlo
	I got one more panic.
	I started upload of vmcore.4.bz2
	The kernel and the sources are the same.

	-- 
	http://aquatique.rusunix.org
	http://rusunix.org

Thanks for your help and patience. I would like to say that I found
the bug, but alas I have not. But I have determined that all four
of your crashes are caused by the same bug. In each case the short
at a 64-byte offset from the beginning of an inodedep is being
decremented. As this offset is the top half of a pointer, the next time
the pointer (now with a value of 0xffff0000 is dereferenced, the
kernel panics. The different crashes show the corruption happening
at different times in the life of the inodedep, usually after it
has been in existence for several seconds, but occationally sooner.
This sort of trashing most commonly occurs when a previous user of
dynamic memory continues using something that they have freed. So, I
would like to test out this theory on your system. Could you please
apply the patch below. It creates an unused field in the inodedep
structure at the location that is getting trashed, sets it to a
known value and then verifies that value has not changed when it is
done with the inodedep (printing out a warning if it has changed).
If my theory is correct, then the panics will stop and you will get
the console message "free_inodedep: trashed memory 0x12335678". If
it is a soft updates code problem, then the same panics will persist.
Either way, we will have narrowed the scope of possible problems.

	Kirk McKusick

=-=-=-=-=

*** softdep.h	Thu Jun 22 12:27:42 2000
--- softdep.h.new	Sun Mar  2 14:35:26 2003
***************
*** 243,248 ****
--- 243,249 ----
  	off_t	id_savedsize;		/* file size saved during rollback */
  	struct	workhead id_pendinghd;	/* entries awaiting directory write */
  	struct	workhead id_bufwait;	/* operations after inode written */
+ 	int	id_unused;
  	struct	workhead id_inowait;	/* operations waiting inode update */
  	struct	allocdirectlst id_inoupdt; /* updates before inode written */
  	struct	allocdirectlst id_newinoupdt; /* updates when inode written */

*** ffs_softdep.c	Sun Mar  2 14:34:33 2003
--- ffs_softdep.c.new	Sun Mar  2 14:56:41 2003
***************
*** 1012,1017 ****
--- 1012,1018 ----
  	num_inodedep += 1;
  	MALLOC(inodedep, struct inodedep *, sizeof(struct inodedep),
  		M_INODEDEP, M_SOFTDEP_FLAGS);
+ 	inodedep->id_unused = 0x12345678;
  	inodedep->id_list.wk_type = D_INODEDEP;
  	inodedep->id_fs = fs;
  	inodedep->id_ino = inum;
***************
*** 2097,2102 ****
--- 2098,2106 ----
  	    inodedep->id_nlinkdelta != 0 || inodedep->id_savedino != NULL)
  		return (0);
  	LIST_REMOVE(inodedep, id_hash);
+ 	if (inodedep->id_unused != 0x12345678)
+ 		printf("free_inodedep: trashed memory 0x%x\n",
+ 		    inodedep->id_unused);
  	WORKITEM_FREE(inodedep, D_INODEDEP);
  	num_inodedep -= 1;
  	return (1);

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-src" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200303171909.h2HJ9PFL013291>