Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 9 Aug 2002 21:05:23 -0700 (PDT)
From:      Nick Johnson <freebsd@spatula.net>
To:        Greg Lewis <glewis@eyesbeyond.com>
Cc:        freebsd-java@freebsd.org
Subject:   Re: More information about segv in RandomAccessFile native method
Message-ID:  <20020809205406.E13773-100000@turing.morons.org>
In-Reply-To: <20020810132304.A21235@misty.eyesbeyond.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Before you go to the trouble, I have some more information still which may
be useful.

The problem completely went away after I fixed a bug in my Java code, so
this is probably a JVM bug, not a JVM-freebsd bug.  What was happening was
I was keeping a Lucene IndexSearcher object and Hits object in a Bean with
session scope, so stuff was getting serialized out to disk including this
RandomAccessFile, or at least some information about some file(s).  I
believe what was happening was the file was getting closed after a while
when the GC ran, and then later when the bean was deserialized, it thought
the file would still be there and consequently the FD would still be good,
when in fact the FD had been closed and the entry in the fdmon array was
nulled.

So arguably what should be happening is the JVM should notice this broken
deserialization and flip whatever bits are necessary such that the class
causes a java.io.IOException to be thrown the next time something tries to
access that file or use that FD, rather than causing an assertion failure
somewhere inside the JVM core.  But that's Sun's problem :)  I'll file a
bug report later.

Another, less elegant solution would be to change the assert to an if and
have it return EBADF if the fdmon entry is null.

My solution was to fix my bad assumption that the Hits object was just a
container of data (which would have gone away sooner had I properly closed
the IndexSearcher object) and instead make my own local data copy of the
values contained in the object and then close the IndexSearcher, thereby
keeping it from getting any foolish ideas about open FDs that were in fact
closed (and nonexistent).  The problem didn't turn up consistently when I
was troubleshooting because I was never waiting long enough for the
session to be committed to disk and for the GC to run in between tests,
and if I was, it didn't occur to me what I was doing :)

   Nick

On Sat, 10 Aug 2002, Greg Lewis wrote:

> Hi Nick,
>
> On Thu, Aug 08, 2002 at 02:41:49PM -0700, Nick Johnson wrote:
> > Here's some more information... I finally got a core file that was worth
> > something:
>
> Sure did :).
>
> > #0  0x280b4b0c in kill () from /usr/lib/libc.so.4
> > #1  0x280f4eea in abort () from /usr/lib/libc.so.4
> > #2  0x281623d9 in Abort () from /usr/local/jdk1.3.1/jre/lib/i386/classic/libjvm.so
> > #3  0x28189596 in panicHandler () from /usr/local/jdk1.3.1/jre/lib/i386/classic/libjvm.so
> > #4  0x28077874 in userSignalHandler () from /usr/local/jdk1.3.1/jre/lib/i386/green_threads/libhpi.so
> > #5  0x28077834 in intrDispatch () from /usr/local/jdk1.3.1/jre/lib/i386/green_threads/libhpi.so
> > #6  0x28070a51 in intrDispatchMD () from /usr/local/jdk1.3.1/jre/lib/i386/green_threads/libhpi.so
> > #7  0xbfbfffac in ?? ()
> > #8  0x28076068 in sysSeek () from /usr/local/jdk1.3.1/jre/lib/i386/green_threads/libhpi.so
> > #9  0x28175976 in JVM_Lseek () from /usr/local/jdk1.3.1/jre/lib/i386/classic/libjvm.so
> > #10 0x3c6fd734 in Java_java_io_RandomAccessFile_length () from /usr/local/freebsd-jdk1.3.1-p7-gcc31/jre/lib/i386/libjava.so
> > #11 0x3d75550e in dispatchJNINativeMethod () from /usr/local/freebsd-jdk1.3.1-p7-gcc31/jre/lib/i386/libOpenJIT.so
> >
> > sysSeek seems to be defined in sys_api_td.c, and my best guess about
> > what's going on is that this is failing:
> >
> >     mon = fdmon[fd];
> >     sysAssert(mon != NULL);
>
> Can you get the same core with java_g?  If so you should even be able
> to get line numbers, which would confirm your suspicions (either that
> or rebuild after putting a trace statement in at that line).
>
> > Since that would be the one thing in that block of code that strikes me as
> > an obvious candidate for something which might cause a panic.
> >
> > So I guess the question now is this: under what circumstances would
> > fdmon[fd] turn up null when trying to do a seek?
>
> Been too long since I looked at that code, I'll try and dig into it a bit.
>
> --
> Greg Lewis                          Email   : glewis@eyesbeyond.com
> Eyes Beyond                         Web     : http://www.eyesbeyond.com
> Information Technology              FreeBSD : glewis@FreeBSD.org
>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-java" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020809205406.E13773-100000>