Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Jun 2014 14:04:52 +0400
From:      Dmitry Sivachenko <trtrmitya@gmail.com>
To:        Ronald Klop <ronald-lists@klop.ws>
Cc:        freebsd-java@freebsd.org
Subject:   Re: JVM BUG(s) - Hadoop's threads hanging
Message-ID:  <1B53E600-B745-459E-98F8-7CEF9FDE77CC@gmail.com>
In-Reply-To: <op.xhjxxemtkndu52@ronaldradial.radialsg.local>
References:  <E14F86A5-C7FE-49B5-8A11-F5237C557AE2@gmail.com> <op.xhjxxemtkndu52@ronaldradial.radialsg.local>

next in thread | previous in thread | raw e-mail | index | archive | help

On 16 =D0=B8=D1=8E=D0=BD=D1=8F 2014 =D0=B3., at 18:45, Ronald Klop =
<ronald-lists@klop.ws> wrote:

>=20
> Hi,
>=20
> =46rom your information it is hard to say something about it. The bug =
can be in FreeBSD, OpenJDK (the Oracle part or in the BSD port part), in =
Hadoop or in your own code running on top of Hadoop.
>=20
> My first idea would be to eliminate some of the possibilities.
> - Run a Linux machine with the same versions of the software.
> - Try FreeBSD 9-stable.

I will try at least FreeBSD-9 soon (never used Linux so it will take =
more time and not so relevant because I want to continue to use FreeBSD, =
not just move to Linux)


> - Try an older version of OpenJDK on FreeBSD.

I already tried latest versions of openjdk-6/7/8 from ports.

7 and 8 behaves the same way (as I described in my original e-mail).  =
Below is the output of jstack for openjdk7 (java process running =
taskttacker):

46897 hadoop        147  21    0  1927M   625M uwait  22  14:31   7.86% =
java

/tmp# jstack -l 46897
46897: Unable to open socket file: target process not responding or =
HotSpot VM not loaded
The -F option can be used when the target process is not responding
/tmp# jstack -F -l 46897>/tmp/jstack.out
Attaching to process ID 46897, please wait...
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at =
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:=
57)
        at =
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm=
pl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sun.tools.jstack.JStack.runJStackTool(JStack.java:136)
        at sun.tools.jstack.JStack.main(JStack.java:102)
Caused by: sun.jvm.hotspot.debugger.UnalignedAddressException: =
746f705b762f4867
        at =
sun.jvm.hotspot.debugger.bsd.BsdDebuggerLocal$1.checkAlignment(BsdDebugger=
Local.java:183)
        at =
sun.jvm.hotspot.debugger.bsd.BsdDebuggerLocal.readCInteger(BsdDebuggerLoca=
l.java:485)
        at =
sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:4=
54)
        at =
sun.jvm.hotspot.debugger.bsd.BsdDebuggerLocal.readAddress(BsdDebuggerLocal=
.java:430)
        at =
sun.jvm.hotspot.debugger.bsd.BsdAddress.getAddressAt(BsdAddress.java:74)
        at =
sun.jvm.hotspot.HotSpotTypeDataBase.readVMTypes(HotSpotTypeDataBase.java:1=
54)
        at =
sun.jvm.hotspot.HotSpotTypeDataBase.<init>(HotSpotTypeDataBase.java:85)
        at =
sun.jvm.hotspot.bugspot.BugSpotAgent.setupVM(BugSpotAgent.java:573)
        at =
sun.jvm.hotspot.bugspot.BugSpotAgent.go(BugSpotAgent.java:494)
        at =
sun.jvm.hotspot.bugspot.BugSpotAgent.attach(BugSpotAgent.java:332)
        at sun.jvm.hotspot.tools.Tool.start(Tool.java:163)
        at sun.jvm.hotspot.tools.JStack.main(JStack.java:86)
        ... 6 more
/tmp#

(jstack.out file is empty)




openjdk-6 is different: during shuffle phase (when portions of =
intermediate data are copied between data nodes), java process running =
tasktracker consumes a lot of CPU (300-400%), and it is often in "vm =
map" state.  Data transfer is very-very slow (1MB/sec and less on 1GB =
network). With openjdk7/8 network is utilized for about 40% (~40MB/sec), =
it is acceptable though the question why isn't it 100MB/sec still =
stands.   So shuffle phase is almost stuck with openjdk6.  But if you =
wait long enough to finish this, tasktrackers in idle state behave as =
expected (do not consume CPU).  Below is the output of top(1) and =
jstack:

35291 hadoop        209  22    0  1922M   461M vm map 17  46.5H 336.08% =
java

/tmp# jstack -l 35291
35291: Unable to open socket file: target process not responding or =
HotSpot VM not loaded
The -F option can be used when the target process is not responding


/tmp# jstack -F -l 35291>/tmp/jstack.out
Attaching to process ID 35291, please wait...
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at =
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:=
57)
        at =
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces    =
    at java.lang.reflect.Method.invoke(Method.java:622)
        at sun.tools.jstack.JStack.runJStackTool(JStack.java:136)
        at sun.tools.jstack.JStack.main(JStack.java:102)
Caused by: sun.jvm.hotspot.debugger.UnalignedAddressException: =
746f705b762f4867
        at =
sun.jvm.hotspot.debugger.bsd.BsdDebuggerLocal$1.checkAlignment(BsdDebugger=
Local.java:183)
        at =
sun.jvm.hotspot.debugger.bsd.BsdDebuggerLocal.readCInteger(BsdDebuggerLoca=
l.java:480)
        at =
sun.jvm.hotspot.debugger.DebuggerBase.readAddressValue(DebuggerBase.java:4=
54)
        at =
sun.jvm.hotspot.debugger.bsd.BsdDebuggerLocal.readAddress(BsdDebuggerLocal=
.java:425)
        at =
sun.jvm.hotspot.debugger.bsd.BsdAddress.getAddressAt(BsdAddress.java:74)
        at =
sun.jvm.hotspot.HotSpotTypeDataBase.readVMTypes(HotSpotTypeDataBase.java:1=
54)
        at =
sun.jvm.hotspot.HotSpotTypeDataBase.<init>(HotSpotTypeDataBase.java:85)
        at =
sun.jvm.hotspot.bugspot.BugSpotAgent.setupVM(BugSpotAgent.java:572)
        at =
sun.jvm.hotspot.bugspot.BugSpotAgent.go(BugSpotAgent.java:493)
        at =
sun.jvm.hotspot.bugspot.BugSpotAgent.attach(BugSpotAgent.java:331)
        at sun.jvm.hotspot.tools.Tool.start(Tool.java:163)
        at sun.jvm.hotspot.tools.JStack.main(JStack.java:86)
        ... 6 more
/tmp#=20

(/tmp/jstack.out file is empty)


> - Try a very simple 'Hello world' style application on Hadoop which =
mimics the thread usage.
>=20
> Did you ever run your Hadoop application on FreeBSD before without =
this symptom? If so, what are the differences between then and now?


No, it is just my first install of hadoop and I use bundled terasort =
test suite (hadoop jar =
/usr/local/share/examples/hadoop/hadoop-examples-1.2.1.jar terasort =
<...>)
Since it is the problem with tasktracker (it does not run user-supplied =
code, it just schedules tasks and performs cleanups), so it is hardly =
relevant which particular task I execute.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1B53E600-B745-459E-98F8-7CEF9FDE77CC>