Date: Sat, 16 Dec 2017 19:32:02 +0000 From: bugzilla-noreply@freebsd.org To: java@FreeBSD.org Subject: [Bug 224079] java/openjdk8: Elasticsearch won't start after OpenJDK upgrade Message-ID: <bug-224079-8522-Mu0K897awU@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-224079-8522@https.bugs.freebsd.org/bugzilla/> References: <bug-224079-8522@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D224079 John W. O'Brien <john@saltant.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #188887| |maintainer-approval? Flags| | --- Comment #4 from John W. O'Brien <john@saltant.com> --- Created attachment 188887 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D188887&action= =3Dedit java/openjdk8: Preserve OS-supplied IPv6 interface scope IDs The problem is in the way u152 started handling IPv6 scope IDs in the java.net.NetworkInterface class. This patch corrects that defect and allows elasticsearch (ES) to start. Read on for details of my investigation and analysis. My attention was drawn to "[::1%2]" in the ktrace output. This looked wrong= to me. I adapted the soconnect.d DTrace script from Gregg and Mauro [0] to get a l= ook at the bind(2) calls (sobind.d). On a machine with u144 where ES starts: PID PROCESS FAM ADDR SCOPE= =20=20=20 PORT 44187 java 28 fe80::1 3=20=20= =20=20=20=20=20 9300 44187 java 28 ::1 0=20=20= =20=20=20=20=20 9300 44187 java 28 127.0.0.1 0=20=20= =20=20=20=20=20 9300 44187 java 28 fe80::1 3=20=20= =20=20=20=20=20 9200 44187 java 28 ::1 0=20=20= =20=20=20=20=20 9200 44187 java 28 127.0.0.1 0=20=20= =20=20=20=20=20 9200 On a machine with u152 where ES fails: PID PROCESS FAM ADDR SCOPE= =20=20=20 PORT 65851 java 28 fe80::1 3=20=20= =20=20=20=20=20 9300 65851 java 28 ::1 3=20=20= =20=20=20=20=20 9300 65851 java 28 ::1 3=20=20= =20=20=20=20=20 9301 65851 java 28 ::1 3=20=20= =20=20=20=20=20 9302 65851 java 28 ::1 3=20=20= =20=20=20=20=20 9303 65851 java 28 ::1 3=20=20= =20=20=20=20=20 9304 [repeat through PORT 9400] Next, I wrote a short C program to exercise getifaddr(3) and ran it on a few different systems. I also wrote a C program to try calling bind(2) on an arbitrary IPv6 address, scope ID, and port. This is the simplest, most dire= ct way I could think of to demonstrate the problem and confirm my understandin= g of the applicable APIs. FreeBSD 10.4-RELEASE-p3: $ ./gifa lo0 iface flags af addr scope i= fidx lo0 0x00008049 18 -1 = 3 lo0 0x00008049 28 ::1 0 = 3 lo0 0x00008049 28 fe80::1 3 = 3 lo0 0x00008049 2 127.0.0.1 -1 = 3 $ ./trybind ::1 0 9300 && echo OK OK $ ./trybind ::1 3 9300 && echo OK Could not bind: Can't assign requested address $ ./trybind ::1 999 9300 && echo OK Could not bind: Can't assign requested address RedHat Enterprise Linux 6.9: $ ./gifa lo iface flags af addr scope i= fidx lo 0x00010049 17 -1 = 1 lo 0x00010049 2 127.0.0.1 -1 = 1 lo 0x00010049 10 ::1 0 = 1 $ ./trybind ::1 0 9300 && echo OK OK $ ./trybind ::1 1 9300 && echo OK OK $ ./trybind ::1 999 9300 && echo OK OK macOS Sierra 10.12.6: $ ./gifa lo0 iface flags af addr scope i= fidx lo0 0x00008049 18 -1 = 1 lo0 0x00008049 2 127.0.0.1 -1 = 1 lo0 0x00008049 30 ::1 0 = 1 lo0 0x00008049 30 fe80::1 1 = 1 $ ./trybind ::1 0 9300 && echo OK OK $ ./trybind ::1 1 9300 && echo OK OK $ ./trybind ::1 999 9300 && echo OK OK >From these results I infer that RHEL and macOS ignore sin6_scope_id unless = it's needed to disambiguate an address known to be scoped, while FreeBSD always considers the scope ID part of the address and treats scope 0 as the unscop= ed scope. For reference, the POSIX spec [1] states: "The sin6_scope_id field is a 32-bit integer that identifies a set of interfaces as appropriate for the scope of the address carried in the sin6_= addr field. For a link scope sin6_addr, the application shall ensure that sin6_scope_id is a link index. For a site scope sin6_addr, the application shall ensure that sin6_scope_id is a site index. The mapping of sin6_scope_= id to an interface or set of interfaces is implementation-defined." Is the loopback address scoped? According to RFC-4007 [2], "::1, is treated= as having link-local scope". The OpenJDK patch that introduced the breakage is a changeset [3] that modi= fies the java.net.NetworkInterface class to unconditionally jam the interface in= dex into sin6_scope_id (see diff lines 1.926, 1.1274, and 1.1662). This is unnecessary for any address that is scoped, because the OS will have already populated sin6_scope_id with the correct link index. This is also incorrect= for any address that is not scoped, because it constitutes a JDK-defined mappin= g of scope ID to an interface or set of interfaces, whereas the OS is entitled to define that mapping. I have found nowhere else where OpenJDK depends upon finding the interface index in the sin6_scope_id field. [0] https://github.com/brendangregg/DTrace-book-scripts/blob/master/Chap6/socon= nect.d [1] http://pubs.opengroup.org/onlinepubs/000095399/basedefs/netinet/in.h.ht= ml [2] https://tools.ietf.org/html/rfc4007#section-4 [3] http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/rev/3dc438e0c8e1 --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-224079-8522-Mu0K897awU>