Date: Fri, 4 Jan 2013 11:21:06 -0500 (EST) From: Rick Macklem <rmacklem@uoguelph.ca> To: =?utf-8?Q?Attila_Bog=C3=A1r?= <attila.bogar@linguamatics.com> Cc: freebsd-fs@freebsd.org Subject: Re: gssd mystery Message-ID: <1583693481.1674257.1357316466470.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20130104160403.42b02209d363359b83695730@linguamatics.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Attila Bogar wrote: > Hi All, >=20 > I have NFS server which exports via kerberos security. > The users and groups come from LDAP via port net/nss-pam-ldapd. > gssd is linked against the latest heimdal. > There are multiple LDAP servers for fail over. >=20 > A story was the following: > - NFS daemon locked up > - top shows that it's in gsslock - or similar - I don't remember the > exact state - > - I noticed, that gssd isn't running > - /etc/rc.d/gssd start > ... panic, reboot >=20 There are a couple of recent commits to head that were MFC'd to stable/9 yesterday that might be useful. r244331 (MFC'd as r245016) modifies the gssd daemon so that it uses syslog() when daemonized, so it should leave a message in /var/log/messages when it exit(1)s, due to a failure. r244370 (MFC'd as r245018) should keep the kernel from crashing when the gssd is restarted. If the gssd daemon crashed, hopefully there is a core dump (/gssd.core). If you have one of these, please run gdb on it and see where it crashed. > Unfortunately I don't have a kernel dump, but checking the logs I see > 3 minutes before the lockup: > [nslcd] [warning] [d802da] <passwd=3D"someuser"> ldap_start_tls_s() > failed (uri=3Dldap://ldap1.linguamatics.com): Can't contact LDAP server: > Bad file descriptor > [nslcd] [warning] [d802da] <passwd=3D"someuser"> failed to bind to LDAP > server ldap://ldap1.linguamatics.com: Can't contact LDAP server: Bad > file descriptor > [nslcd] [info] [d802da] <passwd=3D"someuser"> connected to LDAP server > ldap://ldap2.linguamatics.com > This may or may not be connected, but I can't see these messages for a > long time back in history. >=20 Might be related. It will do getpwname() to create a uid/gid-list for a user principal name. > Anyway there is some bug around gssd, because it died. > I don't know if this is a reproducible bug or not yet. >=20 > How can be gssd monitored on a production system to figure out the > reason for death? >=20 If there is no core dump, hopefully the r244331 patch will result in a message in /var/log/messages. Please let us know if you figure out more about why the gssd died. Good luck with it, rick > Attila >=20 > -- > Attila Bog=C3=A1r > Systems Administrator > Linguamatics - Cambridge, UK > http://www.linguamatics.com/ > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1583693481.1674257.1357316466470.JavaMail.root>