Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 4 Jan 2013 11:21:06 -0500 (EST)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        =?utf-8?Q?Attila_Bog=C3=A1r?= <attila.bogar@linguamatics.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: gssd mystery
Message-ID:  <1583693481.1674257.1357316466470.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20130104160403.42b02209d363359b83695730@linguamatics.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Attila Bogar wrote:
> Hi All,
>=20
> I have NFS server which exports via kerberos security.
> The users and groups come from LDAP via port net/nss-pam-ldapd.
> gssd is linked against the latest heimdal.
> There are multiple LDAP servers for fail over.
>=20
> A story was the following:
> - NFS daemon locked up
> - top shows that it's in gsslock - or similar - I don't remember the
> exact state -
> - I noticed, that gssd isn't running
> - /etc/rc.d/gssd start
> ... panic, reboot
>=20
There are a couple of recent commits to head that were MFC'd to stable/9
yesterday that might be useful. r244331 (MFC'd as r245016) modifies the
gssd daemon so that it uses syslog() when daemonized, so it should leave
a message in /var/log/messages when it exit(1)s, due to a failure.
r244370 (MFC'd as r245018) should keep the kernel from crashing when the
gssd is restarted.

If the gssd daemon crashed, hopefully there is a core dump (/gssd.core).
If you have one of these, please run gdb on it and see where it crashed.

> Unfortunately I don't have a kernel dump, but checking the logs I see
> 3 minutes before the lockup:
> [nslcd] [warning] [d802da] <passwd=3D"someuser"> ldap_start_tls_s()
> failed (uri=3Dldap://ldap1.linguamatics.com): Can't contact LDAP server:
> Bad file descriptor
> [nslcd] [warning] [d802da] <passwd=3D"someuser"> failed to bind to LDAP
> server ldap://ldap1.linguamatics.com: Can't contact LDAP server: Bad
> file descriptor
> [nslcd] [info] [d802da] <passwd=3D"someuser"> connected to LDAP server
> ldap://ldap2.linguamatics.com
> This may or may not be connected, but I can't see these messages for a
> long time back in history.
>=20
Might be related. It will do getpwname() to create a uid/gid-list for
a user principal name.

> Anyway there is some bug around gssd, because it died.
> I don't know if this is a reproducible bug or not yet.
>=20
> How can be gssd monitored on a production system to figure out the
> reason for death?
>=20
If there is no core dump, hopefully the r244331 patch will result in
a message in /var/log/messages.

Please let us know if you figure out more about why the gssd died.

Good luck with it, rick

> Attila
>=20
> --
> Attila Bog=C3=A1r
> Systems Administrator
> Linguamatics - Cambridge, UK
> http://www.linguamatics.com/
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1583693481.1674257.1357316466470.JavaMail.root>