Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Sep 2012 18:29:02 -0400 (EDT)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        =?utf-8?Q?Attila_Bog=C3=A1r?= <attila.bogar@linguamatics.com>
Cc:        freebsd-fs@FreeBSD.org
Subject:   Re: NFS: rpcsec_gss with Linux clients
Message-ID:  <87865688.117574.1346884142683.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <50475A81.2040105@linguamatics.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Attila Bogar wrote:
> Hi Rick,
>=20
> On 02/09/12 00:57, Rick Macklem wrote:
> > This certainly sounds bogus. I can see an argument for 2 TCP
> > connections for trunking, but since a security context should only
> > be
> > destroyed when the client is done with it, doing a DESTROY doesn't
> > make sense? (There is something in the RPC header called a "handle".
> > It identifies the security context, and it would be nice to check
> > the
> > wireshark trace to see if it the same as the one being used on the
> > other connection?)
> TCP0: -> Linux NFS AUTH_NULL
> TCP0: <- FreeBSD responds
>=20
> TCP1: -> Linux sends RPCSEC_GSS_INIT
> TCP1: <- FreeBSD responds by establishing GSS Context (it's a 16 byte
> token)
>=20
> TCP1: -> Linux sends RPCSEC_GSS_DESTROY using the received 16 byte
> token
As I mentioned before, this makes no sense.

> TCP0: -> Linux sends NFS:PUTROOTFS|GETATTR using the same 16 byte
> received gss context token
>=20
Btw, a GSSAPI token is much larger than 16bytes. The 16byte entry is
the RPCSEC_GSS handle (shorthand) for the security context created via
the RPCSEC_GSS_INIT. This shorthand is known to the client and server
and both should know the session key used with this shorthand handle.
This shorthand handle can be used by the client until either it does a
RPCSEC_GSS_DESTROY of it or the server side replies with an authentication
error that tells the client the handle is no longer usable. This latter
case can occur when the handle falls out of the server side cache or
hits its time limit.

> >> I don't quite know why, but during the destroy within the the
> >> svc_rpc_gss_validate() gss_verify_mic() returns maj_stat =3D
> >> GSS_S_DEFECTIVE_TOKEN, no matter what heimdal version I use.
> >>
> > That would indicate the encrypted checksum isn't correct. It
> > might be using an algorithm only supported by the newer
> > RPCSEC_GSS_V3?
> It's RPC version 2, GSS version: 1
>=20
Yea, there's already a spec (or at least a draft of one) for V3 as
well. It's hard to keep up with the spec writers.

> > For DESTROY when it will fail, I'm not sure if marking the
> > context stale makes sense. (I can see an argument for and against
> > doing this.)
> If you "know" other people's context by snooping the wire, you can
> invalidate their client entry on the nfs server by sending a corrupted
> (corrupted, because you don't know their keytab) RPCSEC_GSS_DESTROY
> message.
> I suspect an attacker can force the kerberos clients to re-establish
> the
> security context again and again.
Yes, I think it could be used this way by a bad guy. I tend to resist
mentioning such things on public mailing lists, but since you have done so.=
..;-)
The best protection is not allowing the "world" to do NFS mounts, even
when Kerberos is used.

> I'm not sure this statistically can lead to any advantage breaking the
> keys, kerberos experts may answer this.
>=20
>=20
> > I've attached a small patch with disables setting client->cl_state
> > to CLIENT_STALE for this case, which you could try, to see if it
> > helps?
> I'll look at it.
>=20
Yes, please test it. It seems to have fixed the problem for the other
person reporting a similar issue and I believe it is safe to commit.

> > I'd suggest contacting the Linux folks first and see if they are
> > willing to look at the wireshark trace or know of an issue/fix,
> > because it really sounds like a Linux client issue.
>=20
> > Waiting 4 minutes instead of 5 shouldn't have any real effect,
> > although it might avoid the problem for your case w.r.t. timing.
>=20
> This is an intuition to test a fix for another bug. I noticed, that
> when my users need long file access, they get a permission denied
> error
> at the gss key change time, which is very annoying after the program
> having run for multiple hours.
>=20
The client should try and get a new security handle via RPCSEC_GSS_INIT
when the old handle times out (or falls out of the server side handle
cache). If the client causes syscalls to fail at this point (assuming
the user in the client has a valid TGT), I would call that a client
side error, too. (If your Kerberos system is issuing renewable tickets,
running a daemon like krenew that renews TGT tickets before they expire,
is needed for long running apps.)

Although the patch to support host based initiator credentials in a keytab
is not in head, once it is applied, the FreeBSD client optionally allows
one of those to be used for client credentials and that avoids the hassle
of TGT expiration for long running apps. (I don't think the Linux client
has support for such a thing, so use of a non-Kerberized mount is probably
preferred for long running apps?)

> > This time is usually the TGT lifetime (12->24hrs), so subtracting
> > 12 sec from it doesn't really make any sense. (I will note that
> > the calculation of cred_lifetime for the GSS_C_INDEFINITE case
> > looks incorrect, since time_uptime gets added twice, but I doubt
> > that's relevant to your problem, since it is set to more than
> > 24hrs.)
> The rpcsec timestamp is valid, so this passes this layer. But when
> it's
> actually handled by the NFS layer, how can this permission denied come
> into the picture? Is there another GSS timestamp check on the upper
> level?
>=20
The timeout is passed to the server in the GSSAPI token that is a
part of the RPCSEC_GSS_INIT request. That is remembered by the server
side and normally used, except for the ..INDEFINITE case. The timeout
is up to the client, but is usually the time when the TGT used to get
the GSSAPI token for the RPCSEC_GSS_INIT request times out. (Simple, eh:-)

>=20
> > /*
> > * Fill in cred details in the rawcred structure.
> > @@ -990,7 +995,7 @@
> > gss_buffer_desc rpcbuf, checksum;
> > OM_uint32 maj_stat, min_stat;
> > gss_qop_t qop_state;
> > - int32_t rpchdr[128 / sizeof(int32_t)];
> > + int32_t rpchdr[2048 / sizeof(int32_t)];
> > int32_t *buf;
> Note, that I changed the buffer from 128 bytes to 2048 bytes. This is
> as per PR 162009, which is also hanging around.
> I think checking the code for the RPC 128 byte buffers would be nice
> for
> security and other reasons.
>=20
I'll take a look at this. At this time, I don't know what rpchdr[] is used
for.

> I'm going to send an email to the linux-nfs@ to find out what's going
> on
> this area - maybe this has been already fixed, as I use some old EL6
> and
> Ubuntu 12.04 flavours.
> However dropping the sec context even with failed kerberos ticket
> seems
> like a FreeBSD bug.
>=20
Yes, I think I agree. You noted a possible DOS attack that a bogus client
could do if the context is deleted for this case. The alternate argument
is that contexts will be left around for a long time when a buggy client
does the RPCSEC_DESTROY at the correct time, but uses the wrong session
key and/or encrypted checksum alg, so VerifyMIC() fails on it.

It seems that the current Linux client bug causes the RPCSEC_DESTROY to
happen at the wrong time with the wrong session key and/or encrypted
checksum alg.

rick

> Kind regards,
> Attila
>=20
> --
> Attila Bog=C3=A1r
> Systems Administrator
> Linguamatics - Cambridge, UK
> http://www.linguamatics.com/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?87865688.117574.1346884142683.JavaMail.root>