Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Oct 2012 07:52:59 +0200
From:      Gomes do Vale Victor <ulysse31@gmail.com>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: nfsv4 kerberized and gssname=root and allgsname
Message-ID:  <836B0731-DC60-40DF-8D9E-ADB9D3FD5AB5@gmail.com>
In-Reply-To: <1483416316.1685354.1349303741302.JavaMail.root@erie.cs.uoguelph.ca>
References:  <1483416316.1685354.1349303741302.JavaMail.root@erie.cs.uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
Le 4 oct. 2012 =C3=A0 00:35, Rick Macklem <rmacklem@uoguelph.ca> a =C3=A9cri=
t :

> Ulysse 31 wrote:
>> 2012/9/29 Rick Macklem <rmacklem@uoguelph.ca>:
>>> Ulysse 31 wrote:
>>>> Hi all,
>>>>=20
>>>> I am actually working on a freebsd 9 backup server.
>>>> this server would backup the production server via kerberized nfs4
>>>> (since the old backup server, a linux one, was doing so).
>>>> we used on the old backup server a root/<fqdn> kerberos identity,
>>>> which allows the backup server to access all the data.
>>>> I have followed the documentation found at :
>>>>=20
>>>> http://code.google.com/p/macnfsv4/wiki/FreeBSD8KerberizedNFSSetup
>>>>=20
>>>> done :
>>>> - added to kernel :
>>>>=20
>>>> options KGSSAPI
>>>> device crypto
>>>>=20
>>>> - added to rc.conf :
>>>>=20
>>>> nfs_client_enable=3D"YES"
>>>> rpc_lockd_enable=3D"YES"
>>>> rpc_statd_enable=3D"YES"
>>>> rpcbind_enable=3D"YES"
>>>> devfs_enable=3D"YES"
>>>> gssd_enable=3D"YES"
>>>>=20
>>>> - have done sysctl vfs.rpcsec.keytab_enctype=3D1 and added it to
>>>> /etc/sysctl.conf
>>>>=20
>>>> We used MIT kerberos implementation, since it is the one used on
>>>> all
>>>> our servers (mostly linux), and we have created and
>>>> /etc/krb5.keytab
>>>> containing the following keys :
>>>> host/<fqdn>
>>>> nfs/<fqdn>
>>>> root/<fqdn>
>>>>=20
>>>> and, of course, i have used the available patch at :
>>>> http://people.freebsd.org/~rmacklem/rpcsec_gss-9.patch
>>>>=20
>>>> When i try to mount with the (B) method (the one of the google
>>>> wiki),
>>>> it works as expected, i mean, with a correct user credential, i can
>>>> access to the user data.
>>>> But, when i try to access via the (C) method (the one that i need
>>>> in
>>>> order to do a full backup of the production storage server) i get a
>>>> systematic kernel panic when launch the mount command.
>>>> The mount command looks to something like : mount -t nfs -o
>>>> nfsv4,sec=3Dkrb5i,gssname=3Droot,allgssname <production server
>>>> fqdn>:<export_path> <local_path_where_to_mount>
> Just to confirm it, you are saying that exactly the same mount command,
> except without the "allgssname" option, doesn't crash?

No, in fact it's the same command with gssname=3Dnfs instead of gssname=3Dro=
ot that does not crash. When I specify gssname=3Droot it panics.
The same command with gssname=3Dnfs and allgssname together "works", well sh=
ould say mounts and don't crash because it does not allow accessing as root t=
o the nfs share since the netapp expects a root/fqdn key to be used for that=
.
Don't know if this would give you an hint, I'm gonna test this patch. tell m=
e if you have other ideas.
For now we decided disabling kerberised nfs on the new FreeBSD backup server=
 in order to go on production with it without getting late.
Thanks for the help.

>=20
> That is weird, since when I look at the code, there shouldn't be any
> difference between the two mounts, up to the point where it crashes.
>=20
> The crash seems to indicate that nr_auth is bogus, but I can't see
> how/why that would happen.
>=20
> I have attached a patch which changes the way nr_auth is set and "might"
> help, although I doubt it. (It is untested, but if you want to try it,
> good luck with it.)
>=20
> I'll email again if I get something more solid figured out, rick
>=20
>>>> I have activated the kernel debugging stuff to get some infos, here
>>>> is
>>>> the message :
>>>>=20
>>>>=20
>>>> Fatal trap 12: page fault while in kernel mode
>>>> cpuid =3D 0; apic id =3D 00
>>>> fault virtual address =3D 0x368
>>>> fault code =3D supervisor read data, page not present
>>>> instruction pointer =3D 0x20:0xffffffff80866ab7
>>>> stack pointer =3D 0x28:0xffffff804aa39ce0
>>>> frame pointer =3D 0x28:0xffffff804aa39d30
>>>> code segment =3D base 0x0, limit 0xfffff, type 0x1b
>>>> =3D DPL 0, pres 1, long 1, def32 0, gran 1
>>>> processor eflags =3D interrupt enabled, resume, IOPL =3D 0
>>>> current process =3D 701 (mount_nfs)
>>>> trap number =3D 12
>>>> panic: page fault
>>>> cpuid =3D 0
>>>> KDB: stack backtrace:
>>>> #0 0xffffffff808ae486 at kdb_backtrace+0x66
>>>> #1 0xffffffff8087885e at panic+0x1ce
>>>> #2 0xffffffff80b82380 at trap_fatal+0x290
>>>> #3 0xffffffff80b826b8 at trap_pfault+0x1e8
>>>> #4 0xffffffff80b82cbe at trap+0x3be
>>>> #5 0xffffffff80b6c57f at calltrap+0x8
>>>> #6 0xffffffff80a78eda at rpc_gss_init+0x72a
>>>> #7 0xffffffff80a79cd6 at rpc_gss_refresh_auth+0x46
>>>> #8 0xffffffff807a5a53 at newnfs_request+0x163
>>>> #9 0xffffffff807bf0f7 at nfsrpc_getattrnovp+0xd7
>>>> #10 0xffffffff807d9b29 at mountnfs+0x4e9
>>>> #11 0xffffffff807db60a at nfs_mount+0x13ba
>>>> #12 0xffffffff809068fb at vfs_donmount+0x100b
>>>> #13 0xffffffff80907086 at sys_nmount+0x66
>>>> #14 0xffffffff80b81c60 at amd64_syscall+0x540
>>>> #15 0xffffffff80b6c867 at Xfast_syscall+0xf7
>>>> Uptime: 2m31s
>>>> Dumping 97 out of 1002 MB:..17%..33%..50%..66%..83%..99%
>>>>=20
>>>> -----------------------------------------------------------------------=
-
>>>>=20
>>>> Does anyone as experience something similar ? is their a way to
>>>> correct that ?
>>>> Thanks for the help.
>>>>=20
>>> Well, you're probably the first person to try doing this in years. I
>>> did
>>> have it working about 4-5years ago. Welcome to the bleeding edge;-)
>>>=20
>>> Could you do the following w.r.t. above kernel:
>>> # cd /boot/nkernel (or wherever the kernel lives)
>>> # nm kernel | grep rpc_gss_init
>>> - add the offset 0x72a to the address for rpc_gss_init
>>> # addr2line -e kernel.symbols
>>> 0xXXX - the hex number above (address of rpc_gss_init+0x72a)
>>> - email me what it prints out, so I know where the crash is
>>> occurring
>>>=20
>>> You could also run the following command on the Linux server to
>>> capture
>>> packets during the mount attempt, then email me the xxx.pcap file so
>>> I
>>> can look at it in wireshark, to see what is happening before the
>>> crash.
>>> (I'm guessing nr_auth is somehow bogus, but that's just a guess.:-)
>>> # tcpdump -s 0 -w xxx.pcap host <freebsd-client>
>>=20
>> Hi,
>>=20
>> Sorry for the delay i was on travel and no working network connection.
>> Back online for the rest of the week ^^.
>> Thanks for your help, here is what it prints out :
>>=20
>> root@bsdenc:/boot/kernel # nm kernel | grep rpc_gss_init
>> ffffffff80df07b0 r __set_sysinit_set_sym_svc_rpc_gss_init_sys_init
>> ffffffff80a787b0 t rpc_gss_init
>> ffffffff80a7a580 t svc_rpc_gss_init
>> ffffffff81127530 d svc_rpc_gss_init_sys_init
>> ffffffff80a7a3b0 T xdr_rpc_gss_init_res
>> root@bsdenc:/boot/kernel # addr2line -e kernel.symbols
>> 0xffffffff80a78eda
>> /usr/src/sys/rpc/rpcsec_gss/rpcsec_gss.c:772
>>=20
>>=20
>> for the tcpdump from the linux server, i think you may are doing
>> reference to the production nfs server ?
>> if yes, unfortunately it is not linux, it is a netapp filer, so no
>> "real" root access on it (so no tcpdump available :s ).
>> if you were mentioning the old backup server (which is linux but nfs
>> client), i cannot do unmount/mount on it since its production
>> (mountpoint always busy), but i can made a quick VM/testmachine that
>> acts like the linux backup server and do a tcpdump from it.
>> Just let me know. Thanks again.
>>=20
>> --
>> Ulysse31
>>=20
>>>=20
>>> rick
>>>=20
>>>> --
>>>> Ulysse31
>>>> _______________________________________________
>>>> freebsd-fs@freebsd.org mailing list
>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>>> To unsubscribe, send any mail to
>>>> "freebsd-fs-unsubscribe@freebsd.org"
> <rpcsec-crash.patch>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?836B0731-DC60-40DF-8D9E-ADB9D3FD5AB5>