From owner-freebsd-fs@FreeBSD.ORG Thu Oct 4 13:26:52 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 72CC7106564A for ; Thu, 4 Oct 2012 13:26:52 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id EE7DA8FC12 for ; Thu, 4 Oct 2012 13:26:51 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqAEADaNbVCDaFvO/2dsb2JhbAA8CRaFeboDgiABAQEDAQEBASArIAsFFhgCAg0ZAikBCSYGCAcEARwEhXCBbgYLpg2SdoEhigIBAQ8EBQaEaoESA5I4gQSCLYEVjxaDCYE/CDQ X-IronPort-AV: E=Sophos;i="4.80,536,1344225600"; d="scan'208";a="184673123" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 04 Oct 2012 09:26:44 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 70516B4037; Thu, 4 Oct 2012 09:26:44 -0400 (EDT) Date: Thu, 4 Oct 2012 09:26:44 -0400 (EDT) From: Rick Macklem To: Gomes do Vale Victor Message-ID: <1625458573.1710053.1349357204427.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <836B0731-DC60-40DF-8D9E-ADB9D3FD5AB5@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org Subject: Re: nfsv4 kerberized and gssname=root and allgsname X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Oct 2012 13:26:52 -0000 Gomes do Vale Victor wrote: > Le 4 oct. 2012 =C3=A0 00:35, Rick Macklem a =C3=A9= crit : >=20 > > Ulysse 31 wrote: > >> 2012/9/29 Rick Macklem : > >>> Ulysse 31 wrote: > >>>> Hi all, > >>>> > >>>> I am actually working on a freebsd 9 backup server. > >>>> this server would backup the production server via kerberized > >>>> nfs4 > >>>> (since the old backup server, a linux one, was doing so). > >>>> we used on the old backup server a root/ kerberos identity, > >>>> which allows the backup server to access all the data. > >>>> I have followed the documentation found at : > >>>> > >>>> http://code.google.com/p/macnfsv4/wiki/FreeBSD8KerberizedNFSSetup > >>>> > >>>> done : > >>>> - added to kernel : > >>>> > >>>> options KGSSAPI > >>>> device crypto > >>>> > >>>> - added to rc.conf : > >>>> > >>>> nfs_client_enable=3D"YES" > >>>> rpc_lockd_enable=3D"YES" > >>>> rpc_statd_enable=3D"YES" > >>>> rpcbind_enable=3D"YES" > >>>> devfs_enable=3D"YES" > >>>> gssd_enable=3D"YES" > >>>> > >>>> - have done sysctl vfs.rpcsec.keytab_enctype=3D1 and added it to > >>>> /etc/sysctl.conf > >>>> > >>>> We used MIT kerberos implementation, since it is the one used on > >>>> all > >>>> our servers (mostly linux), and we have created and > >>>> /etc/krb5.keytab > >>>> containing the following keys : > >>>> host/ > >>>> nfs/ > >>>> root/ > >>>> > >>>> and, of course, i have used the available patch at : > >>>> http://people.freebsd.org/~rmacklem/rpcsec_gss-9.patch > >>>> > >>>> When i try to mount with the (B) method (the one of the google > >>>> wiki), > >>>> it works as expected, i mean, with a correct user credential, i > >>>> can > >>>> access to the user data. > >>>> But, when i try to access via the (C) method (the one that i need > >>>> in > >>>> order to do a full backup of the production storage server) i get > >>>> a > >>>> systematic kernel panic when launch the mount command. > >>>> The mount command looks to something like : mount -t nfs -o > >>>> nfsv4,sec=3Dkrb5i,gssname=3Droot,allgssname >>>> fqdn>: > > Just to confirm it, you are saying that exactly the same mount > > command, > > except without the "allgssname" option, doesn't crash? >=20 > No, in fact it's the same command with gssname=3Dnfs instead of > gssname=3Droot that does not crash. When I specify gssname=3Droot it > panics. > The same command with gssname=3Dnfs and allgssname together "works", > well should say mounts and don't crash because it does not allow > accessing as root to the nfs share since the netapp expects a > root/fqdn key to be used for that. > Don't know if this would give you an hint, I'm gonna test this patch. > tell me if you have other ideas. Well, although it doesn't "fix" whatever the bug is, you could try a /etc/krb5.keytab file with only the "root/fqdn@realm" entry in it. (That's the way I used to create them.) > For now we decided disabling kerberised nfs on the new FreeBSD backup > server in order to go on production with it without getting late. > Thanks for the help. >=20 > > > > That is weird, since when I look at the code, there shouldn't be any > > difference between the two mounts, up to the point where it crashes. > > > > The crash seems to indicate that nr_auth is bogus, but I can't see > > how/why that would happen. > > > > I have attached a patch which changes the way nr_auth is set and > > "might" > > help, although I doubt it. (It is untested, but if you want to try > > it, > > good luck with it.) > > > > I'll email again if I get something more solid figured out, rick > > > >>>> I have activated the kernel debugging stuff to get some infos, > >>>> here > >>>> is > >>>> the message : > >>>> > >>>> > >>>> Fatal trap 12: page fault while in kernel mode > >>>> cpuid =3D 0; apic id =3D 00 > >>>> fault virtual address =3D 0x368 > >>>> fault code =3D supervisor read data, page not present > >>>> instruction pointer =3D 0x20:0xffffffff80866ab7 > >>>> stack pointer =3D 0x28:0xffffff804aa39ce0 > >>>> frame pointer =3D 0x28:0xffffff804aa39d30 > >>>> code segment =3D base 0x0, limit 0xfffff, type 0x1b > >>>> =3D DPL 0, pres 1, long 1, def32 0, gran 1 > >>>> processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > >>>> current process =3D 701 (mount_nfs) > >>>> trap number =3D 12 > >>>> panic: page fault > >>>> cpuid =3D 0 > >>>> KDB: stack backtrace: > >>>> #0 0xffffffff808ae486 at kdb_backtrace+0x66 > >>>> #1 0xffffffff8087885e at panic+0x1ce > >>>> #2 0xffffffff80b82380 at trap_fatal+0x290 > >>>> #3 0xffffffff80b826b8 at trap_pfault+0x1e8 > >>>> #4 0xffffffff80b82cbe at trap+0x3be > >>>> #5 0xffffffff80b6c57f at calltrap+0x8 > >>>> #6 0xffffffff80a78eda at rpc_gss_init+0x72a > >>>> #7 0xffffffff80a79cd6 at rpc_gss_refresh_auth+0x46 > >>>> #8 0xffffffff807a5a53 at newnfs_request+0x163 > >>>> #9 0xffffffff807bf0f7 at nfsrpc_getattrnovp+0xd7 > >>>> #10 0xffffffff807d9b29 at mountnfs+0x4e9 > >>>> #11 0xffffffff807db60a at nfs_mount+0x13ba > >>>> #12 0xffffffff809068fb at vfs_donmount+0x100b > >>>> #13 0xffffffff80907086 at sys_nmount+0x66 > >>>> #14 0xffffffff80b81c60 at amd64_syscall+0x540 > >>>> #15 0xffffffff80b6c867 at Xfast_syscall+0xf7 > >>>> Uptime: 2m31s > >>>> Dumping 97 out of 1002 MB:..17%..33%..50%..66%..83%..99% > >>>> > >>>> --------------------------------------------------------------------= ---- > >>>> > >>>> Does anyone as experience something similar ? is their a way to > >>>> correct that ? > >>>> Thanks for the help. > >>>> > >>> Well, you're probably the first person to try doing this in years. > >>> I > >>> did > >>> have it working about 4-5years ago. Welcome to the bleeding > >>> edge;-) > >>> > >>> Could you do the following w.r.t. above kernel: > >>> # cd /boot/nkernel (or wherever the kernel lives) > >>> # nm kernel | grep rpc_gss_init > >>> - add the offset 0x72a to the address for rpc_gss_init > >>> # addr2line -e kernel.symbols > >>> 0xXXX - the hex number above (address of rpc_gss_init+0x72a) > >>> - email me what it prints out, so I know where the crash is > >>> occurring > >>> > >>> You could also run the following command on the Linux server to > >>> capture > >>> packets during the mount attempt, then email me the xxx.pcap file > >>> so > >>> I > >>> can look at it in wireshark, to see what is happening before the > >>> crash. > >>> (I'm guessing nr_auth is somehow bogus, but that's just a > >>> guess.:-) > >>> # tcpdump -s 0 -w xxx.pcap host > >> > >> Hi, > >> > >> Sorry for the delay i was on travel and no working network > >> connection. > >> Back online for the rest of the week ^^. > >> Thanks for your help, here is what it prints out : > >> > >> root@bsdenc:/boot/kernel # nm kernel | grep rpc_gss_init > >> ffffffff80df07b0 r __set_sysinit_set_sym_svc_rpc_gss_init_sys_init > >> ffffffff80a787b0 t rpc_gss_init > >> ffffffff80a7a580 t svc_rpc_gss_init > >> ffffffff81127530 d svc_rpc_gss_init_sys_init > >> ffffffff80a7a3b0 T xdr_rpc_gss_init_res > >> root@bsdenc:/boot/kernel # addr2line -e kernel.symbols > >> 0xffffffff80a78eda > >> /usr/src/sys/rpc/rpcsec_gss/rpcsec_gss.c:772 > >> > >> > >> for the tcpdump from the linux server, i think you may are doing > >> reference to the production nfs server ? > >> if yes, unfortunately it is not linux, it is a netapp filer, so no > >> "real" root access on it (so no tcpdump available :s ). > >> if you were mentioning the old backup server (which is linux but > >> nfs > >> client), i cannot do unmount/mount on it since its production > >> (mountpoint always busy), but i can made a quick VM/testmachine > >> that > >> acts like the linux backup server and do a tcpdump from it. > >> Just let me know. Thanks again. > >> > >> -- > >> Ulysse31 > >> > >>> > >>> rick > >>> > >>>> -- > >>>> Ulysse31 > >>>> _______________________________________________ > >>>> freebsd-fs@freebsd.org mailing list > >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs > >>>> To unsubscribe, send any mail to > >>>> "freebsd-fs-unsubscribe@freebsd.org" > >