From owner-freebsd-hackers@FreeBSD.ORG Mon May 7 14:35:56 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 667BF106564A; Mon, 7 May 2012 14:35:56 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 0C2B08FC0C; Mon, 7 May 2012 14:35:54 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA06392; Mon, 07 May 2012 17:34:37 +0300 (EEST) (envelope-from avg@FreeBSD.org) Message-ID: <4FA7DD7C.4070703@FreeBSD.org> Date: Mon, 07 May 2012 17:34:36 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:12.0) Gecko/20120503 Thunderbird/12.0.1 MIME-Version: 1.0 To: Bruce Evans References: <4F8999D2.1080902@FreeBSD.org> <4FA29E1B.7040005@FreeBSD.org> <4FA2A307.2090108@FreeBSD.org> <201205041125.15155.jhb@freebsd.org> <4FA4F36A.6030903@FreeBSD.org> <20120505194459.D1295@besplex.bde.org> In-Reply-To: <20120505194459.D1295@besplex.bde.org> X-Enigmail-Version: 1.5pre Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, freebsd-hackers@FreeBSD.org, John Baldwin Subject: Re: [review request] zfsboot/zfsloader: support accessing filesystems within a pool X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 May 2012 14:35:56 -0000 on 05/05/2012 13:49 Bruce Evans said the following: > On Sat, 5 May 2012, Andriy Gapon wrote: > >> on 04/05/2012 18:25 John Baldwin said the following: >>> On Thursday, May 03, 2012 11:23:51 am Andriy Gapon wrote: >>>> on 03/05/2012 18:02 Andriy Gapon said the following: >>>>> >>>>> Here's the latest version of the patches: >>>>> http://people.freebsd.org/~avg/zfsboot.patches.4.diff >>>> >>>> I've found a couple of problems in the previous version, so here's another one: >>>> http://people.freebsd.org/~avg/zfsboot.patches.5.diff >>>> The important change is in the first patch (__exec args). >>> >>> A few comments/suggestions on the args bits: >> >> John, >> >> these are excellent suggestions! Thank you! >> Some comments: >>> - Add #ifndef LOCORE guards to the new header around the structure so >>> it can be used in assembly as well as C. >> >> Done. I have had to go into a few btx makefiles and add a necessary include >> path and -DLOCORE to make the header usable from asm. Bruce, first a note that the change that we discussed affects (should affect) only BTX code and as such only boot1/2 -> loader interface. > Ugh, why not use genassym, as is done for all old uses of this header in > locore.s, at least on i386 (5% of the i386 genassym.c is for this). Can not parse 'this header' in this context. We were talking about a new header file, so there could not be any old uses of it :-) Probably you meant sys/i386/include/bootinfo.h ? But, as you say later, it's probably not easy to use genassym with sys/boot code. Not sure if it would be worth while going this path given the possible alternatives. >>> - Move BI_SIZE and ARGOFF into the header as constants. >> >> Done. >> >>> - Add a CTASSERT() in loader/main.c that BI_SIZE == sizeof(struct bootinfo) > > Ugh, BI_SIZE was already used in locore.s. OK, but this is "the other" BI_SIZE. Maybe the name clash is not nice indeed, though. > It wasn't the size of the struct, > but was the offset of the field that gives the size. No CTASSERT() was > needed -- the size is whatever it is, as given by sizeof() on the struct > at the time of compilation of the utility that initializes the struct. > It was a feature that sizeof() and offsetof() can't be used in asm so they > must be translated in genassym and no macros are needed in the header (the > size was fully dynamic, so the asm code only needs the offsetof() values). > Of course, you could use CTASSERT()s to check that the struct layout didn't > get broken. The old code just assumes that the struct is packed by the > programmer and that the arch's struct packing conventions don't change, > so that for example BI_SIZE = offsetof(struct bootinfo, bi_size) never > changes. It seems that boot1/2 -> kernel interface and boo1/2 -> {btxldr, btx} -> loader interfaces are quite independent and a bit different. > genassym is hard to use in boot programs, but the old design was that > boot programs shouldn't use bootinfo in asm and should just use the > target bootinfo.h at compile time (whatever time the target is compiled). I am not sure if it is worthwhile adapting genassym to sys/boot... BTX code needs to know only "some size" of bootinfo. Although it doesn't look like boot1/2 passes anything really useful to loader via bootinfo except for bi_bios_dev. For that matter it looks like maybe only two fields from the whole (x86) bootinfo are useful to (x86) kernel either... > Anyway, LOCORE means "for use in locore.[sS]", so other uses of it, e.g. > in boot programs, are bogus. That's a good point. Maybe we should use some more generic name. Maybe there is even some macro that is always set for .S files that we can check. Oh, thank google, is __ASSEMBLER__ it? It seems like couple of non-x86 headers already use this macro. >> I have added a definition of CTASSERT to boostrap.h as it was not available for >> sys/boot and there were two local definitions of the macro in individual files. >> >> However the assertion would fail right now. >> The backward-compatible value of BI_SIZE (72 == 0x48) covers only part of the > > This isn't backwards compatible. BI_SIZE was decimal 48 (covers everything > up to the bi_size field). I meant backward compatible with the BTX code that I was changing, of course. >> fields in struct bootinfo, those up to the following comment: >> /* Items below only from advanced bootloader */ >> I am a little bit hesitant: should I increase BI_SIZE to cover the whole struct >> bootinfo or should I compare BI_SIZE to offsetof bi_kernend? > > Neither. BI_SIZE shouldn't exist. It defeats the bi_size field. Using the bi_size field may be the proper solution indeed. Even if no data beyond certain offset is ever used by loader. The planned changes to BTX code should make using bi_size easier. >>> Maybe >>> create a 'struct kargs_ext' that looks like this: >>> >>> struct kargs_ext { >>> uint32_t size; >>> char data[0]; >>> }; >> >> I've decided to skip on this. > > Use KNF indentation and KNF field prefixes (ka_) if you add it :-). Generic > field names like `size' and `data' need prefixes more than mos. > > The old struct was: > > % #define N_BIOS_GEOM 8 > % ... > % /* > % * A zero bootinfo field often means that there is no info available. > % * Flags are used to indicate the validity of fields where zero is a > % * normal value. > % */ > % struct bootinfo { > % u_int32_t bi_version; > % u_int32_t bi_kernelname; /* represents a char * */ > % u_int32_t bi_nfs_diskless; /* struct nfs_diskless * */ > % /* End of fields that are always present. */ > > The original size was apparently 12. > > % #define bi_endcommon bi_n_bios_used > > Another style difference. The magic 12 is essentially given by this macro. > This macro is a pseudo-field, like the ones for the copyable and zeroable > regions in struct proc. Its name is in lower case. > > % u_int32_t bi_n_bios_used; > % u_int32_t bi_bios_geom[N_BIOS_GEOM]; > > The struct was broken in 1994 by adding the above 2 fields without providing > any way to distinguish it from the old struct. > > % u_int32_t bi_size; > % u_int8_t bi_memsizes_valid; > % u_int8_t bi_bios_dev; /* bootdev BIOS unit number */ > % u_int8_t bi_pad[2]; > % u_int32_t bi_basemem; > % u_int32_t bi_extmem; > % u_int32_t bi_symtab; /* struct symtab * */ > % u_int32_t bi_esymtab; /* struct symtab * */ > > The above 8 fields were added in 1995 (together with fixing style bugs > like no prefixes for the old field names). Now the struct is determined > by its size according to the bi_size field, and the bi_version field is > not really used (it's much easier to add stuff to the end than to support > multiple versions). This gives a range of old sizes/versions: > > 12: ~1993 (FreeBSD-1) > 48: ~1994 (FreeBSD-1 and/or 2) > 0x48: FreeBSD-2 post-1995 > > But these old sizes are uninteresting since only boot loaders from before > 1993-1995 support only the above fields, and these loaders can't boot current > kernels. > > % /* Items below only from advanced bootloader */ > % u_int32_t bi_kernend; /* end of kernel space */ > % u_int32_t bi_envp; /* environment */ > % u_int32_t bi_modulep; /* preloaded modules */ > > Added in 1998. Still uninteresting, since boot loaders newer than that > are needed to boot current kernels (mainly for elf). > > % uint64_t bi_hcdp; /* DIG64 HCDP table */ > % uint64_t bi_fpswa; /* FPSWA interface */ > % uint64_t bi_systab; /* pa of EFI system table */ > % uint64_t bi_memmap; /* pa of EFI memory map */ > % uint64_t bi_memmap_size; /* size of EFI memory map */ > % uint64_t bi_memdesc_size; /* sizeof EFI memory desc */ > % uint32_t bi_memdesc_version; /* EFI memory desc version */ > > Added in 2010. Are all of these uint64_t types correct? The padding seems > to be broken, so that these fields would not work for amd64: we're at offset > 0x48 for bi_kernend. The 3 uint32_t's added in 1998 reach 0x54. Then all > the uint64_t fields are misaligned on i386, and on amd64 there is unnamed > padding before the first of them to align them. But and64 doesn't use > bootinfo.h in the kernel, so I think only the i386 version is used on amd64 > (in the boot loader), so the misaligned case isn't used. Interesting observations. It looks like these newest fields were ported from IA64 for EFI support, but it doesn't look like that support is actually in x86 yet. > The struct declaration is also broken at the end. The last field is 32 bits, > so there is unnamed padding after it on amd64 only. This padding should be > explicit, like the padding before the uint64_t fields, or just put the 32-bit > field before the 64-bit fields. > > % }; > > So apart you could hard-code the size to the 1998 value of 0x54 without > losing anything except the buggy 2010 fields. But it shouldn't be > hard-coded. I am inclined to agree. Thank you again. P.S. Actually I feel like arguing if the genassym approach is totally correct/safe for BI_SIZE. One could easily insert a field before bi_size and thus change BI_SIZE and thus break compatibility with binaries compiled before the change. And all that without getting any hint during compilation. OTOH, if BI_SIZE is explicitly defined to constant and there is a CTASSERT to assert that BI_SIZE == offsetof(..., bi_size), then the chances of unwittingly breaking things are smaller. Of course, something like this would never happen in reality. -- Andriy Gapon