Date: Mon, 9 Feb 2015 09:37:18 -0800 From: Mark Millard <markmi@dsl-only.net> To: Nathan Whitehorn <nwhitehorn@freebsd.org> Cc: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org> Subject: Re: HEADS UP: ... [Better info for __syncicache exceptions at a dcbst: %DAR holding address not covered by PTEG's] Message-ID: <70827670-5645-4445-BCEF-1809F2554F84@dsl-only.net> In-Reply-To: <DA3763F9-748F-4A55-A4E3-47A14649EF26@dsl-only.net> References: <335C8DCD-33DF-4430-A0FA-77669C513C61@dsl-only.net> <449E0C48-B57D-4873-B2E7-BC217D891897@dsl-only.net> <E527514A-96F6-4794-8F03-504E51EC8CCB@dsl-only.net> <DBFD0918-6274-47D3-8B35-5178ABD16728@dsl-only.net> <DA3763F9-748F-4A55-A4E3-47A14649EF26@dsl-only.net>
next in thread | previous in thread | raw e-mail | index | archive | help
objdump -h /boot/kernel10.1S/kernel shows a hole in the memory map = between 25 (kern_conf) and 26 (.data) below: /boot/kernel10.1S/kernel: file format elf64-powerpc Sections: Idx Name Size VMA LMA File off = Algn 0 .text 007e5f84 0000000000100190 0000000000100190 00000190 = 2**4 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .interp 0000000d 00000000008e6114 00000000008e6114 007e6114 = 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATA 2 .hash 00023598 00000000008e6128 00000000008e6128 007e6128 = 2**3 CONTENTS, ALLOC, LOAD, READONLY, DATA ... 25 kern_conf 000008d1 0000000000be6240 0000000000be6240 00ae6240 = 2**3 CONTENTS, ALLOC, LOAD, READONLY, DATA 26 .data 000635b6 0000000000bf0000 0000000000bf0000 00af0000 = 2**16 CONTENTS, ALLOC, LOAD, DATA Note that 0xbe6240+0x8d1 =3D=3D 0xbe6b11 and 0xbe7000 is on a different = page, and there are several more pages in the hole before 0xbf0000. Also far more than "code" areas are having __syncicache's dcbst = executed. By contrast for the working kernel10.1RE kern_conf and .data are close = together, .data being on the very next page (0xbd600+0x8d1=3D0xbdfed1 = vs. 0xbe0000, a difference of 0x12f): 25 kern_conf 000008d1 0000000000bdf600 0000000000bdf600 00adf600 = 2**3 CONTENTS, ALLOC, LOAD, READONLY, DATA 26 .data 000635b6 0000000000be0000 0000000000be0000 00ae0000 = 2**16 CONTENTS, ALLOC, LOAD, DATA =3D=3D=3D Mark Millard markmi@dsl-only.net On 2015-Feb-9, at 08:28 AM, Mark Millard <markmi@dsl-only.net> wrote: An early copy/paste error propagated throughout my figures and messes = things up. So I try again... Also it turns out openfirmware has a "word" for dumping the registers = for a crash: .registers So I can coordinate some register values with __syncicache operation and = that is better information than I tried to give before anyway. I used a = failing direct boot attempt for the kernel10.1S that has len =3D = 0x10175d0. 01c27834 <__syncicache> lis r9,452 01c27838 <__syncicache+0x4> lwz r10,-25800(r9) Note: cacheline_size =3D r10 above ends up being 0x20. Yep the code is treating the 970MP processor as having a very small = cache line size: 32. My understanding is that it is really 128. So each = line is dcbst'd 4 times (offsets 0, 32, 64, and 96). 01c2783c <__syncicache+0x8> addi r11,r10,-1 Note: cacheline_size-1 =3D r11 above (0x1F). [So fixing both my initial copy/paste of the from value and the = cacheline_size-1 figures: off =3D=3D from & (cacheline_size-1) off =3D=3D 0x100000 & 0x1F (in all 3 cases) off =3D=3D 0x0 So in essence that old material should be ignored.] 01c27840 <__syncicache+0xc> and r0,r3,r11 01c27844 <__syncicache+0x10> add r0,r0,r4 01c27848 <__syncicache+0x14> andc r8,r3,r11 01c2784c <__syncicache+0x18> subf r0,r10,r0 01c27850 <__syncicache+0x1c> neg r7,r10 01c27854 <__syncicache+0x20> mr r9,r8 01c27858 <__syncicache+0x24> mr r11,r0 01c2785c <__syncicache+0x28> dcbst r0,r9 Note: p =3D r9 above. The register dump shows r9=3D0xbe7000 and shows DAR=3D0xbe7000. It also shows SRR0=3D0x1c2785c and interrupt vector 0x300. DSISR=3D0x40000000 (not found in the primary or secondary PTEG). And: r10=3D0x20, r6=3D0x5305d0, r11=3D5305b0, r7=3D0xffffffffffffffe0 (so = -0x20) (r6,r11 together are related to l in the C source.) 01c27860 <__syncicache+0x2c> add r11,r11,r7 01c27864 <__syncicache+0x30> add. r6,r10,r11 01c27868 <__syncicache+0x34> add r9,r9,r10 01c2786c <__syncicache+0x38> bgt+ 01c2785c <__syncicache+0x28> 01c27870 <__syncicache+0x3c> sync =20 01c27874 <__syncicache+0x40> icbi r0,r8 01c27878 <__syncicache+0x44> add r0,r0,r7 01c2787c <__syncicache+0x48> add. r9,r0,r10 01c27880 <__syncicache+0x4c> add r8,r8,r10 01c27884 <__syncicache+0x50> bgt+ 01c27874 <__syncicache+0x40> 01c27888 <__syncicache+0x54> sync =20 01c2788c <__syncicache+0x58> isync 01c27890 <__syncicache+0x5c> blr So the problem seems to be not at the ends of the range but in the = middle of it: a hole in the address range with no PTEG coverage. =3D=3D=3D Mark Millard markmi@dsl-only.net On 2015-Feb-8, at 04:20 PM, Mark Millard <markmi@dsl-only.net> wrote: [Side note: My prior comment about having additional signed/unsigned = conversion code related to "int len" in __syncicache was wrong. That the = calling argument (*result)->f_size has size_t as a type ends up just = depending on the value not being so big as to cause problems when it is = converted to "int len" for __syncicache's len parameter.] %SRR0 being 0x1c2785c is a loader code address and does not change = unless one switches/updates the loader. objdump shows the following for the unstripped variant of the loader = that is in use = (/usr/obj/usr/home/markmi/src_10_1_releng/sys/boot/powerpc/ofw/loader): 01c27834 <__syncicache> lis r9,452 01c27838 <__syncicache+0x4> lwz r10,-25800(r9) 01c2783c <__syncicache+0x8> addi r11,r10,-1 01c27840 <__syncicache+0xc> and r0,r3,r11 01c27844 <__syncicache+0x10> add r0,r0,r4 01c27848 <__syncicache+0x14> andc r8,r3,r11 01c2784c <__syncicache+0x18> subf r0,r10,r0 01c27850 <__syncicache+0x1c> neg r7,r10 01c27854 <__syncicache+0x20> mr r9,r8 01c27858 <__syncicache+0x24> mr r11,r0 01c2785c <__syncicache+0x28> dcbst r0,r9 01c27860 <__syncicache+0x2c> add r11,r11,r7 01c27864 <__syncicache+0x30> add. r6,r10,r11 01c27868 <__syncicache+0x34> add r9,r9,r10 01c2786c <__syncicache+0x38> bgt+ 01c2785c <__syncicache+0x28> 01c27870 <__syncicache+0x3c> sync =20 01c27874 <__syncicache+0x40> icbi r0,r8 01c27878 <__syncicache+0x44> add r0,r0,r7 01c2787c <__syncicache+0x48> add. r9,r0,r10 01c27880 <__syncicache+0x4c> add r8,r8,r10 01c27884 <__syncicache+0x50> bgt+ 01c27874 <__syncicache+0x40> 01c27888 <__syncicache+0x54> sync =20 01c2788c <__syncicache+0x58> isync 01c27890 <__syncicache+0x5c> blr So the address in %SRR0 that is reported is for: 01c2785c <__syncicache+0x28> dcbst r0,r9 The source shows: void __syncicache(void *from, int len) { register_t l, off; char *p; off =3D (uintptr_t)from & (cacheline_size - 1); l =3D len +=3D off; p =3D (char *)from - off; do { __asm __volatile ("dcbst 0,%0" :: "r"(p)); p +=3D cacheline_size; } while ((l -=3D cacheline_size) > 0); __asm __volatile ("sync"); p =3D (char *)from - off; do { __asm __volatile ("icbi 0,%0" :: "r"(p)); p +=3D cacheline_size; } while ((len -=3D cacheline_size) > 0); __asm __volatile ("sync; isync"); } Comparing the 3 examples of dcbst use for off and initial len values = (using cacheline_size =3D=3D 128 so cacheline_size-1 =3D=3D 0x7F): off =3D=3D 0x1c35b48 & 0x7F (in all 3 cases) off =3D=3D 0x48 len =3D=3D 0x1014b80 is cache-line-multiple sized: This is the case that = works fine. from+len =3D=3D 0x1c35b48 + 0x1014b80 from+len =3D=3D 0x2c4A6C8 from+len-1 =3D=3D 0x2c4A6C7 start at 0x1c35b00 end with 0x2c4A680 (spans through 0x2c4A6FF) len =3D=3D 0x1014bb0 is NOT cache-line-multiple sized: FAILS! from+len =3D=3D 0x1c35b48 + 0x1014b80 from+len =3D=3D 0x2c4A6F8 from+len-1 =3D=3D 0x2c4A6F7 start at 0x1c35b00 end with 0x2c4A680 (spans through 0x2c4A6FF) NOTE: Same start/end range as above but fails by causing an exception! len =3D=3D 0x10175d0 is NOT cache-line-multiple sized: FAILS! from+len =3D=3D 0x1c35b48 + 0x10175d0 from+len =3D=3D 0x2c4D118 from+len-1 =3D=3D 0x2c4D117 start at 0x1c35b00 end with 0x2c4D100 (spans through 0x2c4A67F) The 3 cases have the same number of "extra before from" in the = cache-lines. But the 3 cases have varying numbers of "extra after from+len-1". Only for len being a multiple of the cache-line-size would "extra before = from" + "extra after from+len-1" be the cache-line-size (or zero if from = was cache-line-aligned). My guess is that something is configured based on "extra before from" + = "extra after from+len-1" being the cache-line-size (or zero if from was = cache-line-aligned) when it does not apply. So the processor ends up = rejecting some bytes during a dcbst instruction for len values that are = not multiples of the cache-line size. For __syncicache's use from sys/boot/ofw/libofw/ppc64_elf_freebsd.c I've = not figured out what all is configured related to the the range = [from,from+len) (excludes from+len) --i.e. the range [from,from+len-1] = since 0<len. where from =3D=3D (void *) (*result)->f_addr len =3D=3D (*result)->f_size Blindly listing things that can cause dcbst to exception (quotes form = PowerISA_203_Final_Public.pdf): > A Data Storage interrupt occurs when no higher priority exception = exists and a data access cannot be per- formed for any of the following = reasons. Data address translation is enabled (MSRDR=3D1) and the virtual = address of any byte of the storage location specified by a Load, Store, = icbi, dcbz, dcbst, dcbf[l], eciwx, or ecowx instruction cannot be = translated to a real address.=20 > A Data Segment interrupt occurs when no higher prior- ity exception = exists and a data access cannot be per- formed because data address = translation is enabled and the effective address of any byte of the = storage location specified by a Load, Store, icbi, dcbz, dcbst, dcbf[l] = eciwx, or ecowx instruction cannot be trans- lated to a virtual address.=20= > dcbf, dcbfep, dcbst, and dcbstep instructions are treated as Loads = with respect to protection. Flushing or storing a line from the cache is = not considered a Store since the store has already been done to update = the cache and the dcbf, dcbfep, dcbst, or dcbstep instruction is only = updating the copy in main storage. As a Load, they can cause Read Access = Control exception type Data Storage interrupts.=20 (A table indicates Yes for Read Protection Violation and No for Write = Protection Violation.) > On implementations on which a Machine Check interrupt can be caused by = referring to an invalid real address, executing a dcbz, dcbzep, or dcba = instruction can cause a delayed Machine Check interrupt by establishing = in the data cache a block that is associated with an invalid real = address. See Section 3.2 of Book II. A Machine Check interrupt can = eventually occur if and when a subsequent attempt is made to write that = block to main storage, for example as the result of executing an = instruc- tion that causes a cache miss for which the block is the target = for replacement or as the result of exe- cuting a dcbst, dcbstep, dcbf, = or dcbfep instruc- tion.=20 The dcbst related details for the code are (picking a failing example's = argument values): from =3D=3D 0x1c35b48 len =3D=3D 0x1014bb0 (example) starting at 0x1c35b00 ending with 0x2c4A680 (but it will span through 0x2c4A6FF) The code's calculations are based on: off =3D=3D (uintptr_t)from & (cacheline_size - 1) off =3D=3D 0x1c35b48 & 0x7F off =3D=3D 0x48 (len as in the argument here, not later updates) len+off =3D=3D 0x1014bb0 + 0x48 len+off =3D=3D 0x1014BF8 l =3D=3D 0x1014BF8 (initially) Note that len+off is not a multiple of the cache size. Repeated = subtractions of the cache size will end up with a non-zero remainder. (p as it is initialized here, not later updates) p =3D=3D from - off p =3D=3D 0x1c35b48 - 0x48 p =3D=3D 0x1c35b00 p+l =3D=3D 0x1c35b00+0x1014BF8 p+l =3D=3D 0x2c4A6F8 Note that 0x2c4A6F8 this is not cache aligned. Note that as the dcbst loop progresses p+l is constant (other than in = the middle of updating the pair). So when... l=3D=3D0xF8: p=3D=3D0x2c4A600 l=3D=3D0x78: p=3D=3D0x2c4A680 l<0: no dcbst for p It appears that the dcbst loop does span the range of cache-lines that = it should. But apparently the processor is rejecting some bytes from that range = when full cache-line sized units are counted. I would guess the = rejection is on the l=3D0x78 side of things that varies from example to = example. =3D=3D=3D Mark Millard markmiatdsl-only.net On 2015-Feb-8, at 01:16 AM, Mark Millard <markmi at dsl-only.net> wrote: I've narrowed down greatly where the crashes happen, which need not be = where root-cause is: that could be earlier. In the following code [I'm using 10.1-RELEASE-p5 for reference here] $ svnlite diff sys/boot/ofw/libofw/ppc64_elf_freebsd.c=20 Index: sys/boot/ofw/libofw/ppc64_elf_freebsd.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- sys/boot/ofw/libofw/ppc64_elf_freebsd.c (revision 277808) +++ sys/boot/ofw/libofw/ppc64_elf_freebsd.c (working copy) @@ -59,7 +59,11 @@ * be done by the kernel after relocation. */ if (!strcmp((*result)->f_type, "elf kernel")) +{ +printf("ppc64_ofw_elf_loadfile before __syncicache\n"); __syncicache((void *) (*result)->f_addr, = (*result)->f_size); +printf("ppc64_ofw_elf_loadfile after __syncicache\n"); +} return (0); } for a directly-bootable (no crash) kernel-build both printf's are = displayed. (The above code part of /boot/loader .) But for the kernels that I build that fail to directly boot only the = first of the two printf's is displayed when a direct boot is attempted: = Openfirmware's notice with %SRR0 and %SRR1 shows up after that instead = of the text from the second printf. Based on that much it looks like the crash is either in evaluating the = arguments to __syncicache or happens during __syncicache's execution, = not after. Changing the first printf to something like the sequence: printf("ppc64_ofw_elf_loadfile before __syncicache\n"); printf("(void*)result: %p\n",(void*)result); printf("(void*)(*result): %p\n",(void*)(*result)); printf("(void*)(*result)->f_addr: %p\n",(void*)(*result)->f_addr); printf("(*result)->f_size : 0x%lx\n",(*result)->f_size); Shows that all of the stages print before the crash happens, answering = the question about evaluation of the arguments: there is no problem = evaluating them. So the crashes are strictly during __syncicache's activity. Only (*result)->f_size varies between the 3 examples that I've used: 10.1-RELEASE-p5 variant without VERBOSE_SYSINIT (and BOOTVERBOSE, = BOOTHOWTO). (boots fine) 10.1-RELEASE-p5 variant with VERBOSE_SYSINIT (and BOOTVERBOSE, = BOOTHOWTO). (crashes) 10.1-STABLE (-r278028) variant without VERBOSE_SYSINIT (and BOOTVERBOSE, = BOOTHOWTO). (crashes) What the above printf's reported was: (void*)result: 0x1c35b48 (void*)(*result): 0x1ebc0 (*result)->f_addr: 0x100000 10.1-RELEASE-p5 variant without VERBOSE_SYSINIT: (*result)->f_size: = 0x1014b80 10.1-RELEASE-p5 variant with VERBOSE_SYSINIT: (*result)->f_size: = 0x1014bb0 10.1-STABLE (-r278028) variant without VERBOSE_SYSINIT: = (*result)->f_size: 0x10175d0 (Listed in increasing order.) As I remember for the last two the crash report listed %SRR0: 0x1c2785c = for both. The "without VERBOSE_SYSINIT" one above does not crash and for = it the "ppc64_ofw_elf_loadfile after __syncicache" message shows up as = it should. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2015-Feb-7, at 09:43 AM, Mark Millard <markmi at dsl-only.net> wrote: Correction to earlier Email: VERBOSE_SYSINIT with DDB (and GDB) all = enabled (indirectly booted via using kernel10.1RE) got 0x1c277ec for the = %SRR0 value, not 0x1c277fc. So slightly different than Kernel10.1S's = 0x1c277fc (for this 10.1-STABLE variant). (I looked at the wrong notes = when composing the original Email.) More comparisons of kernel build options: VERBOSE_SYSINIT enabled with DDB (and GDB) disabled still has the = booting problem for my 10.1-RELEASE-p5 variant. It also still has the = 0x1c277ec for the %SRR0 value. For VERBOSE_SYSINIT disabled (DDB and GDB enabled) directly booted... Preloaded elf kernel "/boot/kernel/kernel" at 0x1106000. ... real memory =3D 17152118784 (16357 MB) available KVA =3D 7222611967 (6888 MB) Physical memory chunk(s): 0x0000000000024000 - 0x00000000000fffff, 901120 bytes (220 pages) 0x0000000001115000 - 0x00000000017fffff, 7254016 bytes (1771 pages) 0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages) 0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages) 0x0000000004cbd000 - 0x000000000fffffff, 187969536 bytes (45891 pages) 0x0000000020000000 - 0x000000007f5effff, 1600061440 bytes (390640 pages) 0x0000000100000000 - 0x0000000466827fff, 14604730368 bytes (3565608 = pages) 0x0000000200000000 - 0x00000001ffffffff, 0 bytes (0 pages) 0x0000000300000000 - 0x00000002ffffffff, 0 bytes (0 pages) 0x0000000400000000 - 0x00000003ffffffff, 0 bytes (0 pages) avail memory =3D 16374190080 (15615 MB) So 0x1c277ec is between the two: 0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages) 0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages) (But I do not know what most of the regions and holes are supposed to = be.) VERBOSE_SYSINIT, DDB, and GDB enabled but indirectly booted via = kernel10.1RE (via /boot/loader.conf's kernel=3D"kernel10.1RE"), = stopping, unloading, then doing "boot kernel": Preloaded elf kernel "/boot/kernel/kernel" at 0x1116000. ... real memory =3D 17152118784 (16357 MB) available KVA =3D 7222611967 (6888 MB) Physical memory chunk(s): 0x0000000000024000 - 0x00000000000fffff, 901120 bytes (220 pages) 0x0000000001105000 - 0x0000000001114fff, 65536 bytes (16 pages) 0x0000000001125000 - 0x00000000017fffff, 7188480 bytes (1755 pages) 0x0000000001814000 - 0x0000000001bfffff, 4112384 bytes (1004 pages) 0x0000000001c3d000 - 0x0000000001c3cfff, 0 bytes (0 pages) 0x0000000004cbd000 - 0x000000000fffffff, 187969536 bytes (45891 pages) 0x0000000020000000 - 0x000000007f5effff, 1600061440 bytes (390640 pages) 0x0000000100000000 - 0x0000000466827fff, 14604730368 bytes (3565608 = pages) 0x0000000200000000 - 0x00000001ffffffff, 0 bytes (0 pages) 0x0000000300000000 - 0x00000002ffffffff, 0 bytes (0 pages) 0x0000000400000000 - 0x00000003ffffffff, 0 bytes (0 pages) avail memory =3D 16374190080 (15615 MB) =3D=3D=3D Mark Millard markmi at dsl-only.net On 2015-Feb-7, at 03:49 AM, Mark Millard <markmi at dsl-only.net> wrote: Nathan, you had the below written about my problems with booting my = builds of, say, 10.1-STABLE (kernel=3D"kernel10.1S" in = /boot/loaderl.conf) without involving the kernel from my build of = 10.1-RELEASE-p5 (kernel=3D"kernel10.1RE" or sometimes kernel=3D"kernel" = in /boot/loader.conf), where kernel=3D"kernel10.1RE" in = /boot/loader.conf boots just fine... > So this has to be some kind of icache issue. If you unload and reload=20= > the *same* kernel, does it also help? > -Nathan (Part of the evidence was: Using kernel=3D"kernel10.1RE" in = /boot/loader.conf, stopping at the 10sec prompt, unloading, and doing = "boot kernel 10.1S" lets my 10.1-STABLE builds boot that will not boot = directly.) Well I've got a little more information from a different direction: A = way to create the problem when building my 10.1-RELEASE-p5 kernel is to = enable VERBOSE_SYSINIT. More specifically the comparison/contrast I've = done so far is... I added the following 3 lines to my GENERIC64vtsc for my 10.1-RELEASE-p5 = source tree (no other changes elsewhere at all) options VERBOSE_SYSINIT options BOOTVERBOSE=3D1 options BOOTHOWTO=3DRB_VERBOSE and rebuilt kernel the via KERNCONF=3DGENERIC64vtsc INSTKERNNAME=3Dkernel = the resulting kernel load fails if referenced by /boot/loader.conf via = kernel=3D"kernel" line. The %SRR0 address value listed is the same as = for kernel10.1S: 1c277fc. But booting using kernel=3D"kernel10.1RE" in = /boot/loader.conf, stopping at the 10sec wait, unloading, and typing = "boot kernel" boots fine --just like "boot kernel10.1S". Note: GENERIC64vtsc has option DDB enabled (and GBD too). (This is = associated my with my information gathering for early G5 boot = crashes/hangups.) Note: This is the first time I've ever tried any of those 3 options. My = kernel10.1S build was not based on them. Then I changed the 3 lines by just commenting out the first of the 3 = that I had added #options VERBOSE_SYSINIT options BOOTVERBOSE=3D1 options BOOTHOWTO=3DRB_VERBOSE and rebuilt via KERNCONF=3DGENERIC64vtsc INSTKERNNAME=3Dkernel again. = The resulting /boot/kernel/... boots just fine when kernel=3D"kernel" is = used in /boot/loader.conf : no need for using kernel10.1RE or for = stopping to do anything special. =3D=3D=3D Mark Millard markmi at dsl-only.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?70827670-5645-4445-BCEF-1809F2554F84>