Date: Wed, 2 Sep 2015 16:47:38 +0200 From: Svatopluk Kraus <onwahe@gmail.com> To: Dmitry Marakasov <amdmi3@amdmi3.ru> Cc: Adrian Chadd <adrian.chadd@gmail.com>, "freebsd-arm@FreeBSD.org" <freebsd-arm@freebsd.org>, Ian Lepore <ian@freebsd.org> Subject: Re: Instability likely related to new pmap on Cubieboard A10 Message-ID: <CAFHCsPXqgGpXPMJLeZT53bZgwQKOo1r%2B1EAqqn9NitHX4if1vA@mail.gmail.com> In-Reply-To: <20150901130117.GK1245@hades.panopticon> References: <20150819120753.GH79354@hades.panopticon> <CAFHCsPVSGuWWY97ac2QVGAE77Lz2gJ12wDLpzH_kNdZsLQxh%2BQ@mail.gmail.com> <20150819134708.GJ79354@hades.panopticon> <CAFHCsPVwZS_rCnvKztg7g2%2BvrOBwQpqpPYyA2=hCLGiiU5=mrQ@mail.gmail.com> <20150819232836.GA1245@hades.panopticon> <CAFHCsPUfMhkCjiip7o6ZwGx4jNb1-Xqptsj9h_CzZ8xtfZswiA@mail.gmail.com> <20150820185417.GB1245@hades.panopticon> <CAFHCsPVrNFEDkq-j99j7a5GDBN2u7kRJRGk6Ggym2v_dre%2BN3A@mail.gmail.com> <20150820201020.GC1245@hades.panopticon> <CAFHCsPUMxiNHoUqxmz2TjELX943v2b8dkJzgTNowJ%2B=%2B7qDSMg@mail.gmail.com> <20150901130117.GK1245@hades.panopticon>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] On Tue, Sep 1, 2015 at 3:01 PM, Dmitry Marakasov <amdmi3@amdmi3.ru> wrote: > * Svatopluk Kraus (onwahe@gmail.com) wrote: > >> >> Thanks. Meantime, I tried most recent HEAD on pandaboard and >> >> beaglebone black and no problem there. Do you have enabled INVARIANTS >> >> and INVARIANT_SUPPORT in your config? >> > >> > I've enabled them at some point - at least last two runs had these >> > enabled. Any other way I could help? Maybe I should check if it was >> > new pmap commit which caused this, and if not, bisect it? >> > >> >> Can you try attached semi-debug patch, please? I want to be sure that >> problem is not on patched place. > > Sorry for delay, I was short on time last week, and then I was busy with > setting up tftp/nfs netboot for my cubieboard. Now it finally works > and I'd say it's pretty cool when I can test another build without > plugging sd card around. Unfortunately, with this setup panic doesn't > reproduce: there are just around 10 sh(1) segfaults during init, and > then it boots into somehow usable state. Only once I've had panic with > your latest patch applied: > > https://people.freebsd.org/~amdmi3/pmap4.log > > With my new netboot, I plan to try to bisect it; for panic debugging I > guess I'll have to get back to plugging SD around. If you want me to do > more panic tests, could we please revisit which patches should be > applied cause I'm kinda lost in them. > Okay then, here is my summary: I thought that segmentation faults and panic(s) are caused by two different things. Now, it looks that they are not. The logs of old configuration with sd card, you provided, shows that something is corrupting kernel memory. Three logs and three quite different corruptions. So now, I would like to focus to segmentation faults which points to some corruption too. As you are only one who reports this kind of problem, it's probably related to your hardware or the way how your system is booted. Thus, if you would be so kind: (A1) Use only one hardware configuration now. I like to investigate the segmentation faults, so you may use netboot. (A2) Start with clean, up to date kernel without any patches to learn how the system behaves. (A3) You can try to use old pmap, however, the result has no relevance. Note that with old pmap, the system memory layout is different, the system timing is different (for example, when a process is forking), and pointless cache and TLB operation are done in addition. (A4) Apply attached patch, enable KTR, set KTR_MASK to KTR_TRAP, and when system breaks to debugger, send me output from the following commands: "show ktr" ... at least 10 lines but more is better ;) "show pmap /u" (A5) If you type "continue", you can repeat step A4 and send me info from more segmentation faults at once. (A6) If you got panic, send me kernel file together with panic backtrace. (B) Another thing you could try is to omit in you configuration as many devices as possible. Mainly the ones which use DMA. For example, boot from sd card without network driver compiled in and vice versa. (C) If you think that there was kernel revision including new pmap which worked without problems, confirm that please and tell me which one it was. Svata [-- Attachment #2 --] Index: sys/arm/arm/trap-v6.c =================================================================== --- sys/arm/arm/trap-v6.c (revision 287394) +++ sys/arm/arm/trap-v6.c (working copy) @@ -167,7 +167,27 @@ {abort_fatal, "Undefined Code (0x40F)"} }; +static void +cpu_tracesigexit(struct thread *td) +{ + struct trapframe *tf; + tf = td->td_frame; + if (tf == NULL) + return; + + CTR3(KTR_TRAP, "pc 0x%08x usr_lr 0x%08x usr_sp 0x%08x", + tf->tf_pc, tf->tf_usr_lr, tf->tf_usr_sp); + CTR3(KTR_TRAP, "spsr 0x%08x svc_lr 0x%08x svc_sp 0x%08x", + tf->tf_spsr, tf->tf_svc_lr, tf->tf_svc_sp); + CTR5(KTR_TRAP, "r0 0x%08x r1 0x%08x r2 0x%08x r3 0x%08x r4 0x%08x", + tf->tf_r0, tf->tf_r1, tf->tf_r2, tf->tf_r3, tf->tf_r4); + CTR5(KTR_TRAP, "r5 0x%08x r6 0x%08x r7 0x%08x r8 0x%08x r9 0x%08x", + tf->tf_r5, tf->tf_r6, tf->tf_r7, tf->tf_r8, tf->tf_r9); + CTR3(KTR_TRAP, "r10 0x%08x r11 0x%08x r12 0x%08x", + tf->tf_r10, tf->tf_r11, tf->tf_r12); +} + static __inline void call_trapsignal(struct thread *td, int sig, int code, vm_offset_t addr) { @@ -176,6 +196,9 @@ CTR4(KTR_TRAP, "%s: addr: %#x, sig: %d, code: %d", __func__, addr, sig, code); + cpu_tracesigexit(td); + breakpoint(); + /* * TODO: some info would be nice to know * if we are serving data or prefetch abort.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFHCsPXqgGpXPMJLeZT53bZgwQKOo1r%2B1EAqqn9NitHX4if1vA>
