From owner-freebsd-arm@FreeBSD.ORG Sun May 5 23:20:31 2013 Return-Path: Delivered-To: freebsd-arm@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1AB426E4 for ; Sun, 5 May 2013 23:20:31 +0000 (UTC) (envelope-from tim@kientzle.com) Received: from monday.kientzle.com (99-115-135-74.uvs.sntcca.sbcglobal.net [99.115.135.74]) by mx1.freebsd.org (Postfix) with ESMTP id E2B5E1C2 for ; Sun, 5 May 2013 23:20:30 +0000 (UTC) Received: (from root@localhost) by monday.kientzle.com (8.14.4/8.14.4) id r45NKTuF030208; Sun, 5 May 2013 23:20:29 GMT (envelope-from tim@kientzle.com) Received: from [192.168.2.123] (CiscoE3000 [192.168.1.65]) by kientzle.com with SMTP id hr6w4sdvtj26egpxnyk2kzgezw; Sun, 05 May 2013 23:20:29 +0000 (UTC) (envelope-from tim@kientzle.com) Subject: Re: Is this related to the general panic discussed in freebsd-current? Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=us-ascii From: Tim Kientzle In-Reply-To: <20130505233729.63ac23bc@bender.lan> Date: Sun, 5 May 2013 16:20:28 -0700 Content-Transfer-Encoding: 7bit Message-Id: References: <51835891.4050409@thieprojects.ch> <03971BD1-4ADE-4435-BDD0-B94B62634F1D@bsdimp.com> <5183BF8C.4040406@thieprojects.ch> <6D0E82C9-79D1-4804-9B39-3440F99AA8FE@kientzle.com> <20130505140006.0d671ba5@bender> <20130505233729.63ac23bc@bender.lan> To: Andrew Turner X-Mailer: Apple Mail (2.1283) Cc: freebsd-arm@freebsd.org X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Porting FreeBSD to the StrongARM Processor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 May 2013 23:20:31 -0000 On May 5, 2013, at 3:37 PM, Andrew Turner wrote: > On Sun, 5 May 2013 09:37:48 -0700 > Tim Kientzle wrote: >> On May 5, 2013, at 6:00 AM, Andrew Turner wrote: >> >>> On Sat, 4 May 2013 15:44:37 -0700 >>> Tim Kientzle wrote: >>>> I'm baffled. If I insert a printf into the loop in stack_capture, >>>> the kernel boots. But the generated assembly looks perfectly >>>> correct to me in either case. So inserting the printf must have >>>> some side-effect. >>>> >>>> The stack does end up aligned differently: The failing version >>>> puts 16 bytes on the stack, the working version puts 24 bytes. >>>> But I can't figure out how that would explain what I'm seeing... >>> >>> It feels like an alignment issue but those stack sizes should both >>> be valid. Are you able to send me the asm for the working and broken >>> versions of the function? >>> >>> Also which ABI are you using? I have not been able to reproduce it >>> with EABI, but that may have been because I have a patched clang >>> I've been using to track down another issue. >> >> I'm using whatever the default is in FreeBSD-CURRENT. I've seen >> this consistently with both RaspberryPi and BeagleBone kernels >> for the last few weeks. > Ok, it's the old ABI. I note this function may be broken with EABI as > it make assumptions on the layout of each frame. Thought so. >> /* Broken version */ >> c0519cec : >> void >> stack_save(struct stack *st) >> { >> c0519cec: e92d4830 push {r4, r5, fp, lr} > > This stack layout is incorrect. It should store (from a low address to > high address) r4, r5, fp, ip, lr and pc. If I understand right, you're claiming that Clang is generating a wrong preamble for OABI functions which is manifesting as crashes in the stack-walking code. I'm not sure I understand the frame layout you're saying it should use, though. Pushing PC seems a very strange thing to do on ARM. (Though it would seem to match sys/arm/include/stack.h.) It doesn't look like Clang/OABI is using the layout you suggest anywhere in the kernel code: I grepped through the kernel disassembly and found only a single instance of "fp, ip, lr, pc" and that was from assembly. It also looks like sys/arm/include/stack.h needs to be taught about the difference between EABI and OABI. > The unwind code following is > incorrect for this stack layout. Ah. I'll take another look. I hadn't tried to match up the offsets to see if they made sense for the stack layout. I could probably change this stack-walking code to match the frame layout being used by Clang here, but I'm not sure whether that's the "right" fix. > In your working code how deep is the stack you are printing? I > suspect you are getting lucky with the data on the stack. Yes, almost certainly it's a matter of luck here. I had noticed that when I added the printf, it became apparent that the function never walked more than one frame. Now I understand why. Tim