From owner-freebsd-alpha Fri May 10 17:30:45 2002 Delivered-To: freebsd-alpha@freebsd.org Received: from mail.speakeasy.net (mail12.speakeasy.net [216.254.0.212]) by hub.freebsd.org (Postfix) with ESMTP id 6936C37B403 for ; Fri, 10 May 2002 17:29:58 -0700 (PDT) Received: (qmail 14001 invoked from network); 11 May 2002 00:29:57 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) by mail12.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP for ; 11 May 2002 00:29:57 -0000 Received: from laptop.baldwin.cx (laptop.baldwin.cx [192.168.0.4]) by server.baldwin.cx (8.11.6/8.11.6) with ESMTP id g4B0TqF44742; Fri, 10 May 2002 20:29:52 -0400 (EDT) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.2 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <15580.13914.162169.930227@grasshopper.cs.duke.edu> Date: Fri, 10 May 2002 20:29:49 -0400 (EDT) From: John Baldwin To: Andrew Gallatin Subject: Re: gcc3 & alpha kernels Cc: obrien@FreeBSD.ORG, alpha@FreeBSD.ORG, Jeff Roberson Sender: owner-freebsd-alpha@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On 10-May-2002 Andrew Gallatin wrote: > > Jeff Roberson writes: > > On Fri, 10 May 2002, Andrew Gallatin wrote: > > > > > > > > Alan Cox writes: > > > > > > > > > > Did Jeff see a lockup at boot? Or was this on a running system? > > > > > > > > I believe it was at boot time. I can't swear to that, however. > > > > > > > > > > Thanks.. that's the same as me. It would seem that the new compiler > > > is botching the atomic inlines then. > > > > > > Hmm.. According to the disassembly, it looks like the correct > > > sequences are there, though.. > > > > > > Drew > > > > > > > It was at boot time. I believe that this was the first time we ever did > > negative atomic ints on alpha. This was with the old compiler as well. I > > haven't looked at the gcc3 output. > > > > When I looked at the assembly it was pretty clear that the inline wasn't > > written to support non sign extended values. If you change the prototype > > the signed int everything works as expected though. > > FWIW, this (atomic) is the problem. I can boot a kernel > where everything but vm_object.o is built with gcc 3.1 and vm_object.o > is built by the -stable gcc 2.95 compiler. > > I'm not sure where I can go from here. David, is this enough > information for you to use? > > I haven't used this kernel that I just built, as I'm not sure that I > should trust it :-( Ok, I've made a mostly clean diff between these two as follows: (I've removed diff's between label symbol names due to different offsets in the function): --- one.1 Fri May 10 19:45:36 2002 +++ two.1 Fri May 10 19:45:45 2002 @@ -1,6 +1,6 @@ --------- GCC 3.1-------------------------------------- +--------------- gcc 2.95 ------------------------------------ -0000000000000068 <_vm_object_allocate>: +00000000000000a0 <_vm_object_allocate>: : 00 00 bb 27 ldah gp,0(t12) : 00 00 bd 23 lda gp,0(gp) : e0 ff de 23 lda sp,-32(sp) @@ -9,74 +9,83 @@ : 10 00 5e b5 stq s1,16(sp) : 0a 04 f1 47 mov a1,s1 : 09 04 f2 47 mov a2,s0 - : 30 00 f2 b7 stq zero,48(a2) - : 30 00 32 20 lda t0,48(a2) - : 38 00 32 b4 stq t0,56(a2) - : 10 00 f2 b7 stq zero,16(a2) - : 10 00 32 20 lda t0,16(a2) - : 18 00 32 b4 stq t0,24(a2) - : 5c 00 12 3a stb a0,92(a2) - : 48 00 29 b6 stq a1,72(s0) + : 30 00 e9 b7 stq zero,48(s0) + : 01 14 26 41 addq s0,0x30,t0 + : 38 00 29 b4 stq t0,56(s0) + : 10 00 e9 b7 stq zero,16(s0) + : 01 14 22 41 addq s0,0x10,t0 + : 18 00 29 b4 stq t0,24(s0) + : 5c 00 09 3a stb a0,92(s0) + : 48 00 49 b5 stq s1,72(s0) This hunk is just using a2 instead of s0 and using lda insetad of addq. : 01 00 3f 20 lda t0,1 - : 50 00 32 b0 stl t0,80(a2) - : 5e 00 f2 37 stw zero,94(a2) - : 01 f0 1f 46 and a0,0xff,t0 - : a1 37 20 40 cmpule t0,0x1,t0 - : 06 00 20 e4 beq t0,d8 <_vm_object_allocate+0x70> - : 10 04 f2 47 mov a2,a0 + : 50 00 29 b0 stl t0,80(s0) + : 5e 00 e9 37 stw zero,94(s0) + : b0 37 00 42 cmpule a0,0x1,a0 + : 07 00 00 e6 beq a0,110 <_vm_object_allocate+0x70> + : 10 04 e9 47 mov s0,a0 More a2 instead of s0. Uses a0 directly instead of making off bits and using t0. I don't think this is harmful. : 00 20 3f 22 lda a1,8192 : 00 00 7d a7 ldq t12,0(gp) : 00 40 5b 6b jsr ra,(t12),104 <_vm_object_allocate+0x64> : 00 00 ba 27 ldah gp,0(ra) : 00 00 bd 23 lda gp,0(gp) + : 00 00 e0 2f unop 2.95 has an extra nop. Woo. : a1 77 42 41 cmpule s1,0x13,t0 - : 02 00 5f 41 addl s1,zero,t1 : 13 00 3f 22 lda a1,19 - : d1 04 22 44 cmovne t0,t1,a1 + : 01 00 20 e4 beq t0,120 <_vm_object_allocate+0x80> + : 11 00 5f 41 addl s1,zero,a1 Here 3.1 uses a conditional move to avoid a branch. : 00 00 7d a4 ldq t2,0(gp) + : 00 00 e0 2f unop + : 1f 04 ff 47 nop + : 00 00 e0 2f unop 2.95 pads in some more nops. : 00 00 23 30 ldwu t0,0(t2) : 60 00 29 34 stw t0,96(s0) - : 22 76 20 48 zapnot t0,0x3,t1 : 21 76 20 48 zapnot t0,0x3,t0 - : 01 04 21 42 addq a1,t0,t0 + : 22 f6 21 48 zapnot t0,0xf,t1 + : 01 04 31 40 addq t0,a1,t0 This one is perhaps the most questionable but probably ok. Here 3.1 doesn't mask off as many bits when copying t0 to t1. : 01 f0 23 44 and t0,0x1f,t0 : 00 00 83 a8 ldl_l t3,0(t2) : 24 f6 81 48 zapnot t3,0xf,t3 : a4 05 82 40 cmpeq t3,t1,t3 : 04 00 80 e4 beq t3,168 <_vm_object_allocate+0xc8> : 04 04 e1 47 mov t0,t3 : 00 00 83 b8 stl_c t3,0(t2) : 00 00 80 e4 beq t3,164 <_vm_object_allocate+0xc4> : 00 40 00 60 mb This is our atomic operation number 1 unchanged. - : 21 f6 81 48 zapnot t3,0xf,t0 - : f0 ff 3f e4 beq t0,ec <_vm_object_allocate+0x84> + : 01 04 e4 47 mov t3,t0 + : 21 f6 21 48 zapnot t0,0xf,t0 + : ef ff 3f e4 beq t0,130 <_vm_object_allocate+0x90> gcc 3.1 is simply smarter about storing the result of the zapnot directly into t0 to avoid a mov. : 88 00 e9 b7 stq zero,136(s0) : 68 00 e9 b7 stq zero,104(s0) : 70 00 e9 b7 stq zero,112(s0) : 00 00 7d a4 ldq t2,0(gp) + : 00 00 e0 2f unop + : 1f 04 ff 47 nop + : 00 00 e0 2f unop More 2.95 padding. : 00 00 43 a0 ldl t1,0(t2) - : 7f ff 22 20 lda t0,-129(t1) + : 21 35 50 40 subq t1,0x81,t0 lda preferred to subq for some reason.. : 58 00 29 b0 stl t0,88(s0) - : 01 00 3f 40 addl t0,zero,t0 + : 22 f6 41 48 zapnot t1,0xf,t1 + : 21 f6 21 48 zapnot t0,0xf,t0 This I do not grok. Here 3.1 adds zero to t0 and stores the result in t0, but since it is an addl I guess that does the equivalent of the zapnot to clear the sign bits. Note that 3.1 only does this for t0 and not t1 however. : 00 00 83 a8 ldl_l t3,0(t2) : 24 f6 81 48 zapnot t3,0xf,t3 : a4 05 82 40 cmpeq t3,t1,t3 : 04 00 80 e4 beq t3,1c4 <_vm_object_allocate+0x124> : 04 04 e1 47 mov t0,t3 : 00 00 83 b8 stl_c t3,0(t2) : 00 00 80 e4 beq t3,1c0 <_vm_object_allocate+0x120> : 00 40 00 60 mb Atomic operation number 2, also unchanged. - : 21 f6 81 48 zapnot t3,0xf,t0 - : f2 ff 3f e4 beq t0,13c <_vm_object_allocate+0xd4> + : 01 04 e4 47 mov t3,t0 + : 21 f6 21 48 zapnot t0,0xf,t0 + : f0 ff 3f e4 beq t0,190 <_vm_object_allocate+0xf0> 3.1 again uses zapnot more efficiently to avoid a mov. : 40 00 29 a0 ldl t0,64(s0) - : 01 00 21 20 lda t0,1(t0) + : 01 34 20 40 addq t0,0x1,t0 lda instead of addq. : 40 00 29 b0 stl t0,64(s0) : 00 00 1d a6 ldq a0,0(gp) : 11 04 ff 47 clr a1 : 00 00 5d a6 ldq a2,0(gp) : e4 00 7f 22 lda a3,228 : 00 00 7d a7 ldq t12,0(gp) : 00 40 5b 6b jsr ra,(t12),1f4 <_vm_object_allocate+0x154> : 00 00 ba 27 ldah gp,0(ra) : 00 00 bd 23 lda gp,0(gp) : 00 00 e9 b7 stq zero,0(s0) @@ -99,3 +108,8 @@ : 10 00 5e a5 ldq s1,16(sp) : 20 00 de 23 lda sp,32(sp) : 01 80 fa 6b ret + : 00 00 e0 2f unop + : 1f 04 ff 47 nop + : 00 00 e0 2f unop + : 1f 04 ff 47 nop + : 00 00 e0 2f unop 2.95 pads out with more nops. As you can see, I don't see much of anything in this assembly which indicates the function should execute any differently aside from the two weirdisms involving t1. Using -fno-strict-aliasing might get rid of those btw, not sure. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-alpha" in the body of the message