Date: Tue, 20 Dec 2016 11:54:57 +0100 From: Jakub Palider <jpa@semihalf.com> To: Dimitry Andric <dim@freebsd.org> Cc: Hans Petter Selasky <hps@selasky.org>, Colin Percival <cperciva@tarsnap.com>, freebsd-current@freebsd.org Subject: Re: clang/llvm 3.9.0 mysteriously zeroing variables? Message-ID: <CAL7QUyNeHiUANEtBzT1gGU9En_tOvy%2Bey5qGR9p_dLhWgsJsAw@mail.gmail.com> In-Reply-To: <78FB227F-3542-452F-9A16-4FB0E0E698AC@FreeBSD.org> References: <01000158c7252f0c-6c3198b0-fbef-4a60-ade9-e3b91d9e83bd-000000@email.amazonses.com> <e0646eb8-d793-1ffb-bd12-febbce86a4f8@selasky.org> <78FB227F-3542-452F-9A16-4FB0E0E698AC@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi,
do you still observe this behaviour? Which type of EC2 instances were
affected?
I tried to reproduce with kernel/tools from Dec 15 and did not manage to
crash the machine.
Jakub
On Sun, Dec 4, 2016 at 5:38 PM, Dimitry Andric <dim@freebsd.org> wrote:
> On 04 Dec 2016, at 10:52, Hans Petter Selasky <hps@selasky.org> wrote:
> >
> > On 12/04/16 01:04, Colin Percival wrote:
> >> Starting with r309124 (when clang/llvm 3.9.0 was imported) I'm seeing
> EC2
> >> instances panic on boot with a division-by-zero error; the code in
> question
> >> is in blkfront.c, printing out the size of disks:
> >>
> >>> device_printf(dev, "%juMB <%s> at %s",
> >>> (uintmax_t) sectors / (1048576 / sector_size),
> >>> device_get_desc(dev),
> >>> xenbus_get_node(dev));
> >>
> >> My first thought was that 'sector_size' must be either zero or very
> large...
> >> but no, when I add printf("sector_size = %ju\n",
> (uintmax_t)sector_size), it's
> >> entirely normal. What's more, adding that printf makes the
> division-by-zero
> >> panic go away.
> >>
> >> I'd think I was just hallucinating, but earlier today I heard that a
> similarly
> >> "impossible" panic had been observed in the NFS client code when
> compiled with
> >> clang/llvm 3.9.0.
> >>
> >> So... is anyone else seeing unexpected panics or other odd behaviour
> starting
> >> after clang/llvm 3.9.0 was imported?
> >>
> >
> > Hi,
> >
> > Can you look at the code with "objdump -Dx --source" and see what is
> going on there? Might it be the "sector" variable is shadowed?
>
> I don't see anything in the generated code for the call that can cause
> this, except for sector_size really being zero, or the result of
> 1048576/sector_size being zero.
>
> On i386, you get this:
>
> .loc 1 1349 19 # /usr/src/sys/dev/xen/blkfront/
> blkfront.c:1349:19
> movl -56(%ebp), %ecx # -56(%rbp) = sectors
> .Ltmp1148:
> #DEBUG_VALUE: xbd_connect:sectors <- %ECX
> .loc 1 1349 38 is_stmt 0 # /usr/src/sys/dev/xen/blkfront/
> blkfront.c:1349:38
> movl $1048576, %eax # imm = 0x100000
> xorl %edx, %edx
> divl -52(%ebp) # -52(%ebp) = sector_size
> movl %eax, %edi
> .loc 1 1349 27 # /usr/src/sys/dev/xen/blkfront/
> blkfront.c:1349:27
> xorl %edx, %edx
> movl %ecx, %eax
> divl %edi
> movl %eax, -32(%ebp) # 4-byte Spill
>
> On amd64, it looks pretty similar:
>
> .loc 1 1349 19 # /usr/src/sys/dev/xen/blkfront/
> blkfront.c:1349:19
> movq -112(%rbp), %rcx # -112(%rbp) = sectors
> .Ltmp1128:
> #DEBUG_VALUE: xbd_connect:sectors <- %RCX
> .loc 1 1349 38 is_stmt 0 # /usr/src/sys/dev/xen/blkfront/
> blkfront.c:1349:38
> movl $1048576, %eax # imm = 0x100000
> xorl %edx, %edx
> divq -88(%rbp) # -88(%rbp) = sector_size
> movq %rax, %rsi
> .loc 1 1349 27 # /usr/src/sys/dev/xen/blkfront/
> blkfront.c:1349:27
> xorl %edx, %edx
> movq %rcx, %rax
> divq %rsi
> movq %rax, %r15
>
> Colin, does it panic for you in the first or the second div?
>
> -Dimitry
>
>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAL7QUyNeHiUANEtBzT1gGU9En_tOvy%2Bey5qGR9p_dLhWgsJsAw>
