From owner-freebsd-current@FreeBSD.ORG  Tue Sep  2 18:41:12 2008
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 8C9E81065670
	for <freebsd-current@freebsd.org>; Tue,  2 Sep 2008 18:41:12 +0000 (UTC)
	(envelope-from freebsd-current@adam.gs)
Received: from mail.adam.gs (cl-127.ewr-01.us.sixxs.net
	[IPv6:2001:4830:1200:7e::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 180118FC2C
	for <freebsd-current@freebsd.org>; Tue,  2 Sep 2008 18:41:12 +0000 (UTC)
	(envelope-from freebsd-current@adam.gs)
Received: from [127.0.0.1] (localhost.adam.gs [127.0.0.1])
	by mail.adam.gs (Postfix) with ESMTP id 99888F8D61D
	for <freebsd-current@freebsd.org>; Tue,  2 Sep 2008 14:41:10 -0400 (EDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=simple; s=mail; d=adam.gs;
	b=Qy4SktDGZf3OQ4mxIAR62j/qaE4uOwFhLdKS+XQghXt2G27xFpWXYksHGyvvo28BQFEuRbZ3iDqEoaXomWJOb8mHpULGVMdRWvI/GECR8GJrIhtKtKdhdGm7m4ZNYF/WPnMZWcTSK9xc/n3CaRSx77D2Vo7pGu1//Zv2iuyolk4=;
Message-Id: <EE59210F-072C-48E7-BDBE-0164F5407187@adam.gs>
From: Adam Jacob Muller <freebsd-current@adam.gs>
To: Kostik Belousov <kostikbel@gmail.com>
In-Reply-To: <20080901145315.GU2038@deviant.kiev.zoral.com.ua>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v928.1)
Date: Tue, 2 Sep 2008 14:41:08 -0400
References: <ed91d4a80808300946s49ff076dw64b57f8e9058f2d@mail.gmail.com>
	<20080830183804.GG2038@deviant.kiev.zoral.com.ua>
	<ed91d4a80808301250j1a4802d4o412c6b5e30979079@mail.gmail.com>
	<20080830195844.GI2038@deviant.kiev.zoral.com.ua>
	<ed91d4a80808301403t5b776d10ubd184bc1ff01215@mail.gmail.com>
	<20080831071618.GK2038@deviant.kiev.zoral.com.ua>
	<20080831091639.GM2038@deviant.kiev.zoral.com.ua>
	<80861bfa0809010733h47580d3evb3eb68c972a2bb25@mail.gmail.com>
	<20080901145315.GU2038@deviant.kiev.zoral.com.ua>
X-Authentication: 5Y9zP9uohPVCIKSKssvSwTvzf0+nN8I7IHuNjRHchA2ZGkQ0S1nNBDHYYPDC3FPZGd9A1cjjjyDCoZ9VN+JLpLiSCkZVi7UuiAa/zQMWg10/YCOcNIalQ2HNL//J/eqV60ibrXDFlBGWDW/aW03wCn3+ZAGKOzfiwburp0e9qa2uGm7X2FGol2eC+A==
Cc: freebsd-current@freebsd.org, Vyacheslav Bocharov <adeepv@gmail.com>
Subject: Re: __tls_get_addr problem with recent current
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 02 Sep 2008 18:41:12 -0000


On Sep 1, 2008, at 10:53 AM, Kostik Belousov wrote:

> On Mon, Sep 01, 2008 at 05:33:37PM +0300, Vyacheslav Bocharov wrote:
>> I have similar problem in 7-STABLE (from 1 sep):
>> 32bit application exec 64application and we have an core dump:
>>
>> # gdb fw.sh fw.sh.core
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License,  
>> and you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for  
>> details.
>> This GDB was configured as "amd64-marcel-freebsd"...
>> Core was generated by `fw.sh'.
>> Program terminated with signal 11, Segmentation fault.
>> Reading symbols from /usr/lib/libstdc++.so.6...done.
>> Loaded symbols for /usr/lib/libstdc++.so.6
>> Reading symbols from /lib/libm.so.5...done.
>> Loaded symbols for /lib/libm.so.5
>> Reading symbols from /lib/libgcc_s.so.1...done.
>> Loaded symbols for /lib/libgcc_s.so.1
>> Reading symbols from /lib/libc.so.7...done.
>> Loaded symbols for /lib/libc.so.7
>> Reading symbols from /libexec/ld-elf.so.1...done.
>> Loaded symbols for /libexec/ld-elf.so.1
>> #0  0x0000000800507483 in __tls_get_addr () from /libexec/ld-elf.so.1
>> (gdb) bt
>> #0  0x0000000800507483 in __tls_get_addr () from /libexec/ld-elf.so.1
>> #1  0x0000000800ad8892 in _pthread_mutex_init_calloc_cb () from
>> /lib/libc.so.7
>> #2  0x0000000800ada35f in malloc () from /lib/libc.so.7
>> #3  0x00000008007050ad in operator new () from /usr/lib/libstdc+ 
>> +.so.6
>> #4  0x00000008006b5f21 in std::string::_Rep::_S_create ()
>>   from /usr/lib/libstdc++.so.6
>> #5  0x00000008006b6ca5 in std::string::_S_copy_chars ()
>>   from /usr/lib/libstdc++.so.6
>> #6  0x00000008006b6dc2 in std::basic_string<char,  
>> std::char_traits<char>,
>> std::allocator<char> >::basic_string () from /usr/lib/libstdc++.so.6
>> #7  0x00000000004021ec in __static_initialization_and_destruction_0 (
>>    __initialize_p=1, __priority=65535) at CCmdLine.cpp:16
>> #8  0x00000000004026c3 in global constructors keyed to cmdlist ()
>>    at CCmdLine.cpp:177
>> #9  0x00000000004033a2 in __do_global_ctors_aux ()
>> #10 0x000000000040113e in _init ()
>> #11 0x0000000800b2b0c0 in __cxa_atexit () from /lib/libc.so.7
>> #12 0x00000000004014e8 in _start ()
>> #13 0x000000080052c000 in ?? ()
>>
>> I tried your patch but nothing changed.
> Exactly which patch ? There were three, one of which caused immediate
> panic. I put the patches at
> http://people.freebsd.org/~kib/misc/fsbase.1.patch
> http://people.freebsd.org/~kib/misc/fsbase.2.patch
>
> Could you, please, try both and report the results ?
> And, isolated test case, as several C files or recipe to reproduce
> this with base system, would be ideal.
>
>>
>> 2008/8/31 Kostik Belousov <kostikbel@gmail.com>
>>
>>> On Sun, Aug 31, 2008 at 10:16:18AM +0300, Kostik Belousov wrote:
>>>> On Sat, Aug 30, 2008 at 02:03:00PM -0700, Artem Belevich wrote:
>>>>> With the new patch kernel has crashed as soon as I ran i386 app,
>>>>> though the crash happened within in-kernel thread g_up:
>>>>>
>>>>> Fatal trap 12: page fault while in kernel mode
>>>>> cpuid = 2; apic id = 02
>>>>> fault virtual address   = 0x20
>>>>> fault code              = supervisor read data, page not present
>>>>> instruction pointer     = 0x8:0xffffffff804a821f
>>>>> stack pointer           = 0x10:0xffffffffac280b60
>>>>> frame pointer           = 0x10:0x0
>>>>> code segment            = base 0x0, limit 0xfffff, type 0x1b
>>>>>                       = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>> processor eflags        = resume, IOPL = 0
>>>>> current process         = 3 (g_up)
>>>>> trap number             = 12
>>>>> panic: page fault
>>>>> cpuid = 2
>>>>> Uptime: 37s
>>>>> Physical memory: 8169 MB
>>>>> Dumping 380 MB: 365 349 333 317 301 285 269 253 237 221 205 189  
>>>>> 173
>>>>> 157 141 125 109 93 77 61 45 29 13
>>>> Could you, please, show me the disassembled code around the faulted
>>>> %rip ?
>>>
>>> No need, it seems I found the problem. I trashed the %rdx that  
>>> contains
>>> the third cpu_switch argument. Please, try the updated patch.
>>>
>>> Thanks for the testing !
>>>
>>> diff --git a/sys/amd64/amd64/cpu_switch.S b/sys/amd64/amd64/ 
>>> cpu_switch.S
>>> index f34b0cc..03f0eca 100644
>>> --- a/sys/amd64/amd64/cpu_switch.S
>>> +++ b/sys/amd64/amd64/cpu_switch.S
>>> @@ -249,6 +249,12 @@ store_seg:
>>> 1:     movl    %ds,PCB_DS(%r8)
>>>       movl    %es,PCB_ES(%r8)
>>>       movl    %fs,PCB_FS(%r8)
>>> +       movq    %rdx,%r11
>>> +       movl    $MSR_FSBASE,%ecx
>>> +       rdmsr
>>> +       shlq    $32,%rdx
>>> +       leaq    (%rax,%rdx),%r9
>>> +       movq    %r11,%rdx
>>>        jmp     done_store_seg
>>> 2:     movq    PCB_GS32P(%r8),%rax
>>>       movq    (%rax),%rax
>>>
>>
>>
>>
>> -- 
>> Vyacheslav Bocharov


Hi,
i have this same issue on recent RELENG_7 (pre and post 7.1- 
PRERELEASE), the issue was reproducible by a simple c-app compiled on  
7.x 32-bit

#include <unistd.h>
main()
{
    execl("/bin/ls", "/bin/ls", (char *) 0);
}

this app will segfault rather reliably (but not 100% of the time)  
(while true;do ./test; if [ "$?" -gt "0" ];then break; fi; done).

patch 1 (http://people.freebsd.org/~kib/misc/fsbase.1.patch) fixes the  
issue for me
patch 2 (http://people.freebsd.org/~kib/misc/fsbase.2.patch) does not  
though it may mitigate it slightly (cause things to crash less  
frequently)


-Adam