Date: Wed, 20 May 2009 15:52:45 GMT From: Stephen Sanders <ssanders@opnet.com> To: freebsd-gnats-submit@FreeBSD.org Subject: amd64/134757: 32 bit processes on 64 bit platforms occasionally drop core with bad ds reg Message-ID: <200905201552.n4KFqjxY096124@www.freebsd.org> Resent-Message-ID: <200905201600.n4KG0CjM055281@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 134757 >Category: amd64 >Synopsis: 32 bit processes on 64 bit platforms occasionally drop core with bad ds reg >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-amd64 >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Wed May 20 16:00:12 UTC 2009 >Closed-Date: >Last-Modified: >Originator: Stephen Sanders >Release: 6.3 Release amd64 >Organization: OPNET >Environment: FreeBSD alt-4100-2.lab.opnet.com 6.3-RELEASE FreeBSD 6.3-RELEASE #0: Tue Mar 31 14:11:07 PDT 2009 pmai@focus7.networkphysics.com:/u1/builds/ping/NP/FreeBSD/package/NPbabkernel/bld-tmp/sys/amd64/compile/NPBAB amd64 >Description: With fair regularity, we have 32 bit processes dropping core on 64 bit systems. In particular perl and bash. Our system is definitely a hybrid but that aspect appears to not be the issue. The system works properly more than not. I have attached a file containing 2 gdb sessions. One session is looking at a core that bash left behind and the other is looking at a bash session with no core. In the core case, the core drop is occurring when the instruction "cmpl $0x0,,0x80d41d4" is executed. Checking the registers, one will see that ss == cs == ds == es == fs == gs == 0x23 35 In the non-core case, I halted execution on execute_command() and found that ss == 0x23 35 ds == es == fs == gs == 0x0 This sound suspiciously like a bug that was fixed in 7.1. I believe the issue was in in cpuswitch.S. Porting the 32 bit processes up to 64 bits is not currently an option for a solution. >How-To-Repeat: Fork a 32 bit process on a 64 bit 6.3 FBSD machine often and long enough. Something like once a minute. Alternatively, fork a large number of 32 bit processes at boot time. >Fix: None. Patch attached with submission follows: The following is gdb output from debugging a /usr/local/bin/bash core drop. Note that ds == ss. ============================================================================== (gdb) (gdb) bt #0 0x080759fe in kill_pid () #1 0x08074dc8 in wait_for () #2 0x08067d18 in execute_command_internal () #3 0x08068af1 in execute_command_internal () #4 0x08068cb3 in execute_command_internal () #5 0x08067ef5 in execute_command_internal () #6 0x08097317 in parse_and_execute () #7 0x0807cadc in command_substitute () #8 0x080801bc in pat_subst () #9 0x0807a6cc in cond_expand_word () #10 0x0807a76d in cond_expand_word () #11 0x0807a7c6 in expand_string_unsplit () #12 0x0807a44c in string_rest_of_args () #13 0x08079e7c in strip_trailing_ifs_whitespace () #14 0x0807a018 in do_assignment () #15 0x0808175e in expand_words_shellexp () #16 0x080811a4 in expand_words () #17 0x0806a95c in execute_command_internal () #18 0x08067cab in execute_command_internal () #19 0x08067796 in execute_command () #20 0x08068c63 in execute_command_internal () #21 0x08067ef5 in execute_command_internal () #22 0x08067796 in execute_command () ---Type <return> to continue, or q <return> to quit---q Quit (gdb) info frame Stack level 0, frame at 0xffffca40: eip = 0x80759fe in kill_pid; saved eip 0x8074dc8 called by frame at 0xffffca90 Arglist at 0xffffca38, args: Locals at 0xffffca38, Previous frame's sp is 0xffffca40 Saved registers: ebx at 0xffffca2c, ebp at 0xffffca38, esi at 0xffffca30, edi at 0xffffca34, eip at 0xffffca3c (gdb) disassemble 0x80759fe Dump of assembler code for function kill_pid: 0x0807579c <kill_pid+0>: push %ebp 0x0807579d <kill_pid+1>: mov %esp,%ebp 0x0807579f <kill_pid+3>: push %edi 0x080757a0 <kill_pid+4>: push %esi 0x080757a1 <kill_pid+5>: push %ebx 0x080757a2 <kill_pid+6>: sub $0x3c,%esp 0x080757a5 <kill_pid+9>: mov 0xc(%ebp),%edi 0x080757a8 <kill_pid+12>: movl $0x0,0xffffffc0(%ebp) 0x080757af <kill_pid+19>: cmpl $0x0,0x10(%ebp) 0x080757b3 <kill_pid+23>: je 0x8075948 <kill_pid+428> 0x080757b9 <kill_pid+29>: sub $0xc,%esp 0x080757bc <kill_pid+32>: lea 0xffffffd8(%ebp),%esi 0x080757bf <kill_pid+35>: push %esi 0x080757c0 <kill_pid+36>: call 0x8059908 <_init+316> 0x080757c5 <kill_pid+41>: add $0x8,%esp 0x080757c8 <kill_pid+44>: push $0x14 0x080757ca <kill_pid+46>: push %esi 0x080757cb <kill_pid+47>: call 0x8059b78 <_init+940> 0x080757d0 <kill_pid+52>: lea 0xffffffc8(%ebp),%ebx 0x080757d3 <kill_pid+55>: mov %ebx,(%esp) 0x080757d6 <kill_pid+58>: call 0x8059908 <_init+316> 0x080757db <kill_pid+63>: add $0xc,%esp ---Type <return> to continue, or q <return> to quit--- 0x080757de <kill_pid+66>: push %ebx 0x080757df <kill_pid+67>: push %esi 0x080757e0 <kill_pid+68>: push $0x1 0x080757e2 <kill_pid+70>: call 0x805a2b8 <close+112> 0x080757e7 <kill_pid+75>: add $0xc,%esp 0x080757ea <kill_pid+78>: lea 0xffffffc4(%ebp),%eax 0x080757ed <kill_pid+81>: push %eax 0x080757ee <kill_pid+82>: push $0x0 0x080757f0 <kill_pid+84>: pushl 0x8(%ebp) 0x080757f3 <kill_pid+87>: call 0x8073d4c <kill_current_pipeline+20> 0x080757f8 <kill_pid+92>: mov %eax,%ebx 0x080757fa <kill_pid+94>: add $0x10,%esp 0x080757fd <kill_pid+97>: cmpl $0xffffffff,0xffffffc4(%ebp) 0x08075801 <kill_pid+101>: je 0x807591c <kill_pid+384> 0x08075807 <kill_pid+107>: mov 0xffffffc4(%ebp),%edx 0x0807580a <kill_pid+110>: mov 0x80d3d60,%eax 0x0807580f <kill_pid+115>: mov (%eax,%edx,4),%eax 0x08075812 <kill_pid+118>: andl $0xfffffffd,0x10(%eax) 0x08075816 <kill_pid+122>: mov 0xffffffc4(%ebp),%edx 0x08075819 <kill_pid+125>: mov 0x80d3d60,%eax 0x0807581e <kill_pid+130>: mov (%eax,%edx,4),%edx 0x08075821 <kill_pid+133>: mov 0x8(%edx),%eax 0x08075824 <kill_pid+136>: cmp 0x80cd618,%eax ---Type <return> to continue, or q <return> to quit--- 0x0807582a <kill_pid+142>: jne 0x8075878 <kill_pid+220> 0x0807582c <kill_pid+144>: mov 0x4(%edx),%ebx 0x0807582f <kill_pid+147>: nop 0x08075830 <kill_pid+148>: sub $0x8,%esp 0x08075833 <kill_pid+151>: push %edi 0x08075834 <kill_pid+152>: pushl 0x4(%ebx) 0x08075837 <kill_pid+155>: call 0x8059ca8 <_init+1244> 0x0807583c <kill_pid+160>: add $0x10,%esp 0x0807583f <kill_pid+163>: cmpl $0x0,0xc(%ebx) 0x08075843 <kill_pid+167>: jne 0x8075860 <kill_pid+196> 0x08075845 <kill_pid+169>: cmp $0xf,%edi 0x08075848 <kill_pid+172>: je 0x807584f <kill_pid+179> 0x0807584a <kill_pid+174>: cmp $0x1,%edi 0x0807584d <kill_pid+177>: jne 0x8075860 <kill_pid+196> 0x0807584f <kill_pid+179>: sub $0x8,%esp 0x08075852 <kill_pid+182>: push $0x13 0x08075854 <kill_pid+184>: pushl 0x4(%ebx) 0x08075857 <kill_pid+187>: call 0x8059ca8 <_init+1244> 0x0807585c <kill_pid+192>: add $0x10,%esp 0x0807585f <kill_pid+195>: nop 0x08075860 <kill_pid+196>: mov (%ebx),%ebx 0x08075862 <kill_pid+198>: mov 0xffffffc4(%ebp),%eax 0x08075865 <kill_pid+201>: mov 0x80d3d60,%edx ---Type <return> to continue, or q <return> to quit--- 0x0807586b <kill_pid+207>: mov (%edx,%eax,4),%eax 0x0807586e <kill_pid+210>: cmp %ebx,0x4(%eax) 0x08075871 <kill_pid+213>: jne 0x8075830 <kill_pid+148> 0x08075873 <kill_pid+215>: jmp 0x8075930 <kill_pid+404> 0x08075878 <kill_pid+220>: sub $0x8,%esp 0x0807587b <kill_pid+223>: push %edi 0x0807587c <kill_pid+224>: mov 0xffffffc4(%ebp),%eax 0x0807587f <kill_pid+227>: mov 0x80d3d60,%edx 0x08075885 <kill_pid+233>: mov (%edx,%eax,4),%eax 0x08075888 <kill_pid+236>: pushl 0x8(%eax) 0x0807588b <kill_pid+239>: call 0x8059e88 <unlink+160> 0x08075890 <kill_pid+244>: mov %eax,0xffffffc0(%ebp) 0x08075893 <kill_pid+247>: add $0x10,%esp 0x08075896 <kill_pid+250>: test %ebx,%ebx 0x08075898 <kill_pid+252>: je 0x8075930 <kill_pid+404> 0x0807589e <kill_pid+258>: mov 0xffffffc4(%ebp),%eax 0x080758a1 <kill_pid+261>: mov 0x80d3d60,%edx 0x080758a7 <kill_pid+267>: mov (%edx,%eax,4),%eax 0x080758aa <kill_pid+270>: cmpl $0x1,0xc(%eax) 0x080758ae <kill_pid+274>: jne 0x80758d6 <kill_pid+314> 0x080758b0 <kill_pid+276>: cmp $0xf,%edi 0x080758b3 <kill_pid+279>: je 0x80758ba <kill_pid+286> 0x080758b5 <kill_pid+281>: cmp $0x1,%edi ---Type <return> to continue, or q <return> to quit--- 0x080758b8 <kill_pid+284>: jne 0x80758d6 <kill_pid+314> 0x080758ba <kill_pid+286>: sub $0x8,%esp 0x080758bd <kill_pid+289>: push $0x13 0x080758bf <kill_pid+291>: mov 0xffffffc4(%ebp),%eax 0x080758c2 <kill_pid+294>: mov 0x80d3d60,%edx 0x080758c8 <kill_pid+300>: mov (%edx,%eax,4),%eax 0x080758cb <kill_pid+303>: pushl 0x8(%eax) 0x080758ce <kill_pid+306>: call 0x8059e88 <unlink+160> 0x080758d3 <kill_pid+311>: add $0x10,%esp 0x080758d6 <kill_pid+314>: test %ebx,%ebx 0x080758d8 <kill_pid+316>: je 0x8075930 <kill_pid+404> 0x080758da <kill_pid+318>: mov 0xffffffc4(%ebp),%edx 0x080758dd <kill_pid+321>: mov 0x80d3d60,%eax 0x080758e2 <kill_pid+326>: mov (%eax,%edx,4),%eax 0x080758e5 <kill_pid+329>: cmpl $0x1,0xc(%eax) 0x080758e9 <kill_pid+333>: jne 0x8075930 <kill_pid+404> 0x080758eb <kill_pid+335>: cmp $0x13,%edi 0x080758ee <kill_pid+338>: jne 0x8075930 <kill_pid+404> 0x080758f0 <kill_pid+340>: push %edx 0x080758f1 <kill_pid+341>: call 0x807543c <reap_dead_jobs+580> 0x080758f6 <kill_pid+346>: mov 0xffffffc4(%ebp),%edx 0x080758f9 <kill_pid+349>: mov 0x80d3d60,%eax 0x080758fe <kill_pid+354>: mov (%eax,%edx,4),%eax ---Type <return> to continue, or q <return> to quit--- 0x08075901 <kill_pid+357>: andl $0xfffffffe,0x10(%eax) 0x08075905 <kill_pid+361>: mov 0xffffffc4(%ebp),%edx 0x08075908 <kill_pid+364>: mov 0x80d3d60,%eax 0x0807590d <kill_pid+369>: mov (%eax,%edx,4),%eax 0x08075910 <kill_pid+372>: orl $0x2,0x10(%eax) 0x08075914 <kill_pid+376>: add $0x4,%esp 0x08075917 <kill_pid+379>: jmp 0x8075930 <kill_pid+404> 0x08075919 <kill_pid+381>: lea 0x0(%esi),%esi 0x0807591c <kill_pid+384>: sub $0x8,%esp 0x0807591f <kill_pid+387>: push %edi 0x08075920 <kill_pid+388>: pushl 0x8(%ebp) 0x08075923 <kill_pid+391>: call 0x8059e88 <unlink+160> 0x08075928 <kill_pid+396>: mov %eax,0xffffffc0(%ebp) 0x0807592b <kill_pid+399>: add $0x10,%esp 0x0807592e <kill_pid+402>: mov %esi,%esi 0x08075930 <kill_pid+404>: sub $0x4,%esp 0x08075933 <kill_pid+407>: push $0x0 0x08075935 <kill_pid+409>: lea 0xffffffc8(%ebp),%eax 0x08075938 <kill_pid+412>: push %eax 0x08075939 <kill_pid+413>: push $0x3 0x0807593b <kill_pid+415>: call 0x805a2b8 <close+112> 0x08075940 <kill_pid+420>: add $0x10,%esp 0x08075943 <kill_pid+423>: jmp 0x807595a <kill_pid+446> ---Type <return> to continue, or q <return> to quit--- 0x08075945 <kill_pid+425>: lea 0x0(%esi),%esi 0x08075948 <kill_pid+428>: sub $0x8,%esp 0x0807594b <kill_pid+431>: push %edi 0x0807594c <kill_pid+432>: pushl 0x8(%ebp) 0x0807594f <kill_pid+435>: call 0x8059ca8 <_init+1244> 0x08075954 <kill_pid+440>: mov %eax,0xffffffc0(%ebp) 0x08075957 <kill_pid+443>: add $0x10,%esp 0x0807595a <kill_pid+446>: mov 0xffffffc0(%ebp),%eax 0x0807595d <kill_pid+449>: lea 0xfffffff4(%ebp),%esp 0x08075960 <kill_pid+452>: pop %ebx 0x08075961 <kill_pid+453>: pop %esi 0x08075962 <kill_pid+454>: pop %edi 0x08075963 <kill_pid+455>: leave 0x08075964 <kill_pid+456>: ret 0x08075965 <kill_pid+457>: lea 0x0(%esi),%esi 0x08075968 <kill_pid+460>: push %ebp 0x08075969 <kill_pid+461>: mov %esp,%ebp 0x0807596b <kill_pid+463>: push %ebx 0x0807596c <kill_pid+464>: sub $0x4,%esp 0x0807596f <kill_pid+467>: call 0x8059d58 <_init+1420> 0x08075974 <kill_pid+472>: mov (%eax),%ebx 0x08075976 <kill_pid+474>: incl 0x80d41d4 0x0807597c <kill_pid+480>: cmpl $0x0,0x80d41d8 ---Type <return> to continue, or q <return> to quit--- 0x08075983 <kill_pid+487>: jne 0x8075994 <kill_pid+504> 0x08075985 <kill_pid+489>: sub $0x8,%esp 0x08075988 <kill_pid+492>: push $0x0 0x0807598a <kill_pid+494>: push $0xffffffff 0x0807598c <kill_pid+496>: call 0x80759a0 <kill_pid+516> 0x08075991 <kill_pid+501>: add $0x10,%esp 0x08075994 <kill_pid+504>: call 0x8059d58 <_init+1420> 0x08075999 <kill_pid+509>: mov %ebx,(%eax) 0x0807599b <kill_pid+511>: mov 0xfffffffc(%ebp),%ebx 0x0807599e <kill_pid+514>: leave 0x0807599f <kill_pid+515>: ret 0x080759a0 <kill_pid+516>: push %ebp 0x080759a1 <kill_pid+517>: mov %esp,%ebp 0x080759a3 <kill_pid+519>: push %edi 0x080759a4 <kill_pid+520>: push %esi 0x080759a5 <kill_pid+521>: push %ebx 0x080759a6 <kill_pid+522>: sub $0x1c,%esp 0x080759a9 <kill_pid+525>: mov $0x0,%edi 0x080759ae <kill_pid+530>: movl $0x0,0xffffffe8(%ebp) 0x080759b5 <kill_pid+537>: movl $0xffffffff,0xffffffe4(%ebp) 0x080759bc <kill_pid+544>: cmpl $0x0,0x80cd634 0x080759c3 <kill_pid+551>: je 0x80759d3 <kill_pid+567> 0x080759c5 <kill_pid+553>: mov $0x6,%esi ---Type <return> to continue, or q <return> to quit--- 0x080759ca <kill_pid+558>: cmpl $0x0,0x80d5454 0x080759d1 <kill_pid+565>: je 0x80759d8 <kill_pid+572> 0x080759d3 <kill_pid+567>: mov $0x0,%esi 0x080759d8 <kill_pid+572>: cmpl $0x0,0x80d41d4 0x080759df <kill_pid+579>: jne 0x80759e7 <kill_pid+587> 0x080759e1 <kill_pid+581>: cmpl $0x0,0xc(%ebp) 0x080759e5 <kill_pid+585>: jne 0x80759ea <kill_pid+590> 0x080759e7 <kill_pid+587>: or $0x1,%esi 0x080759ea <kill_pid+590>: sub $0x4,%esp 0x080759ed <kill_pid+593>: push %esi 0x080759ee <kill_pid+594>: lea 0xfffffff0(%ebp),%eax 0x080759f1 <kill_pid+597>: push %eax 0x080759f2 <kill_pid+598>: push $0xffffffff 0x080759f4 <kill_pid+600>: call 0x8059838 <_init+108> 0x080759f9 <kill_pid+605>: mov %eax,%ebx 0x080759fb <kill_pid+607>: add $0x10,%esp 0x080759fe <kill_pid+610>: cmpl $0x0,0x80d41d4 0x08075a05 <kill_pid+617>: jle 0x8075a15 <kill_pid+633> (gdb) info registers eax 0x369 873 ecx 0x0 0 edx 0x0 0 ebx 0x369 873 esp 0xffffca10 0xffffca10 ebp 0xffffca38 0xffffca38 esi 0x0 0 edi 0x0 0 eip 0x80759fe 0x80759fe eflags 0x10282 66178 cs 0x1b 27 ss 0x23 35 ds 0x23 35 es 0x23 35 fs 0x23 35 gs 0x23 35 =============================================================================== The following is a gdb session that is not a core drop. Here ds != ss =============================================================================== alt-4100-2[1036] # gdb /usr/local/bin/bash GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"...(no debugging symbols found)... (gdb) break execute_command Breakpoint 1 at 0x8067755 (gdb) run Starting program: /usr/local/bin/bash bash-3.00# ls Breakpoint 1, 0x08067755 in execute_command () (gdb) bt #0 0x08067755 in execute_command () #1 0x0805c749 in reader_loop () #2 0x0805aba0 in main () (gdb) info frame Stack level 0, frame at 0xffffdc00: eip = 0x8067755 in execute_command; saved eip 0x805c749 called by frame at 0xffffdc30 Arglist at 0xffffdbf8, args: Locals at 0xffffdbf8, Previous frame's sp is 0xffffdc00 Saved registers: ebx at 0xffffdbf0, ebp at 0xffffdbf8, esi at 0xffffdbf4, eip at 0xffffdbfc (gdb) info registers eax 0x80dd5a0 135124384 ecx 0x0 0 edx 0xffffd400 -11264 ebx 0x0 0 esp 0xffffdbf0 0xffffdbf0 ebp 0xffffdbf8 0xffffdbf8 esi 0x0 0 edi 0xffffdcd4 -9004 eip 0x8067755 0x8067755 eflags 0x292 658 cs 0x1b 27 ss 0x23 35 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 (gdb) >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200905201552.n4KFqjxY096124>