Date: Sun, 05 Oct 2008 21:33:03 -0600 From: Dale Hagglund <dale.hagglund@gmail.com> To: freebsd-questions@freebsd.org Cc: db@freebsd.org, Mel <fbsd.questions@rachie.is-a-geek.net> Subject: Re: processes hanging in _umtx_op Message-ID: <86skra4cuo.fsf@ponoka.ab.hsia.telus.net> In-Reply-To: <200810052019.01920.fbsd.questions@rachie.is-a-geek.net> (Mel's message of "Sun, 5 Oct 2008 20:19:01 %2B0200") References: <86r66v6gsj.fsf@ponoka.ab.hsia.telus.net> <200810051546.28440.fbsd.questions@rachie.is-a-geek.net> <86bpxz58l9.fsf@ponoka.ab.hsia.telus.net> <200810052019.01920.fbsd.questions@rachie.is-a-geek.net>
next in thread | previous in thread | raw e-mail | index | archive | help
[Mel, the last time I replied to your @rachie address, I got a bounce. I'm still including it here on the CC list. Should I remove it and just reply to you via this list? --rdh] Diane, Mel, thanks for your suggestions so far. Mel> If upgrading ports is a possible solution, then you have the Mel> fine task of finding out, which library in everything that's Mel> being loaded is *NOT* linked with libthr, cause a likely Mel> candidate would be two different threading libraries being Mel> used. I would start with ldd -a /path/to/python/wx.so and see Mel> if both libthr.so and libpthread.so (or maybe even libkse) show Mel> up. What I did was this: $ python -c "import wx"& which hangs. Then I did $ lsof -p $pid | grep '\.so' to get a list of open shared objects. The only matches for "thr" are /lib/libthr.so.3 /usr/local/lib/libgthread-2.0.so.0 There are no matches for "kse". Then I started doing $ lsof -p $pid | > grep '\.so' | > awk '{print $NF}' | > xargs -n 1 ldd -a | less When I looked closely at the many libthr.so.3 references, though, I saw something quite interesting. As far as I can tell, not all are loaded at the same address. This is quite confusing to me. $ lsof -p $pid | > grep '\.so' | > awk '{print $NF}' | > xargs -n 1 ldd -f '\t%o %p %x\n' -a | > awk 'NF==1 {prefix=$1; next} {print prefix, $0}' | > awk '$2 ~ /libthr/ { print $4 }' | > sort | > uniq -c | > sort -nr 22 0x28bc8000 7 0x2953f000 5 0x2945f000 5 0x29371000 4 0x29407000 4 0x293fa000 3 0x2934d000 2 0x28952000 2 0x2894b000 1 0x29a79000 1 0x2960d000 1 0x289fb000 1 0x28921000 1 0x28548000 1 0x281b6000 $ However, closer inspection shows that, confusing as it is, this behaviour is common to almost all the shared libraries loaded into the stuck python process. Indeed, only libc seems to have just one loaded address. Also, this pipeline is actually inspecting the results from many different runs of ldd on each .so, instead of looking at the state of the running process. A little more poking leads to the following result that is again confusing to me $ lsof -p 79117 | > grep '\.so' | > awk '{print $NF}' | > sort | uniq -c | sort -nr | > head 2 /usr/local/lib/python2.5/site-packages/wx-2.8-gtk2-ansi/wx/_core_.so 1 /usr/local/lib/libxml2.so.5 1 /usr/local/lib/libwx_gtk2_xrc-2.8.so.0.2.0 1 /usr/local/lib/libwx_gtk2_qa-2.8.so.0.2.0 1 /usr/local/lib/libwx_gtk2_html-2.8.so.0.2.0 1 /usr/local/lib/libwx_gtk2_core-2.8.so.0.2.0 1 /usr/local/lib/libwx_gtk2_aui-2.8.so.0.2.0 1 /usr/local/lib/libwx_gtk2_adv-2.8.so.0.2.0 1 /usr/local/lib/libwx_base_xml-2.8.so.0.2.0 1 /usr/local/lib/libwx_base_net-2.8.so.0.2.0 $ The python wx core library seem to have been opened twice, unlike every other shared object that the python process has opened. Anyway, I don't know what to make of these results. Also, they seem at least somewhat unlikely to be related to seeing the same hang in ooo3. Mel> Also inspect /etc/libmap.conf for entries you may have added in Mel> a not too recent past and forgot about. No such file on my system. Mel> Unfortunately, I see no obvious candidates in your package list (ie: Mel> compat-[456]x, *flash*). I had compat-5x installed and removed it, but the problem persisted. I still have compat-6x installed. So, the upshot is I still don't see a smoking gun anywhere, but I certainly see some things that are confusing, although that has no bearing on whether or not they're actually problems. If anything above inspires you with more questions, let me know and I can do more poking around. The next step, I guess, is to rebuild with ULE and/or try out 7.1 prerelease. Thanks again for your help so far. Dale.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86skra4cuo.fsf>