From owner-freebsd-ia64@FreeBSD.ORG Thu Mar 18 15:51:17 2010 Return-Path: Delivered-To: freebsd-ia64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 45450106564A; Thu, 18 Mar 2010 15:51:17 +0000 (UTC) (envelope-from mexas@bristol.ac.uk) Received: from dirj.bris.ac.uk (dirj.bris.ac.uk [137.222.10.78]) by mx1.freebsd.org (Postfix) with ESMTP id F31448FC20; Thu, 18 Mar 2010 15:51:16 +0000 (UTC) Received: from ncsd.bris.ac.uk ([137.222.10.59] helo=ncs.bris.ac.uk) by dirj.bris.ac.uk with esmtp (Exim 4.69) (envelope-from ) id 1NsI0B-0006Bv-9P; Thu, 18 Mar 2010 15:51:15 +0000 Received: from mech-cluster241.men.bris.ac.uk ([137.222.187.241]) by ncs.bris.ac.uk with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.67) (envelope-from ) id 1NsI0A-0002xx-Db; Thu, 18 Mar 2010 15:51:14 +0000 Received: from mech-cluster241.men.bris.ac.uk (localhost [127.0.0.1]) by mech-cluster241.men.bris.ac.uk (8.14.4/8.14.3) with ESMTP id o2IFpEPh002500; Thu, 18 Mar 2010 15:51:14 GMT (envelope-from mexas@bristol.ac.uk) Received: (from mexas@localhost) by mech-cluster241.men.bris.ac.uk (8.14.4/8.14.3/Submit) id o2IFpDWV002499; Thu, 18 Mar 2010 15:51:13 GMT (envelope-from mexas@bristol.ac.uk) X-Authentication-Warning: mech-cluster241.men.bris.ac.uk: mexas set sender to mexas@bristol.ac.uk using -f Date: Thu, 18 Mar 2010 15:51:13 +0000 From: Anton Shterenlikht To: jhell Message-ID: <20100318155113.GE1552@mech-cluster241.men.bris.ac.uk> References: <20100317163230.GJ87732@mech-cluster241.men.bris.ac.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Cc: FreeBSD Current , freebsd-ia64@freebsd.org Subject: Re: ldd leaves the machine unresponsive X-BeenThere: freebsd-ia64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the IA-64 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Mar 2010 15:51:17 -0000 On Thu, Mar 18, 2010 at 11:29:36AM -0400, jhell wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > > On Wed, 17 Mar 2010 12:32, Anton Shterenlikht wrote: > In Message-Id: <20100317163230.GJ87732@mech-cluster241.men.bris.ac.uk> > > > Just updated to ia64 r205248 > > > > If my problem is due to my mis-configuration, > > I apologise in advance. > > > > I run this shell script after each upgrade > > and 'make delete-old-libs' to check > > if any shared objects need to be rebuilt: > > > > > > > > #!/bin/sh > > > > for file in `find /bin /sbin /usr/bin /usr/sbin /usr/lib /usr/libexec /usr/local -name "*"` > > do > > echo $file > > ldd $file >> /root/ldd_results 2> /dev/zero > > done > > > > > > > > This will probably do closer to what you actually would want to look for. > > Writing to /dev/zero ... I don't know never tried it since /dev/null is > usually the standard place to throw trash. > > #!/bin/sh > for file in `find /*bin /usr/*bin /usr/lib* /usr/local/*bin -type f` do > echo $file > ldd $file >>/root/ldd_results 2>/dev/null > done > > The problem with your script is that it finds most files that it can not > or is not useful to run ldd on and leaves you junk in return. > > It might be more useful if you searched for dynamically linked ELF > binaries to run ldd against like the following. > > === Script starts here === > #!/bin/sh > > SEARCHPATH="/*bin /usr/*bin /usr/lib* /usr/local/*bin" > > trap 'exit 1' 2 > > check_libs() { > for spath in $SEARCHPATH; do > for ifelf in `find $spath -type f`; do > ldd `file $ifelf | grep dynamically | cut -f1 -d:` > done > done > } > > check_libs 2>/dev/null > === Script ends here === > > The above will find all type ELF * that are dynamically linked within the > SEARCHPATH variable and run ldd on them and print the results to stdout. > > Obviously since you are going to have thousands of files being questioned, > stdout is not going to be useful. > > So with the about stated: > save the script to: checklibs.sh > run with: "sh checklibs.sh >/root/checklibs_output" > or: "script /root/checklibs_output checklibs.sh" > > > After the upgrade to r205248, the script > > freezes at seemingly random points. > > > > Unneeded disk usage & execution. > > > I can still ssh to the machine (using keys), i.e. > > I see the welcome message, but cannot get to the console prompt. > > Of course... to many open files or processes in wait. SSH already has the > information it needs loaded into memory, that's why you can get sort-of-in > > ZFS file-system perhaps ? > > > > > On the serial console I cannot get the prompt > > after entering the root password. > > > > See above. > > > I have top(1) running interactively in another window. > > The sh process is in "getblk" state, and ignores kill -9. > > But there's no ldd process. > > > > And shutdown requests are also ignored: > > > > # shutdown -r now > > Shutdown NOW! > > shutdown: [pid 8019] > > # > > and nothing happens after that > > > > So I have to do a cold reset via MP. > > > > On ia64 r204322, this script causes no problems. > > > > Please advise > > > > The above edited script should help to limit disk usage and too many open > processes that causes your machine to bog down like that. This script does > have its limitations and there is one bug in it... Ill let you figure out > how to get rid of that bug but it really does not effect the intended > output so I left it alone and sent error output to fd/2. > > The limitations you'll find is how many files that ldd(1) or file(1) can > handle at one time. But if you specify specific paths like already in > SEARCHPATH then you will most likely never see this unless the files in > /*bin grow to be over max number of files that file(1) or ldd(1) can > handle at one time. Shortly said... use direct paths or short globs like > above. > > > many thanks > > anton > > > > A final note you might want to just install sysutils/libchk and run that. > > Standard Disclaimer: NONE OF THIS CONTAINED HEREIN "THIS MESSAGE" EXCUSES > ANY OF THE UNEXPLAINED DISK LOCKING THAT IS GOING ON AND THE INFORMATION > FOR WHICH IT MAY CONTAIN BECOMING UNAVAILABLE AT ANY POINT IN TIME DURING > THE ORIGINAL RUN OF THE FIRST SCRIPT OR THE SECOND SCRIPT THAT WAS POSTED > EITHER AS A ATTACHMENT OR IN-LINE. > > ;) JK! > > Good Luck. many thanks, this is very helpful I don't seem to have this lockup anymore. Don't know what was happening. I've run it now several times on 3 different ia64 current (different revisions) boxes, with disks of different speed, and can't reproduce. My script was very crude, of course. I'll try sysutils/libchk thanks again anton -- Anton Shterenlikht Room 2.6, Queen's Building Mech Eng Dept Bristol University University Walk, Bristol BS8 1TR, UK Tel: +44 (0)117 331 5944 Fax: +44 (0)117 929 4423