From owner-freebsd-arm@freebsd.org Thu Jan 7 16:12:57 2016 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 72B36A677A5 for ; Thu, 7 Jan 2016 16:12:57 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from outbound1b.ore.mailhop.org (outbound1b.ore.mailhop.org [54.200.247.200]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 56E9715F3 for ; Thu, 7 Jan 2016 16:12:57 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from ilsoft.org (unknown [73.34.117.227]) by outbound1.ore.mailhop.org (Halon Mail Gateway) with ESMTPSA; Thu, 7 Jan 2016 16:13:22 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.14.9/8.14.9) with ESMTP id u07GCoHW004181; Thu, 7 Jan 2016 09:12:50 -0700 (MST) (envelope-from ian@freebsd.org) Message-ID: <1452183170.1215.4.camel@freebsd.org> Subject: Re: FYI: various 11.0-CURRENT -r293227 (and older) hangs on arm (rpi2): a description of sorts From: Ian Lepore To: Mark Millard , freebsd-arm Date: Thu, 07 Jan 2016 09:12:50 -0700 In-Reply-To: References: Content-Type: text/plain; charset="us-ascii" X-Mailer: Evolution 3.16.5 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jan 2016 16:12:57 -0000 On Thu, 2016-01-07 at 02:19 -0800, Mark Millard wrote: > I've had various hangs when the rpi2 was busy over longish periods, > both debug buildkernel/buildworld builds of the arm and non-debug > variants. No log files or console messages produced. > > I've not had any analogous issues with powerpc64 (PowerMac G5) or > with amd64 (Virtual Box used on Mac OS X). > > I've finally discovered that if I have, say, top running on the rpi2 > serial console that top continues to update its display so long as I > leave it alone during the hang. (Otherwise it hangs too.) So I > finally have a little window for seeing some of what is happening. > > An example top display showed after the hang: > > Mem: 764M Active 12M Inact 141M Wired 98M Buf 8k free > Swap: 2048M Total 29M Used 2019 Free 1% in use > > (Yep: Just 8K free Mem.) > That's not a problem. > The unusual STATEs for processes seemed to be (for the specific > hang): > > STATE COMMANDs > pfault [ld] [ld] /usr/sbin/syslogd > vmwait [ld] [md0] [kernel] > wswbuf [pagedaemon] > > Those same 3 states seem to always be involved. Some of the processes > vary from one hang to the next: the prior hang had build/genautoma , > /usr/sbin/moused , and /usr/sbin/ntpd instead of 3 [ld]'s. > > /usr/sbin/syslogd, [md0], [kernel], and [pagedaemon] and their states > do not seem to vary (so far). > > Everything is backed up waiting for slow sdcard IO. You can get an amd64 system with many cores and gigabytes of ram into the same state with an sdcard (or any other storage device that takes literally seconds for any individual IO to complete). All the available buffers get queued up to the one slow device, then you can't do anything that requires IO (even launch tools to try to figure out what's going on). -- Ian