From owner-freebsd-arm@freebsd.org Thu Jan 7 10:19:20 2016 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D4CF0A65170 for ; Thu, 7 Jan 2016 10:19:20 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-4.reflexion.net [208.70.210.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 99B7818B3 for ; Thu, 7 Jan 2016 10:19:19 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 950 invoked from network); 7 Jan 2016 10:19:14 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 7 Jan 2016 10:19:14 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Thu, 07 Jan 2016 05:19:21 -0500 (EST) Received: (qmail 32629 invoked from network); 7 Jan 2016 10:19:20 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 7 Jan 2016 10:19:20 -0000 X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id 2D8A21C43C1 for ; Thu, 7 Jan 2016 02:19:12 -0800 (PST) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: FYI: various 11.0-CURRENT -r293227 (and older) hangs on arm (rpi2): a description of sorts Message-Id: Date: Thu, 7 Jan 2016 02:19:12 -0800 To: freebsd-arm Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jan 2016 10:19:20 -0000 I've had various hangs when the rpi2 was busy over longish periods, both = debug buildkernel/buildworld builds of the arm and non-debug variants. = No log files or console messages produced. I've not had any analogous issues with powerpc64 (PowerMac G5) or with = amd64 (Virtual Box used on Mac OS X). I've finally discovered that if I have, say, top running on the rpi2 = serial console that top continues to update its display so long as I = leave it alone during the hang. (Otherwise it hangs too.) So I finally = have a little window for seeing some of what is happening. An example top display showed after the hang: Mem: 764M Active 12M Inact 141M Wired 98M Buf 8k free Swap: 2048M Total 29M Used 2019 Free 1% in use (Yep: Just 8K free Mem.) The unusual STATEs for processes seemed to be (for the specific hang): STATE COMMANDs pfault [ld] [ld] /usr/sbin/syslogd vmwait [ld] [md0] [kernel] wswbuf [pagedaemon] Those same 3 states seem to always be involved. Some of the processes = vary from one hang to the next: the prior hang had build/genautoma , = /usr/sbin/moused , and /usr/sbin/ntpd instead of 3 [ld]'s. /usr/sbin/syslogd, [md0], [kernel], and [pagedaemon] and their states do = not seem to vary (so far). (Note: I may not be able to rapidly apply any investigative steps asked = for but hopefully can get to any requested over time. I can get into ddb = via the serial console, not that I'm familiar with the kernel or with = using ddb.) Context: > $ freebsd-version -ku; uname -aKU > 11.0-CURRENT > 11.0-CURRENT > FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r293227M: Wed Jan 6 = 22:32:34 PST 2016 = root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2 arm 1100093 = 1100093 > Filesystem 1M-blocks Used Avail Capacity Mounted on > /dev/ufs/RPI2rootfs 443473 16791 391203 4% / > devfs 0 0 0 100% /dev > /dev/mmcsd0s1 49 7 42 15% /boot/msdos > $ mount > /dev/ufs/RPI2rootfs on / (ufs, local, noatime, soft-updates) > devfs on /dev (devfs, local) > /dev/mmcsd0s1 on /boot/msdos (msdosfs, local, noatime) (I picked soft-updates without journaling so I could use dump -L and = restore together.) /dev/ufs/RPI2rootfs is on an SSD on a powered hub and is found via the = /etc/fstab on the mmcsd referencing /dev/ufs/RPI2rootfs instead. The = /etc/fstab on the SSD has the same content (see below). I do = installkernel and installworld (from an amd64 context) on both the mmcsd = and the SSD so that they fully match for that content. (It is possible = to boot the rpi2 without the SSD.) > $ more /etc/fstab > /dev/mmcsd0s1 /boot/msdos msdosfs rw,noatime 0 0 > /dev/ufs/RPI2rootfs / ufs rw,noatime 1 1 > md none swap sw,late,file=3D/swapfile0 0 0 > $ swapinfo > Device 1K-blocks Used Avail Capacity > /dev/md0 2097152 0 2097152 0% So the rpi2 is swapping/paging to a file, not to a swap partition. When = I looked it turned out that I did not leave a free space for a swap = partition on the SSD. There is a free space for a swap partition on the = mmcsd, so at least I set up that one as I intended. Currently no place = meets the criteria for a crash dump since I have not made a swap = partition on the mmcsd yet. The following were used for buildworld and buildkernel built via clang = in an amd64 FreeBSD context to produce my more recent arm builds (some = with my KERNCONF=3DRPI2-NODBG instead): -march=3Darmv7a -mcpu=3Dcortex-a7 -mno-unaligned-access KERNCONF=3DRPI2 TARGET=3Darm TARGET_ARCH=3Darmv6 WITH_FAST_DEPEND=3D WITH_LIBCPLUSPLUS=3D WITH_BINTOOLS_BOOTSTRAP=3D WITH_CLANG=3D WITH_CLANG_IS_CC=3D WITH_CLANG_FULL=3D WITH_LLDB=3D WITH_CLANG_EXTRAS=3D WITH_BOOT=3D WITH_DEBUG=3D WITH_DEBUG_FILES=3D WITHOUT_LIB32=3D WITHOUT_ELFTOOLCHAIN_BOOTSTRAP=3D WITHOUT_CLANG_BOOTSTRAP=3D WITHOUT_GCC_BOOTSTRAP=3D WITHOUT_GCC=3D WITHOUT_GNUCXX=3D NO_WERROR=3D The test case: I've set up to portmaster an arm-gnueabi-gcc analogous to powerpc64-gcc, = more to give the rpi2 lots to do than to use the result at this point. = It is built via gcc49 from pkg install. I've never come close to = completing a build of arm-gnueabi-gcc but I have been able to complete = an RPI2 buildkernel. But I've also had hangs during buildkernel. As far as I can tell the details of the arm-gnueabi-gcc build do not = matter, such as /etc/make.conf details: I've even had hangs during = svnlite status (over all of /usr/src or /usr/ports) and svnlite diff = (over all of /usr/src or /usr/ports). Most of my contexts for hangs predate my discovery that top would keep = updating on the serial console so I had no window into what was = happening at the time. =3D=3D=3D Mark Millard markmi at dsl-only.net