From owner-freebsd-current@freebsd.org Sun Dec 1 01:52:51 2019 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C909A1C1AF3; Sun, 1 Dec 2019 01:52:51 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (www.zefox.net [50.1.20.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "www.zefox.org", Issuer "www.zefox.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 47QWTt4xRdz3KyL; Sun, 1 Dec 2019 01:52:50 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (localhost [127.0.0.1]) by www.zefox.net (8.15.2/8.15.2) with ESMTPS id xB11qrF4046037 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 30 Nov 2019 17:52:54 -0800 (PST) (envelope-from fbsd@www.zefox.net) Received: (from fbsd@localhost) by www.zefox.net (8.15.2/8.15.2/Submit) id xB11qqSk046036; Sat, 30 Nov 2019 17:52:52 -0800 (PST) (envelope-from fbsd) Date: Sat, 30 Nov 2019 17:52:52 -0800 From: bob prohaska To: freebsd-arm@freebsd.org, freebsd-current@freebsd.org Cc: bob prohaska Subject: Re: Rpi3 panic: non-current pmap 0xfffffd001e05b130 Message-ID: <20191201015252.GB45887@www.zefox.net> References: <20191201011615.GA45887@www.zefox.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191201011615.GA45887@www.zefox.net> User-Agent: Mutt/1.5.24 (2015-08-30) X-Rspamd-Queue-Id: 47QWTt4xRdz3KyL X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of fbsd@www.zefox.net has no SPF policy when checking 50.1.20.27) smtp.mailfrom=fbsd@www.zefox.net X-Spamd-Result: default: False [0.54 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.81)[-0.808,0]; WWW_DOT_DOMAIN(0.50)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; IP_SCORE(0.07)[ip: (0.30), ipnet: 50.1.16.0/20(0.15), asn: 7065(-0.04), country: US(-0.05)]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[zefox.net]; AUTH_NA(1.00)[]; NEURAL_HAM_LONG(-0.62)[-0.619,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; R_SPF_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:7065, ipnet:50.1.16.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; MID_RHS_WWW(0.50)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Dec 2019 01:52:51 -0000 On Sat, Nov 30, 2019 at 05:16:15PM -0800, bob prohaska wrote: > A Pi3 running r355024 reported a panic while doing a -j3 make of > www/chromium: > Ok, another panic, looks like a dying storage device. This time there was a preamble on the console: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 c3 90 d8 00 00 08 00 (da0:umass-sim0:0:0:0): CAM status: CCB request completed with an error (da0:umass-sim0:0:0:0): Error 5, Retries exhausted swap_pager: I/O error - pageout failed; blkno 1442883,size 4096, error 5 swap_pager: I/O error - pageout failed; blkno 1442884,size 4096, error 5 swap_pager: I/O error - pageout failed; blkno 1442885,size 8192, error 5 swap_pager: I/O error - pageout failed; blkno 1442887,size 4096, error 5 swap_pager: I/O error - pagein failed; blkno 1103209,size 4096, error 5 vm_fault: pager read error, pid 681 (devd) swap_pager: I/O error - pagein failed; blkno 1130270,size 4096, error 5 vm_fault: pager read error, pid 2362 (c++) Nov 30 17:37:34 www kernel: Failed to fully fault in a core file segment at VA 0x40400000 with size 0x60b000 to be written at offset 0x32b000 for process devd panic: vm_page_assert_unbusied: page 0xfffffd0030f8af80 busy @ /usr/src/sys/vm/vm_object.c:777 cpuid = 3 time = 1575164255 Earlier panics didn't have any proximate warnings on the console, but they're probably the same story. apologies for the noise! bob prohaska > panic: non-current pmap 0xfffffd001e05b130 > cpuid = 0 > time = 1575161361 > KDB: stack backtrace: > db_trace_self() at db_trace_self_wrapper+0x28 > pc = 0xffff000000729e4c lr = 0xffff0000001066c8 > sp = 0xffff000059f3e2b0 fp = 0xffff000059f3e4c0 > > db_trace_self_wrapper() at vpanic+0x18c > pc = 0xffff0000001066c8 lr = 0xffff000000400d7c > sp = 0xffff000059f3e4d0 fp = 0xffff000059f3e580 > > vpanic() at panic+0x44 > pc = 0xffff000000400d7c lr = 0xffff000000400b2c > sp = 0xffff000059f3e590 fp = 0xffff000059f3e610 > > panic() at pmap_remove_pages+0x8d4 > pc = 0xffff000000400b2c lr = 0xffff00000074154c > sp = 0xffff000059f3e620 fp = 0xffff000059f3e6e0 > > pmap_remove_pages() at vmspace_exit+0xc0 > pc = 0xffff00000074154c lr = 0xffff0000006c9c00 > sp = 0xffff000059f3e6f0 fp = 0xffff000059f3e720 > > vmspace_exit() at exit1+0x4f8 > pc = 0xffff0000006c9c00 lr = 0xffff0000003bc2a4 > sp = 0xffff000059f3e730 fp = 0xffff000059f3e7a0 > > exit1() at sys_sys_exit+0x10 > pc = 0xffff0000003bc2a4 lr = 0xffff0000003bbda8 > sp = 0xffff000059f3e7b0 fp = 0xffff000059f3e7b0 > > sys_sys_exit() at do_el0_sync+0x514 > pc = 0xffff0000003bbda8 lr = 0xffff000000747aa4 > sp = 0xffff000059f3e7c0 fp = 0xffff000059f3e860 > > do_el0_sync() at handle_el0_sync+0x90 > pc = 0xffff000000747aa4 lr = 0xffff00000072ca14 > sp = 0xffff000059f3e870 fp = 0xffff000059f3e980 > > handle_el0_sync() at 0x404e6d60 > pc = 0xffff00000072ca14 lr = 0x00000000404e6d60 > sp = 0xffff000059f3e990 fp = 0x0000ffffffffd590 > > KDB: enter: panic > [ thread pid 94966 tid 100145 ] > Stopped at 0x40505460: undefined 54000042 > db> bt > Tracing pid 94966 tid 100145 td 0xfffffd002552b000 > db_trace_self() at db_stack_trace+0xf8 > pc = 0xffff000000729e4c lr = 0xffff000000103b0c > sp = 0xffff000059f3de80 fp = 0xffff000059f3deb0 > > db_stack_trace() at db_command+0x228 > pc = 0xffff000000103b0c lr = 0xffff000000103784 > sp = 0xffff000059f3dec0 fp = 0xffff000059f3dfa0 > > db_command() at db_command_loop+0x58 > pc = 0xffff000000103784 lr = 0xffff00000010352c > sp = 0xffff000059f3dfb0 fp = 0xffff000059f3dfd0 > > db_command_loop() at db_trap+0xf4 > pc = 0xffff00000010352c lr = 0xffff000000106830 > sp = 0xffff000059f3dfe0 fp = 0xffff000059f3e200 > > db_trap() at kdb_trap+0x1d8 > pc = 0xffff000000106830 lr = 0xffff0000004492fc > sp = 0xffff000059f3e210 fp = 0xffff000059f3e2c0 > > kdb_trap() at do_el1h_sync+0xf4 > pc = 0xffff0000004492fc lr = 0xffff000000747418 > sp = 0xffff000059f3e2d0 fp = 0xffff000059f3e300 > > do_el1h_sync() at handle_el1h_sync+0x78 > pc = 0xffff000000747418 lr = 0xffff00000072c878 > sp = 0xffff000059f3e310 fp = 0xffff000059f3e420 > > handle_el1h_sync() at kdb_enter+0x34 > pc = 0xffff00000072c878 lr = 0xffff000000448948 > sp = 0xffff000059f3e430 fp = 0xffff000059f3e4c0 > > kdb_enter() at vpanic+0x1a8 > pc = 0xffff000000448948 lr = 0xffff000000400d98 > sp = 0xffff000059f3e4d0 fp = 0xffff000059f3e580 > > vpanic() at panic+0x44 > pc = 0xffff000000400d98 lr = 0xffff000000400b2c > sp = 0xffff000059f3e590 fp = 0xffff000059f3e610 > > panic() at pmap_remove_pages+0x8d4 > pc = 0xffff000000400b2c lr = 0xffff00000074154c > sp = 0xffff000059f3e620 fp = 0xffff000059f3e6e0 > > pmap_remove_pages() at vmspace_exit+0xc0 > pc = 0xffff00000074154c lr = 0xffff0000006c9c00 > sp = 0xffff000059f3e6f0 fp = 0xffff000059f3e720 > > vmspace_exit() at exit1+0x4f8 > pc = 0xffff0000006c9c00 lr = 0xffff0000003bc2a4 > sp = 0xffff000059f3e730 fp = 0xffff000059f3e7a0 > > exit1() at sys_sys_exit+0x10 > pc = 0xffff0000003bc2a4 lr = 0xffff0000003bbda8 > sp = 0xffff000059f3e7b0 fp = 0xffff000059f3e7b0 > > sys_sys_exit() at do_el0_sync+0x514 > pc = 0xffff0000003bbda8 lr = 0xffff000000747aa4 > sp = 0xffff000059f3e7c0 fp = 0xffff000059f3e860 > > do_el0_sync() at handle_el0_sync+0x90 > pc = 0xffff000000747aa4 lr = 0xffff00000072ca14 > sp = 0xffff000059f3e870 fp = 0xffff000059f3e980 > > handle_el0_sync() at 0x404e6d60 > pc = 0xffff00000072ca14 lr = 0x00000000404e6d60 > sp = 0xffff000059f3e990 fp = 0x0000ffffffffd590 > > db> > > The last top screen showed > > last pid: 94966; load averages: 1.22, 1.42, 1.40 up 0+05:10:16 16:49:20 > 43 processes: 1 running, 42 sleeping > CPU: 3.7% user, 0.0% nice, 20.0% system, 5.5% interrupt, 70.8% idle > Mem: 502M Active, 6672K Inact, 150M Laundry, 184M Wired, 90M Buf, 55M Free > Swap: 7194M Total, 3835M Used, 3359M Free, 53% Inuse, 11M In, 3852K Out > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 71350 root 1 22 0 951M 144M swread 0 10:20 5.97% c++ > 58502 root 1 21 0 986M 232M swread 1 11:23 3.42% c++ > 77283 root 1 22 0 963M 151M swread 0 8:45 3.30% c++ > 6904 root 1 22 0 1144M 191M swread 0 21:26 3.29% c++ > 1091 bob 1 52 0 11M 324K wait 3 0:55 0.27% sh > 1079 bob 1 20 0 13M 1636K CPU0 0 0:57 0.22% top > 1074 bob 1 20 0 19M 1316K select 1 0:13 0.03% sshd > 970 root 1 20 0 16M 1500K select 1 0:02 0.02% sendmail > 1069 root 1 20 0 204M 1044K select 3 1:40 0.00% ninja > 1050 root 1 20 0 12M 972K select 2 0:02 0.00% make > 977 root 1 20 0 11M 0B nanslp 1 0:02 0.00% > 957 root 1 20 0 19M 1216K select 1 0:01 0.00% sshd > 824 root 1 20 0 11M 1084K select 2 0:01 0.00% syslogd > 1084 bob 1 20 0 13M 1008K ttyin 0 0:00 0.00% tcsh > > and the last few storage activity log entries were: > > dT: 1.002s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > 1 713 694 4666 3.7 20 116 6.0 0 0 0.0 90.9 mmcsd0 > 1 713 694 4666 3.8 20 116 6.0 0 0 0.0 91.2 mmcsd0s2 > 2 751 734 4730 2.0 17 96 0.6 0 0 0.0 79.4 da0 > 1 713 694 4666 3.8 20 116 6.0 0 0 0.0 91.4 mmcsd0s2b > 2 751 734 4730 2.0 17 96 0.6 0 0 0.0 79.9 da0p6 > Sat Nov 30 16:48:21 PST 2019 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 1958976 2445276 44% > /dev/da0p6 5242880 1956872 3286008 37% > Total 9647132 3915848 5731284 41% > Nov 30 16:38:17 www sshd[91264]: error: PAM: Authentication error for illegal user support from 103.133.104.114 > Nov 30 16:38:17 www sshd[91264]: error: Received disconnect from 103.133.104.114 port 52716:14: No more user authentication methods available. [preauth] > 0/1016/1016/19178 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 0 0 12 4523836 56064 6988 186 715 257 6931 25128 0 0 30789 1073 29817 14 26 60 > dT: 1.002s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > 0 709 687 4588 3.9 22 144 5.8 0 0 0.0 90.5 mmcsd0 > 0 709 687 4588 3.9 22 144 5.8 0 0 0.0 90.9 mmcsd0s2 > 2 698 679 4696 2.1 19 104 0.6 0 0 0.0 75.2 da0 > 0 709 687 4588 3.9 22 144 5.8 0 0 0.0 91.0 mmcsd0s2b > 2 698 679 4696 2.2 19 104 0.6 0 0 0.0 75.7 da0p6 > Sat Nov 30 16:48:22 PST 2019 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 1959032 2445220 44% > /dev/da0p6 5242880 1956928 3285952 37% > Total 9647132 3915960 5731172 41% > Nov 30 16:38:17 www sshd[91264]: error: PAM: Authentication error for illegal user support from 103.133.104.114 > Nov 30 16:38:17 www sshd[91264]: error: Received disconnect from 103.133.104.114 port 52716:14: No more user authentication methods available. [preauth] > 0/1016/1016/19178 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 0 0 12 4523836 55844 6989 186 715 257 6932 25127 604 604 30790 1073 29819 14 26 60 > dT: 1.001s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > 2 736 680 4829 3.9 56 292 5.0 0 0 0.0 94.4 mmcsd0 > 2 736 680 4829 4.0 56 292 5.0 0 0 0.0 94.6 mmcsd0s2 > 1 680 627 4014 1.9 53 328 1.0 0 0 0.0 71.1 da0 > 2 736 680 4829 4.0 56 292 5.0 0 0 0.0 94.7 mmcsd0s2b > 1 680 627 4014 1.9 53 328 1.0 0 0 0.0 71.7 da0p6 > Sat Nov 30 16:48:24 PST 2019 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 1959324 2444928 44% > /dev/da0p6 5242880 1957468 3285412 37% > Total 9647132 3916792 5730340 41% > Nov 30 16:38:17 www sshd[91264]: error: PAM: Authentication error for illegal user support from 103.133.104.114 > Nov 30 16:38:17 www sshd[91264]: error: Received disconnect from 103.133.104.114 port 52716:14: No more user authentication methods available. [preauth] > 0/1016/1016/19178 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 0 0 12 4523836 52860 6989 186 715 257 6932 25125 1038 1038 30790 1073 29820 14 26 60 > dT: 1.002s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > 1 751 702 4860 4.0 49 251 5.0 0 0 0.0 94.4 mmcsd0 > 1 751 702 4860 4.1 49 251 5.1 0 0 0.0 94.7 mmcsd0s2 > 2 704 658 4082 1.8 46 235 0.6 0 0 0.0 71.9 da0 > 1 751 702 4860 4.1 49 251 5.1 0 0 0.0 94.8 mmcsd0s2b > 2 704 658 4082 1.8 46 235 0.7 0 0 0.0 72.6 da0p6 > Sat Nov 30 16:48:26 PST 2019 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 1959504 2444748 44% > /dev/da0p6 5242880 1957540 3285340 37% > Total 9647132 3917044 5730088 41% > Nov 30 16:38:17 www sshd[91264]: error: PAM: Authentication error for illegal user support from 103.133.104.114 > Nov 30 16:38:17 www sshd[91264]: error: Received disconnect from 103.133.104.114 port 52716:14: No more user authentication methods available. [preauth] > 0/1016/1016/19178 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 0 0 12 4523868 46872 6989 186 715 257 6932 25123 0 0 30790 1073 29820 14 26 60 > dT: 1.002s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > 2 700 681 4888 3.7 19 108 3.7 0 0 0.0 92.1 mmcsd0 > 2 700 681 4888 3.8 19 108 3.7 0 0 0.0 92.5 mmcsd0s2 > 2 709 687 4314 2.1 22 108 3.4 0 0 0.0 78.2 da0 > 2 700 681 4888 3.8 19 108 3.7 0 0 0.0 92.6 mmcsd0s2b > 2 709 687 4314 2.1 22 108 3.4 0 0 0.0 78.7 da0p6 > Sat Nov 30 16:48:28 PST 2019 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 440 > > It's clear the machine was heavily loaded, but storage didn't appear to be swamped. > I hope the foregoing has been of some interest, thanks for reading! > > bob prohaska >