Date: Sun, 1 Jan 2023 17:30:25 +0000 From: John F Carr <jfc@mit.edu> To: Ronald Klop <ronald-lists@klop.ws> Cc: Mark Millard <marklmi@yahoo.com>, "freebsd-arm@freebsd.org" <freebsd-arm@FreeBSD.org>, Andrew Turner <andrew@FreeBSD.org> Subject: Re: lsof crashes in Arm Optimized Routines Message-ID: <FE629840-F87B-414C-B445-354986749950@mit.edu> In-Reply-To: <bf6e8328-9a47-e0be-8e58-62c1aadf60a7@klop.ws> References: <1331707040.259440.1668459233836@localhost> <490902644.115954.1668511998644@localhost> <690511F4-25A6-41E8-A75A-FFE80C352DFA@yahoo.com> <bf6e8328-9a47-e0be-8e58-62c1aadf60a7@klop.ws>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Jan 1, 2023, at 07:49, Ronald Klop <ronald-lists@klop.ws> wrote: >=20 > On 11/18/22 01:57, Mark Millard wrote: >>> On Nov 15, 2022, at 03:33, Ronald Klop <ronald-lists@klop.ws> wrote: >>>=20 >>> Sorry for the noise. >>>=20 >>> But I cannot reproduce this today. I can scroll back in my terminal and= see the command and error from yesterday, but running the same again just = works. >> FYI: >> I do not have specifics any more, but I'll note that I've seen >> such lsof behavior of failing at one time and later working >> without any installed updates to it or the system between. I >> rarely use lsof and, so, this was not recently. >> I've no clue how to cause the failure(s) to show up. I've no >> clue how common the issue is. But, over time, it is not just >> you. >>>=20 >>> Van: Ronald Klop <ronald-lists@klop.ws> >>> Datum: maandag, 14 november 2022 21:53 >>> Aan: freebsd-arm@FreeBSD.org, Andrew Turner <andrew@FreeBSD.org> >>> Onderwerp: lsof crashes in Arm Optimized Routines >>> Hi, >>>=20 >>> See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D267760 : Segmen= tation fault in lsof. Program received signal SIGSEGV, Segmentation fault. >>> Invalid permissions for mapped object. >>> memcpy () at /home/ronald/dev/freebsd/src/contrib/arm-optimized-routine= s/string/aarch64/memcpy.S:175 >>> 175 stp D_l, D_h, [dst, 64]! >>>=20 >>> I also remembered this change: https://cgit.freebsd.org/src/log/contrib= /arm-optimized-routines?showmsg=3D1 about Arm Optimized Routines. >>>=20 >>> Could this be related? What can I do to help debug this? >>>=20 >> =3D=3D=3D >> Mark Millard >> marklmi at yahoo.com >=20 >=20 > I'm having this issue again. >=20 > No debugging symbols found in lsof) > (gdb) run > Starting program: /usr/local/sbin/lsof >=20 > Program received signal SIGSEGV, Segmentation fault. > Invalid permissions for mapped object. > memcpy () at /home/ronald/dev/freebsd/src/contrib/arm-optimized-routines/= string/aarch64/memcpy.S:171 > bt > 171 stp B_l, B_h, [dst, 32] > (gdb) bt > #0 memcpy () at /home/ronald/dev/freebsd/src/contrib/arm-optimized-routi= nes/string/aarch64/memcpy.S:171 > #1 0x0000000000218be4 in ?? () > #2 0x0000000400000000 in ?? () > Backtrace stopped: previous frame identical to this frame (corrupt stack?= ) > (gdb) >=20 >=20 > Some output of "truss -o /tmp/lsof.txt lsof": >=20 > __sysctl("kern.proc.filedesc.1",4,0x0,0x80ba06f0,0x0,0) =3D 0 (0x0) > __sysctl("kern.proc.filedesc.1",4,0x851d6000,0x80ba06f0,0x0,0) =3D 0 (0x0= ) > __sysctl("kern.proc.filedesc.385",4,0x0,0x80ba06f0,0x0,0) =3D 0 (0x0) > __sysctl("kern.proc.filedesc.385",4,0x8516ec00,0x80ba06f0,0x0,0) =3D 0 (0= x0) > __sysctl("kern.proc.filedesc.97537",4,0x0,0x80ba06f0,0x0,0) =3D 0 (0x0) > __sysctl("kern.proc.filedesc.97537",4,0x8516ec00,0x80ba06f0,0x0,0) =3D 0 = (0x0) > statfs("/data/jails/jail13/_root/home/root/dev/workspace/FreeBSD-Ports-13= /_root/usr/local/poudriere/data/.m/freebsd13-custom/04/bin/sh",{ fstypename= =3Dnullfs,mntonname=3D/data/jails/jail13/_root/home,mntfromname=3D/data/jai= ls/_home,fsid=3D3cff022929000000 }) =3D 0 (0x0) > statfs("/data/jails/jail13/_root",{ fstypename=3Dnullfs,mntonname=3D/data= /jails/jail13/_root,mntfromname=3D/data/jails/freebsd13,fsid=3D37ff02292900= 0000 }) =3D 0 (0x0) > statfs("/data/jails/_home3root/dev/workspace/FreeBSD-Ports-13/_root/usr/l= ocal/poudriere/data/.m/freebsd13-custom/04/bin/sh",0x80b9ef40) ERR#2 'No su= ch file or directory' > statfs("/data/jails/_home3root/dev/workspace/FreeBSD-Ports-13/_root/usr/l= ocal/poudriere/data/.m/freebsd13-custom/04/wrkdirs/usr/ports/devel/cmake-co= re/work/cmake-3.24.3/Source",0x80b9ef40) ERR#2 'No such file or directory' > statfs("/data/jails/_home3root/dev/workspace/FreeBSD-Ports-13/_root/usr/l= ocal/poudriere/data/.m/freebsd13-custom/04",0x80b9ef40) ERR#2 'No such file= or directory' > statfs("/data/jails/freebsd13ovt",0x80b9ef40) ERR#2 'No such file or d= irectory' > SIGNAL 11 (SIGSEGV) code=3DSEGV_MAPERR trapno=3D36 addr=3D0x80ba1000 > process killed, signal =3D 11 (core dumped) >=20 >=20 > I'm surprised that the path names in the truss output are corrupted: _hom= e3root should be _home/root. >=20 > NB: I'm using lsof while running poudriere in a jail in a Jenkins agent. >=20 > Regards, > Ronald. >=20 >=20 I think this is a bug in lsof and the optimized memcpy routine is doing wha= t it is asked to do, copy into a block of memory that the caller does not h= ave write access to. The faulting data address 0x80ba1000 is at the start = of a page. The faulting instruction address is in the middle of a block of= code that writes to successively increasing addresses. The destination po= inter passed to memcpy must be valid or the function would have crashed ear= lier. But the end address is out of bounds, meaning the size is wrong. If= you can get the program in a debugger again, or you can find a core file, = check the value of register x2 ("count" in assembly code). If that is huge= then you have an uninitialized or otherwise invalid third argument to memc= py. In a jail system calls to determine the current filesystem behave different= ly. The odd path names may be symptoms of jail-induced confusion.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FE629840-F87B-414C-B445-354986749950>