Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 1 Jan 2023 17:30:25 +0000
From:      John F Carr <jfc@mit.edu>
To:        Ronald Klop <ronald-lists@klop.ws>
Cc:        Mark Millard <marklmi@yahoo.com>, "freebsd-arm@freebsd.org" <freebsd-arm@FreeBSD.org>, Andrew Turner <andrew@FreeBSD.org>
Subject:   Re: lsof crashes in Arm Optimized Routines
Message-ID:  <FE629840-F87B-414C-B445-354986749950@mit.edu>
In-Reply-To: <bf6e8328-9a47-e0be-8e58-62c1aadf60a7@klop.ws>
References:  <1331707040.259440.1668459233836@localhost> <490902644.115954.1668511998644@localhost> <690511F4-25A6-41E8-A75A-FFE80C352DFA@yahoo.com> <bf6e8328-9a47-e0be-8e58-62c1aadf60a7@klop.ws>

next in thread | previous in thread | raw e-mail | index | archive | help

> On Jan 1, 2023, at 07:49, Ronald Klop <ronald-lists@klop.ws> wrote:
>=20
> On 11/18/22 01:57, Mark Millard wrote:
>>> On Nov 15, 2022, at 03:33, Ronald Klop <ronald-lists@klop.ws> wrote:
>>>=20
>>> Sorry for the noise.
>>>=20
>>> But I cannot reproduce this today. I can scroll back in my terminal and=
 see the command and error from yesterday, but running the same again just =
works.
>> FYI:
>> I do not have specifics any more, but I'll note that I've seen
>> such lsof behavior of failing at one time and later working
>> without any installed updates to it or the system between. I
>> rarely use lsof and, so, this was not recently.
>> I've no clue how to cause the failure(s) to show up. I've no
>> clue how common the issue is. But, over time, it is not just
>> you.
>>>=20
>>> Van: Ronald Klop <ronald-lists@klop.ws>
>>> Datum: maandag, 14 november 2022 21:53
>>> Aan: freebsd-arm@FreeBSD.org, Andrew Turner <andrew@FreeBSD.org>
>>> Onderwerp: lsof crashes in Arm Optimized Routines
>>> Hi,
>>>=20
>>> See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D267760 : Segmen=
tation fault in lsof. Program received signal SIGSEGV, Segmentation fault.
>>> Invalid permissions for mapped object.
>>> memcpy () at /home/ronald/dev/freebsd/src/contrib/arm-optimized-routine=
s/string/aarch64/memcpy.S:175
>>> 175 stp D_l, D_h, [dst, 64]!
>>>=20
>>> I also remembered this change: https://cgit.freebsd.org/src/log/contrib=
/arm-optimized-routines?showmsg=3D1 about Arm Optimized Routines.
>>>=20
>>> Could this be related? What can I do to help debug this?
>>>=20
>>   =3D=3D=3D
>> Mark Millard
>> marklmi at yahoo.com
>=20
>=20
> I'm having this issue again.
>=20
> No debugging symbols found in lsof)
> (gdb) run
> Starting program: /usr/local/sbin/lsof
>=20
> Program received signal SIGSEGV, Segmentation fault.
> Invalid permissions for mapped object.
> memcpy () at /home/ronald/dev/freebsd/src/contrib/arm-optimized-routines/=
string/aarch64/memcpy.S:171
> bt
> 171             stp     B_l, B_h, [dst, 32]
> (gdb) bt
> #0  memcpy () at /home/ronald/dev/freebsd/src/contrib/arm-optimized-routi=
nes/string/aarch64/memcpy.S:171
> #1  0x0000000000218be4 in ?? ()
> #2  0x0000000400000000 in ?? ()
> Backtrace stopped: previous frame identical to this frame (corrupt stack?=
)
> (gdb)
>=20
>=20
> Some output of "truss -o /tmp/lsof.txt lsof":
>=20
> __sysctl("kern.proc.filedesc.1",4,0x0,0x80ba06f0,0x0,0) =3D 0 (0x0)
> __sysctl("kern.proc.filedesc.1",4,0x851d6000,0x80ba06f0,0x0,0) =3D 0 (0x0=
)
> __sysctl("kern.proc.filedesc.385",4,0x0,0x80ba06f0,0x0,0) =3D 0 (0x0)
> __sysctl("kern.proc.filedesc.385",4,0x8516ec00,0x80ba06f0,0x0,0) =3D 0 (0=
x0)
> __sysctl("kern.proc.filedesc.97537",4,0x0,0x80ba06f0,0x0,0) =3D 0 (0x0)
> __sysctl("kern.proc.filedesc.97537",4,0x8516ec00,0x80ba06f0,0x0,0) =3D 0 =
(0x0)
> statfs("/data/jails/jail13/_root/home/root/dev/workspace/FreeBSD-Ports-13=
/_root/usr/local/poudriere/data/.m/freebsd13-custom/04/bin/sh",{ fstypename=
=3Dnullfs,mntonname=3D/data/jails/jail13/_root/home,mntfromname=3D/data/jai=
ls/_home,fsid=3D3cff022929000000 }) =3D 0 (0x0)
> statfs("/data/jails/jail13/_root",{ fstypename=3Dnullfs,mntonname=3D/data=
/jails/jail13/_root,mntfromname=3D/data/jails/freebsd13,fsid=3D37ff02292900=
0000 }) =3D 0 (0x0)
> statfs("/data/jails/_home3root/dev/workspace/FreeBSD-Ports-13/_root/usr/l=
ocal/poudriere/data/.m/freebsd13-custom/04/bin/sh",0x80b9ef40) ERR#2 'No su=
ch file or directory'
> statfs("/data/jails/_home3root/dev/workspace/FreeBSD-Ports-13/_root/usr/l=
ocal/poudriere/data/.m/freebsd13-custom/04/wrkdirs/usr/ports/devel/cmake-co=
re/work/cmake-3.24.3/Source",0x80b9ef40) ERR#2 'No such file or directory'
> statfs("/data/jails/_home3root/dev/workspace/FreeBSD-Ports-13/_root/usr/l=
ocal/poudriere/data/.m/freebsd13-custom/04",0x80b9ef40) ERR#2 'No such file=
 or directory'
> statfs("/data/jails/freebsd13ovt",0x80b9ef40)    ERR#2 'No such file or d=
irectory'
> SIGNAL 11 (SIGSEGV) code=3DSEGV_MAPERR trapno=3D36 addr=3D0x80ba1000
> process killed, signal =3D 11 (core dumped)
>=20
>=20
> I'm surprised that the path names in the truss output are corrupted: _hom=
e3root should be _home/root.
>=20
> NB: I'm using lsof while running poudriere in a jail in a Jenkins agent.
>=20
> Regards,
> Ronald.
>=20
>=20

I think this is a bug in lsof and the optimized memcpy routine is doing wha=
t it is asked to do, copy into a block of memory that the caller does not h=
ave write access to.  The faulting data address 0x80ba1000 is at the start =
of a page.  The faulting instruction address is in the middle of a block of=
 code that writes to successively increasing addresses.  The destination po=
inter passed to memcpy must be valid or the function would have crashed ear=
lier.  But the end address is out of bounds, meaning the size is wrong.  If=
 you can get the program in a debugger again, or you can find a core file, =
check the value of register x2 ("count" in assembly code).  If that is huge=
 then you have an uninitialized or otherwise invalid third argument to memc=
py.

In a jail system calls to determine the current filesystem behave different=
ly.  The odd path names may be symptoms of jail-induced confusion.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FE629840-F87B-414C-B445-354986749950>