Date: Sun, 6 Dec 2020 12:51:32 +0100 From: Michal Meloun <meloun.michal@gmail.com> To: Mark Millard <marklmi@yahoo.com> Cc: Marcel Flores <marcel@brickporch.com>, freebsd-arm@freebsd.org Subject: Re: ThunderX Panic after r368370 Message-ID: <91654fc4-8734-d8a7-5309-0400f418438a@freebsd.org> In-Reply-To: <56F0E9EB-0B78-4B0B-830A-48F8AFC5ABE1@yahoo.com> References: <1C3442ED-278E-45B8-9206-0DD24FCBC237@brickporch.com> <4331eee0-74a6-565c-3bec-0051415b2bc1@freebsd.org> <56F0E9EB-0B78-4B0B-830A-48F8AFC5ABE1@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 06.12.2020 10:47, Mark Millard wrote: > > > On 2020-Dec-6, at 00:17, Michal Meloun <meloun.michal at gmail.com> wrote: > >> On 06.12.2020 3:21, Marcel Flores wrote: >>> Hi All, >>> Looks like the ThunderX started panicking at boot after r368370: >>> https://reviews.freebsd.org/rS368370 >>> From a verbose boot, it looks like it bails in gic0 redistributor setup(?): >>> gic0: CPU29 Re-Distributor woke up >>> gic0: CPU24 enabled CPU interface via system registers >>> gic0: CPU17 enabled CPU interface via system registers >>> gic0: CPU29 enabled CPU interface via system registers >>> done >>> Full Verbose boot: >>> https://gist.github.com/mesflores/f026122495c8494d041bce04d30b15bb >>> I'm not really familiar with the details of the commit, but happy to test >>> anything if anyone has any ideas. >> >> >> Hi Marcel >> are you able to get crashdump and do backtrace? >> https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html#kerneldebug-obtain >> and >> https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html >> If not, I'll make some debug patch. >> >> It's weird, even though GIC is potentially affected by my patch, in this case the cpuid numbering was not changed. > > (I've no access to a ThunderX. I just looked for my own curiosity. > Sorry if this is obvious and so is noise.) > > When I looked at the code it appeared to be the last "->" in > the following that was dereferencing the nullptr value (via [x8] > in assembler notation): > > static uint64_t > its_cmd_prepare(struct its_cmd *cmd, struct its_cmd_desc *desc) > { > uint64_t target; > uint8_t cmd_type; > u_int size; > > cmd_type = desc->cmd_type; > target = ITS_TARGET_NONE; > > switch (cmd_type) { > case ITS_CMD_MOVI: /* Move interrupt ID to another collection */ > target = desc->cmd_desc_movi.col->col_target; > . . . > > In other words: it appeared to me that the above desc->cmd_desc_movi.col > evaluated as 0 when used in what was reported. > This is very probably right analysis. But problem is that cmd_desc_movi.col should not be NULL, is initialized in its_cmd_movi from sc->sc_its_cols which should be allocated in gicv3_its_attach(). Marcel, can you, please also try this debug patch? https://github.com/strejda/freebsd/commit/a25ed736644b05672e3e813891af213c280daac3 Unfortunately, I have only single socket board with GIv3, Honeycomb, but it still boots fine. Thanks, Michal
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?91654fc4-8734-d8a7-5309-0400f418438a>