Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 13 Nov 2022 18:37:56 -0500
From:      Mark Johnston <markj@freebsd.org>
To:        Christopher Bowman <crb@chrisbowman.com>
Cc:        hackers@freebsd.org
Subject:   Re: I could use some help
Message-ID:  <Y3F/1BknnC/tpy/U@nuc>
In-Reply-To: <5C07A058-977C-4F2E-8B41-01EBAB4FF24B@chrisbowman.com>
References:  <ED4D59FC-DE1B-4591-AE51-5AA61B5647A6@chrisbowman.com> <Y3FDfpKWKjyHaWAK@nuc> <5C07A058-977C-4F2E-8B41-01EBAB4FF24B@chrisbowman.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Nov 13, 2022 at 02:48:02PM -0800, Christopher Bowman wrote:
> 
> 
> > On Nov 13, 2022, at 11:20 AM, Mark Johnston <markj@freebsd.org> wrote:
> > It's indeed a bit hard to see why commit
> > ce9c3848ff369467749f59fd24f8b9f1241e725c might cause an sdhci command
> > timeout.  I suspect the commit itself is innocent, but happens to
> > uncover a problem elsewhere by changing the order in which page
> > allocations are distributed to different consumers.  At the time of the
> > command timeouts, the kernel is still running on a single CPU so the
> > order in which pages are allocated should be fairly deterministic and
> > thus consistent across multiple boots.
> > 
> > One thing you can try is to take a releng/13.1 tree and revert just
> > ce9c3848ff369467749f59fd24f8b9f1241e725c, and/or take releng/13.0 and
> > apply just that commit.  It would be useful to know whether either of
> > the resulting kernels can boot successfully.
> > 
> > Could you also please post your kernel configuration (ARTYZ7) somewhere,
> > together with a dmesg of a successful boot?
> 
> Mark,
> 	If I checkout release/13.1.0 and revert the commit in question then my kernel boots.  I wasn’t able to checkout releng/13.1 as I get the following:
> 
> crb@eclipse:142> git checkout releng/13.1
> M	sys/vm/uma_core.c
> Already on 'releng/13.1'
> Your branch is up to date with 'origin/releng/13.1’.
> 
> I think that’s just me being incompetent with git.

It's because sys/vm/uma_core.c is modified in your local copy, and to
check out releng/13.1 needs to modify uma_core.c.  You can remove that
local modification with "git checkout -f sys/vm/uma_core.c && git
checkout releng/13.1", or just "git checkout -f releng/13.1", though the
latter will discard any other uncommitted modifications you might have
in your tree.

I'm not sure what release/13.1.0 is though, that branch doesn't appear
to exist in the official git repo.  If it points to the same commit as
releng/13.1, then ok.

> I can add the commit to release/13.1.0 if you would still like
>
> My kernel is essentially identical to the in-tree ZEDBOARD config
> with the name changed.  I can post it if truly desired.
> 
> Please find the dmesg output for 13.1 release below.
> Thanks
> Christopher

Ok, nothing really stands out to me.  It looks like mmcsd_attach() is
running, but before it can discover any partitions it issues a command
that times out, so we'll have to dig in a bit more.

Could you please try booting a problematic kernel with the hw.mmc.debug
and hw.sdhci.debug tunables set to "3", and share the resulting output
up until the hang?  It might be pretty verbose.

Separately, I wonder if you could try booting a problematic kernel with
the kern.maxphys tunable set to 131072.  Does it change anything?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Y3F/1BknnC/tpy/U>