Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Jun 2020 21:44:41 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Andriy Gapon <avg@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r362294 - head/sys/dev/sound/pci/hda
Message-ID:  <20200622184441.GA1783@kib.kiev.ua>
In-Reply-To: <df3a1e48-328d-c216-623f-eb429babfad3@FreeBSD.org>
References:  <202006180612.05I6C6EE013037@repo.freebsd.org> <20200621215945.GA2057@kib.kiev.ua> <df3a1e48-328d-c216-623f-eb429babfad3@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jun 22, 2020 at 09:37:15AM +0300, Andriy Gapon wrote:
> On 22/06/2020 00:59, Konstantin Belousov wrote:
> > This commit (or rather, a merge of this commit to stable/12) causes an issue
> > on my Apollo Lake machine.  Look:
> > hdac0@pci0:0:14:0:   class=0x040300 card=0xa1821458 chip=0x5a988086 rev=0x0b hdr=0x00
> >     vendor     = 'Intel Corporation'
> >     device     = 'Celeron N3350/Pentium N4200/Atom E3900 Series Audio Cluster'
> >     class      = multimedia
> >     subclass   = HDA
> >     bar   [10] = type Memory, range 64, base 0x91410000, size 16384, enabled
> >     bar   [20] = type Memory, range 64, base 0x91000000, size 1048576, enabled
> >     cap 01[50] = powerspec 3  supports D0 D3  current D0
> >     cap 09[80] = vendor (length 20) Intel cap 15 version 0
> >     cap 05[60] = MSI supports 1 message, 64 bit
> > 
> > DMAP base is 0xfffff80000000000.
> > tom% sudo /usr/local/bin/kgdb ~/work/DEV/svn/RELENG_12/sys/amd64/compile/TOM/kernel.full /dev/mem
> > GNU gdb (GDB) 9.2 [GDB v9.2 for FreeBSD]
> > Reading symbols from /usr/home/kostik/work/DEV/svn/RELENG_12/sys/amd64/compile/TOM/kernel.full...
> > ...
> > sched_switch (td=0xfffff8029d516000, newtd=0xfffff800025f4000,
> >     flags=<optimized out>) at ../../../kern/sched_ule.c:2143
> > 2143                    cpuid = PCPU_GET(cpuid);
> > (kgdb) p/x *(unsigned int *)(0xfffff80000000000+0x91410000+0x24) INTSTS
> > $1 = 0xc0000000
> > This causes the first interrupt handler to loop forever.
> > 
> > And this is why:
> > (kgdb) p/x *(unsigned char *)(0xfffff80000000000+0x91410000+0x5d) RIRBSTS
> > $3 = 0x0
> > (kgdb) p/x *(unsigned char *)(0xfffff80000000000+0x91410000+0x4d) CORBSTS
> > $4 = 0x0
> > (kgdb) p/x *(unsigned short *)(0xfffff80000000000+0x91410000+0xe) WAKESTS
> > $5 = 0x5
> > So the SDIN State Change status bit is set, and nothing in the driver
> > clears it, it seems.
> > 
> > More, the EDS specifies that another reason for INTSTS CIS bit set may
> > be the CORB Memory Error Indication (CMEI), and again it seems that driver
> > never clears the reason.
> > 
> > So as is the change perhaps helps for some situations, but also
> > practically makes some systems to loose core in tight loop with interrupt
> > priority, which causes weird machine misbehavior like unkillable processes
> > and never finishing epochs.
> 
> 
> Kostik,
> 
> thank you for the report and the debugging!
> 
> It seems that the driver currently does do not much with the register 0xe
> (referred to by you as WAKESTS, by the code and the HDA specification as STATESTS).
> I think that we can just clear that register (documented as RW1C)
> More, I am surprised that the driver does not clear the register in
> hdac_attach2() when probing codecs.
> 
> So, maybe something like a quick patch below.
> 
> Also, I see that the driver never explicitly manages WAKEEN register.
> I think that it should.
> If we do not need or do not expect any wake events then we should clear the
> corresponding bits.
> So, maybe it would be better to just zero WAKEEN in attach and resume methods.
> Although, WAKEEN is documented as zero after reset, the specification has this note:
>   These bits are only cleared by a power-on reset. Software must make no
>   assumptions about how these bits are set and set them appropriately.
> 
> 
> Regarding CMEI, we never enable it by setting CMEIE and it should be zero after
> reset.  So, we should never get that interrupt.
> The same applies to another potential interrupt source, RIRBOIS / RIRBOIC.
> 
> Index: sys/dev/sound/pci/hda/hdac.c
> ===================================================================
> --- sys/dev/sound/pci/hda/hdac.c	(revision 362383)
> +++ sys/dev/sound/pci/hda/hdac.c	(working copy)
> @@ -312,6 +312,10 @@ hdac_one_intr(struct hdac_softc *sc, uint32_t ints
> 
>  	/* Was this a controller interrupt? */
>  	if (intsts & HDAC_INTSTS_CIS) {
> +		/* XXX just clear SDIWAKE bits. */
> +		HDAC_WRITE_2(&sc->mem, HDAC_STATESTS,
> +		    HDAC_STATESTS_SDIWAKE_MASK);
> +
>  		rirbsts = HDAC_READ_1(&sc->mem, HDAC_RIRBSTS);
>  		/* Get as many responses that we can */
>  		while (rirbsts & HDAC_RIRBSTS_RINTFL) {
> @@ -1530,6 +1534,7 @@ hdac_attach2(void *arg)
>  		device_printf(sc->dev, "Scanning HDA codecs ...\n");
>  	);
>  	statests = HDAC_READ_2(&sc->mem, HDAC_STATESTS);
> +	HDAC_WRITE_2(&sc->mem, HDAC_STATESTS, statests);
>  	hdac_unlock(sc);
>  	for (i = 0; i < HDAC_CODEC_MAX; i++) {
>  		if (HDAC_STATESTS_SDIWAKE(statests, i)) {
> 

The patch worked, thanks.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200622184441.GA1783>