Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 15 Mar 1998 22:29:54 -0000 (GMT)
From:      Duncan Barclay <dmlb@ragnet.demon.co.uk>
To:        Peter Jeremy <Peter.Jeremy@alcatel.com.au>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   RE: IDE+LPIP causing random lockups
Message-ID:  <XFMail.980315222954.dmlb@computer.my.domain>
In-Reply-To: <199803152137.IAA04231@gsms01.alcatel.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help

On 15-Mar-98 Peter Jeremy wrote:
> I have been having occasional lockups on my main system for some time
> now (it occurs with all 2.2.x releases, I can't recall if 2.1.x was
> affected).  I am wondering if anyone else has seen something similar.
> 
> Configuration:
> main machine: VLB-based 486DX2-50 with 2 IDE disks (32-bit, multiblock 16)
>       on 1 (VLB) controller, running 2.2.5-R.
> 2nd machine: Toshiba T1850 (386SX25 with 1 IDE disk) running 2.2.1-R
> 
> The machines are joined by a laplink cable and use LPIP.
> 
> The symptoms are that my main machine (only) locks up and needs a
> reset to recover.  The laptop has never been affected.  The problem
> only occurs when there is LPIP activity between the machines and seems
> to also correlate with disk activity on the main machine, and using
> ssh to transfer data.  Running XFree86 also seems to make it worse
> (but this might just be the increased disk activity associated with
> running X11 in less than infinite RAM).  Note that between talking to
> IDE disks and communicating with a slow host via LPIP, the machine
> spends a lot of time inside interrupts - often over 30% according to
> top.

Only to add that I have seen this happen in a similar set up. However I feel it 
is also to do with having a "fast" and "slow" machine. The "fast" machine
dies more frequently than the "slow" machine; and it depends on which machine
sets the sockets up. E.g. an rcp from "fast" to "slow" dies; but and rcp on
the same file from "slow" to "fast" works!

The machines are a Contura 4/25 with 8MB ram and a VESA/ISA/PCI 5x86
with 20MB both running 2.2-stable from mid summer.

I posted something on this a few months ago to c.u.b.f.m. Have a look on Deja
News (I managed to configure my news reader not to save posting, sorry).

> 
> Having (finally) gotten around to loading a kernel with DDB, I find
> that the lockup appears to be caused by sbcompress() attempting to
> add an mbuf to itself - sbappend() is called to append an mbuf to
> sockbuf that already includes that mbuf in the mbuf list.  When this
> is passed to sbcompress(), it winds up in an infinite loop.
> 
> My suspicion is that something is continuing to use an mbuf after it
> frees it, probably associated with some sort of interrupt window.
> I've added some checks in the mbuf code to try and pick this up, but
> haven't found the problem yet.  I did notice that the splXXX()
> routines don't atomically update cpl, but the associated comments (in
> i386/include/spl.h) say this is OK.
> 
> I have (less frequently) seen kernel page faults with addresses like
> 0xf400xxxx, 0x79xxxxxx.  I haven't looked into these yet.  I'm hoping
> they are caused by the mbuf's being used for two things at once.
> 
> Does anyone have any ideas?

You've got a lot further than I did in tracking it down!

> Peter
> --
> Peter Jeremy (VK2PJ)                    peter.jeremy@alcatel.com.au
> Alcatel Australia Limited
> 41 Mandible St                          Phone: +61 2 9690 5019
> ALEXANDRIA  NSW  2015                   Fax:   +61 2 9690 5247
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-hackers" in the body of the message

---
________________________________________________________________________
Duncan Barclay          | God smiles upon the little children,
dmlb@ragnet.demon.co.uk | the alcoholics, and the permanently stoned.
________________________________________________________________________

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.980315222954.dmlb>