From owner-freebsd-hackers@FreeBSD.ORG Mon Sep 15 14:30:41 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D9E3316A4BF for ; Mon, 15 Sep 2003 14:30:41 -0700 (PDT) Received: from nospam.dyndns.dk (c-180-196-86.ka.dial.de.ignite.net [62.180.196.86]) by mx1.FreeBSD.org (Postfix) with ESMTP id E5E6D43FE3 for ; Mon, 15 Sep 2003 14:30:00 -0700 (PDT) (envelope-from bounce@NOSPAM.dyndns.dk) Received: from Mail.NOSPAM.DynDNS.dK (ipv6.NetScum.dyndns.dk [2002:3eb4:c456:0:220:afff:fed4:dbcb]) dastardly.newsbastards.org.72.27.172.IN-addr.ARPA.NetScum.dyndns.dk (8.11.6/8.11.6-SPAMMERS-DeLiGHt) with ESMTP id h8FLRrG71221 verified NO) for ; Mon, 15 Sep 2003 23:27:54 +0200 (CEST) (envelope-from bounce@NOSPAM.dyndns.dk) Received: (from root@localhost) by Mail.NOSPAM.DynDNS.dK (8.11.6/FNORD) id h8FLRrv71220; Mon, 15 Sep 2003 23:27:53 +0200 (CEST) (envelope-from bounce@NOSPAM.dyndns.dk) Date: Mon, 15 Sep 2003 23:27:53 +0200 (CEST) Message-Id: <200309152127.h8FLRrv71220@Mail.NOSPAM.DynDNS.dK> From: Barry Bouwsma To: FreeBSD List of Hackers Subject: Machine wedges solid after one serial-port source-line addition... X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Sep 2003 21:30:42 -0000 [NOTE: IPv6-only e-mail above, so you probably want to drop me from the recipients and just send to the list, which I'll read later, as I'm not always online -- else remove just the hostname part to reveal an IPv4-aware e-mail for me that may well timeout and bounce. Sorry.] Hello gurus and the like; In the process of trying to enhance my FreeBSD kernel's PPS and related NTP timekeeping ability, I discovered I could reliably wedge my machine (two different machines, actually) solid, such that I couldn't break into the kernel debugger and the NumLock key wouldn't toggle the LED, and only hitting the reset/power switch could return me to sanity. Thinking it was a problem with the logic of my added code, I pruned things and realized a single printf() line would cause my machine to hang within a few minutes of boot; of course, with a PPS source (radio clock) connected to the serial port to toggle the DCD line every second and trigger the printf(). I'd been stuck with STABLE-09.Dec.2002 for a while, but the same thing seems to happen as well with a RELENG_4 kernel as of a week or so ago -- at least with my hardware. Would anyone care to explain why the following simple patch could be enough to wedge my machine solid? (My original hack-patches without any console printf() debuggery did the same thing within seconds, as well...) All it does is notify the console whenever a serial port DCD PPS signal transition is detected, as follows (patch against 4.foo; I haven't tried this with 5.bar or later -- also, not a real patch as I've included context and snipped my comments) : --- /usr/local/system/src/sys/isa/sio.c Tue Sep 2 08:57:19 2003 +++ /usr/local/source-hacks/sys/isa/sio.c Tue Sep 2 18:55:31 2003 [...] @@ -1999,21 +2015,56 @@ while (!com->gone) { if (com->pps.ppsparam.mode & PPS_CAPTUREBOTH) { modem_status = inb(com->modem_status_port); if ((modem_status ^ com->last_modem_status) & MSR_DCD) { tc = timecounter; count = tc->tc_get_timecount(tc); pps_event(&com->pps, tc, count, (modem_status & MSR_DCD) ? PPS_CAPTUREASSERT : PPS_CAPTURECLEAR); + printf("DCD status change\n"); } } line_status = inb(com->line_status_port); [...] I'd be grateful for enlightenment. I'd successfully added other lines to record timestamps of other modem lines in addition to DCD (TIOCDCDTIMESTAMP) but any attempt to do anything with code comparable to the above would invariably result in a wedge within seconds to hours, from which keyboard debugger entry was ineffective. Also note that added debuggery reveals the solid wedge doesn't happen anywhere in the suspect section of code that I sprinkled with printf()s, but I haven't done enough debuggery to narrow down where it does or does not happen. I'm wondering if it's something really blindingly obvious that I should be but am not aware of, or something I gotta work on to track down. Thanks, Barry Bouwsma