Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 Jan 1995 01:53:04 -0500 (EST)
From:      Wankle Rotary Engine <wpaul@skynet.ctr.columbia.edu>
To:        wraith@csd.uwm.edu (Robert Michael Gorichanaz)
Cc:        hackers@FreeBSD.org, questions@FreeBSD.org
Subject:   Re: Is this a bug?!?
Message-ID:  <199501230653.BAA02642@skynet.ctr.columbia.edu>
In-Reply-To: <199501230250.UAA20330@alpha2.csd.uwm.edu> from "Robert Michael Gorichanaz" at Jan 22, 95 08:50:32 pm

next in thread | previous in thread | raw e-mail | index | archive | help
They say this Robert Michael Gorichanaz person was kidding when he wrote:
> 
> I seem to have stumbled upon a bug (?!?) in FreeBSD

What version?!?!?!

> - or maybe its a
> hardware thing - dunno.
> 
> I just finished upgrading one of my 'bsd boxes from a dx2-50 ISA to a dx2-66
> VLB motherboard.  Problems:
> 
> 	I now seem to be unable to issue a Ctrl-Alt-Del (system
> 	just beeps at me each time I hit the combo).  
> 
> 	'shutdown' results in a kernel panic:
> 
> 		panic: b_to_q to a clist with no reserved cblocks

Well there's a good chance that this particular panic isn't the result of 
a hardware problem, though it certainly is a bug. You need to supply
*all* the messages generated between the time you typed 'shutdown' and
the time when the panic message appeared. Write them down if you have
to ("I don't remember the error messages exactly" won't cut it: detail
is important). What I'm curious to know is exactly how far shutdown
gets before the system pukes. Does it do it immediately? Does it
shut down some processes first? Please be specific. For extra credit,
provide a stack trace. :)

I've been able to reliably reproduce this panic message on my system now 
that I've started looking into actually using /dev/console for something. 
Basically, I want to be able to put a getty on /dev/console (instead of 
/dev/ttyv0) so that I can boot either using the VGA display or a serial 
port as a console and have a getty pop up in the right place without 
having to modify /etc/ttys. (SunOS handles this correctly: /dev/console 
is always the console device no matter what physical I/O device you 
use.) This would be especially handy now that FreeBSD-current can be 
booted from a serial port without any special tinkering. The general 
consensus is that you aren't supposed to put a getty on /dev/console, but 
I disagree (and I certainly don't think it should result in a panic), so 
I've decided to hunt down and terminate this bug with extreme prejudice 
no matter how much of my real work I have to put aside to do it. :)

The way I've been able to duplicate the problem is as follows:

1) Boot the system with a VGA console
2) Edit /etc/ttys and activate a getty on 'console' while turning off the
   getty on 'ttyv0.'
3) Log in on 'console' as root
4) Type the following: echo kaboom > /dev/ttyv0

Alternatively, if you boot with a serial port as your console, you
can replace step 4 with: echo kaboom > /dev/ttyd0. Another way to do it 
is to fire up the X server. I've also seen the kernel panic in putc() 
because of the same problem (no reserved cblocks). Basically, if you
write to the console device while /dev/console is also open, you get
a panic.

The immediate problem is that the tty struct that eventually makes it
to ttywrite() doesn't have its clists set up correctly. cblocks are
supposed to be reserved in ttyopen() (from what I can tell). After
finally building a kernel with debugging symbols (6 Mbytes worth!)
and 'options DODUMP,' I managed to analyze a crash dump and discovered
that writing to /dev/ttyv0 actually causes the kernel to go from
write(), to vn_write(), to ffsspec_write(), to spec_write(), to cn_write(),
to scwrite(), to ttwrite() and then to b_to_q() and a panic. The
strange thing is the call to cnwrite(): this should only happen if
you actually write to /dev/console -- a write directly to /dev/ttyv0
should not end up there. Tracing back, I discovered that 
vn->v_un->vu_specinfo->si_rdev in spec_write() was 0x00000000 where
it probably should have been 0x0c000000 (or something like that --
basically, 0x00000000 is major 0/minor 0 where it should have been
major 12/minor 0 (for syscons) instead).

My feeling is that there's a lookup function somewhere that's
getting confused and returning the major/minor numbers for /dev/console
when it should be returning the major/minor of the actual driver
device. That, or it's failing entirely and just returning 0. The trouble 
is that there's a lot of code to cover: the problem could be in vn_write(),
or it could be in vn_open() (you have to open the device before you can
write to it, right?) or somewhere in between. Kernel printf()s themselves
don't cause problems because there aren't any 'struct tty's involved.
I'm not at all sure why normal writes to /dev/console work but writes to 
/dev/ttyv0 (or /dev/ttyd0 depending on the circumstances) with 
/dev/console open will bomb. Hmmm... now that I think of it, maybe 
/dev/ttyv0 isn't being opened at all... could it be /dev/console is
actually being opened in place of /dev/ttyv0 by mistake? Maybe we're
getting the major/minor number wrong right from the start and 
ttyopen()ing /dev/console for a second time instead of ttyopen()ing 
/dev/ttyv0, and then ttwrite()ing to ttyv0 when it hasn't really been
properly open()ed yet. That would explain the bogus call to cnwrite
and uninitialized clists. I'll have to look into this tomorrow. 

> 	'reboot' 'fastboot' 'halt' etc. all lock the system just as
> 	it is about to reboot to system.  I get the "Press any key to reboot"
> 	message, and then the keyboard locks and I hafta hit the reset
> 	switch.

This I don't know about. Whenever my system panics because of the
'no reserved cblocks' problem it always reboots cleanly. I have
extremely generic hardware, however. (386DX/40 (no FPU) with 8 megs
of RAM and only IDE disks)
 
> The board I used to use had the Symphony chipset, AMI bios and an Intel
> dx-50.  This new one has a Bioteq chipset, AMI bios, and an AMD dx2-66.
> 
> MoBo functions just fine under a DOS/Windows environment (ctrl-alt-del
> works).

Rrrrr... please don't say this. In reality, nothing functions just fine
in a DOS/Windoze environment. If it did, you wouldn't be running FreeBSD.

> Cards:  Soundblaster 16
> 	Buslogic BT-545 16bit busmaster SCSI
> 	'No-name' VLB IDE controller
> 	Trident 9400CXI VLB video 1MB
> 
> This is the only panic I have experienced.  While this isnt a HUGE problem,
> I really need to be able to reboot this machine remotely (have an autodial 
> script to start a point-to-point link w/outside world).  
> 
> Has anyone had similar problems with VLB boards?  Do I have a piece of sh*t
> MoBo?

There may be a hardware problem involved, but I'm not sure how closely
related it is to this panic. From what I've read, FreeBSD uses something
of a brute force approach to reset the CPU (it actually causes the CPU
to triple fault, which shuts it down). But that doesn't happen until
*after* the panic: I can underatand your hardware objecting to the
brute force CPU reset, but that doesn't account for the panic since
that happens before cpu_reset() is called. 

If you're feeling really adventurous you can generate a crash
dump and try to trace through it to see exactly what the kernel does
that leads up to the panic. This is a little tricky, but it's the
best way to isolate the problem.

-Bill

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~T~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Bill Paul            (212) 854-6020 | System Manager
Work:         wpaul@ctr.columbia.edu | Center for Telecommunications Research
Home:  wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The Møøse Illuminati: ignore it and be confused, or join it and be confusing!
~~~~~~~~ FreeBSD 2.1.0-Development #1: Fri Jan 20 14:28:17 EST 1995 ~~~~~~~~~



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199501230653.BAA02642>