Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Nov 1995 11:52:43 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        jkh@time.cdrom.com (Jordan K. Hubbard)
Cc:        current@FreeBSD.org
Subject:   Re: ISP state their FreeBSD concerns
Message-ID:  <199511131852.LAA17000@phaeton.artisoft.com>
In-Reply-To: <14486.816158986@time.cdrom.com> from "Jordan K. Hubbard" at Nov 11, 95 10:49:46 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> 
> First off, I really have to wonder why Frank Durda went to what was
> obviously a considerable effort to be anonymous.  The message had no
> signature and has been sent with a "From" of -current.  C'mon Frank,
> we don't bite!  Valid criticism always has a place in these lists and
> going out of your way to mask its origin only calls its validity into
> question unnecessarily.  If I didn't know you were the only poster
> we've ever had from fw.ast.com, I might have thought this was
> something with a deeper political agenda from one of the other *BSD
> advocates and ignored it.

Frank obviously includes his signiture manually, and simply did the
incantation as the last thing before sending -- and happened to be in
insert mode.  I don't think he was hiding.

> > 1.	A concern that FreeBSD tends to "bind" for brief periods when
> > 	loaded. Here is how it was described to me:  You will be doing
> > 	something (like skimming news articles in trn or tin) that is
> 
> This one really needs some additional context.  Suffice it to say that
> I've not experienced such behavior myself, and that's about all one
> can say with a fairly subjective report like this.  That's not to say
> that it's totally lacking in validity, but consider what you yourself
> would do with a user report that "the system seemed slow."

Try "the system appears to halt for a few seconds".

To repeat:

Build a system with:

	P90 with ASUS motherboard
	16M of physical memory or less
	A lot of swap
	An NCR SCSI controller
	1G or more of disk

Then:

	Build a bunch of kernels
	Build world once or twice
	(get that cache nice and full and fragmented)
	Pop up an xterm
	Run Elm on a couple of large mailboxes

It's infinitely repeatable.  I do it 3-4 times a week.

Typing "sync" fixes the problem for a while, as does rebooting.

I believe it to be an over-caching issue.  Maybe there needs to
be a high/low watermarking on how much of memory is allowed to be
used vor buffer cache vs. VM.  Or a page reserver for VM that is not
allowed to be allocated for buffers.  Etc.

It seems to be a starvation deadlock.


> > 4.	A concern about problems related to filesystem stacks, such as
> > 	ISO9660 and DOS.  (They may be talking about Samba, and not
> > 	the actual mounting of DOS filesystems.)  One of the ISO9660
> > 	issues is the one I reported last week where a plain user
> > 	can easily break all mounted ISO9660 access and hang all processes
> > 	that attempt to access the ISO filesystems.   (We discovered this
> > 	on one of the local ISPs archive server - they were not pleased.)
> > 	These guys with lots of CD-ROMs mounted on their systems don't
> > 	like the sound of that remaining broken for any period of time,
> > 	particularly when it ripples to other systems via NFS.
> 
> DOS filesystem support is broken.  I would not be surprised to learn that
> the ISO9660 support was full of mice as well.  Neither of these things
> will be fixed until either:
> 
> 	1. Somebody volunteers the time.

I would be happy to fix this.  Unfortunately, the relookup() is the
main failure mode for this (from my investigations), and it seems that
there are some indeterminate states that the VFS can get into because
of the layering abstraction being broken.

Ultimately, I'd like to get rid of relookup() entirely by propagating
the lock mechanism up and allowing lock reentrancy in other than the
ffs_readwrite.c case.

> > 6.	Multi-port serial support and even single-port serial support.
> > 	Seems they feel hardware flow control doesn't work or isn't enabled
> > 	when it should be (or can't be) or both.  This may actually be
> > 	an issue with the 16550 drivers not setting silo depth to a
> > 	reasonable level.   Most of these guys use terminal servers for
> > 	the actual users where these local serial ports are used for UPS,
> > 	router control, serial printers, and other in-house controls.  They
> > 	complain of seeing problems with lost data outbound when flow
> > 	control was in use, including during UUCP sessions.  
> 
> I use a standard serial port for a 115.2K ISDN connection and it works
> *great*, even better than a friend of mine who's using a Cisco for the
> purpose and looks enviously at my 10.5K/sec FTP transfer rates when
> he's only getting 7K.  Not to say that this scales to hundreds of
> serial ports or anything, but it does work!

He didn't identify hardware vs. software flow control.  If software, then
his observation about SILO depth is well taken.  Your ^S could be in limbo
for quite some time before it is seen.  A ^Q in limbo is a much worse
problem, actually.

> > 	They point to Linux, where the core seems to be going through
> > 	few changes (what about the "FT" guys?), or a purchased system
> > 	(SUN/SCO) which sees a maintenance release that only alters a small
> 
> 1. Linux is hardly unchanging, the FT effort being still mostly on the
>    drawing board.

This was a counterexample of "tinkering under the hood" on Linux.

> 2. SCO doesn't change because it's been *moribund* for years!  They started
>    with a mediocre system and enshrined it.  You won't get very far with
>    me by holding them up as any sort of paragon to be emulated.  I've done
>    my time with SCO, from both ends of the picture, and all of it was
>    unilaterally horrible.

Commercial release processes have advantages and disadvantages.  The BSD
release processes are nowhere near commercial quality on QA/QC, but they
frequently include much more than a typical commercial release, which is
simply a different tradeoff between sureity and stagnation than that which
SCO makes.

> > 8.	File creation (particularly directories) appears to be slow compared
> > 	to other BSD-like systems.  They say the stats for INN and CNEWS
> > 	for articles processed per second are quite a bit lower than that
> > 	on some "other" systems.  They say that file deletion seems to be
> > 	a bit slower than BSDI, but not by much.  I think they are talking
> > 	2.0.5 on this item, although one ISP was experimenting with 1026 SNAP.
> 
> It would help if they could percolate this down to some benchmarks
> that could be easily run by the people responsible for enhancing said
> performance.  It's hard to speed up what you can't reproduce without
> being on-site with the ISP.

I think the University of Michigan EECS's paper on "Metadata Update
Performance in File Systems" (USENIX Symposium on Operating Systems
Design and Implementation, November 1994, pp. 49-60) is well taken in
this regard.  They used BSDI as their base platform, so their code
will need rewriting before they can release it.  It wants ~1500 lines
of changes to the buffer cache code.

> > 10.	More support for high-end hardware.  I put this last because
> > 	it is one of the harder things for FreeBSD to do much about
> > 	since hardware vendors don't always want to tell us the tricks
> 
> We're working on it, though if some people want to DONATE said
> high-end hardware to us so that we have a better chance of actually
> doing it, it would be a great help.w  These people don't expect us to
> go out of pocket for $8K multi-CPU P6 machines now do they?  Your ISPs
> seem to expect a lot if so! :-)

Depending on the hardware you insist on having with it, a 2 CPU P90 box
costs on the order of $2000.  A 2 CPU PPC-66 box costs on the order of
$2600.

That said, there does not seem to be much interest in integrating changes
that move toward SMP and other high end hardware support.  Full on SMP
support will require many incremental moves in the direction of kernel
multithreading before high grain parallelism can be put into place.  One
of the main issues here is enabling the code for subsystem level mutex
based locking by simplifying the task of maintaining lock state; the
best was to achieve this is to enable the code for routine entrancy/exit
locking to allow incremental enabling of reentrancy at the system call
layer.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199511131852.LAA17000>