From owner-freebsd-chat  Thu Apr 10 01:26:46 1997
Return-Path: <owner-chat>
Received: (from root@localhost)
          by freefall.freebsd.org (8.8.5/8.8.5) id BAA23725
          for chat-outgoing; Thu, 10 Apr 1997 01:26:46 -0700 (PDT)
Received: from papillon.lemis.de ([203.239.92.5])
          by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id BAA23720;
          Thu, 10 Apr 1997 01:26:38 -0700 (PDT)
Received: (grog@localhost) by papillon.lemis.de (8.8.4/8.6.12) 
       id RAA01491; Thu, 10 Apr 1997 17:25:40 +0900 (KST)
From: grog@lemis.de
Message-Id: <199704100825.RAA01491@papillon.lemis.de>
Subject: Re: Informix efficiency--any ideas?
In-Reply-To: <199704100747.RAA01734@genesis.atrad.adelaide.edu.au> from Michael Smith at "Apr 10, 97 05:17:34 pm"
To: msmith@atrad.adelaide.edu.au (Michael Smith)
Date: Thu, 10 Apr 1997 17:25:40 +0900 (KST)
Cc: chat@FreeBSD.ORG (FreeBSD Chat)
Organisation: 	LEMIS, Schellnhausen 2, 36325 Feldatal, Germany
Phone: +49-6637-919123
Fax: +49-6637-919122
Reply-to: grog@lemis.de (Greg Lehey)
WWW-Home-Page: http://www.FreeBSD.org/~grog
X-Mailer: ELM [version 2.4ME+ PL28 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-chat@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

Michael Smith writes:
> grog@lemis.de stands accused of saying:
>>
>> We did some examination and found that the port of Informix on A
>> performs 21729 system calls to insert 5000 records into a table.  Of
>> these, 21403 calls are to semsys.  B performs approximately half this
>> number of calls, equally split (what a surprise) between send() and
>> recv(): although it's System V.4, it has a native sockets
>> implementation.
>>
>> I'm obviously following this up with Informix and vendor A, but I'd be
>> interested if anybody here had a view with a different bias.  Do
>> semaphores have to be so much less efficient, or could this just be a
>> poor semxxx() implementation on A?
>
> I don't think it's that the semaphores are less efficient, but that there
> is more semaphore activity required to transfer the given data.

Sure, there's that.

> At a guess, a record transaction with semaphores goes something like :
>
>  writer puts data in shared memory area, raises 'data is there' semaphore.
>  reader wakes on semaphore, raises 'I am busy with your data' semaphore
>  reader finishes with data, lowers 'I am busy with your data' semaphore
>  writer wakes on semaphore, repeat.
>
> That's 4 system calls for a single transaction.
>
> With sockets, I would expect :
>
>  writer calls send() with data, repeat
>  reader returns from recv() with data, repeat
>
> ie. two calls, and if socket buffering works the two will be decoupled
> as well, reducing the impact of the (potentially) less efficient
> send/recv by eliminating the context switches between transaction
> cycles.

Well, that's the question.  This particular program ran for 2 minutes
on platform A, and one minute on platform B, a pretty constant rate of
175 system calls per second on both, not exactly likely to swap a
machine of this class.  This suggests to me that there's something
seriously wrong somewhere.

> TBH, I don't know how the various SMP implementations would come into
> play here; specifically on the send/recv model I would expect
> (assuming an idle system) that one half would be running on each CPU,
> ie. the long-term throughput would be that of the slowest component
> rather than of the system as a whole.  In contrast, the lock-step
> model using semaphores basically ensures that a single client/server
> pair runs as a single logical thread.

Now that's an interesting thought.  Basically, the socket
implementation doesn't even need to be as fast as the semaphore
implementation (*if* you're restricting your semaphore counter to 1.
IIRC, you can have more than one on System V), since it can queue.
That might explain why we were getting up to 30% idle on System A.

In the meantime, we have found some more documentation.  It seems that
there are "protocols" for talking to the server.  One is called
tlitcp, and the other is called ipcshm.  System A was using ipcshm,
and system B was using tlitcp, so now we're repeting a (longer, more
accurate) benchmark the other way round.

Greg