Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 5 Sep 1996 11:23:06 +0200 (SAT)
From:      rv@groa.uct.ac.za (Russell Vincent)
To:        smp@freebsd.org
Subject:   SMP on Intel XXpress
Message-ID:  <m0uyaeS-0004vdC@groa.uct.ac.za>

next in thread | raw e-mail | index | archive | help
Steve and I have come across another problem while getting the
FreeBSD SMP code working on the Intel XXpress and I thought I
would bounce it off the list:

We seem to be able to start the second CPU now.

 sysctl -w kern.smp_active=2

starts the CPU, but the machine appears to freeze at this point.
Steve added some debug code, which shows that the processors are
still running.  What was added was the following (compressed a little
to save space). More comments follow the included code.

------------------------------------------------
--- sys/i386/i386/machdep.c
void pp1( int d );
void pp2( int d, int x );
static int pptimes = 0;
extern int smp_active;

pp1( int d )
{
    if ( smp_active != 2 ) return;
    if ( pptimes > 0 ) { if ( d == 9 ) --pptimes; return; }
    printf( " --- point: %d ---\n", d );
    if ( d == 9 ) pptimes = PP_PASSES;
}

void
pp2( int d, int x )
{
    if ( smp_active != 2 ) return;
    if ( pptimes > 0 ) { if ( d == 9 ) --pptimes; return; }
    printf( " --- point: %d, val: 0x%08x ---\n", d, x );
    if ( d == 9 ) pptimes = PP_PASSES;
}

-------------------------------------------------
--- sys/i386/i386/swtch.s

/*
 * cpu_switch()
 */
ENTRY(cpu_switch)
        
#ifdef MY_DEBUG
        pushl $1; call _pp1; addl $4,%esp
        GETCPUID(%ecx)
        pushl %ecx; pushl $2; call _pp2; addl $4,%esp; popl %ecx
#endif  /** MY_DEBUG */
        /* switch to new process. first, save context as needed */
        GETCURPROC(%ecx)
#ifdef MY_DEBUG
        pushl %ecx; pushl $3; call _pp2; addl $4,%esp; popl %ecx
#endif  /** MY_DEBUG */

        /* if no process to save, don't bother */
        testl   %ecx,%ecx
        je      sw1

[ Code deleted for this message ]

swtch_com:
#ifdef MY_DEBUGxx
        pushl %ecx; pushl $4; call _pp2; addl $4,%esp; popl %ecx
        /* pushfl; pushl $5; call _pp2; addl $8,%esp  ** Causes panic - rv ** */
#endif  /** MY_DEBUG */
        movl    $0,%eax
        movl    %eax,_want_resched

#ifdef  DIAGNOSTIC

[ More code deleted for this message ]

        movb    $0,_intr_nesting_level
#ifdef SMP
        movl    _apic_base, %eax
        movl    APIC_ID(%eax), %eax
#ifdef MY_DEBUG
        andl    $0x0f000000, %eax
# ifdef XXPRESS
        je 9f
        shrl $24,%eax
        decl %eax
        shll $24,%eax
9:
# endif  /** XXPRESS */
#else
        andl    $0xff000000, %eax
#endif  /** MY_DEBUG */
        orl     PCB_MPNEST(%edx),%eax
        movl    %eax, _mp_lock
#endif /* SMP */

[ More code deleted for this message ]

2:
#endif

#ifdef MY_DEBUGxx
        pushl $9; call _pp1; addl $4,%esp
#endif  /** MY_DEBUG */
        sti
        ret

-------------------------------------------------

After adding that code, the console prints out the 'point'
lines, with what appear to be acceptable values. What was even
more interesting, was that the machine then became responsive,
albeit a little sluggish.

Disabling the 'MY_DEBUG' code, causes the machine to go back
into it's 'frozen' state. Enabling 'MY_DEBUG' but commenting
out the printf statements in pp1() and pp2() also causes the
machine to 'freeze'.

Increasing 'PP_PASSES' for pp1() and pp2() makes the 'point' lines
print out a lot slower, this in turn made the machine even more
sluggish.

This, to me, sounds as if the machine is no longer handling interrupts
when the second processor is enabled with sysctl. What increases my
suspicion is that I can adjust the ping reponse times to the machine
by adjusting 'PP_PASSES' and hence the frequency that the printf
statements are called. i.e: It looks like the call to 'printf' is
enabling the interrupts temporaily. The amount that is actually
printed in the printf also makes a difference.

More confirmation is that I have a little script that times a series
of processes, starting with a single, then dual simultaneous processes,
followed by triple, followed by quad. Each process is a small C
program in a tight loop. A sample result after a sysctl is:

Single:
        3.95 real         0.00 user         1.95 sys
Dual:
        3.77 real         3.76 user         0.00 sys
        3.87 real         3.75 user         0.05 sys
Triple:
        3.77 real         0.00 user         3.76 sys
        5.68 real         3.72 user         0.05 sys
        5.74 real         3.60 user         0.17 sys
Quad:
        3.79 real         3.76 user         0.00 sys
        6.33 real         3.70 user         0.07 sys
        7.58 real         3.72 user         0.06 sys
        7.66 real         3.67 user         0.09 sys

When only a single processor is running, the times show the
doubling, tripling, etc of the times, which is correct. An
aside comment is that the times are about 3.7 real without
the SMP code in src/sys indicating that things are pretty
efficient.  :-)  (Also remember that the kernel is sending
streams of printf's to the console while dual processors are
alive, using up some CPU)

I realise that this is pretty vague, but it does show that only
user response (i.e: interrupt driven code) is the only code
being affected by the 'sluggishness' - both processors seem
to be running perfectly.

Have I made enough sense for anyone to prod us in the right
direction?  Shout if you need any more code samples or info.

 -Russell




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m0uyaeS-0004vdC>