Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Sep 2009 16:56:17 +0000 (UTC)
From:      Attilio Rao <attilio@FreeBSD.org>
To:        cvs-src-old@freebsd.org
Subject:   cvs commit: src/sys/kern sched_ule.c
Message-ID:  <200909151656.n8FGuZ9P037542@repoman.freebsd.org>

next in thread | raw e-mail | index | archive | help
attilio     2009-09-15 16:56:17 UTC

  FreeBSD src repository

  Modified files:
    sys/kern             sched_ule.c 
  Log:
  SVN rev 197223 on 2009-09-15 16:56:17Z by attilio
  
  Fix sched_switch_migrate():
  - In 8.x and above the run-queue locks are nomore shared even in the
    HTT case, so remove the special case.
  - The deadlock explained in the removed comment here is still possible
    even with different locks, with the contribution of tdq_lock_pair().
    An explanation is here:
    (hypotesis: a thread needs to migrate on another CPU, thread1 is doing
    sched_switch_migrate() and thread2 is the one handling the sched_switch()
    request or in other words, thread1 is the thread that needs to migrate
    and thread2 is a thread that is going to be preempted, most likely an
    idle thread. Also, 'old' is referred to the context (in terms of
    run-queue and CPU) thread1 is leaving and 'new' is referred to the
    context thread1 is going into.  Finally, thread3 is doing tdq_idletd()
    or sched_balance() and definitively doing tdq_lock_pair())
  
    * thread1 blocks its td_lock. Now td_lock is 'blocked'
    * thread1 drops its old runqueue lock
    * thread1 acquires the new runqueue lock
    * thread1 adds itself to the new runqueue and sends an IPI_PREEMPT
      through tdq_notify() to the new CPU
    * thread1 drops the new lock
    * thread3, scanning the runqueues, locks the old lock
    * thread2 received the IPI_PREEMPT and does thread_lock() with td_lock
      pointing to the new runqueue
    * thread3 wants to acquire the new runqueue lock, but it can't because
      it is held by thread2 so it spins
    * thread1 wants to acquire old lock, but as long as it is held by
      thread3 it can't
    * thread2 going further, at some point wants to switchin in thread1,
      but it will wait forever because thread1->td_lock is in blocked state
  
  This deadlock has been manifested mostly on 7.x and reported several time
  on mailing lists under the voice 'spinlock held too long'.
  Many thanks to des@ for having worked hard on producing suitable textdumps
  and Jeff for help on the comment wording.
  
  Reviewed by:    jeff
  Reported by:    des, others
  Tested by:      des, Giovanni Trematerra
                  <giovanni dot trematerra at gmail dot com>
                  (STABLE_7 based version)
  
  Revision  Changes    Path
  1.258     +11 -11    src/sys/kern/sched_ule.c



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200909151656.n8FGuZ9P037542>