Date: Fri, 7 Feb 2014 16:12:56 -0800 From: Adrian Chadd <adrian@freebsd.org> To: "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>, FreeBSD Net <freebsd-net@freebsd.org> Subject: flowtable, collisions, locking and CPU affinity Message-ID: <CAJ-VmonNCzFED=20_C2fV1g1jvFNRE=N-H%2B09Wb2OdxdzHp9JQ@mail.gmail.com>
next in thread | raw e-mail | index | archive | help
Hi, I've been knee deep in the flowtable code looking at some of the less .. predictable ways it behaves. One of them is the collisions that do pop up from time to time. I dug into it in quite some depth and found out what's going on. This assumes it's a per-CPU flowtable. * A flowtable lookup is performed, on say CPU #0 * the flowtable lookup fails, so it goes to do a flowtable insert * .. but since in between the two, the flowtable "lock" is released so it can do a route/adjacency lookup, and that grabs a lock * .. then the flowtable insert is done on a totally different CPU * .. which happens to _have_ the flowtable entry already, so it fails as a collision which already has a matching entry. Now, the reason for this is primarily because there's no CPU pinning in the lookup path and if there's contention during the route lookup phase, the scheduler may decide to schedule the kernel thread on a totally different CPU to the one that was running the code when the lock was entered. Now, Gleb's recent changes seem to have made the instances of this drop, but he didn't set out to fix it. So there's something about his changes that has changed the locking/contention profile that I was using to easily reproduce it. In any case - the reason it's happening above is because there's no actual lock held over the whole lookup/insert path. It's a per-CPU critical enter/exit path, so the only way to guarantee consistency is to use sched_pin() for the entirety of the function. I'll go and test that out in a moment and see if it quietens the collisions that I see in lab testing. Has anyone already debugged/diagnosed this? Can anyone think of an alternate (better) way to fix this? Thanks, -a
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmonNCzFED=20_C2fV1g1jvFNRE=N-H%2B09Wb2OdxdzHp9JQ>