FreeBSD SMP Project

Project Goal

The FreeBSD SMP project, often referred to as SMPng (SMP next generation), is focused on implementing fine-grained SMP support for the FreeBSD 5.0 kernel (scheduled for November 2002). Due to FreeBSD's history, this is much like trying to fit a square peg into a round hole, and as such, the intermediate results aren't pretty in many ways. We are specifically not attempting to rewrite the kernel from scratch, nor are we on a crusade to fix all the architectural nits currently present in the kernel. In fact, we expect to leave a trail of architectural nits that will still be evident in many ways when FreeBSD 5.0 is released. This is a pragmatic project rather than a theoretical one; we need to have the kernel working and stable in under a year, so time restraints require that we be realistic about what to do when.

Project Plan

This web page contains information related to the effort to improve SMP support in FreeBSD. In general, this project uses what it can from the BSD/OS 5.0 development kernel, and re-implements what cannot be directly used due to divergence in the code bases.

As with any free software project, a detailed schedule is not possible. We expect to have significant performance and stability issues that need to be worked through over the first several months of the project, though every effort will be made to keep -current running as well as possible.

The task list below is not intended to be complete, but does represent a set of relevant and/or important components of the overall work. The "Responsible" field identifies a developer who has expressed willingness to be responsible for completing the identified task; this doesn't preclude others working on it, but suggests that coordination with the responsible party might be appropriate so as to avoid unnecessary duplication of work, and to maximize forward progress. If beginning work on a new area of substantial size, or one that appears unclaimed, it may be worth dropping an e-mail to the FreeBSD SMP mailing list to see if any progress has been made.

The definition of the date field varies depending on the status of a task. For completed tasks, it refers to the date completed or reported completed. For in-progress tasks, it refers to the date of the last update of the entry. For stalled tasks, it refers to the date that the task was declared stalled. For new tasks, it refers to the date the task was added to the list.

Tasks are sorted first by status, then by date.

Resources and Links

Status

Following is an incomplete list of general tasks.

Task Responsible Last updated Status
Convert the giant lock from spinning to blocking, add the scheduler lock, add per-CPU idle processes. Matt Dillon 25 June 2000 Done
Port the BSD/OS locking primitives (i386). Jake Burkholder 3 July 2000 Done
Implement heavy-weight interrupt threads (i386). Greg Lehey 3 August 2000 Done
Rewrite the low level interrupt code (i386 UP). Greg Lehey 3 August 2000 Done
Demonstrated reasonable stability (self-hosted buildworld) (i386 UP). -smp developers 12 August 2000 Done
Port the BSD/OS locking primitives (alpha). Doug Rabson 24 August 2000 Done
Stub out (disable) spl()s. Greg Lehey 30 August 2000 Done
Port the BSD/OS ktr code. Greg Lehey, John Baldwin 30 August 2000 Done
Rewrite the low level interrupt code (i386 SMP). John Baldwin 1 September 2000 Done
Demonstrated reasonable stability (self-hosted buildworld) (i386 SMP). -smp developers 6 September 2000 Done
Demonstrated reasonable stability (self-hosted buildworld) (alpha). -smp developers 6 September 2000 Done
Make malloc and friends thread-safe. Jason Evans 10 September 2000 Done
Implement msleep(), make tsleep() an msleep() wrapper. Jake Burkholder 11 September 2000 Done
Make fxp driver thread-safe. Chuck Paterson 17 September 2000 Done
Make mbuf's thread-safe. Bosko Milekic 29 September 2000 Done
Lock manager re-work. Jason Evans 3 October 2000 Done
Implement heavy-weight interrupt threads (alpha). John Baldwin, Doug Rabson 5 October 2000 Done
Rewrite the low level interrupt code (alpha). Doug Rabson, John Baldwin 5 October 2000 Done
Process accounting. Tor Egge, John Baldwin 5 October 2000 Done
Make ethernet drivers thread-safe. Bill Paul 15 October 2000 Done
Make the mutex headers mostly machine-independent. John Baldwin 20 October 2000 Done
Rename SMP_DEBUG to MUTEX_DEBUG. John Baldwin 20 October 2000 Done
Give each soft interrupt its own thread. Chuck Paterson 25 October 2000 Done
Make sf_bufs (sendfile(2)) thread-safe. Bosko Milekic 5 November 2000 Done
Make the witness code work correctly. John Baldwin 18 November 2000 Done
Split the ktr-specific code out of db_interface.c. John Baldwin 15 December 2000 Done
Convert the sio driver to using a spin mutex. John Baldwin 18 December 2000 Done
Implement condition variables. Jake Burkholder, Jason Evans 15 January 2001 Done
Add a flag to mtx_init() (MTX_RECURSE) that denotes whether a mutex is allowed to recurse. Bosko Milekic 19 January 2001 Done
Make the zone allocator thread-safe. Dag-Erling Smorgrav 21 January 2001 Done
Convert simplelocks to mutexes. Jason Evans 24 January 2001 Done
Make kernel preemptive with respect to interrupts. Jake Burkholder 31 January 2001 Done
Cleanup of mutex API. Bosko Milekic 8 February 2001 Done
Remove COM_LOCK. Mark Murray 11 February 2001 Done
Merge various scheduling classes into one run queue. Modify scheduler to support preemptable kernel. Jake Burkholder 11 February 2001 Done
Make priority propagation work correctly. Jake Burkholder 11 February 2001 Done
Make most of the interrupt thread code MI and shared between hardware and software interrupts. John Baldwin 18 February 2001 Done
Add protection to struct jail and jail-related functionality. Robert Watson 20 February 2001 Done
Implement sx (shared/exclusive) locks. Jason Evans 5 March 2001 Done
Generalize/improve witness to handle more complex locking primitives (mtx, sx). John Baldwin 28 March 2001 Done
Convert the allproc and proctree locks from lockmgr locks to sx locks. John Baldwin 28 March 2001 Done
Make mbuf system use condition variables instead of msleep()/wakeup(). Bosko Milekic 2 April 2001 Done
Remove <sys/mutex.h> includes from other kernel headers such as <vm/vm_zone.h>, <sys/resourcevar.h>, <sys/ucred.h>, and <sys/mbuf.h>. Mark Murray 15 May 2001 Done
Cleanup the various mp_machdep.c's, unify various SMP API's such as IPI delivery, etc. John Baldwin 15 May 2001 Done
Make most of the forward_* and forwarded_* functions MI. John Baldwin 15 May 2001 Done
Complete the MD support for SMP on the Alpha platform. Andrew Gallatin, Doug Rabson, John Baldwin 15 May 2001 Done
Convert select() to use condition variables. Seigo Tanimura 15 May 2001 Done
Add a "giant" lock around the VM subsystem. Alfred Perlstein 13 June 2001 Done
Introduce a modified slab allocator for the mbuf subsystem. Bosko Milekic 21 June 2001 Done
Add a witness_assert() function to handle lock assertions. John Baldwin 27 June 2001 Done
Extend sx locks to support try lock operations. John Baldwin 27 June 2001 Done
Document KTR. John Baldwin 28 June 2001 Done
Make fork_return, fork_exit, ast, and userret MI. John Baldwin 29 June 2001 Done
Make sched_lock's savecrit a per-process property saved and restored in mi_switch and initialized in fork_exit. John Baldwin 30 June 2001 Done
Make ast() loop. John Baldwin 10 August 2001 Done
Add upgrade/downgrade sx lock operations. Alexander Kabaev, Jason Evans 13 August 2001 Done
Implement semaphores. Jason Evans 14 August 2001 Done
Add support for upgrade/downgrades in witness. John Baldwin 23 August 2001 Done
Make most of cpu_wait() and cpu_exit() MI. Peter Wemm 9 September 2001 Done
Split NFS into client and server. Peter Wemm 18 Oct 2001 Done
Lock taskqueues. Andrew Reiter, John Baldwin 25 October 2001 Done
Add a per-thread ucred reference. John Baldwin 25 October 2001 Done
Make most of the per-CPU stuff MI. John Baldwin 11 December 2001 Done
Make critical section saved state per-thread instead of per-lock so that interlocking spin locks work properly. John Baldwin 17 December 2001 Done
Replace the APIC-specific imen_mtx with a MI-named icu_lock to protect interrupt controllers and associated data within the kernel for both i386 and alpha. John Baldwin 20 December 2001 Done
Use the per-thread critical section nesting level in the mutex and interrupt thread code to automatically determine when to not preempt. This makes the MTX_NOSWITCH, SWI_SWITCH, and SWI_NOSWITCH flags obsolete as the kernel will be able to figure out the proper behavior on its own. John Baldwin 5 January 2002 Done
Lock struct filedesc and struct file. Seigo Tanimura, Alfred Perlstein 12 January 2002 Done
Lock struct pgrp, struct session, and struct sigio. Seigo Tanimura 23 February 2002 Done
Lock pipe implementation, but not sigio/fown, VM interactions Alfred Perlstein 27 February 2002 Done
Move to explicit reference counting for soft vnode references. Poul-Henning Kamp 8 March 2002 Done
Initialize mutex pools early enough that sx locks can be used for VM. Brian Feldman 14 March 2002 Done
Place a global lock (sellock) around selinfo structures to fix a variety of lock order reversals, and make select() MP-safe. Alfred Perlstein, Chad David 14 March 2002 Done
Push down Giant on read, write, pread, pwrite system calls, acquiring Giant in the per-subsystem fileop layer for sockets, VFS, etc. Alfred Perlstein 15 March 2002 Done
Lock down kernel module structures. Andrew Reiter 18 March 2002 Done
Lock down kernel linker globals. Andrew Reiter 18 March 2002 Done
Rewrite kernel memory allocator to be a slab allocator that uses per-cpu caches. Jeff Roberson 21 March 2002 Done
Replace incorrect use of MD critical section API to disable interrupts with a specific interrupt disable API. Warner Losh, Doug Rabson, Benno Rice, John Baldwin 21 March 2002 Done
Lock down access to the shared p_args "process arguments" structure through appropriate protection of that structure and references to it. Jonathan Mini 31 March 2002 Done
Move from flags/tsleep lock to sx locks to protect sysctl tree from updates during sysctl operations. Jonathan Mini 1 April 2002 Done
Create/port userland tool to manage KTR event dumps. Jake Burkholder 1 April 2002 Done
Create MTX_SYSINIT and SX_SYSINIT macros that allow for initializing locks that are subsystem independent. Andrew Reiter 2 April 2002 Done
Lock down the global securelevel variable. Andrew Reiter 2 April 2002 Done
Make grow_stack() MI. Possibly even a macro or inline. Alan L. Cox 6 April 2002 Done
Lock use of p_fd, which otherwise can result in corrupted p_fd panics during heavy operation. Start with a global, and move to per-proc locking. Alfred Perlstein, Seigo Tanimura 8 April 2002 Done
Lock struct pargs. Jonathan Mini 9 April 2002 Done
Make {o,}sigreturn() MPSAFE. Alan L. Cox 11 April 2002 Done
Rewrite kernel memory allocator so that Giant is not required for malloc() or free(). Jeff Roberson 2 May 2002 Done
Replace complex shared/exclusive locking scheme in the VM system with a purely exclusive lockmgr locking scheme, simplifying locking and removing potential livelock/deadlock scenarios. Brian Feldman, Alan L. Cox 3 May 2002 Done
Push down Giant into readv/writev system calls in style of read/write/pread/pwrite once malloc no longer requires Giant in the handling of iovec structures for uio. Alan L. Cox 9 May 2002 Done
Push down Giant in mprotect(), minherit(), and madvise() so that it is no longer acquired and released directly. Alan L. Cox 18 May 2002 Done
Update suser() and p_can*() APIs to accept threads instead of processes. John Baldwin 18 May 2002 Done
Broadly transition to td_ucred from p_ucred once KSE dependencies are in place. John Baldwin 18 May 2002 Done
Add a witness_sleep() check to uma_zalloc() to catch code calling malloc() or uma_zalloc() while holding non-sleepable locks. John Baldwin 20 May 2002 Done
Optimize UP support by changing spin locks to only perform critical section enter and exits. John Baldwin 21 May 2002 Done
Make sleep mutexes spin if the current lock holder is executing on another CPU. John Baldwin 21 May 2002 Done
Add support for the IA32 pause instruction to spin loops in locks. John Baldwin 21 May 2002 Done
Make KTRACE write into tracefiles asynchronously. John Baldwin 7 June 2002 Done
Remove Giant from jail(2). Andrew Reiter 25 June 2002 Done
Remove Giant from modnext(2), modfnext(2), modstat(2),and modfind(2). Andrew Reiter 25 June 2002 Done
Fix synchronization of TLB flushes and invlpg() on x86 SMP. Peter Wemm 12 July 2002 Done
Make cpu_coredump MI. Peter Wemm 7 September 2002 Done
Add a subsystem lock to the accounting code. Andrew Reiter 11 September 2002 Done
Lock down TrustedBSD MAC implementation. Robert Watson 11 November 2002 Done
Lock struct proc. John Baldwin 20 February 2001 In progress
Make the kernel fully preemptive. John Baldwin 7 September 2001 In progress
Lock down the tty subsystem. Dick Garner, Jeremy Scofield, Thomas Moestl 2 April 2002 In progress
Fix clock locking to be the same on all platforms. John Baldwin 16 November 2001 In progress
Lock pipe implementation: sigio/fown-related evil Alfred Perlstein 27 February 2002 In progress
Make use of process locking and process reference counting to protect debugging interfaces (and procfs). John Baldwin 27 February 2002 In progress
Make use of process locking to protect process monitoring sysctls, including those employed by 'ps' and related tools. John Baldwin 27 February 2002 In progress
Lock down newbus infrastructure to support driver fine-graining. Warner Losh 28 February 2002 In progress
Remove the MP safe syscall flag from the syscall table and add explicit mtx_lock/unlock's of Giant to all syscalls. Matt Dillon, Maxime Henrion 28 February 2002 In progress
SMPng architecture document. John Baldwin, Robert Watson 28 February 2002 In progress
Move to shared lock for VOP_GETATTR() to reduce blocking during frequent lightweight VFS operations. Modify namei() to provide a LOOKUP_SHARED flag to indicate when the lock required may be shared instead of exclusive. Jeff Roberson 11 March 2002 In progress
Create mutex profiling tool for the kernel so as to measure contention and behavior of kernel mutexes. Eivind Eklund, Dag-Erling Smorgrav 31 March 2002 In progress
Lock eventhandlers. Mike Smith, Jonathan Mini 8 April 2002 In progress
Lock sysctl hierarchy and access methods. Jonathan Mini 9 April 2002 In progress
Document existing vm_map locking and verify it's correctness. Alan L. Cox 18 May 2002 In progress
Document existing vm_object locking and verify it's correctness. Alan L. Cox 4 May 2002 In progress
Implement generic turnstiles to use when blocking on non-sleepable locks. John Baldwin 23 May 2002 In progress
Lock down linker_file_t structures in the kernel linker. Andrew Reiter 19 June 2002 In progress
Lock down the SysV IPC code. Alfred Perlstein 13 August 2002 In progress
Review locking strategy and correctness of VFS operations and fix up various failure modes associated with enabling VFS locking assertions. Jeff Roberson 10 December 2002 In progress
Document in-vnode locking strategy, clean it up, remove interlock, switch to sx locks. Jeff Roberson 10 December 2002 In progress
Implement lazy interrupt thread switching (context stealing) on i386. Bosko Milekic, Alexander Kabaev 10 December 2002 In progress
Implement lazy interrupt thread switching (context stealing) on sparc64. Jake Burkholder 10 December 2002 In progress
Switch from using lockmgr in VM to using a mutex or exclusive sxlock. Push down Giant on all VM except for vm_object/VFS and vm_page/pmap components. Alan L. Cox 10 December 2002 In progress
Create mechanism in cdevsw structure to protect thread-unsafe drivers. John Baldwin 15 May 2001 Stalled
Make printf() safe to call in almost any situation to avoid deadlocks. Chuck Paterson 15 May 2001 Stalled
Add locking to NFS.   15 May 2001 Not Started
Remove priority argument from tsleep(), msleep(), cv_*wait*().   12 January 2001 Not Started
Reimplement kqueue using condition variables. Jonathan Lemon 15 March 2001 Not Started
Conditionalize atomic ops in the SMP code that are used for debugging statistics. Peter Wemm 15 March 2001 Not Started
Add a new witness check for exiting processes to verify that an exiting process holds no locks. John Baldwin 13 June 2001 Not Started
Specify priorities for condition variables, semaphores, and sx locks.   7 September 2001 Not Started
Fix SIGXPU and other #if 0'd things in mi_switch().   7 September 2001 Not Started
Axe schedpu() in favor of event driven priority updates as much as possible.   7 September 2001 Not Started
Fix PHOLD() so that it blocks to guarantee PS_INMEM.   7 September 2001 Not Started
Fix *hold (e.g. crhold) to return reference to object.   7 September 2001 Not Started
Fix various procfs_machdep.c to use PHOLD, not sched_lock.   7 September 2001 Not Started
Add witness checking for lockmgr locks.   7 September 2001 Not Started
Add ICU spin locks on ia64.   4 January 2002 Not Started
Fast-path push-down of Giant for VOP_READ() and VOP_WRITE().   25 February 2002 Not Started
Lock contention measurement tool to measure heat of various locks, including Giant, and permit more directed performance and locking strategy optimization.   25 February 2002 Not Started
Push the grabbing of Giant into Linux i386 ABI system calls.   25 February 2002 Not Started
Push the grabbing of Giant into Linux AXP ABI system calls.   25 February 2002 Not Started
Push the grabbing of Giant into SVR4 i386 ABI system calls.   25 February 2002 Not Started
Push the grabbing of Giant into OSF/1 AXP ABI system calls.   25 February 2002 Not Started
Push the grabbing of Giant into IBCS i386 ABI system calls.   25 February 2002 Not Started
Lock pipe implementation: VM optimizations.   27 February 2002 Not Started
Expand mutex profiling tool to also profile sx locks. Eivind Eklund, Dag-Erling Smorgrav 1 April 2002 Not Started
Implement atomic_fetchadd() for int's and long's with acq and rel versions.   23 May 2002 Not Started
Implement a simple reference count API using atomic operations and use this to replace locks that just protect a reference count.   23 May 2002 Not Started
Implement a sleep queue abstraction to be used by both msleep() and condition variables. This new abstraction should use a hash table of sleep queues with a spin lock on each sleep queue chain similar to turnstile chain locks to make sched_lock finger grained. John Baldwin 23 May 2002 Not Started
Add a witness_sleep() check to copyin/out() and s/fuword(). John Baldwin 7 June 2002 Not Started
Split witness_lock() into witness_checkorder() and witness_lock(). witness_checkorder() would be called before acquiring a lock to increase the changes of detecting and warning about a reversal prior to deadlocking. witness_lock() would simply update witness' internal state to note that a lock has been acquired. John Baldwin 7 June 2002 Not Started

This table lists the todo subtasks for multithreading the network stack.

Task Responsible Last updated Status
Protect network interface queues. Jonathan Lemon 24 November 2000 Done
Lock down struct socket. Seigo Tanimura 21 April 2002 In progress
Lock down struct inpcb. Jeffrey Hsu 29 April 2002 In progress
Lock struct ifnet.   19 January 2001 Not Started
Reduce contention upon locking a socket buffer by replacing tsleep() and wakeup() with a condvar. Seigo Tanimura 21 April 2002 Not Started

Known Issues

Issue Last updated Status
Idle processor time is not charged to the idle processes. 20 September 2000 Resolved
microuptime creeps backwards. 4 October 2000 Resolved
microuptime() went backwards 4 October 2000 Resolved
Process accounting is not accurate (the more CPUs, the closer to correct it is). 5 October 2000 Resolved
M_DEVBUF is probably the wrong memory pool for interrupt stuff and we should think about creating a new malloc pool for that stuff. 9 February 2001 Resolved
PC card eject panics due to a race condition in the interrupt thread code. 15 March 2001 Resolved
SMP x86 boxes are seeing NCPU * 100 clk interrupts and NCPU * 128 rtc interrupts. 15 May 2001 Resolved
Witness will infinitely recurse when it acquires Giant after sleeping with a sleepable lock. 27 June 2001 Resolved
Serial gdb does not work if boot_ddb and boot_gdb options are specified. 14 July 2002 Resolved
Serial gdb does not work at 115200 baud. 14 July 2002 Resolved
Serial gdb never regains control once 'cont' has been entered. 14 July 2002 Resolved
Profiling is broken. 20 February 2001 Unresolved
jail_sysvipc_allowed is checked in an unsafe manner in the SYSV IPC syscalls. 5 March 2002 Unresolved

News

The remainder of this page is structured as a reverse-chronological log.

13 January 2002 15 May 2001 22 March 2001 5 March 2001 24 January 2001 12 January 2001 11 October 2000 8 September 2000 6 September 2000 5 September 2000 1 September 2000 30 August 2000 12 August 2000 3 August 2000 6 July 2000 5 July 2000 3 July 2000 26 June 2000 25 June 2000 19 June 2000

http://www.freebsd.org/mailto.html
Copyright © 1995-2002 The FreeBSD Project. All rights reserved.
Last modified: 2002/12/10 21:13:04