8000 Improve monitor implementation · Issue #1917 · qbicc/qbicc · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Improve monitor implementation #1917
Open
@dmlloyd

Description

@dmlloyd

TL;DR:

  • Add owned-locks array to java.lang.Thread
  • Implement new locking procedure in java.lang.Thread (phase 1, fast lock/heap lock, no nursery)
  • Implement new AQS-based Monitor class in java.lang without condition
  • Implement new AQS-based Condition class in java.lang
  • Implement nursery GC support (TBD, see below for rough ideas)
  • Revise new locking procedure (phase 2, nursery support and GC-based monitor location)
  • Drop monitor field from java.lang.Object
  • Drop monitor class from qbicc runtime API

Long version:

Right now monitors are implemented using a single lazily-allocated, ReentrantLock-based monitor class. This approach is simple and functional but has several drawbacks:

  • The monitor field is an oop-sized field that always exists on every object, even if that object's monitor is never used
  • When a lock is taken, even if it is uncontended, the full monitor object must be allocated
  • The lock includes a condition object, even if wait/notify is never used
  • The monitor uses a minimum of three objects: the monitor itself, a ReentrantLock instance, and a Condition instance

Ideally, we would want an implementation with the following characteristics:

  • Objects which are never locked should not have measurable any overhead at all (no heap objects, no extra fields)
  • Objects which are locked but can be statically shown to be uncontendable should have their corresponding monitor operations deleted or replaced with functional equivalent lockless operations at compile time (specifically, wait could be converted into an equivalent sleep operation)
  • Objects which are locked without contention should have minimal overhead (e.g. a one-bit lock)
  • Objects which are contended for should only allocate the necessary objects for managing waiters on the lock, and only once contention actually happens (i.e. no lock object until a second locker arrives, and no condition object until/unless it is required)
  • Avoid the overhead of a field or fields on an object pointing to its monitor and/or condition even when one exists
  • Avoid the overhead of additional indirection to lock and/or condition sub-objects
  • If a nursery is used, have a low-overhead mechanism to deregister nursery objects when their corresponding heap objects are GC'd or the nursery object is moved out of the nursery
  • Initial heap objects which are locked but can somehow be statically shown to be contended could be preinitialized with a monitor (and optionally a condition)

Many of these characteristics are not possible to achieve under our current architecture, but we can take some steps in this direction.

I propose a design where we do the following:

  • Drop the current monitor implementation and field
  • Introduce a lock nursery, essentially a ConcurrentReferenceHashMap from reference-to-Object to monitor object instance
  • Introduce a new Reference subtype, NurseryReference, which is weaker than WeakReference or FinalReference but stronger than PhantomReference (i.e. can be read by GC code) and which is also queued when an object no longer needs to be tracked in the nursery, for use as the key reference type in the nursery map (note that this is contingent on implementing Reference support in GC, which is not yet done at the time of this writing)
  • Organize fully-inflated locks such that the lock object always appears directly after the object that owns it in the parseable heap
  • Add two bits to the object header, reflecting the following states:
    • 00 - object is not locked and has no monitor
    • 01 - object is "fast-locked", no waiters are contending for the lock
    • 10 - object has an inflated monitor in the nursery, lock state is there
    • 11 - object has an inflated monitor on the heap consecutively after this object itself
  • Introduce an embedded array field of small size (5 to 10 slots) on Thread to act as a set or list of currently-owned fast-locked monitors, to facilitate Thread.holdsLock for objects which are fast-locked

The fast lock algorithm is contingent on identifying matching monitor enter-exit pairs, and would be something like this:

  • On enter,
    • If the thread's owned-lock array is full, skip fast locking and go directly to inflation
    • If the lock state is 00 (unlocked)
      • CAS it to 01
      • Record the object in the embedded owned-lock array
      • Return true (monitor needs to be released i.e. this is the outermost owner)
    • Else if the lock state is 01 (fast-locked)
      • If the object is present in the embedded owned-lock array, return false (monitor does not need to be released)
    • Else go to inflated lock procedure
  • On exit,
    • If the fast-lock attempt returned true
      • Remove the object from the embedded owned-lock array
      • If the lock state is 01 (still fast-locked)
        • CAS it to 00
      • Else if the lock state is 10 or 11 (lock was inflated)
        • Go to inflated unlock procedure

The lock inflation procedure would be something like this:

  • Construct a new lock instance, and add it to the nursery
    • The owner of the lock is null unless the corresponding object appears in the calling thread's owned-lock array, in which case we can set the owner
  • CAS the lock state to 10
    • If the CAS fails, clean up the mess appropriately
  • If we set the owner of the lock to our own thread, remove the object from the owned-lock array

The inflated unlock procedure would be like this:

  • If the owner of the lock is null, it was inflated while we held the lock (verify assertion by checking the owned-lock array)
    • After unlock, remove the object from the owned-lock array
  • Perform the normal unlock() call on the monitor object

The inflated monitor itself would be a subclass of AbstractQueuedSynchronizer with only simple lock/unlock and ownership operations.

This process does not take into account condition objects. In order to facilitate conditions, we would extend this process to account for a second object: the Condition object.

A Condition object would normally not exist for an object, until/unless it performs a condition operation. Once a valid condition operation (i.e. one which happens while the corresponding lock is held) takes place, we would then need to inflate a condition object to mediate the operation. In order to track the condition's inflation without needing extra bits on the source object, we can use the same two lock bits on the inflated lock's instance (which itself may not have a monitor, leaving these bits free to use). The bits would be defined as follows:

  • 00 - No condition exists
  • 10 - The condition is inflated in the nursery
  • 11 - The condition is inflated on-heap directly after the lock object

The following would be done:

  • The lock would be inflated if it is not already inflated
  • The condition would be inflated using an equivalent process

GC

The procedure to move objects out of the nursery could work like this:

  • When marking an object, if it has an on-heap monitor, also mark the monitor. If there is an on-heap condition, also mark the condition.
  • When copying an object to the destination space, if the nursery monitor is also marked, and there is sufficient room in the destination space, then preemptively move the monitor object to the destination space directly after its host object. Queue and clear the nursery reference for reference queue processing.
  • Follow a similar process for conditions.

Note that this procedure relies on being able to read (but not write) the nursery map during safepoint/GC, which is itself a tricky detail.

Deflation

With a system like this, it is theoretically possible to track the object's use of monitors somehow (perhaps by keeping a counter on the monitor instance itself which is incremented on GC and cleared on usage) such that if a monitor is unused (perhaps for a certain number of GC cycles, or for some time duration), it can be discarded and the lock state of the host object can be set back to 00. Likewise if a condition object is unused for a certain length of time, it too could be discarded in a similar way.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0