From JVMLangSummit
Jump to navigationJump to search

Friday night after the Summit, some of us who hadn't had enough yet went across the street to Piatti for dinner.

Emboldened by good food and drink, some of the JSR 292 EG members and others (Forax, Ohrstrom, Szegedi, Bini, Rose, ...) worked on some hard remaining issues. Here are the napkins to prove it.


MethodHandle sub-kernel

class MH {
  MT type()
  ? exactInvoke(?)
  ? genericInvoke(?)
  ? varargsInvoke(Object...)  // maybe? does a spread

  MH convertArguments(MT)
  MH bind(x)
  MH spreadArguments()     // creates a spreader
  MH collectArguments(int n) // creates a fixed-arity collector

  MH convertArgments(MT)

  "MH collectVarargs()" // maybe?  creates a variable-arity collector

  toString() ??
  examine() ??

Rename MH.insertArgument to MH.bind or (per Ola) MH.prependArgument.

Perhaps rename MH.convertArguments to MH.asType. This is needed as a fundamental factory to create MHs of arbitrary exact types.

What should toString return?

Simple answers:

  • Implementation-dependent
  • Object.toString
  • Current RI: The simple-name of the target method. (For system-defined combinators, delegates to target.toString; user-defined ones also.) Simple and useful.

Completely Generic Argument Collection

As Fredrik has blogged, with genericInvoke we allow one method handle to accept any call type, as long as they have a certain given arity. This allows us to glue together method handles without generating signature specific bytecodes.

But it still requires (sometimes) arity-specific bytecodes; there are currently up to 256 different arities in the JVM. This is close enough to the "infinite bytecode" requirement to be impractical.

Possible Solution: Allow a "varargs" method handle type, which can be generically invoked from any call site, of any signature, and (moreover) any arity. It would be uniformly able to collect arguments from any call site.


  Object foo(Object[] arglist)
  { ...do something complicated with arglist... }
  MH foo = #foo;
  foo.genericInvoke();    // WMT, arity
  foo.genericInvoke(1);   // CCE Integer != Object[]
  foo.genericInvoke(1,2); // WMT, arity
  MH bar = foo.collectVarargs();
  bar.genericInvoke();    // = foo.exactInvoke(new Object[]{})
  bar.genericInvoke(1);   // = foo.exactInvoke(new Object[]{1})
  bar.genericInvoke(1,2); // = foo.exactInvoke(new Object[]{1,2})

The MH.type of such a MH would be some special value that would make it impossible to perform exact invocation.

Spreading and collecting are of course interrelated:

spreading from av to mh(x,y) collecting from (x,y) to mh(av)
call mh.varargsInvoke(av) mh.exactInvoke(new Object[]{x,y})
adapt mh.spreadArgumments() mh.collectArguments(2)

Note that collectArguments could be implemented by collectVarargs plus asType.

Calling Sequences

MH.genericInvoke does autoboxing. Can result in an OOM error. The JVM can make the adapter non-blocking, by preallocating all the boxing memory in advance (in thread-local buffer) before advancing into the call. If the allocation fails, the call site is restarted after GC.

// virtual call through vtable:
mov [obj] -> ecx
call [ecx+off]
// mh call through vtable:
mov [obj] -> ecx
mov [obj+6] -> edx
call [ecx+edx]
// inexact call to DMH
// esi=this, edi/eax/edx=args
cmp [obj+8], #ExactMT
jmp/eq [...generic method entry...]

...adjust arguments, etc...
push ebp
call [...exact method entry...]
... adjust return value...
pop ebp

The adapter stub can be generated once per method. Better to generate one per signature; need linkage register:

// generic method entry adapter, per-signature version
alias dest = r8  // linkage
mov dest <- #(...exact method entry...)
if (cs.type == mh.type)
  call [dest]
  call [dest-8]  // pushes call to r8

Note the need for special stack frames for adapting the return value.

Java Method Handles

class Foo extends JavaMethodHandle {
  Foo() { super(#myInvoke); }
  void myInvoke() { ... }

Constant Pool Constants

Should be for MethodType and four kinds of MH: virtual/interface, static, special.

Idea: Have the constant pool format for CONSTANT_MethodHandle_info be parameterized by (a) a member reference, and (b) a bytecode point (invokespecial, etc.). Makes clear the correspondence between the symbolic reference and the semantics of the resulting method handle. This also extends naturally to all other symbolic references:

  • invokevirtual, CONSTANT_Methodref = findVirtual (must be class)
  • invokeinterface, CONSTANT_InterfaceMethodref = findVirtual (must be interface)
  • invokestatic, CONSTANT_Methodref = findStatic
  • invokespecial, CONSTANT_Methodref = findSpecial
  • getfield, CONSTANT_Fieldref = unreflectGetter (must be non-static)
  • getstatic, CONSTANT_Fieldref = unreflectGetter (must be static)
  • putfield, CONSTANT_Fieldref = unreflectSetter (must be non-static)
  • putstatic, CONSTANT_Fieldref = unreflectSetter (must be static)

CONSTANT_MethodHandle_info tag can be 15.

For MethodType, I (Rémi) propose to add a new constant pool format CONSTANT_MethodType_info CONSTANT_MethodType_info {

 u1 tag;                      
 u2 method_descriptor_index;



   The tag item of the CONSTANT_MethodType_info structure has the value CONSTANT_MethodType (14).


   The value of the descriptor_index item must be a valid index into the constant_pool table. The constant_pool entry at that index must be a CONSTANT_Utf8_info (§4.4.7) structure representing a valid method descriptor (§4.3.3).

Also LDC instruction spec should be changed to allow CONSTANT_MethodHandle_info and CONSTANT_MethodType_info. About the runtime semantics,

 LDC CONSTANT_MethodType_info == LDC CONSTANT_MethodType_info is required to return true
 if the constant values are equals.
 The same is not true with LDC CONSTANT_MethodHandle_info.

Advantages to using constant pool:

  • static checking, static analysis tools
  • pre-linking, application-level early binding
  • less need for user-managed cache variables (static final private MethodHandle FOO = ...)

Exactly Typed Guards

(napkins #3, #4)

Attila's MOP handles overload resolution. This makes it unable to use plain invokevirtual calls (MHs.findVirtual) even as an optimization. Reason: A virtual method may be overlapped by another method which "steals" some of its argument types. This could happen even with an "innocent looking" method like Object.equals. For example, String.equals(String) steals string arguments from String.equals(Object), when overloading is resolved. It usually cannot be proved that this won't happen; see example below where Baz calls Bar, and later on Foo is loaded to complicate the resolution of Bar.f.

As a result, type guards for overloaded methods must use equality comparisons on classes, not the more "natural" instanceof.

class Bar {
  int f(Object o) { ... }
  // f might also be Bar.equals inherited from Object
class Baz extends Bar {
  { invokedynamic f(String) ->
       guardWithTest({=> o.getClass == Bar.class}, Bar.#f, ...)
// later loaded:
class Foo extends Bar {
  int f(String s) { ... }

Open Problem: Is there a way to register interest in class hierarchy changes, in such a way that the MOP could use virtual invocations, and "devirtualize" (like JVMs do) if a conflicting overloading is ever loaded? The difficulty is defining a critical section during which (a) a new class is installed in the hierarchy, and (b) dependent call sites are reset, while (c) no calls are in progress. It's the same critical section that JVMs (like Hotspot) use at a low level. It should not allow execution of bytecodes. It could, perhaps, associated invokedynamic sites with some given point in the type hierarchy, such that those points would be invalidated when new classes are loaded under that point.

The problem gets a little more complex if argument or return types are replaced arbitrarily; for example int Foo.equals(boolean) overriding boolean Object.equals(Object).


Dynamic languages can reverse themselves: The meaning of a call site can change over time due to changes in the runtime, especially in the application's type schema.

Groovy currently gets a serious performance hit from polling a schema evolution counter on each method call. This counter is volatile, to make sure it updates to every thread. This is a typical "pull" or "polling" type of invalidation logic.

JSR 292 will include "push" invalidation.

Call site states, excluding invalidation:

  • not yet executed
  • executed, but bootstrap method not yet returned
  • linked: reified as a unique call site (returned from BSM)
  • relinked: CallSite.setTarget can be called any number of times

Note that CS.setTarget does not have volatile semantics. Threads may take any amount of time to witness changes. This is by design.

Key question: How do we (a) back up to a known state, (b) forcing all threads to see the change, (c) without race conditions, and (d) without per-call polling? Answer: Collect call sites for invalidation, and bulk-invalidate at a well-defined safepoint.

(Why bulk? Because most implementations will be too slow for piecemeal work; need big transactions.)

The JVM (at least, Hotspot) has subtle mechanisms for this, to perform revirtualization when a devirtualized call site becomes invalid, due to class loading (class schema change). After some discussion, it was agreed that GCs in general require such global critical sections too.

The answer we came up with is "full back-out at a safepoint". Details:

  • The runtime (language or application) initiates any global transaction it wants
  • The runtime marks the CallSite it wants to invalidated (and any other sites as well)
  • The runtime requests bulk invalidation from the JVM
  • The JVM invalidates the CallSite in a global critical section ("safepoint")
  • During this safepoint, every affected invokedynamic instruction is either blocked or has rolled forward
  • The CallSite object is discarded (collectable by GC)
  • The invokedynamic instruction is backed up to a "not yet executed" state
  • As soon as the JVM returns, threads may execute the invokedynamic instruction, triggering its BSM
  • The BSM interacts correctly with the runtime's current transaction

Backing all the way out to an unlinked state is cleanest. Alternative: Back up to a "known initial target value" on the call site. Seems to require messy bytecode execution in uncontrolled contexts, to compute and store the initial target value.

Implications of full back-out:

  • BSM can return the same call site as last time, if desired (only if we supply an "is-live" query)
  • gives a "hook" for changing the class and/or identity of a call site
  • 1-to-many bytecode-to-CS relation meshes nicely with call site splitting (which Remy has rightly insisted on)

This design allows threads to run invokedynamic instructions with complete concurrency except during specified global safepoints.

Remaining problem: Define what it means to "roll forward" through an invokedynamic call. Should execute the whole call site target entry including pre-entry adapters, including argument boxing, etc. Can define this w.r.t. the functions in MHs, but this adds a funny privilege to them. This could interact with tail-call guarantees.