Package serp.bytecode
Bytecode Manipuation
This package contains a framework for Java bytecode manipulation.
Bytecode manipulation is a powerful tool in the arsenal of the Java developer. It can be used for tasks from compiling alternative programming languages to run in a JVM, to creating new classes on the fly at runtime, to instrumenting classes for performance analysis, to debugging, to altering or enhancing the capabilities of existing compiled classes. Traditionally, however, this power has come at a price: modifying bytecode has required an in-depth knowledge of the class file structure and has necessitated very low-level programming techniques. These costs have proven too much for most developers, and bytecode manipulation has been largely ignored by the mainstream.
The goal of the serp bytecode framework is to tap the full power of bytecode modification while lowering its associated costs. The framework provides a set of high-level APIs for manipulating all aspects of bytecode, from large-scale structures like class member fields to the individual instructions that comprise the code of methods. While in order to perform any advanced manipulation, some understanding of the class file format and especially of the JVM instruction set is necessary, the framework makes it as easy as possible to enter the world of bytecode development.
There are several other excellent bytecode frameworks available. Serp excels, however, in the following areas:
- Ease of use. Serp provides very high-level APIs for all normal bytecode modification functionality. Additionally, the framework contains a large set of convenience methods to make code that uses it as clean as possible. From overloading its methods to prevent you from having to make type conversions, to making shortcuts for operations like adding default constructors, serp tries to take the pain out of bytecode development.
- Power. Serp does not hide any of the power of bytecode manipulation behind a limited set of high-level functions. In addition to its available high-level APIs, which themselves cover the functionality all but the most advanced users will ever need, serp gives you direct access to the low-level details of the class file and constant pool. You can even switch back and forth between low-level and high-level operations; serp maintains complete consistency of the class structure at all times. A change to a method descriptor in the constant pool, for example, will immediately change the return values of all the high-level APIs that describe that method.
- Constant pool management. In the class file format, all constant values are stored in a constant pool of shared entries to minimize the size of class structures. Serp gives you access to the constant pool directly, but most of you will never use it; serp's high-level APIs completely abstract management of the constant pool. Any time a new constant is needed, serp will automatically add it to the pool while ensuring that no duplicates ever exist. Serp also does its best to manipulate the pool so that the effects of changing a constant are as expected: i.e. changing one instruction to use the string "bar" instead of "foo" will not affect other instructions that use the string "foo", but changing the name of a class field will instantly change all instructions that reference that field to use the new name.
-
Instruction morphing. Dealing with the individual
instructions that make up method code is the most difficult
part of bytecode manipulation. To facilitate this process,
most serp instruction representations have the ability to
change their underlying low-level opcodes on the fly as the
you modify the parameters of the instruction. For
example, accessing the constant integer value 0 requires the
opcode
iconst0
, while accessing the string constant "foo" requires a different opcode,ldc
, followed by the constant pool index of "foo". In serp, however, there is only one instruction,constant
. This instruction hassetValue
methods which use the given value to automatically determine the correct opcodes and arguments --iconst0
for a value of 0 andldc
plus the proper constant pool index for the value of "foo".
Serp is not ideally suited to all applications. Here are a few disadvantages of serp:
- Speed. Serp is not built for speed. Though there are plans for performing incremental parsing, serp currently fully parses class files when a class is loaded, which is a slow process. Also, serp's insistence on full-time consistency between the low and high level class structures slows down both access and mutator methods. These factors are less of a concern, though, when creating new classes at runtime (rather than modifying existing code), or when using serp as part of the compilation process. Serp excels in both of these scenarios.
- Memory. Serp's high-level structures for representing class bytecode are very memory-hungry.
- Multi-threaded modifications. The serp toolkit is not threadsafe. Multiple threads cannot safely make modifications to the same classes the same time.
- Project-level modifications. Changes made in one class in a serp project are not yet automatically propogated to other classes. However, there are plans to implement this, as well as plans to allow operations to modify bytecode based on specified patterns, similar to aspect-oriented programming.
The first class that you should study in this package is the
Project
type. From there, move onto the
BCClass
, and trace its APIs into
BCField
s, BCMethod
s,
and finally into actual Code
.
-
Interface Summary Interface Description BCEntity Interface implemented by all bytecode entities.Constants Interface to track constants used in bytecode.InstructionPtr An entity that maintains ptrs to instructions in a code block. -
Class Summary Class Description Annotated An annotated entity.Annotation A declared annotation.Annotation.Property An annotation property.Annotation.Property.Value Property value struct.Annotations Java annotation data.ArrayInstruction Any array load or store instruction.ArrayLoadInstruction Loads a value from an array onto the stack.ArrayState State implementing the behavior of an array class.ArrayStoreInstruction Store a value from the stack into an array.Attribute In bytecode attributes are used to represent anything that is not part of the class structure.Attributes Abstract superclass for all bytecode entities that hold attributes.BCClass The BCClass represents a class object in the bytecode framework, in many ways mirroring theClass
class of Java reflection.BCClassLoader Class loader that will attempt to find requested classes in a givenProject
.BCField A field of a class.BCMember A member field or method of a class.BCMethod A method of a class.BootstrapMethodElement BootstrapMethods ClassConstantInstruction Pseudo-instruction used to placeClass
objects onto the stack.ClassInstruction An instruction that takes as an argument a class to operate on.CmpInstruction An instruction comparing two stack values.Code Representation of a code block of a class.CodeEntry An entry in a code block.ConstantInstruction An instruction that that loads a constant onto the stack.ConstantValue A constant value for a member field.ConvertInstruction A conversion opcode such asi2l, f2i
, etc.Deprecated Attribute signifying that a method or class is deprecated.ExceptionHandler Represents atry {} catch() {}
statement in bytecode.Exceptions Attribute declaring the checked exceptions a method can throw.FieldInstruction Instruction that takes as an argument a field to operate on.GetFieldInstruction Loads a value from a field onto the stack.GotoInstruction An instruction that specifies a position in the code block to jump to.IfInstruction An if instruction such asifnull, ifeq
, etc.IIncInstruction Theiinc
instruction.InnerClass Any referenced class that is not a package member is represented by this structure.InnerClasses Attribute describing all referenced classes that are not package members.Instruction An opcode in a method of a class.InstructionPtrStrategy InstructionPtrStrategy handles the different strategies for finding the Instructions that InstructionPtrs point to.JumpInstruction An instruction that specifies a position in the code block to jump to.LineNumber A line number corresponds to a sequence of opcodes that map logically to a line of source code.LineNumberTable Code blocks compiled from source have line number tables mapping opcodes to source lines.LoadInstruction Loads a value from the locals table to the stack.Local A local variable or local variable type.LocalTable Code blocks compiled from source have local tables mapping locals used in opcodes to their names and descriptions.LocalVariable A local variable contains the name, description, index and scope of a local used in opcodes.LocalVariableInstruction An instruction that has an argument of an index into the local variable table of the current frame.LocalVariableTable Code blocks compiled from source have local variable tables mapping locals used in opcodes to their names and descriptions.LocalVariableType A local variable type contains the name, signature, index and scope of a generics-using local used in opcodes.LocalVariableTypeTable Code blocks compiled from source have local variable type tables mapping generics-using locals used in opcodes to their names and signatures.LookupSwitchInstruction Thelookupswitch
instruction.MathInstruction One of the math operations defined in theConstants
interface.MethodInstruction An instruction that invokes a method.MonitorEnterInstruction Themonitorenter
instruction.MonitorExitInstruction Themonitorexit
instruction.MonitorInstruction A synchronization instruction.MultiANewArrayInstruction Themultianewarray
instruction, which creates a new multi-dimensional array.NameCache Caching and conversion of names in both internal and external form.NewArrayInstruction Thenewarray
instruction, which is used to create new arrays of primitive types.ObjectState State implementing the behavior of an object type.PrimitiveState State implementing the behavior of a primitive class.Project The Project represents a working set of classes.PutFieldInstruction Stores a value from the stack into a field.RetInstruction Theret
instruction is used in the implementation of finally.ReturnInstruction Returns a value (or void) from a method.SourceFile Attribute naming the source file for this class.StackInstruction Represents an instruction that manipulates the stack of the current frame.State The State type is extended by various concrete types to change the behavior of aBCClass
.StoreInstruction An instruction to store a value from a local variable onto the stack.SwitchInstruction Contains functionality common to the different switch types (TableSwitch and LookupSwitch).Synthetic Attribute marking a member as synthetic, or not present in the class source code.TableSwitchInstruction Thetableswitch
instruction.TypedInstruction Any typed instruction.UnknownAttribute An unrecognized attribute; class files are allowed to contain attributes that are not recognized, and the JVM must ignore them.WideInstruction Thewide
instruction, which is used to allow other instructions to index values beyond what they can normally index baed on the length of their arguments.