Discussing the design, decisions and discovery involved in developing Mutability Detector, an open-source analysis tool for Java.

Monday 5 July 2010

Immutability and subclassing: are they mutually exclusive?

In an earlier post I introduced the Mutability Detector, and how Josh Bloch's rules were used in its development. One point I mentioned was that Josh's rules are a bit on the strict side. I'm using this blog entry to clarify what I meant by that, and get some of the ol' wheels turning on the question: are immutability and subclassing mutually exclusive?

One of the immutability rules from Effective Java is:
2. Ensure that the class can’t be extended.


I'd like to make the case that yes, while preventing your class being extended is not going harm the immutability status it isn't always necessary. There's also the issue to consider that declaring your class final may have benefits, somewhere, at some point, in the hazy world of the Java memory model.

I'll start by explaining when it is necessary.

For example, given the following class:

public class IAmImmutable {
private String label;
public IAmImmutable(String label) {
this.label = label;
}
public String getLabel() {
return this.label;
}
}


I'm pretty sure everyone would consider this immutable - an instance of IAmImmutable cannot change after construction. However, it fails one of Josh's rules: the class can be extended. So you could have a situation like:

public class ImInUrClassRuiningUrImmutability extends IAmImmutable {
... // similar constructor
public void setLabel(String label) {
this.label = label;
}
}


So you could have a class which depends on IAmImmutable. Perhaps you're composing your big immutable objects out of smaller immutable ones. Like this:

public class ComposedOfImmutables {
public final IAmImmutable firstField;
public final SecondImmutableClass secondField;

public ComposedOfImmutables(IAmImmutable first, SecondImmutableClass second) {
firstField = first;
secondField = second;
}
}


All is well: ComposedOfImmutables, because it's being built on the solid foundation of immutable instances, is easier to understand, debug, reason about, and almost trivial to use in multithreaded applications. That is, however, until IAmImmutable is subclassed, and the menace that is the immutability-ruining LOLclass is free to pollute the otherwise immutable world around it. An instance of ImInUrClassRuiningUrImmutability, because it extends IAmImmutable, can be passed as a parameter to ComposedOfImmutables, corrupted it with mutability. Like so:

ImInUrClassRuiningUrImmutability mutableThing = new ImInUrClassRuiningUrImmutability("lol");
ComposedOfImmutables supposedlyImmutable = new ComposedOfImmutables(mutableThing, ...);

publishImmutableThingToManyThreads(supposedlyImmutable);

supposedlyImmutable.firstField.setLabel("hr i goez - ruinin ur immutability");


This is the case where it's important that IAmImmutable cannot be extended. It's a consumer thing. The existence of a mutable subclass does not mean IAmImmutable instantly becomes mutable. You can still new them up and pass them around multiple threads with joyful abandon. But... ComposedOfImmutables depends on the instance it receives in the constructor being immutable. The onus of immutability in this case does, and should lie, with ComposedOfImmutables.

You may be asking, what's the problem? IAmImmutable obviously wants to be immutable. It lives to be used immutably (is that a word?), why not just prompt the developer to prevent subclassing?. Well, it becomes an issue for static code checking tools, like Mutability Detector. In the past year I spent my honours year project steeped in the stew of static bug checkers, with their incomplete analysis, unsoundness, and swathes of false positives. My research in that area has convinced me that any static analysis tool has to be very, very careful of reporting errors that get in the users way. If it's not reporting real problems they want to fix, it's wasting their time.

[Edit: looking back after the comments made by Seb Charrot, this conclusion was really misleading. Hopefully this revision makes more sense.]

That's why it's my opinion that Josh's rule is too strict. And it's the reason that Mutability Detector will not emit errors that IAmImmutable is in fact mutable, it will instead emit warnings. Instead of an "omg teh buildz broken!", it says "You know, there's a danger here. Gather round to hear why...". However, Mutability Detector has, and will, tell you that ComposedOfImmutables is definitely not immutable. If guarantees can't be made about the runtime instances passed to a class, the foundations it is built on are too weak to support it, and immutability cannot be claimed.

So while most of the time immutability and subclassing are incompatible, hopefully this post has demonstrated that they are not always mutually exclusive.

15 comments:

  1. I'm not sure I agree - isn't immutability by definition "the quality of being incapable of mutation"?

    A class with a publicly-exposed variable is mutable though it may never be mutated. I.e. the variable may never be assigned from an external source.

    In the same way, a class which can be extended is mutable - it can be mutated, even if it never will be. This is the very definition of mutability, isn't it? That a consumer has the means to mutate an object?

    It seems to me that you've drawn a distinction between "class X is mutable" and "class X is in danger of being mutable" when in fact there is no difference. If there's a danger of mutation, then it *is* mutable.

    Disclaimer: This comment is not meant as an attack but an invitation to hear some robust arguments :)

    ReplyDelete
  2. First off, thanks for your comments: frist psot!

    You're right that it's the danger of being mutated which is important. Generally in developing immutable classes, defensiveness is the key. This is why I've said that the likes of public non-final fields (which can be reassigned after construction) or setter methods (which also reassign fields after construction) are both definitely not immutable.

    In the example in the blog entry, the distinction isn't between being mutable and danger of being mutable, it's between the use of a class by its clients and the class itself.

    The specific, concrete class IAmImmutable, is not in danger of being mutated. No matter what you do, if you construct an instance of it directly, from it's concrete constructor, it cannot be changed*. So IAmImmutable is found to be immutable when looked at in isolation, the tool issuing what is essentially only a warning. You're slightly off in your statement "class X is in danger of being mutable" (assuming you meant IAmImmutable) in that the *class* isn't in danger of being mutable. What's more accurate is that "classes which extend X can render runtime instances referenced as type X to be mutable" (bit of a mouthful I know, but the difference is significant).

    The class not being declared final only becomes significant when instances are being consumed. As in the example, if a class accepts an instance of IAmImmutable, it also accepts any mutable subclasses. At this point, something is in danger of being mutable, but what? Is it IAmImmutable, whose instances *are* immutable, or clients which depend on the instances it receives being immutable, but don't guarantee that those instances are definitely immutable? For me, it's clearly the latter.

    The distinction wouldn't be that important, and the solution is the same nevertheless (declaring it final), if it wasn't for the devastating effect false positives can have on static analysis tools. If IAmImmutable was found to be mutable because it wasn't declared final, I could see it putting people off using the tool. In my experience with analysis tools, I've found it's easy to be more strict, but much more difficult to be realistic in terms of what the tool demands from developers.

    Getting back to the example, IAmImmutable introduces mutability, but not in 100% of cases. Detecting the percentage of cases where it does matter reduces false positives and increases effectiveness of the tool (in theory).

    Hope this clarifies my point, and if you're still in doubt, feel free to question me till I do convince ya :-)

    ~ Graham

    * actively ignoring the capabilities of reflection, and other such (I don't want to say 'nasty', but yeah) stuff.

    ReplyDelete
  3. Here’s my problem with that argument – IAmImmutable doesn’t expose any means of exterior modification of its state after construction. But one of its subclasses – ImInUrClassRuininUrImmutability – can clearly provide paths to modify information defined in IAmImmutable. Although I understand that the mutable *instance* is of a subclass, we’re still allowing a state which is defined in the base class to be changed. If IAmImmutable were truly, reliably immutable, you would be happy consuming any of its children (now that’s an evil phrase). But as it is, we have no guarantee that any object under the guise of IAmImmutable is immutable.

    > If IAmImmutable was found to be mutable because it wasn't declared final, I could see it putting people off using the tool.

    I’m all for pragmatism, but if people are creating classes which can be subclassed and modified, then I think they should be told that there’s no guarantee that objects declared as IAmImmutable will be their namesake. And I’d always thought that immutability was something which by definition had to have a guarantee.

    Talking in practical terms, you could always provide an option for developers to toggle “Treat non-final classes as mutable”... though I think that’s a weak compromise.

    ReplyDelete
  4. > If IAmImmutable were truly, reliably immutable, you would be happy consuming any of its children (now that’s an evil phrase). But as it is, we have no guarantee that any object under the guise of IAmImmutable is immutable.


    Herein lies the point I was chipping away at. You *wouldn't* be happy consuming IAmImmutable's children*. There *is no* guarantee that a reference to an IAmImmutable instance is immutable. If you had a constructor which assigned an IAmImmutable instance to one of it's fields, it can be made mutable, the same as if the class received an instance of an interface or abstract class. Until subclassing IAmImmutable is prevented, there can be no guarantee that when you receive a reference to one, it will point to a truly, reliably immutable object.

    BUT... Mutability Detector is aware of this, and complains when consumers depend upon a property of a class which can't be guaranteed. The reasons you want IAmImmutable to be exposed as mutable fall under the conditions that I tried to describe: when instances are being consumed, and the consumers depend upon their immutability.

    Without users to gather data from, it's essentially an unknown how much the distinction will matter in practice. What we've been discussing is, I expect, an edge case, where the significance of the distinction is the demotion of a particular case being definitely not immutable, to a warning about a gotcha that may or may not matter. Where consuming classes previously failed, they still fail. Given the examples in the blog, IAmImmutable has went from not immutable to a warning; ComposedOfImmutables has remained definitely not immutable.

    In regard to the toggle, what I envisage being the real use for this tool is for making assertions in unit tests (a blog entry on this, when the functionality is nearing completion, will follow). Rather than a toggle, or option for the tool, I intend to make the hypothetical Assert.isImmutable() method flexible enough to allow users to overrule the tool's decision, in the inevitable event that the tool generates a false positive. Since one of my favourite features of FindBugs is how it explains its decisions in detail, I'd rather put information in front of the user and let them judge, rather than swallowing warnings silently. Not filling the gap that allows IAmImmutable to be subclasses would fall under this - the user would be alerted, but they could suppress the warning.

    However, though I still stand by what I've wrote at this moment in time, if I'm proved to be wrong, rest assured the plight of immutability will not suffer under the yoke of my stubborness. For born am I with the humility to admit mistake, which I can effortlessly achieve while simultaneously writing in the tone of a first class twat.

    * "No I don't want to eat babies! I don't have the time... they're not delicious." - Kyle Cease, anyone?

    ReplyDelete
  5. I took a look back at the content of this post, and saw how some of the wording was completely misleading. Perhaps you wouldn't mind taking a look at the last couple of paragraphs and telling me if the edit makes my point any clearer?

    ReplyDelete
  6. Where in Effective Java does it say that an object is only immutable if it can't be extended?

    ReplyDelete
  7. I'll go ahead an answer my own question. Effective Java does not say that an object is only immutable if it can't be extended, because that's not true. The requirements for immutability are quite precise and listed in Java Concurrency in Practice. According to the rules, IAmImmutable is NOT immutable because its field is non-final. If its field were final, it would be irrelevant if its subclasses were mutable, it would still be immutable no matter what. Even without writing a subclass or using crazy reflection, I can write a unit test which proves that IAmImmutable is in fact mutable.

    ReplyDelete
  8. Craig, I'll answer your first question too, then take another comment to answer your second points.

    Effective Java includes as one of its five rules for immutability "Ensure that the class can't be extended."

    See here (Google Books)

    ReplyDelete
  9. I'm not sure if I'm interpreting you correctly, but you're right, "Can't be extended" is definitely not the only rule. Classes which cannot be extended can be mutable. Yep, there's much more to it.

    In my example, I probably should have made the same distinction between Immutable and Effectively Immutable, as JCIP also does. In my example, IAmImmutable is the latter: once an instance of that class is safely published, it cannot be seen to change by clients of the class.

    However, if it is unsafely published to other threads, the lack of the final modifier would become an issue, as the field could be accessed before it has been assigned. Thus a client of the class could observe getLabel() returning null on the first call, then returning a non-null String on the second call. Safely publishing prevents this.

    At least, that's my understanding. If you could share a unit test or some other piece of code that demonstrates what you mean, I'd love to see it.

    ReplyDelete
  10. That's exactly what I meant. A unit test could unsafely publish IAmImmutable and another thread could observe a null field.

    I didn't remember the term "effectively immutable" from JCIP. Discovering effectively immutable objects seems interesting, and difficult.

    I recommend splitting the API into assertEffectivelyImmutable() and assertImmutable(). JCIP says "Immutable objects, on the other hand, can be safely accessed even when synchronization is not used to publish the object reference." I think that assertImmutable() should be strict on that point.

    ReplyDelete
  11. That's something I've been thinking about, and a mechanism for tailoring the output of assert has been in place since 0.6. The default assert will be the most strict and the user will be able to supply suppressions. For example, for java.lang.String, you could write something like:

    assertInstancesOf(String.class, areImmutable(), allowing(field("hash").toBeReassigned());

    (don't quote me on the syntax, hopefully you get the idea)

    Something similar could be used for effectively immutable objects. The overall codebase would then have to be assessed manually, to check that publication was safe, but hopefully that would still be a helpful utility.

    Thanks Craig.

    ReplyDelete
  12. private String label; in IAmImmutable class has private access. How can u change its value in sublcass by setLabel() method? It will give error:

    "label has private access in IAmImmutable".

    Thanks...

    ReplyDelete
    Replies
    1. Hi Rajinder,

      Good catch. The subclass could use any field as the return value for getField(). It could either redeclare the field label, or introduce a new one, or return a random string.

      The point still stands that, because of subclassing, getField() could return different result on different calls.

      Make sense?

      Regards,
      Graham

      Delete
    2. OK, I think this could be something like this:

      class Immutable{
      private String label="Original Label";
      public String getLabel(){
      return label;
      }
      }

      public class MutableMischief extends Immutable{

      @Override
      public String getLabel(){
      return "Not Original Label";
      }

      public static void main(String args[]){

      MutableMischief mutObj= new MutableMischief();
      ImmutableClient immutClient= new ImmutableClient(mutObj);
      System.out.println(immutClient.printImmutableLabel());
      }
      }


      class ImmutableClient{

      Immutable immutObj;

      ImmutableClient(Immutable immut){
      this.immutObj=immut;
      }

      public String printImmutableLabel(){
      return immutObj.getLabel();
      }

      }

      The above program should print Original Label But it will print Not Original Label.

      Thanks...

      String Pool is possible due to String Immutability

      Delete
    3. Just to be pedantic, your MutableMischief is also immutable :)

      Although the return value of getLabel() is different from its superclass, it will always be the same for an instance of MutableMischief, just like a final field would be. To show what I mean, consider MutableMischief to be like this:

      public class MutableMischief extends Immutable {

      @Override
      public String getLabel() {
      return "Return value keeps changing: " + (System.currentTimeMillis() % 10);
      }

      public static void main(String args[]){
      MutableMischief mutObj= new MutableMischief();
      ImmutableClient immutClient= new ImmutableClient(mutObj);
      System.out.println(immutClient.printImmutableLabel());
      Thread.sleep(1000);
      System.out.println(immutClient.printImmutableLabel());
      }
      }

      One of the times I ran this I got the result:
      Return value keeps changing: 6
      Return value keeps changing: 8

      So the issue is *not* that MutableMischief returns a different value than its superclass. It's that, on different invocations of the same method of the same instance of MutableMischief, different results are returned. It just so happens that having subclasses is a way to introduce that behaviour.

      Is that any clearer?

      Kind regards,
      Graham


      Delete