Sunday 6 December 2009

The dangers of overloading

Posted the equals(Object) solution from my last blog on the artima discussion for this topic. As a contributor demonstrated, if the protected field comparison methods are called from outside an instance some very strange results can be produced. The answer is of course that protected methods should not be called from outside a class even if Java's same-package rule permits access. Default access gives plenty of scope for any required kludge without ignoring this last vestige of encapsulation.

These strange results should certainly discourage giving these methods public access or overloading an equals() with the class name, as seen in at least one published implementation. As things stand it is still quite possible to get the protected methods to return a symmetrical false when comparing the same object. All down to overloading, substitution and the aforementioned weak encapsulation but worth a look.

The first thing to do is write a Point base and a ColoredPoint subclass, as in the artima article, but use the new equals() implementation rather canEqual. In the base-class field comparison is factored of to fEquals(Point) and equals(Object) calls the received object's version of fEquals.

public class Point {

private final int x;
private final int y;

public Point(int x, int y) {
this.x = x;
this.y = y;
}

// accessors are optional for this discussion

@Override public boolean equals(Object obj) {
return (obj instanceof Point &&
((Point)obj).fEquals(this) );
}

protected boolean fEquals(Point obj) {
return (this.x == obj.x && this.y == obj.y);
}


@Override public int hashCode() {
return (41 * (41 + getX()) + getY());
}
}

Point can now be subclassed without overriding equals() and instances of all types will provide symmetry when compared for equality. ColouredPoint overrides equals() and overrides fEquals(Point) to return false - the required return comparing a ColoredPoint with a Point.

public class ColoredPoint extends Point {

private final Color color;

public ColoredPoint(int x, int y, Color color) {
super(x, y);
this.color = color;
}

// ....................

@Override public boolean equals(Object obj) {
return (obj instanceof ColoredPoint &&
((ColoredPoint)obj).fEquals(this));
}

@Override protected boolean fEquals(Point obj) {
return false;
}

protected boolean fEquals(ColoredPoint obj) {
return (this.color.equals(obj.color) && super.fEquals(obj));
}

@Override public int hashCode() {
return (41 * super.hashCode() + color.hashCode());
}
}

ColoredPoint gets a fEquals(ColoredPoint) overloading and again equals() calls the received object's version. Using this technique we can write as many Point and ColoredPoint subclasses as we like that do or do not override equals().

To show two references to the same object giving a symmetrical false the same ColoredPoint is assigned to a Point reference and a ColoredPoint reference:

public static void main(String[] args) {
Point p1 = new Point(1, 1);
ColoredPoint cp1 = new ColoredPoint(1, 1, Color.PINK);
Point p2 = cp1;

First do a check that equals(Object) is working O.K. Of the following the first two pairs should return symmetric false comparing a Point with a ColoredPoint. The last pair should get a symmetric true comparing the same ColoredPoint. A same-object check has not been put in equals so the full field comparison is done:

System.out.println("p1.equals(p2) is " + (p1.equals(p2)));
System.out.println("p2.equals(p1) is " + (p2.equals(p1)));
System.out.println("p1.equals(cp1) is " + (p1.equals(cp1)));
System.out.println("cp1.equals(p1) is " + cp1.equals(p1)));
System.out.println("p2.equals(cp1) is " + (p2.equals(cp1)));
System.out.println("cp1.equals(p2) is " + (cp1.equals(p2))+ "\n");

Output is:
p1.equals(p2) is false
p2.equals(p1) is false
p1.equals(cp1) is false
cp1.equals(p1) is false
p2.equals(cp1) is true
cp1.equals(p2) is true


Everything works as expected but now the overloaded fEquals() methods are tested:

System.out.println("p1.fEquals(p2) is " + (p1.fEquals(p2)));
System.out.println("p2.fEquals(p1) is " + (p2.fEquals(p1)) + "\n");
The p1 reference is a Point type that references a Point so Point's fEquals is called and the x, y fields are compared: true expected. The ClouredPoint argument referenced by p2 at runtime has no effect at compile or runtime. Runtime argument type never has any effect on which method gets called.

For the second comparison the p2 reference is also a Point type. The compile time call is to fEquals(Point) but p2 references a ColoredPoint and its overriden version of fEquals(Point) is called at runtime: false expected, this method always returns false.

Output is:
p1.fEquals(p2) is true
p2.fEquals(p1) is false
The next two comparisons behave in a similar way:

System.out.println("p1.fEquals(cp1) is " + (p1.fEquals(cp1)));
System.out.println("cp1.fEquals(p1) is " + (cp1.fEquals(p1)) + "\n");
p1.fEquals(cp1) behavior is the same as p1.fEquals(p2), true expected. In cp1.fEquals(p1) gets a call to ColoredPoint's version of fEquals(Point): always false.

Output is:
p1.fEquals(cp1) is true
cp1.fEquals(p1) is false
The second result is obvious but now the references to the same ColoredPoint are compared:

System.out.println("p2.fEquals(cp1) is " + (p2.fEquals(cp1)));
System.out.println("cp1.fEquals(p2) is " + (cp1.fEquals(p2)));
}
gets:
p2.fEquals(cp1) is false
cp1.fEquals(p2) is false
cp1.fEquals(p2) is false is particularly unintuitive unless we remember that the runtime type of an argument has no effect on which method gets called - ever, unless said method is an exception handler that is.


The binary method problem

Elsewhere multiple and predicate dispatching add another dimension to overridden methods by making the argument type/s direct execution to an appropriate method. A Java extension that supports predicate dispatching is [here] with a covering paper [pdf].

One thing that multiple dispatching is good at is solving the binary method problem but at least a limited solution is possible in standard Java. Suppose we write a base class that has a method with an argument referencing an instance of the class or a same type array argument: a binary method. How can we allow for this method being overridden so that it works for instances and arguments of any subclass including those that have not yet been written and without changing code in the base class? The equals() implementation is a special case solution to the binary method problem using programmed double dispatching.


Diverse identifiers

Overloading is simply a convenience that saves having to think up a new method name. It forms the basis of operator overloading but may not be appropriate when methods are subject to being overridden in a subclass. Overloading adds nothing to functionality and as shown above can get unwanted side effects when an overloaded method is overridden. Maybe we need to consider whether this kind of polymorphism is generally helpful or whether it only adds ambiguity in some and perhaps in many cases - not for the compiler of course but when trying to interpret someone else's code or our own at a later time.

There is no reason why fEquals should not be given different identifiers in different classes, pointEquals(Point) and coloredPointEquals(ColoredPoint) for example. On the other hand doing this gets another common OOP problem: linkedColoredPointEquals(something) - huge identifiers that wrap code onto the next line. When implementing equals() there is never any scope for calling the protected method against a reference with an inappropriate nominal type. Perhaps all that is needed to provide better clarity is a more explicit overloaded method name, equalFields(...), for example.

No comments:

Post a Comment