Happy Types

I had prepared this month’s monthly newsletter on generics in Java for my team. It may not be the one ring, but might help you gaining a step, to become the one ring in `Generics` in Java (In case if you are not already). It can be accessed here.

Table of Contents and brief summary:

Generics

Variance
Here we introduce some terms: Co-Variance, Contra-Variance, In-variance. And then explain what do they actually mean.

Case-Studies

  • We observe the true strengths of `? extends` and `? super` and when should you be using what.
  • Almost all different case one can think of.

Some More Cool Stuff

Please let me know in case you disagree with something.

 

PS: The booklet has been generated using Scalatex developed by Li Haoyi. You might want to look at it. Its easier to setup than Jekyll and certainly more expressive than WordPress. And provides compile time like type safety.

Cats

Cats is a simple and concise functional programming library for Scala. It appeals to me for its compactness and simpler class hierarchy.

I have no formal education in category theory. But I am a student of theoretical Mathematics and have studied Abstract Algebra. Though I am not aware of formal category theory terms, I will try converting them to simple set theory.

Category theory is concerned with study of algebraic structures. Algebraic system can be described as a set of objects together with some operations for combining them. Example’s can be vectors, groups, fields etc. Or even Sets.

The aim of this blog post is to explain cats first in a very simple lucid way. And they try putting it in code. I will be using some terms:

  1. Universe: For example: Integer-Universe would imply a universe containing integers or set of integers. Think of it as our infinite terrestrial universe which contains milky way which contains solar system and ultimately planets.
  2. Space/Type: A set of elements corresponding to a property. Space exists in universe. You can think Space as a type in scala.

Functor

Let A and B be space in some universe. And there exists a function `f` from A to B.  Then Functor for `X` universe is a mapping from (mapping of space A in X universe) `X(A)` to (mapping of space `B` in universe `X`) `X(B)`

Factor

Now what does this mean in scala world? Say we have a function `f` that tells us how to convert from type A to type B. So `f` does `Int => String`.

Now we want a mechanism to convert elements of another universe, say `Option` universe. First we need a mechanism to convert from space `A` in our universe to `Option(A)` (Space A in Option universe). We will see how it happens later below (Applicative).

So now some how we have mapped A to Option universe. Lets call it `Option(A)` space. We wish to convert from space `Option(A)` to space `Option(B)` (i.e. we need to `Option[Int] => Option[String]`) (Or F[Option] in above image)

Functors are exactly for this purpose with `map` being its flagship method:

trait Functor[F[_]]  { self =>
    def map[A, B](f: F[A])(f: A => B): F[B]
}
implicit val OptionFunctor = new Functor[Option] {
  override def map[A, B](fa: Option[A])(f: (A) => B): Option[B] = fa map f
}

Functor[Option].map(Some(1))(x => x*13)
Functor[Option].map(Some(1))(x => x.toString)

Functor[Option] is a mapping mechanism from space `Option(A)` to set `Option(B)` in Option universe. Think of it as:

We know how to convert from a space A to another space B in some universe. In another universe X, Functor of X lets us convert from X(A) to X(B)

So it lets you convert from `Option[Int] => Option[String]` or `Option[String] => Option[List[Int]]` etc. In short any space to any space in `Option` universe

Lift:

Lift is my favorite method in Functor. Notice with map, provided you have `Option[A]`, map converts to `Option[B]`. i.e. it consumes `Option[A]` object. But the conversion from `Option[A] => Option[B]` is just another function can be useful if it can be pass around as function. i.e. `F[X]` in above image.

def toStri(n:Int) = n.toString
val lift: (Option[Int]) => Option[String] = Functor[Option].lift(toStri)
lift(Some(1))

If you think, lift seems something like this:

val lift = Functor[Option].map _ //ignore map types

`lift` seems everything of a `map` but as a pure function. And thats how lift is implemented

def lift[A, B](f: A => B): F[A] => F[B] = 
                                          x => map(x)(f)

For a function from Space A to Space B, lift provides a function in Universe X to convert from X(A) to X(B)

Applicative

Now notice, using a Functor we can convert from X(A) to X(B), given a function A to B. But how do we convert an element x belonging to A, to an element in X(A)?
Applicative
Applicative lets us do that:

implicit val OptionApplicative = new Applicative[Option] {
  override def pure[A](x: A): Option[A] = Option(x)

  override def apply[A, B](fa: Option[A])(f: Option[(A) => B]): Option[B] = ...
}
val op: Option[Int] = Applicative[Option].pure(1)

`pure` lets us convert an element from a universe to another. In above case, it lets us convert from element of type `Int` to `Option[Int]`. Applicative is an `Apply`

Apply

Apply is also a Functor.

def apply[A, B](fa: F[A])(f: F[A => B]): F[B]

Just instead of the function `f` which converts from `A => B`, you have `X(A => B)`. i.e. a function in X universe. Notice it is not `X[A] => X[B]`.


Lets have some more comprehensive example:

implicit val ListApplicative = new Applicative[List] {
  override def pure[A](x: A): List[A] = List(x)

  override def apply[A, B](fa: List[A])(f: List[(A) => B]): List[B] = fa.flatMap(x => f.map(y => y(x)))
}

val lsOp = Applicative[List].pure(2)

You ask the Applicative for `List` to convert an element 2 to something in `List` universe. i.e. `List(2)`

 
scala> val ls = Applicative[List].apply(List(1,2,3))(List(x => "a=>"+x, y => "b ==>"+y)) 
ls: List[String] = List(a=>1, b ==>1, a=>2, b ==>2, a=>3, b ==>3)

scala> val ls2 = Applicative[List].apply2(List(1,2,3), List(4,5,6))(List((x:Int,y:Int) => x+" --a--> "+y, (x:Int,y:Int) => x+" --b--> "+y))
ls2: List[String] = List(1 --a--> 4, 1 --b--> 4, 2 --a--> 4, 2 --b--> 4, 3 --a--> 4, 3 --b--> 4, 1 --a--> 5, 1 --b--> 5, 2 --a--> 5, 2 --b--> 5, 3 --a--> 5, 3 --b--> 5, 1 --a--> 6, 1 --b--> 6, 2 --a--> 6, 2 --b--> 6, 3 --a--> 6, 3 --b--> 6)

Compose:

implicit val OptionApplicative = new Applicative[Option] {
  override def pure[A](x: A): Option[A] = Option(x)

  override def apply[A, B](fa: Option[A])(f: Option[(A) => B]): Option[B] = fa.flatMap(x => f.map(y => y(x)))
}
scala> val comp= (Applicative[List] compose Applicative[Option])
comp: cats.Applicative[[α]List[Option[α]]] = cats.Applicative$$anon$1@168d1858

scala> comp.pure(12)
res5: List[Option[Int]] = List(Some(12))

Compose is quite interesting. So say you have an Applicative for List universe and another Applicative for Option universe. By composing them, you get an Applicative for `List[Option]`. Calling pure convert’s 12 from Int to a List(Option(1)).

PS: Will keep updating with time

Tagged

Type-information of generic type for a Parametrized class

Sometimes you feel that your code is world-class and deserves to be shared. There is one such class I have written for a personal project and find it to be extremely useful. You can view it here. Its called `CaptureType`

This class obtains the type information of the generic type and can be accessed in a type-safe manner. Some simple examples:

   static void test1() {
        CaptureType<String> t1 = new CaptureType<String>() {};
        equals(t1.getRawType(), String.class);
    }

    static void test2() {
        CaptureType<List<String>> t1 = new CaptureType<List<String>>() {};
        equals(t1.getRawType(), List.class);
        equals(t1.getParamADT().getParameters().get(0).getRawType(), String.class);
    }

Little tougher:

    static class Test4 extends CaptureType<List<String>> {}

    static void test4() {
        Test4 test4 = new Test4();
        equals(test4.getParamADT().getRawType(), List.class);
    }

    static class PreTest6<S> extends CaptureType<S> {}

    static class Test6 extends PreTest6<Integer> {
    }

A really tough one:

    class X<T> extends CaptureType<T> {
    }

    class Y<A, B> extends X<B> {
    }

    class Z<Q> extends Y<Q, Map<Integer, List<List<List<Integer>>>>> {
    }

    void test7(){
        Z<String> z = new Z<>();
        TypeADT param = z.getParamADT();
        equals(param.getRawType(), Map.class);
        List<TypeADT> parameters = param.getParameters();
        equals(parameters.get(0).getRawType(), Integer.class);
        equals(parameters.get(1).getRawType(), List.class);
        equals(parameters.get(1).getParameters().get(0).getRawType(), List.class);
        equals(parameters.get(1).getParameters().get(0).getParameters().get(0).getRawType(), List.class);
        equals(parameters.get(1).getParameters().get(0).getParameters().get(0).getParameters().get(0).getRawType(), Integer.class);
    }

This class captures information using Reflection. Sadly, it only obtain whatever reflection can provide. It captures generic types using the super token.
So in cases like:

 class SomeClass<T> extends CaptureType<T>{}
 SomeClass<String> claz = new SomeClass<>();

It is impossible to obtain the type information as it is erased. `CaptureType` class will only work,
if the type is mentioned while declaring the Class (at any point in hierarchy). For example in:

     class A<Z> extends CaptureType<Z>{}
     class B<E,T> extends A<T>{}
     class C <X,Y,Z> extends B<X, String>
     }

It is possible to obtain the type {@code String}, because it is mentioned at class level and can be mapped by following hierarchy. So `Z` in {@code A} is {@code String}

Please comment down your feedback. Even better an improvement :). I give as good as I get!

Parallel Streams!!

I am always skeptical of using parallel streams in Java8. For simple reason:

You do not have control to decide the pool responsible for executing tasks. By default, common ForkJoinPool executes it.

So in short when you do:

stream.parallel().filter(x -> isPrime(x));

The common pool executes it. And you cannot let some other pool execute `filter` in a straight-forward manner. It gets really dangerous on large codebases where people stupidly using parallel streams. If any computation gets time taking or worse blocking, it impacts all the parallel streams because they all use the same pool.

Though there is a way to let your custom pool execute tasks, but it is insanely irritating (and breakable).

ForkJoinPool pool = new ForkJoinPool(4);
int[] ans = pool.submit(() -> {
            return IntStream.range(1,1000000).parallel().filter(x -> isPrime(x)).toArray();
        }).get();

public static boolean isPrime(int i){
        System.out.println(Thread.currentThread());
        return true;
    }

The above prints Thread[ForkJoinPool-1-worker-0,5,main] rather than Thread[ForkJoinPool.commonPool-worker-2,5,main]. Which means the subtasks are executed by our pool rather than the common pool. It seems magic because we haven’t requested the parallel stream to use our pool. This implicitly happens because of ForkJoinTask#doInvoke internals

private int doInvoke() {
        int s; Thread t; ForkJoinWorkerThread wt;
        return (s = doExec()) < 0 ? s :
            ((t = Thread.currentThread()) instanceof ForkJoinWorkerThread) ?
            (wt = (ForkJoinWorkerThread)t).pool.awaitJoin(wt.workQueue, this) :
            //else delegate to common pool
    }

Where it checks if the current thread is part of any pool, if so obtains reference to the pool and uses it. Else delegates to common pool.

I haven’t found documentation which suggests above behavior. Hence the behavior cannot be fully trusted.

Tagged

Constraint Argument to Multiple Types

A lot of times while writing quick scripts (where I haven’t thought through well), I wish the compiler would be a bit lenient to allow any argument of different types. Like this:

class Animal
def getInt(n: Any) : Int = n match {
    case x: Int => x
    case x: String => Integer.parseInt(x)
    case x: Animal => 42
}

Duck typing looks well. But its too lenient for me. I dont like the ability to do getInt(new Object). I wish I was allowed to do:

def getInt(n: Int || String || Animal) : Int = n match {
    case x: Int => ....

One very smart way of doing it can be:

sealed trait GetIntType[-T]
object GetIntType{
implicit object IntType extends GetIntType[Int]
implicit object StringType extends GetIntType[String]
implicit object AnimalType extends GetIntType[Animal]
}

class Dog extends Animal

def getInt[T : GetIntType](n: T) : Int = n match {
case x: Int => x
case x: String => Integer.parseInt(x)
case x: Animal => 42
}

scala> getInt(1)
res7: Int = 1

scala> getInt("1")
res8: Int = 1

It works because, the compiler constraint is from the availability of implicit objects in `GetIntType` companion object. The context bound implicit def f: GetIntType[T] by default searches for implicit objects inside companion object of `GetIntType`. So now not only you can do this:

scala> getInt(new Animal)
res5: Int = 42</pre>

But also this:

scala> getInt(new Dog)
res6: Int = 42

It works because `GetIntType` is contra-variant. Which means if Dog <: Animal then GetIntType[Animal] <: GetIntType[Dog].

But I still find the above tedious. And too much baggage code.


 

 

 

Hack Puzzle – 1

I have been thinking for quite some-time to build a puzzle. I think I have built an awesome one and hope everyone finds it awesome to hack :)

You can download the jar from here.

Aim:
  • Make the program terminate gracefully
Terminate gracegullyImplies that the program should end. And it should not end by throwing an Exception or in an abnormal manner. Using `System.exit` or the program terminating due to an exception is not graceful termination.
To run:
  • java -jar DeadLock2.jar
If successfully run it will output:

Oops there is a deadlock. Please solve the deadlock to make me move forward.

Rules:
  1. You cannot replace any of the jar contents with your files
  2. You cannot recompile any of the classes in jar and replace them
  3. You are permitted to alter/modify the contents of the jar/class files
  4. You are permitted to add or use any external library to solve it
If successfully solved, the program prints the below:

Puzzle Soved!! Congratulations you have solved the puzzle.

Comment below the tools you used and time it took to solve. Happy Hacking :-)

Replace View Bounds

Well the scala community has decided to deprecate view bounds in Scala. I suspect it might have to do with brevity. Full discussion can be viewed here

Well looks like most of the view bounds (<%) can be converted to context bounds. Lets see different strategies that can help us do the same. Suppose we have:


scala> def foo[T <% Int](x: T):Int = x
foo: [T](x: T)(implicit evidence$1: T => Int)Int

scala> implicit def a[T](n:T) = n match {
 | case x:String => x.toInt
 | }
warning: there were 1 feature warning(s); re-run with -feature for details
a: [T](n: T)Int

scala> foo("23")
res4: Int = 23

To convert it to `context bound` we need to implicitly have something that does T => Int. We could:


scala> type L[X] = X => Int
defined type alias L

scala> def goo[T : L](x: T):Int = x
goo: [T](x: T)(implicit evidence$1: T => Int)Int

scala> goo(23)
res2: Int = 23

We could combine the type declaration with the function definition as:


scala> def goo[T : ({type L[X] = X => Int})#L](x: T):Int = x
goo: [T](x: T)(implicit evidence$1: T => Int)Int

//quite cryptic though

A more readable version:


scala> def goo2[T](x: T)(implicit ev: T => Int):Int = x
goo2: [T](x: T)(implicit ev: T => Int)Int

scala> goo2("12")
res8: Int = 12

More generalized version:


scala> def goo[E,T : ({type L[X] = X => E})#L](x: T):E = x
goo: [E, T](x: T)(implicit evidence$1: T => E)E

scala> def goo2[T, E](x: T)(implicit ev: T => E):E = x
goo2: [T, E](x: T)(implicit ev: T => E)E

scala> goo("1000")
res10: String = 1000

scala> goo2("1000")
res11: String = 1000

Asynchronous Distributed Key-Value DB

I have developed a simple-beta version of an synchronous distributed key-value database. It is simple and quite robust and I plan to make it not so-simple and open to other languages (apart from JVM based ones) in the future :). You can download it and start using  from github.

In short (Refer wiki for more info): Suraj Bhirani

  1. It is Asynchronous
  2. You can have a single primary node with any number of secondary nodes which can join or leave arbitrarily. 
  3. It is truly distributed. You just need the address of “Arbiter” using which any node running on any instance can start acting as a node.
  4. Insert (Key-Value pair) can only happen at primary node. Search can be done on any node.
  5. Order and consistency is maintained at all nodes. i.e. given enough time, all replicas settle on the same view.
  6. Each node has a Persistence service which takes backup of every insertion/removal. This is client dependent and can be done as desired using any SQL/NoSQL database or file for that matter.
  7. Tested vigorously. I believe more effort has gone in testing it than building it.

The following are the current limitations. I hope to remove them by the next version

  1. Currently updates are only possible on a dedicated node, the primary replica.
  2. The primary (leader) node cannot fail during the uptime of the system.
  3. Membership is handled reliably by a provided subsystem (see section “The Arbiter”). New Replicas which wish to Join should send aJoin call to Arbiter. But for a replica to leave, the current process is quite tedious by sending a manual Replcias(Set[ActorRef])call to Primary Replica.
  4. The update rate is low, meaning that no incoming requests are rejected due to system overload. In case of rejecting an update the store is left in a possibly inconsistent state which may require a subsequent succeeding write to the same key to repair it.
  5. Clients are expected not to reuse request IDs before the request has been fully processed and responded to.
  6. There is an utter small chance for the database to be in in-consistent state. Please refer the section “Consistency in the case of failed replication or persistence” in the overview.

But in short “It Works!!”

Technologies:

  1. Scala
  2. Akka
  3. SBT as build tool

I highly recommend everyone to have a look at Akka. There are very frameworks that scale as beautifully as Akka does.

Special Thanks to the course “Principles of Reactive Programming” at coursera to help me get started.

Tagged ,

Atomicity from single-cycle compare & swap

There is a wonderful function defined in scala.concurrent.Future trait called onComplete. Here is the declaration:

def onComplete[U](func: Try[T] => U)
              (implicit executor: ExecutionContext): Unit

There is one key-point in the documentation which says :

If the future has already been completed, this will either be applied immediately or be scheduled asynchronously.

The reason I call it is wonderful function is its implementation. Roughly speaking doing:

import scala.concurrent._
import ExecutionContext.Implicits.global
val future = future{
Thread.sleep(1000)
"Successful"
}
future onComplete{
case Success(i) => println(i)
case Failure(i) => println(i)
}

Line 3 dispatches the method to a thread and returns a Future. At line 7 we start an asynchronous call which says – When the task is completed, depending upon the return value, please print it.

On reading the documentation, it  might suggest that it creates a new java.lang.Thread as soon as line-7 is executed. This Thread waits for “future” to complete and then print the value. But say if the thread is from Executors.newCachedThreadPool then this might also mean that for every call to onComplete, a separate Thread is responsible to execute the above mentioned. A Disaster!!

But its inventors were smart. What they instead did was to have an internal list which stores all functions to be called once the Future completes. Once it completes, it calls back all those registered functions itself or using ExecutorService. This way no extra Thread has to be creates or blocked. What is interesting is that they did not use any internal list to store the list of callback functions.


private volatile Object _ref;
final static long _refoffset;

protected final boolean updateState(Object oldState, Object newState) {
return Unsafe.instance.compareAndSwapObject(this, _refoffset, oldState, newState);
 }

Now calling onComplete does below. It replaces previous pointer with a new list containing previous list and a new addition. On calling multiple times, it keeps adding to the previous list.

updateState(listeners, runnable :: listeners)

function tryComplete of CallbackRunnable is called once the future isCompleted. Now is when all the above listeners are to be executed which are dispatched to the ExecutorService.

def tryComplete(value: Try[T]): Boolean = {
  val resolved = resolveTry(value)
   (try {
     @tailrec
     def tryComplete(v: Try[T]): List[CallbackRunnable[T]] = {
      getState match {
        case raw: List[_] =>
        val cur = raw.asInstanceOf[List[CallbackRunnable[T]]]
        if (updateState(cur, v)) cur else tryComplete(v)
            case _ => null
      }
    }
   tryComplete(resolved)
  } finally {
    synchronized { notifyAll() } //Notify any evil blockers
  }) match {
     case null => false
     case rs if rs.isEmpty => true
     case rs => rs.foreach(r => r.executeWithValue(resolved)); true
   }
 }
def onComplete[U](func: Try[T] => U)(implicit executor: ExecutionContext): Unit = {
 val preparedEC = executor.prepare
 val runnable = new CallbackRunnable[T](preparedEC, func)

  @tailrec //Tries to add the callback, if already completed, it dispatches the callback to be executed
  def dispatchOrAddCallback(): Unit =
   getState match {
    case r: Try[_] => runnable.executeWithValue(r.asInstanceOf[Try[T]])
    case listeners: List[_] => if (updateState(listeners, runnable :: listeners)) () else dispatchOrAddCallback()
   }
  dispatchOrAddCallback()
  }
 }

What is nice about the above is how they have used a single cycle processor call – “Unsafe.instance.compareAndSwapObject” to achieve synchronization. At line-30 recursive calls are made to swap it. Say if some other thread at the exact moment did not change the reference (i.e. add a future), then in the first recursive call, the operation succeeds. Say if someone did, then one more recursive call is made where it tries again. This is a superb optimization where without using any synchronization, it is able to do both “Memory Visibility” and “Atomicity”.

Here is lazy initialization (github link) implementation in Java inspired from Scala’s lazy val implementation. It uses similar technique to guarantee atomicity with memory-visibility.

Tagged

Issues with Java 7 on Mac OS

At my work place, I was involved in building a desktop application for Mac (priority) & Windows (lesser priority). So for about 5 months, I did all my development work on Mac and now I think I have a good feel of the OS.

So the application was built using  JavaFX. There were native calls to be made to some C libraries, and I used awesome JNA to do it. But in the whole, I had a terrible time working with JVM on Mac and below is a list of why it is such a pain:

 

Some History:

Till Snow Leopard (not sure), Java came per-installed on Mac. Apple had its own bindings to HotSpot VM. But after Java 7, Apple decided to stop porting JDK and Oracle provided the built. And here starts, half the problems:

(In no particular order)

  • Firstly it is excruciating working with Java 7 based desktop applications . Do not even try it on Retina display; For the font.  It is horrible and blurry. I use Monospaced 13 on eclipse and it is pathetic. And I dont think any more text can ever describe how blurry and pathetic it is.

It is a registered bug which will only get fixed by Java 8 (What!!). It seems the behavior is due to the fact that Oracle built does not support HiDPI. A quick working solution is to use Apple Java 6 to run your applications if you can . For eclipse and netbeans, here.

  • A lot of apple libraries such as Quick Time and others are 32-bit builds. The canon sdk, I was working with only has a 32-bit build available. And only a 32-bit JVM can do native calls with 32-bit libraries. Apple Java 6 is a 64-bit VM and can be run in 32-bit mode using `-d32` jvm argument.

The issue: Java 7 no longer supports `-d32` option. Which means you cannot use it to call 32-bit native libraries. And this breaks half the code. Hence, I simply cannot use Java 7 for development.

  • It is buggy. Our client was Norwegian. Obviously, he had a lot of files named in Norwegian. The issue: Java 7 could not detect files that were named with non-ascii characters (norwegian). (This was caught at the client side and believe me it hurts when you see this happen).  It was a bug which was solved but the damage was done.

And then it was corrected. But during testing, I realized that the DirectoryChooser of javaFX did detect them, and on user selection, weirdly returns null instead of file. I have submitted the bug. But again it was painful telling the client not to use files/directories with Norwegian characters. And I had no time building a custom component.

With all the above, it was still important to use Java 7. We wanted to build a cross-platform application and had no developer resources to write them separately for each OS. This made C/C++ out of picture. We did not want to use Adobe Flex (license & future maintenance issue). Not python, as no one in team knew it and we all were comfortable with Java.

We built on Java 7 because:

  • Apple does not give license to redistribute VM along with application. But my company people wanted to port it along. Hence Java 6 cannot be used.
  • We wanted to use JavaFX (awesome MVC UI toolkit) and not Swing. JavaFX looked promising (which actually is) and hence the decision was taken. But unlike windows, JavaFX build is only available from Java 7 on Mac. Hence we are stuck with it now.

The only positive was this small issue, which worked well on windows and not on Mac.

 

PS: There are a lot more and will keep adding up the list as time permits.

Tagged ,