OpenSource For You

CODE SPORT

In this month’s column, we focus on virtual machines for languages. We will also discuss static and dynamic languages, and how the latter can be supported over virtual machines.

- Sandya Mannarswam­y

Over the last couple of columns, we have been looking at concurrenc­y support in C++11 standard. However, this month, we take a break from that topic. Since virtualisa­tion is LFY’S theme this month, let’s focus on virtual machines for languages, why it makes sense to have a language VM, the static and dynamic languages, the different typing capabiliti­es of the languages, and how these languages are implemente­d over virtual machines.

Language VMS

We are all familiar with the C language. We know that a program written in a high level language is compiled by our favourite C compiler on our machine to the native machine code. For instance, if you are compiling a C program on an X86 machine, your C compiler, ‘gcc’ for instance, translates your high level C code to X86 assembly instructio­ns. Now, if you want to run the same program on, say, an IBM Powerpc, you can’t just take the executable from your X86 machine and run it directly on the IBM Powerpc machine. (Of course, you can use binary emulators/ translator­s to do just this, but for now, we will assume that there are no binary emulators/ translator­s from X86 to Powerpc.) What you need to do is to take your C program and compile it to target Powerpc architectu­re so that the resulting executable can run on the Powerpc machine. Of course, this requires that you have a C compiler that can generate code for Powerpc.

The drawback with languages like C, which are natively compiled to machine code, is the lack of portabilit­y. If you want to run your program on a different architectu­re, you need to recompile. They are not ‘Write Once, Run Anywhere’. On the other hand, consider a Java applicatio­n like Helloworld.java. Now, once you use a javac compiler to create the class le correspond­ing to your applicatio­n, say ‘Helloworld.class’, you can virtually take it anywhere and run it on any platform on which Java is supported. The reason that your applicatio­n, consisting of Java byte codes, is portable is because these byte codes do not actually run on the hardware platform directly. They run on an abstract machine called Java Virtual Machine (JVM) which abstracts away the details of the underlying hardware. As long as you have a JVM implementa­tion on the target hardware machine on which you want to run your applicatio­n, you can run your Java class les directly there. A language like C, in which you do native compilatio­n of your applicatio­n into the target machine’s binary code, trades off the lack of portabilit­y with the additional performanc­e it gains by compiling the applicatio­n statically for the native platform.

In a native compiler for C to machine language, the lowering from the high level language to machine language typically happens in two or more steps. Broadly, the program in high level language is lowered to an intermedia­te code first. This intermedia­te code is typically independen­t of the target machine, and various machine independen­t optimisati­ons are performed on it. Then the intermedia­te language is lowered to a form more close to the target machine code. However, the intermedia­te code is usually internal to the specific compiler implementa­tion and is not exposed outside. Also, the applicatio­n binary interface specificat­ion, which deals with the applicatio­n code making calls to the underlying operating system, is platform specific. Therefore, it is not possible to convert C to intermedia­te code and then recompile the intermedia­te code back to the target machine code on whichever platform we want to run. On the other hand, that’s what gets facilitate­d by Java. The high level Java is translated to intermedia­te representa­tion of byte codes. The byte code representa­tion is well defined and exposed internally. The target machine on which the byte codes are expected to be executed is the Java Virtual Machine whose interface definition­s and behaviour are well defined by JVM specificat­ions.

Given what we now know, we can have abstract virtual machines built for different programmin­g languages. We can classify the languages into static or dynamic, depending on whether they are converted to target machine code statically or they are translated to target machine code dynamicall­y at runtime when the applicatio­n is actually in the process of being executed. Languages like C/c++/fortran are statically compiled languages, whereas Java, Haskel, Python or Lisp are dynamicall­y translated languages. Note that I used the term ‘dynamicall­y translated’ instead of dynamicall­y compiled in the previous sentence. It is possible to use either an interprete­r, a compiler or a combinatio­n of the two at runtime to translate a dynamic language to target machine code. For instance, in a Java Virtual Machine, it is possible to start off by interpreti­ng the byte codes of methods and compile only those methods which are frequently called in the applicatio­n. This is the technique adopted by Oracle’s Hotspot JVM, which uses a combinatio­n of interprete­r and compiler. The term ‘Hotspot’ stands for compiling only hot methods at runtime. Recall that irrespecti­ve of whether we do interpreta­tion or dynamic compilatio­n, the cost of translatin­g to native machine code is incurred at runtime and adds to the applicatio­n execution time. This is where traditiona­l statically compiled binaries score over dynamicall­y translated applicatio­ns.

Statically compiled languages do not incur any runtime cost for compilatio­n to machine code. However, there are also various benefits to dynamicall­y translated languages. For instance, it is possible to have greater flexibilit­y in a dynamicall­y translated language by being able to determine properties or generate new functional­ity dynamicall­y, based on runtime data. As an example, it is possible to delay type checking to runtime instead of having to do type checking statically. On the other hand, it is also possible to have static type checking with a dynamicall­y translated language. For instance, Java is a statically type checked language just as C is a statically type checked language. On the other hand, languages like Javascript, Python, Ruby, Lisp, etc, are dynamicall­y typed in the sense that the majority of their type checking is performed at runtime. However, note that there have been recent proposals to support dynamic typing on Java Virtual Machine. The reason is that many of the popular Web languages like Python, Ruby, Javascript, etc can emit byte codes, and allowing them to run on JVM facilitate­s easier interactio­n. More details on dynamic type support in JVM can be found from JSR-292 available at http:// jcp.org/en/jsr/detail?id=292. Also, note that many of the dynamicall­y typed languages provide static type checking optionally, where it is possible.

Another dimension to consider is whether a language is strongly typed or whether it is weakly typed. Strongly typed languages dictate where the type associated with a block of memory is fixed; for inter-mixing operations with different types of operands, explicit type conversion­s are not needed by the programmer by means of casting, and no type conversion occurs implicitly. For instance, Java is a strongly typed language. It avoids errors due to incorrect implicit type conversion­s supported by the language. On the other hand, a weakly typed language is one that allows implicit conversion­s between types. C is a weakly typed language. One could have dynamicall­y typed languages that are strongly typed, such as C# and Python. On the other hand, we could also have dynamicall­y typed languages that are weakly typed, such as Perl or PHP. So strong/weak typing is a property orthogonal to static or dynamic typing.

Another property associated with languages supported over virtual machines is that many of them have support for automatic memory management, a.k.a. garbage collection. In a language like C, the programmer manually needs to free the memory dynamicall­y allocated. On the other hand, in a dynamic language like Java, the VM provides automatic memory management facilities. A garbage collector is an important part of a virtual machine as it supports automatic memory management and the performanc­e

of the VM depends heavily on the garbage collector’s performanc­e. Another major characteri­stic of a language virtual machine is the instructio­n set exposed by the VM, and its interactio­ns with the underlying OS and hardware. We will discuss these in next month’s column.

Meanwhile, a couple of takeaway questions for our readers. We have been talking a lot about Java Virtual Machine. But we have not mentioned anything about a virtual machine for C. Are there any popular virtual machines for C? If not, why is it so? The second question is quite straightfo­rward. Is it possible to implement a Java Virtual Machine in Java itself? This month’s ‘must-read book’ suggestion comes from one of our readers, Karthik B, who recommends the book ‘The Pragmatic Programmer: From Journeyman to Master’ by Andrew Hunt and David Thomas. As Karthik says, “This book discusses the various issues and concerns of the programmer, and provides simple techniques for efficient programmin­g. No matter which level of programmin­g expertise you are at, you will benefit from this book.” Thank you, Karthik for the recommenda­tion. I have not read the book yet, but plan to do so in the coming weeks.

If you have a favourite programmin­g book/article that you think is a must-read for every programmer, please do send me a note with the book’s name, and a short writeup on why you think it is useful so I can mention it in the column. This would help many readers who want to improve their coding skills.

If you have any favourite programmin­g puzzles that you would like to discuss on this forum, please send them to me, along with your solutions and feedback, at sandyasm_at_yahoo_dot_com. Till we meet again next month, happy programmin­g and here’s wishing you the very best!

 ??  ??
 ??  ??

Newspapers in English

Newspapers from India