Software Engineering Involves a Lot More than Programming
In this article, the author talks about the important lessons he learnt while working on a college project—lessons that hold good even in the real world.
During my college days, a few of my classmates decided to take up a challenging project. Five of us had time to hack some code—two months of semester holidays to make use of and enough enthusiasm to take up a project that was unusual and difficult. During those days, in the late 90s, Java was getting very popular. So we decided to implement a simple compiler and virtual machine for Java as a group project.
Though we were aspiring to accomplish a lot, the chances of completing the project were slim. We had little experience writing code for a large project; we lacked coordination and did not know how to plan; we did not know what configuration management was; we knew how to use only primitive tools (like text editors) and we debugged manually by tracing the code. But there was an even more formidable challenge - none of us knew Java!
However, our objective was very clear—to implement a compiler and virtual machine for Java. The Java language specification and the virtual machine specification was (and is) available for free and is well written. A book was also available describing the internals of Java. Armed with these resources, we divided the project tasks into five parts: Java parser (that would use recursive descent parsing since it was the easiest to implement) Assembler (our Java compiler implementation would generate Java byte codes that would be ‘assembled’ to generate a Java ‘.class’ file) Loader (the loader would dissect the ‘.class’ file and create runtime data structures; it also passed the byte code instructions to the interpreter for execution) Interpreter (the stack-based interpreter would actually execute the byte code instructions that would be seen as the ‘execution’ of the input program to the users of the VM) Utilities (meant for common utility code for the compiler and VM) The only thing we planned was to complete each of these tasks in two months (because that was the duration of our semester holidays)! To avoid ‘wasting time’ in discussions, we limited our communications and found our own ways to develop components independently. For instance, to implement the assembler, we would use the Java disassembler tool (i.e., javap tool shipped as part of JDK) and then assemble it to create ‘.class’ files.
Independently, we struggled a lot to complete our respective modules. For instance, my task was to write a Java parser. I ran into so many problems writing the code, and spent 80 per cent or more of the time debugging the code rather than writing any useful new code.
There were some small disasters as well. We used the date or keywords like ‘latest’ in the folder names to keep track of versions of the folders containing source code, and did not use any version control tools. We lost source code often and had to rework them again; for instance, we had copies of the zip files in many machines or floppy drives - so, the extension ‘latest’ was never useful and we had to rework when we used the wrong version.
At the end of two months, we were quite happy that we had completed our tasks. We thought it was a simple task to put together our parts and get the code working, but we got a rude shock: nothing worked. The interfaces were incompatible. There were many assumptions we had made on other parts, which led to build failures (and later crashes). We thought it would take a week or two to get the final working product. But it took two more months with late night sessions and considerable reworking to get it going. We realised we spent more time integrating components than developing the independent components. The completely unused part was the utilities component - parts each of us developed used our own helper classes and methods, and the utilities component was just not used!
It was thrilling to see the final version of the compiler and interpreter working, and running as expected—starting from a simple ‘Hello world’ application to large programs spanning hundreds of lines of code.
We went our different ways after completing the project. But this experience gave us confidence and led to opportunities that we didn't expect at all when we started it. For instance, based on this project experience, I landed my first job as an engineer in a C++ compiler team in an MNC. Two others are now successful project managers, and the other two run their own small businesses.
Learning from experience
If I reflect upon our experience, there are many interesting things to learn from it.
Why did we divide our work into five modules? Because we were five people who had to work as independently as possible!
Later I learnt about Conway’s Law, which is about this effect on a larger scale: “Organisations which design systems ... are constrained to produce designs which are copies of the communication structures of these organisations.”
We were using concepts such as mocking that are popular today without knowing what they were. With this, I learnt that we can successfully create and use various methods, techniques, or solution approaches in projects without knowing their names.
We started with the simplest thing that worked. For instance, I used a recursive descent parser, because it was the easiest to implement and debug. Using compiler generators would have complicated implementation, and more important, debugging would have been a nightmare. Similarly, we used a simple stack-based interpreter instead of something advanced like Just-in-Time compilation. We used reference counting as the implementation for GC, since that is the simplest to implement. As the first cut, doing the simplest thing that works is the best way to get going. After working in real-world projects, I realised most engineers try complex solutions first and wonder why they fail. I could also relate to Brook’s famous second system effect: “The tendency of small, elegant, and successful systems to have elephantine, feature-laden monstrosities as their successors due to inflated expectations.”
Requirements play a key role in the success of the project. In our case, we had a well-written and stable specification and that helped complete our implementation.
Without realising it, we had followed the Waterfall model for implementation. I am a fan of iterative methods like Agile, but it is important to know when and where Agile is suitable. Since requirements were known and remained stable, the Waterfall model worked well for this project and was a good choice.
At its core, programming is a fun-filled problem solving activity. The happiness and excitement I experience when I successfully solve a problem is something I find very hard to describe to others. That adrenaline rush when you have a complex project working is something that must be experienced. We learned many hard lessons as well. We had spread ourselves too thin by taking on the task of implementing both the compiler and the VM. Functionality completion was not a major problem, but the quality of the code was really bad. Had we taken up only one of these two tasks, we could have achieved both functionality and a quality product.
Good software configuration management practices are essential for a software project. The success of a project or product is not determined solely by the SCM practices, but bad SCM practices will certainly lead to failure.
The quality of the code was poor and this fact was clear to all of us who wrote the code—we didn't have to run the application to know that the compiler or the VM would crash anytime. So, quality of the code (and of design, by extension) is the most important indicator of quality. (But if we were to start over again, the only way to improve the quality would be to improve the process of developing the code.) Later, when working in large projects, I was quite surprised (and dismayed) to see project managers looking at testing results and defect trends as the main indicator of quality. Testing or defect counts are, of course, some of the indicators of quality; but it is the quality of the code (and design) that is the main determining factor of software quality.
Our effort and time estimate was unrealistic. I later learned about Hofstadter’s Law: “It always takes longer than you expect,” which is especially applicable for software engineering projects.
We made a costly mistake by thinking that communicating and interacting was a ‘waste of time’. Aspects such as effective communication and team work constitute the core of effective software engineering. In fact, I learnt later that our ability to communicate and work as part of a team is more important than our technical skills and abilities, especially when working as part of a large software development team.
I can sum up our experience with this statement: programming is not software engineering. Programming is essentially a personal activity. Programming involves solving a problem through coding and testing. Software engineering is a team activity. Programming is just one aspect of software engineering. Software engineering is about creating a software product that could be used, reused and modified by others; it includes a range of activities—from requirements gathering, modelling or prototyping solutions, to testing, tracking the project’s progress and status, applying processes and methods for quality, etc. Software engineering is a structured and disciplined approach towards designing, developing and maintaining highquality software for the real world. Understanding and applying this distinction between programming and software engineering is the key to successfully developing products or managing projects in the real world.