jump to navigation

How not to port software, part 2 23 February 2007

Posted by Matthew Fulmer in AME Education, software.
add a comment

In my previous post, I told about my experience trying to port a
windows-only C++ program first to the cross-platform glib toolkit, then
to the Squeak environment. I gave up because I thought it would take too
long, and because I was not willing to ask for help using Smallapack, a
Squeak/VisualWorks matrix library. Now I have finished the short project
assigned to me and can look at what happened in retrospect.

The code was even harder to work with than I had imagined it to be. In
spite of being written in C++, the program has no modularity whatsoever,
since all the work is done in a single instance of a single class, and a
mighty fat class it is. The class contains three separate, but heavily
related, algorithms to analyze a stream of data using overlapping window
analysis. So, from the very beginning, there is extreme code
duplication, with up to four functions with nearly the same role; only 3
of the functions were actually shared by all three algorithms.

But what of it? This code is a maintenance nightmare, but will it be
maintained? No! So no maintenance means no nightmare. Harvey, my
adviser, said that he had the same problem I had when he first started
making disposable systems: He had to learn how to write bad code.

So what did I do? I just erased the functions that did something they
did not need to do and made them do what I wanted them to do. Sure; I
accidentally deleted a bit of key functionality, but I could just look at
an old version of the code in subversion to put it back in, and then I
had it: an ugly working program, built from another working program;
neither worse than the other. Each separate functional unit is a
separate program in a separate subversion branch; never to be merged
with the trunk. Again; a maintenance nightmare, but no maintenance
means it is all good.

So is my experience applicable in the real world where code lives on?
I hope not. My code has no future. If it did, a refactoring would be
necessary.

So I wonder, is this more of a problem with early-bound, bottom-up
systems like C++ than it is late-bound, top-down systems like smalltalk?
I don’t know. I started the port with the goal of portability and
maintainability in mind, and chose smalltalk because I wanted to see if
it really was easier to use. So was it?

Refactoring was very easy in Squeak, thanks to the refactoring browser.
However, one problem I found was I could not find a way to limit the
scope of a method rename to just my own classes. I managed to trash the
image twice by renaming a message globally.

Backtracking in Smalltalk was easier than in Subversion, thanks to the
ability to view all changes to each method rather than a coarse
commits. However, I did have a little trouble importing changes after a
crash; the change viewer seemed to file in a range of changes in a
non-temporal order, then abort because it tried to define a method on a
non-existent class. That is probably easy to fix.

The smalltalk debugger actually works. The MSVC debugger just crashed. I
had to resort to printf’s and debugging macros. C++ got totally owned in
this respect.

Squeak does not have a stable matrix library. I sincerely wish I had
been able to Smallapack; the only thing that stopped me was the
deadline pressure. It was not the deadline itself, but the pressure of
“just get it working, now” that prevented me from debugging the very
nice Smallapack library. Without that, Smallapack would be slightly more
stable, and Squeak a pre-alpha dataflow framework.

So, what have I learned? The best way to make a deadline is to cheat.
Therefore, I am going to do whatever it takes to make my next project be
free of milestone deadlines, where functionality always trumps code
sharing. This project is open-source in spirit, especially within our
research department, but the focus on the present seems to prevent
the code from having a future, other than as mutant forms of itself.
Luckily, my internship at Intel seems, at least right now, to be free of
milestone deadlines. I have created milestones to track my progress, but
not for scheduling purposes.

How not to port software 7 February 2007

Posted by Matthew Fulmer in Uncategorized.
3 comments

Three weeks ago, I received an assignment from my research adviser to
add a new, simple motion model to our path tracing program. Quite a
simple assignment. However, I do not like the code base for several
reasons:

  1. It only works under Visual Studio on Windows
  2. It is monolithic
  3. It is not modular

So I thought, why not get around to cleaning up this code? I’ve been
wanting to do it all of last semester. First order of business is to get
it to work under Linux, as I didn’t have easy access to Windows or
Visual Studio at home. So I figured out how to use Autoconf and
Automake, and had the program building under Cygwin (but not linking) by
the end of the day. I then went home, thinking I could get the rest done
at home on Linux.

So I started examining what was preventing the program from running
anywhere. I found three things:

  1. A windows-only C++ threading package
  2. A windows/Mac only UDP Multicast client and server
  3. A possibly non-portable Mersenne Twister RNG

Being the over-confident programmer that I was, I figured I could solve
all these problems by refactoring to use glibmm, which had portable
implementations of all these things.

At the thought of refactoring, I had a bright idea: This would be so
much easier in Squeak Smalltalk! Why don’t I translate the program into
Squeak, then I could do the refactoring really fast! Also, I could do
the required visualizations in eToys and Morphic rather than learn how
to program in Max/MSP. So I started systematically translating the
relevant parts of the C++ code into Smalltalk.

By this time, I was about a week into the assignment. I had read enough
of the code to be able to decipher what was going on. Since I had this
understanding, why not make a simple, extensible framework that does the
same thing! So I installed the Smallapack matrix library for Squeak.
Then I set about creating the framework that does what the C++ code did
as a bunch of nested loops.

Two weeks go by, and I finally have a framework to send in data, process
it using the Matrix library, and draws a simple visualization. So I try
running it. Everything breaks. It seems that Smallapack for Squeak is
still in an early version, and diagonal matrices are unusably broken. So
I fixed that, and I find that I still have many of the calculations
wrong, and matrix inversion does not quite work in Smallapack. I did a
bit of Smallapack debugging, but by this time, I am tired of fixing
code, and just want to get it working. But also by this time, I have
spent so much time on this project that I have a few pending assignments
I need to work on.

About this time, Intel gave me a new Windows laptop. Well, that kind of
invalidated the entire reason I had been porting this software in the
first place. So I stopped working on this software for a week and worked
on the other projects I had to do.

That is what I have been up to for the past four weeks. I have the new
motion models in the Squeak version, but that version is broken, and
does not yet integrate into the rest of Smallab, although that would be
pretty easy to add.

So what do I have to show? I have a little bit of broken code with a
fraction of the functionality I started with. Sure, it is cleaner and
a bit easier to work with, but I don’t see an easy way to get it
functional without debugging Smallapack.

I now see that I went about the problem completely wrong:

  • I tried to refactor before I even solved the problem
  • I tried to change the build system while refactoring
  • I tried to change the platform out from under the system while
    refactoring
  • I tried to translate the program just to make the refactoring I hadn’t
    even started yet easier
  • I tried to rewrite the program on an unstable platform,
    with no commitment to fix the platform
  • I tried to fix the program with no way to run it or test it until the
    very end.

This was my biggest mistake.What I should have done is to suck in my arrogance, stay at the lab, and
use Visual Studio to add the new model to the C++ code base. I probably
would then have had a bit of time to go about refactoring the code in a
more leisurely manner. That would have prevented me from trying to do
the entire jump from Windows to glib or Squeak in one step. If I wanted
to port it to Squeak, I should have made the C++ code into a Squeak
plug-in, and wrote some tests for it. Then I should have translated it
method by method, refactoring as I translate, and keep running the
tests.

Finally, I would like to apologize to my adviser: Harvey, I am sorry I
spent my time taking the long path. I know that you value my time and I
see now that all we need is something that works, so that should be
given top priority. Again, I apologize for not following your advice, I
will most definitely have the new model and visualization working by
next Wednesday.