A Life of Software Development: The Year at Harvard and Parallel Programming
1991-1992
I was now a regular part of the faculty and was concentrating on my research. I was still continuing my course on Programming, and was thinking of introducing object-oriented concepts to replace the structured concepts. Since I had become a professor in my own university, there was a pressure to visit an external university and get more experience to make up for the missing graduate work I should have done at a different institution.
My background was in simulation, so I started feeling around for a postgraduate research post to involve simulation. Fortunately, my PhD advisor was a graduate of Harvard University, and he contacted his Ph.D. advisor, who told him that he needed somebody to work in Simulation, in a new area called the Standard Clock Method. After a couple of e-mail messages to understand what the task was about, I got a letter of acceptance from Harvard to work for one year as a Postgraduate Researcher at the School of Applied Sciences.
The research team I was joining was being run by Professor Yu-Chi Ho. He was a very prominent Asian-American researcher who had once co-authored a paper with the famous Prof. Kalman and was one of the most famous researchers in the area of Discrete-Event Dynamic Systems. The team consisted of several Chinese doctoral students, an Indian, A Greek, one American and myself.
Prof. Ho wanted me to work in the area of Standard Clock simulation. This was a new technique that tried to use the properties of certain simulation problems to exploit the power of Massively Parallel Computers. A typical simulation problem consists of running several variations of a simulation model to find out how different parameters influence the outcome. Since simulation models are not analytical, there is no easy theoretical solution. Results are always statistical and can only be obtained by running many simulations.
Harvard had a massively parallel computer built by the company Maspar. Maspar was a newly established company specializing in building a massively parallel computer using the Single-Instruction Multiple Data (SIMD) paradigm (Maspar would be defunct in 1999). The first such computer named MP-1 was shipped in 1990 and Harvard was one of the first users. The MP-1 had 1024 processors, but also a central unit (which was usually another computer). The SIMD paradigm was based on a data-level parallelism, and enabled the computer to run the same instruction on all processors at the same time, but process different data items in all the processors. It had a C-like language called MPL (Maspar Programming Language). The difference was in the data structures, which could be declared as plural, and thus have a vector instead of a single variable. If you had two parallel variables and used the + operator with these, it just added all 1024 separate data elements that belonged to the 1024 processors in one instruction.
int plural num
would declare an integer variable called num which would always have the same value in all processors.
plural float plural x
would define a floating point variable called x which could have different values on all processors (thus the double plural keyword).
There was no possibility to communicate between the different processors, but the central unit would be able to connect and pass data to the individual processors.
Memory allocation was also a problem, since any allocation of memory would result in all processors having the same memory allocation. If different simulations required different memory sizes, then this would somehow have to be managed. I solved this problem by managing the memory myself centrally, pre-allocating memory for all processors even if they did not need it yet, but would pass the memory to the processor when needed.
The algorithm I was using used the properties of some probability distributions (in that they could be scaled linearly) to run different variants of the same simulation model, differing only in a small amount in one of the parameters. Then I could ignore some events happening in some of the processors and I would accept them in some other processors, thereby having 1024 parallel simulations with different outcomes.
I was remotely connecting to the Maspar front-end (a Sun Workstation) through my Sun workstation. This was pre-Solaris. I was now getting more and more used to Unix, having used previously the HP variant called HP-UX. This workstation was also being used for e-mails and searches. At that time, the popular search tool was something called Gopher. It was essentially a text-based tool that did not really look like the simple search tools we have today on our browsers. World-Wide-Web was in its infancy, the first popular browser (NCSA Mosaic) would be developed in two years and the establishment of Google was still 7 years in the future.
I had not really used C before I started using Maspar C. In a sense that would be good, since I was going to go directly to C++ afterwards, and would be immune to the deficiencies of the C language. Maspar C was of course different, since it was geared to work with parallel data structures. I was able to develop the Standard Clock algorithm and run many thousands of simulations on the Maspar MP-1. I would also find several bugs and help the Maspar team to fix these.
I also had a chance to run the software on an MP- with 4096 processors. Execution times would plateau after a few processors, thus providing the same performance regardless of the number of simulations run.
So, this year at Harvard, apart from the contribution to my academic career, gave me my initial experience with the Internet, World-Wide-Web and other features that are part of today's computing environment. After a year at Cambridge, one of the most European cities in the U.S., I went back to my university and to the teaching.