Three ways to Parallelize a Raytracer 8 July 2008
Posted by Matthew Fulmer in Uncategorized.add a comment
Now, back to my original question: I’ve done a bit of reading on how to parallelize raytracing, especially given the memory constraints of the cell. There are three approaches I’ve discovered so far:
- Distribute the rays: Each node runs a full raytracer on the full scene, and processes a subset of the rays from start to finish
- Distribute the processing: One node handles ray creation, another ray intersection, another shading, another shadows. Every node needs access to different aspects of the entire scene.
- Distribute the scene: Each node handles every ray segment that intersects it’s slice of the scene.
For my purposes, I’m interested in two things:
- How well this will work on the Cell processor, where each node has only 256K of memory, and
- How well the approach maps to the E Concurrency model , which I also call the vat model, below.
Distribute the Rays
The first approach is unquestionably the best when the entire scene fits into local memory of each node, as it is pretty easy on network bandwidth. Using this approach, once the scene is availible to all nodes, the raytracing problem is embarassingly parallel: every ray shooting out of the camera can be processed with zero communication among the nodes. However, if the scene does not fit in memory, the node will have to fetch parts of it and cache them locally.
Distribute the Computation
The second approach is closest to what the Cell was meant to do: act as a stream processor. However, this does not scale up to any number of nodes, nor does it scale down; it requires each node to be developed as part of a stream. This is rather network-intensive; most rays will make at least one pass through every node.
Distribute the Scene
The third is the closest to how vats natively operate. Scenes are usually stored in a tree structure, with the root pointing to a number of children. In this scheme, Many of the child pointers in the tree would be local pointers, but some would be far refs; entire subtrees of the scene graph would be seperated from the root and reside in a seperate vat. This would be very easy to program for.
A raytracer would be started on the vat with the root node of the scene, and as the program progressed, it would send remote messages to the parts of the scene stored in other vats, distributing the computation as a side-effect of sending regular, asynchronous messages to various parts of the scene. This is network intensive; every ray passes through several slices of the scene, and it is also unfair; slices with a light will have to process nearly every ray, while slices in the corner may get only a handful of rays.
E Concurrency model 21 March 2008
Posted by Matthew Fulmer in Uncategorized.4 comments
The E Language has an interesting and very nice concurrency model. It is related to the Croquet/Tweak messaging model. This is an outline of it. I also made an animated demo of how promises work (more…)
Smalltalk debugger realization 8 June 2007
Posted by Matthew Fulmer in Uncategorized.add a comment
As I was working on adding email notification to SqueakSource commits, I had the debugger open to a method with two very interesting objects. I wanted to send a few test messages involving those objects to see what kind of information I could coax out of them. But I couldn’t think of a way to have them both open in a workspace or inspector at the same time! How could I test them out?
Then it hit me. You can edit the method here in the debugger, test them out however I want, then save the messages I want into this method, and erase the ones I don’t. It is more than what I wanted, and I had to press zero buttons. I like interfaces like that
How not to port software 7 February 2007
Posted by Matthew Fulmer in Uncategorized.3 comments
Three weeks ago, I received an assignment from my research adviser to
add a new, simple motion model to our path tracing program. Quite a
simple assignment. However, I do not like the code base for several
reasons:
- It only works under Visual Studio on Windows
- It is monolithic
- It is not modular
So I thought, why not get around to cleaning up this code? I’ve been
wanting to do it all of last semester. First order of business is to get
it to work under Linux, as I didn’t have easy access to Windows or
Visual Studio at home. So I figured out how to use Autoconf and
Automake, and had the program building under Cygwin (but not linking) by
the end of the day. I then went home, thinking I could get the rest done
at home on Linux.
So I started examining what was preventing the program from running
anywhere. I found three things:
- A windows-only C++ threading package
- A windows/Mac only UDP Multicast client and server
- A possibly non-portable Mersenne Twister RNG
Being the over-confident programmer that I was, I figured I could solve
all these problems by refactoring to use glibmm, which had portable
implementations of all these things.
At the thought of refactoring, I had a bright idea: This would be so
much easier in Squeak Smalltalk! Why don’t I translate the program into
Squeak, then I could do the refactoring really fast! Also, I could do
the required visualizations in eToys and Morphic rather than learn how
to program in Max/MSP. So I started systematically translating the
relevant parts of the C++ code into Smalltalk.
By this time, I was about a week into the assignment. I had read enough
of the code to be able to decipher what was going on. Since I had this
understanding, why not make a simple, extensible framework that does the
same thing! So I installed the Smallapack matrix library for Squeak.
Then I set about creating the framework that does what the C++ code did
as a bunch of nested loops.
Two weeks go by, and I finally have a framework to send in data, process
it using the Matrix library, and draws a simple visualization. So I try
running it. Everything breaks. It seems that Smallapack for Squeak is
still in an early version, and diagonal matrices are unusably broken. So
I fixed that, and I find that I still have many of the calculations
wrong, and matrix inversion does not quite work in Smallapack. I did a
bit of Smallapack debugging, but by this time, I am tired of fixing
code, and just want to get it working. But also by this time, I have
spent so much time on this project that I have a few pending assignments
I need to work on.
About this time, Intel gave me a new Windows laptop. Well, that kind of
invalidated the entire reason I had been porting this software in the
first place. So I stopped working on this software for a week and worked
on the other projects I had to do.
That is what I have been up to for the past four weeks. I have the new
motion models in the Squeak version, but that version is broken, and
does not yet integrate into the rest of Smallab, although that would be
pretty easy to add.
So what do I have to show? I have a little bit of broken code with a
fraction of the functionality I started with. Sure, it is cleaner and
a bit easier to work with, but I don’t see an easy way to get it
functional without debugging Smallapack.
I now see that I went about the problem completely wrong:
- I tried to refactor before I even solved the problem
- I tried to change the build system while refactoring
- I tried to change the platform out from under the system while
refactoring - I tried to translate the program just to make the refactoring I hadn’t
even started yet easier - I tried to rewrite the program on an unstable platform,
with no commitment to fix the platform - I tried to fix the program with no way to run it or test it until the
very end.
This was my biggest mistake.What I should have done is to suck in my arrogance, stay at the lab, and
use Visual Studio to add the new model to the C++ code base. I probably
would then have had a bit of time to go about refactoring the code in a
more leisurely manner. That would have prevented me from trying to do
the entire jump from Windows to glib or Squeak in one step. If I wanted
to port it to Squeak, I should have made the C++ code into a Squeak
plug-in, and wrote some tests for it. Then I should have translated it
method by method, refactoring as I translate, and keep running the
tests.
Finally, I would like to apologize to my adviser: Harvey, I am sorry I
spent my time taking the long path. I know that you value my time and I
see now that all we need is something that works, so that should be
given top priority. Again, I apologize for not following your advice, I
will most definitely have the new model and visualization working by
next Wednesday.
A blog for the New Year 6 January 2007
Posted by Matthew Fulmer in Uncategorized.add a comment
I am starting this blog primarily as a geek journal. I found last semester that I had a hard time remembering what I had done, and this blog is my solution to that problem. Maybe it will also make it easier for me to have a better home page than my old lame one.