Last week I coded something I wanted for a long time in my game. A scheduling visualizer! With it you get a much better feeling how all the parallel stuff works. In the picture below you can see 20ms of game time. Each row represents one thread and the rightmost frame is the last frame rendered. On the basis of the picture I will try to explain how my architecture works. To see the scheduling visualizer in action, jump to the video at the bottom.
20ms of game time: Each row represents one thread (ThreadId is shown at the beginning of a row). |
- Graphics: Green
- Physics: Blue
- AI: Orange
- Input: Turquoise
Graphics System
One example for this can be seen in the picture. The graphics system issues a task, which updates the positions of all particles in the scene. This is the single small green rectangle. The graphics system issues this task at the beginning of the frame rendering. At the end of the frame rendering, the particles are getting drawn. If the task hasn't finished till that point in time the graphics system has to wait. This occurs very seldom, but you never know what the OS scheduler does with your threads.
Physics System
Real data parallel work is done in the physics system. If you look at the picture you can identify four data parallel parts. To be honest, I don't know what exactly is happening there. Jitter, the physics library I use, was already parallelized. And luckily the library is implemented very well and the source code is available too. I only had to replace Jitter's Threadmanager with my own one. My new custom Threadmanager is using a parallel for loop, which the taskmanager provides, to iterate over the work which gets issued by Jitter. The for loop iteration are then getting distributed among the worker threads. Jitter is really nicely designed, only a few lines of code were needed to achieve this.
Change Distribution
Between every frame you can see ca. 0,6 ms wide gaps. This is the time needed to distributed the changes that were made by the systems. For instance, the physics system may have changed the positions of a object. This position is needed by the graphics system to draw it at the according position in the next frame.
To finish, heres a video where you can see the scheduling visualizer in action. The scene contained 2400 asteroids and 60 space ships with a rudimentary AI (only roaming). Without multithreading we've got 40 frames per second, with multithreading 60 frames per second. Without video capturing the speedup is even better (50 single vs. 90 multithreaded). Nevertheless, with only two systems doing hard work (physics and graphics) the four CPU cores can't be utilized fully. This may change when I implement more stuff in the AI system. Getting more data parallelism out of the physics system would also improve the utilization, but I don't think that can be easily achieved and its not worth the hassle.
Used hardware: Core Quad Q9300 2.5 GHz, Radeon HD 6700
XNA multithreading from Jan Tepelmann on Vimeo.
No comments:
Post a Comment