Monday, February 10, 2014

Shock Wave Refraction Pixel Shader

Yesterday I had a few hours time and tweaked my explosion effects a little bit. Its not 100% finished, but have a first look:

Shock Wave Pixel Shader from Jan Tepelmann on Vimeo.

The implementation is straight forward. I use a pixel shader to manipulate the drawn pixels around the explosion. To be able doing this you have to render the whole scene first to a full screen quad. For more details about post-processing with full screen quads, heres a nice blog post about it. By drawing to the quad you can execute a pixel shader for every drawn pixel. The vertex shader does nothing in this case. This process is called post processing and effects like HDR, Bloom or more advanced effects like Screen Space Ambient Occlusion (SSAO) are realized in that way. But the possibilities for different effects are endless. Also deferred shading is some kind of post processing effect (nice slides explaining deferred shading).
In SunBurn you can simply create a post processor by deriving from a specific base class. This class uses a full screen quad for appling a pixel shader on the current frame.
The post processor is responsible for passing all the input variables to the actual shader and maybe do some pre processing. In my case I had to transform the explosions world positions into the coordinate system of the full screen quad. This is simply the texture coordinate space with (0,0) at the top left and (1,1) at the bottom right. Thats different to the normal projection process with the view-projection matrix! To get everything right I made a sketch how the positions have to be transformed.

Transformation from world space coordinates to texture coordinates
Next I want to add a refraction effect to the space ship engines.

Monday, January 20, 2014

First Beams effect

This weekend I finally had time to finish my code for auto aiming beams. Have a look:

Beams from Jan Tepelmann on Vimeo.

Not perfect yet, but a good start! When the beam has no target, the effect looks a little bit ugly. That surely needs improvement.

Calculating the beam curve
Every beam consists of a starting- and end-point. Firstly, I create a local coordinate system to make things easier. In the local coordinate system the starting point is at (0, 0) and the endpoint is at (1, 1). Now I use the XNA Curve Class to define a smooth curve between those two points. This is simply done by specifying the wanted values at the positions. The curve class can then interpolate between those position-value pairs (just use curve.Evaluate(position)). In the sketch below I defined five position-value pairs. The pairs are called CurveKeys:
  • (0, 0)
  • (0.25, 0.25²)
  • (0.50, 0.50²)
  • (0.75, 0.75²)
  • (1, 1)
You can see that this is simply a quadratic curve. So why then bother and use the curve class? Because now I can add an oscillation to the curve easily, by simply modifying the CurveKey values depending on the current time. This allows adding a wave effect on the beam.
A good sketch is half the battle :-)
Rendering the beam curve
To render the beam curve I  break the beam curve down into N line segments (in the video N=30). Those line segments are drawn with the help of a custom vertex shader (with hardware instancing! cool :-). The drawing code is based on the XNA RoundLine example (describing blog post). Since the original vertex shader draws on the XY-plane, I had to modify the shader to draw on the XZ-plane. I also did some other changes to the pixel shader to have a smoother glow effect. I think I will do some more tweaking in the pixel shader to make the beam look more like plasma.

Thursday, November 28, 2013


Last week I stumbled upon this blog post about generating and rendering Lightning Bolts. Really cool stuff. I was searching for such an effect for my missile interception weapon (see video below). So I implemented my own lighting effect generator. Have a look:

Missile Shield on Vimeo.

At first I wanted to implement the lightning effects straight forward and simply use a spritebatch for rendering. But then it came to me, that it would be really nice to integrate the lighting effect in the Mercury Particle Engine which I already use in my game. This allows to customize the bolt effect by adding all kind of Controllers and Modifiers, without changing any source code. Also Mercury has a really good particle editor and the effects can be saved and loaded from xml. Some useful modifiers are for example:
  • OpacityInterpolator: Adds a fading out effect.
  • ColourInterpolator: Bolt can change color depending on its age.
  • ScaleInterpolator: Bolt can change size.
Also its easy to add your own modifier. I created a simple jitter modifier, which you can see in the video above. The modifier moves all emitted particles randomly along their normal in the XZ plane. Implementing the modifier took me only a few minutes.

Needed Changes in Mercury
Now lets talk about the necessary changes to integrate the lightning effects in Mercury. Mercury provides Controllers and Modifiers to customize your particle effects.
Modifiers: They iterate over all active particles of an emitter and modify particle data like position, scale, color, velocity and so on.
Controllers: Those add logic which is executed before the particle trigger event. They get a TriggerContext which allows them for example to cancel a trigger event or change a trigger position where the particles get released.
The TriggerContext looks like this:
public struct TriggerContext
        /// True if the trigger should be cancelled.
        public Boolean Cancelled;

        /// Gets or sets the position of the trigger.
        public Vector3 Position;


        /// ADDED.
        /// The texture the emitter should use.
        public Texture2D ParticleTexture;

I added a field for the particle texture to the TriggerContext. This allows me to create controllers which can change the emitters particle textures dynamically. If the controller sets the ParticleTexture field, the emitters old texture reference is overwritten by the one provided by the controller (this is done in ParticleEffect.Trigger).
I created an abstract dynamic texture creation controller which can be used as base class for all texture changing controllers. All derived classes have to provide a DrawTexture method which gets called periodically (TextureRecalculatePeriod can be defined in the xml description of the controller).

Bolt generation controller
Bolt generation and rendering is done in the BoltParticleTextureController which is right now the only from the abstract controller derived class. Right now there are only a few parameters that tweak the particle textures look:
  • Two parameters to control the random placement of the start and end point of the bolt
  • GenerationCount: Number of times the line between start and end will be halved
  • BoltCount: Number of bolts to render into the particle texture
In the future I want to add more parameters to customize the created glow textures. Here two example textures created by the controller:

In the last texture three bolts were rendered. But with more then one bolt there are some problems with artifacts I could not fix yet. Write a comment if you have an idea :-)

Edit. In the last few days I improved the bolt effect a little bit. The new effect can be seen in the video at the top. The old videos are still here:

Red Bolt on Vimeo.

Missile Shield on Vimeo.

Wednesday, November 13, 2013

Game Architecture and Networking, Part 2

In this post I will talk in more detail about my game architecture and how the networking is realized in it. For the basics of my architecture, see these two old blog posts from last year:
Parallel Game Architecture, Part 1
Parallel Game Architecture, Part 2
In short, my architecture consists of 6 subsystems which all work on their own copy of a subset of game data. Because they work on a copy, they can be updated concurrently. If a subsystem changes its game data copy, it has to queue the change up in the change manager. After all subsystems are finished updating, the change manager distributes the changes to the other subsystems. Right now my engine has the following 6 subsystems:
  • Input: responsible for reading the input from the gamepad or keyboard
  • Physic: collision detection and physical simulation of the world (uses Jitter Physics Engine)
  • Graphic and Audio: uses the SunBurn Engine and Mercury Particle Engine
  • LevelLogic: responsible for checking winning/losing conditions, creating/destroying objects, and so on
  • Networking: communicate with other peers over the network
  • AI: calculates the InputActions for not human controlled game entities (allies, enemies, homing missiles, ...)
In the picture below you can see how a player controllable game entity (for example the player's space ship) is represented in the game architecture. Every box equates to an object in the specific subsystem.
Representation of a player controllable entity on the host and a client system. The numbers in the white circles show the latency in number of frames.
In the input system the pressed buttons are read and translated to InputActions (Accelerate, Rotate, Fire Weapon, ...). Those InputActions are interpreted by the phyics-system which is the central subsystem of the engine. It informs the other system about the physical state of all the game entities like the current position, current acceleration, collisions and so on. These information are needed in the graphics and audio system to render the game entities models at the right position, play a sound and/or trigger a particle effect.

Host Side Networking
The networking system has to send the physical state of the game entities to all clients. For important game entities like player ships, this is done a few times per second, for ai objects much more infrequently, because ai objects behave more or less deterministically (iff the client and host are in sync!). There are some important physical state changes which trigger an immediate network send. For example, if the game entity is firing, we want to create the fired projectile on the clients as fast as possible, to minimize the divergence between the host-world and the client worlds.

Client Side Networking
On the client side, the networking system receives the physical state and calculates a prediction, based on the latency, the received velocity and current acceleration. In the next change distribution phase the current physical state in the physic-system is overwritten by the predicted physical state. Then it takes one more frame till the information is really visible on the client side. On the Xbox my game runs on average at 60 frames per second. This means, if a player fires her weapon this is seen at the earliest 66 ms (4 frames) + X ms later on another peer. X depends on the latency L and the point in time the network packet is received. If the packet is received after the networking system has already finished updating, the packet has to wait for the next frame till it gets processed.

The latency is a disadvantage of the game architecture. But it could be improved for example by merging the input system into the physics-system. A separate input system does not improve performance. But merging the two system would reduce the maintainability, therefor I have no plans doing this in the near future. Separation of concerns between the systems is really a big plus. You could even replace a system completely without affecting the other systems.
On importent part of my networking code is not finished yet: the interpolation. Right now the old physical state is simply overwritten by the new one received from the host. If the two states differ much, the movement of the game entity looks jerky. To solve this, the difference from the old to the new physical state can be evenly spread and applied over the next few frames.

Wednesday, October 16, 2013

Networking, Part 1

I'm back :-) This summer I finished my thesis. Yay! I wanted to continue working on my game while writing my thesis, but sadly a day has only twenty-four hours...

Anyway, now I'm ready to finish my game. The last two months I worked mostly on my network code. Its almost finished. The networking stuff is one of the bad parts of XNA. Don't get me wrong, the networking is nicely integrated in the Xbox environment and easy to use. However you can't use the XNA networking on windows, but you are forced to use XNA networking on the Xbox since the normal .NET networking classes are not accessible there. Therefore one needs two different networking implementations for Windows and Xbox. This sucks!

To support both implementations in my framework I wrote a wrapping layer which provides access to the XNA and Windows specific implementation on the respective platforms. Since I didn't want to reinvent the wheel, the code of the wrapping layer is based on some classes from the IndieFreaks Game Framework. The interface of the wrapping layer is oriented on the XNA networking capabilities. For XNA, the wrapping layer is therefore very thin.

The windows networking implementation uses the Lidgren networking library. Lidgren is really an excellent library and widely-used in .NET based games (AI War, XenoMiner, IndieFreaks Game Framework, etc.). Basically it is a layer over the unreliable, connectionless UDP protocol. Thus more complex features like session management (player joining, player leaving, start/end session, ...) are not supported, which means I had to implement them by my own. But this wasn't very difficult. Below you can see a sketch of the sequence diagram for the client join. It doesn't get more complicated than that. In XNA this is all hidden behind the provided API.
In the next post I will go into more detail how my networking system works on the game logic level.

Tuesday, June 19, 2012

Fixing My Xbox Performance Problem

The Symptoms
Most of the time I tested my game on my PC because I simply had no Xbox available. Now that I have one, I tested my game on the Xbox and had really bad results. It ran only at average 40-50 frames per second. So I looked at my scheduling visualizer to search for the bottleneck (see this blogpost from march for a description of the visualizer):

Task execution times Xbox 360

In the test scene there were ca. ten missiles and 15 space-crafts (enemies + own ships) and the framerate was down to ca. 35 fps. Very bad! The bottleneck was the physics task (blue bar). It took average 31ms per frame. The graphics task (green bar) had always to wait for the physics task. So I had to search there for my problem.
And just for fun the same scene on my PC (Intel Quadcore 2.5 GHz + Radeon HD 6700):

Task execution times PC (Intel Quadcore 2.5 GHz + Radeon HD 6700)

You can see, there is absolutely no performance problem on the PC! Further you can see that the Xenon CPU used in the Xbox is really really slow. The physics calculations take over a hundred times longer. Ok, its a unfair comparison, the Xenon is from the year 2005 and my Core 2 Quad Q9300 is from 2008. Additionally there must be some other problems. Factor 100 is just too much. This can't be all reduced to the hardware. Maybe some specific Xbox .Net compact framework problems.
I learned from this, that you have to test all the time on the hardware your game is targeted for.

The Problem
Searching for the problem I had a closer look at the Jitter Physics source code for the first time. The engines debug timers showed that the engine spent 95% of the time in the collision detection system, which uses the Sweep and Prune algorithm. A very good description of this algorithm and its implementation can be found on the Jitter homepage. The core idea behind this and most other collision detection algorithms is, to figure out which of the possible collisions can actually happen and put them into a list. This step is called the broadphase. In the next step (the narrowphase), only those object pairs in the list are actually tested for collision, using a more complex collision check which also calculates things like the collision points.
After reading the article about Sweep and Prune I had an idea what the reason for my performance problems could be.
I've put the perception for AI (also used for the player's radar) in the physics system. Perception means, figuring out which other game entities the AI can see right now. So every physics object carried a sensor collision shape around to "see" surrounding objects. Every game entity inside the shape can be seen by the AI. This is illustrated in the picture below:

I have drawn circles into the radar on the screenshot to show the approximated shape of the sensor. Doing perception inside the collision system doubled the number of objects that need a broadphase collision check. Also there are many completely useless broadphase collision checks between two sensors. Putting the perception into the physics systems was just a very bad idea! But the main CPU cycle wastage came from gathering perception data in EVERY FRAME. Thats just not necessary.

The Solution
The solution is easy. Move the perception gathering out of the collision system and don't gather every frame. It took me a few hours to move perception into the AI system. Have a look at the nice result:

Good Performance (Xbox): Perception in AI system, gathering interval one second
For comparison - Bad Performance (Xbox): Perception in physics system

Now I have frame rates between 70 and 80 and all tasks have to wait for the graphics system. This means I have now plenty of CPU cycles left to increase the number of enemies, implement new cool weapons or improve my AI.

Wednesday, June 13, 2012

Dream Build Play 2012

Yesterday I submitted my game to the Dream Build Play 2012 contest. Awesome feeling :-) The week before I have done my utmost to finish it. I never programmed more in one week.
Here is my epic trailer ;-)

Took a whole day to make it. I'm really glad that I found the cool trailer music from Martin von Hagenau on Soundcloud(
"Trailer 3 - Hellraiser" (Martin von Hagenau) / CC BY 3.0).
Before I publish my game on Xbox Live I want to finish three features that couldn't make it into the Dream Build Play version.
  • Two player coop mode
  • Two player death match
  • Survival highscore mode
This shouldn't be very difficult because I already have working split-screen code. Only a few parts of my code have to be adapted (losing conditions for example). I think this will take two or three weeks (hopefully).