Saturday, October 24, 2009

Major gmerlin player upgrade

While working on several optimization tasks of the player engine, I found out that the player architecture sucked. So I made a major upgrade (well, a downgrade actually since lots of code was kicked out). Let me elaborate what exactly was changed.

Below you see a block schematics of the player engine as it was before (subtitle handling is omitted for simplicity):
The audio- and video frames were read by the input thread from the file, pulled through the filter chains (see here) and pushed into the fifos.

The output threads for audio and video pulled the frames from the fifos and sent them to the soundcard or the display window.

The idea behind the separate input thread was that if CPU load is high and decoding a frame takes longer than it should, the output threads can still continue with the frames buffered in the fifos. It turned out that this was the only advantage of this approach, and it only worked if the average decoding time was still less than realtime.

The major disadvantage is, that if you have fifos with frames pushed at the input and pulled at the output, the system becomes very prone to deadlocks. If fact, the code for the fifos became bloated and messy over the time.

While programming a nice new feature (updating the video display while the seek slider is moved), the playback was messed up after seeking and I quickly blamed the fifos for this. The resulted was a big cleanup, the result is shown below:
You see, that the input thread and fifos are completely removed. Instead, the input plugin is protected by a simple mutex and the output threads do the decoding and processing themselves. The advantages are obvious:
  • Much less memory usage (one video frame instead of up to 8 frames in the fifo)
  • Deadlock conditions are much less likely (if not impossible)
  • Much simpler design, bugs are easier to fix
The only disadvantage is that if a file cannot be decoded in realtime, audio and video run out of sync. In the old design the input thread took care that the decoding of the streams didn't run too much out of sync. For these cases, I need to implement frame skipping. This can be done in several steps:
  • If the stream has B-frames, skip them
  • If the stream has P-frames, skip them (display only I-frames)
  • Display only every nth I-frame with increasing n
Frame skipping is the next major thing to do. But with the new architecture it will be much simpler to implement than with the old one.

No comments: