Sunday, March 15, 2009

Dirac video

Libquicktime and gmerlin-avdecoder now support Dirac in quicktime. En- and decoding is done with the libschrödinger library. Having already implemented support for lots of other video codecs I noticed some things, both positive and negative.

Positive
  • Very precise specification of the uncompressed video format. Interlacing (including field order) is stored in the stream as well as singal ranges (video range or full range). This brings direct support for lots of colormodels.

  • Support for > 8 bit encoding. This is really a rare feature. While ffmpeg always sticks with 8 bit even for codecs with 10 bit or 12 bit modes, the libschrödinger API has higher precision options. Not sure if these modes are really supported internally by now.

  • It seems to aim for scalability from low-end internet downloads to intra-only modes for video editing applications. A lossless mode is also there. Whether it performs equally well for all usage scenarios is yet to be found out.

  • It pretends to be patent free. But since the patent jungle is so dense, it is almost impossible to prove this.

  • It has the BBC behind it, which hopefully means serious development, funding and a chance for a wide deployment.
Negative

Sequence end code in an own sample
In the quicktime mapping specification it is required, that the sequence end code (a 13 byte string telling that the stream ends here) must be in an own sample. This is a mess, since for all Quicktime codecs I know (even the most disgusting ones) 1 sample always corresponds to 1 frame. Having a sample, which is no frame, screws up the timing because there is a "duration" associated with the sequence-end-sample, which makes the total stream duration seem larger than is actually is. Also, a frame accurate demuxer will expect one frame more than the file actually has. For both libquicktime and gmerlin-avdecoder I wrote workarounds for this (they simply discard the last packet).

If I had written the mapping spec, I would require the sequence end code to be appended to the last frame (in the same sample). In addition it can be made optional since it's not really needed in quicktime.

If libquicktime encodes dirac files, everything is done according to the spec. Conformance to the spec is more important than my personal opinion about it :)

No ctts atom required
Quicktime timestamps (as given by the stts atom) are decoding timestamps. For all low-delay streams (i.e. streams without B-frames), these are equal to the presentation timestamps. For H.264 and MPEG4 ASP streams with B-frames, the ctts atom specifies the difference between PTS and DTS for each frame and lets the demuxer calculate correct presentation timestamps without touching the video data. If the ctts atom is missing, such quicktime files become as disgusting as AVIs with B-frames. Unfortunately the ctts atom isn't required by the mapping spec, which means we'll see such files in the wild.

The good news is, that the ctts atom isn't mentioned at all in the spec. From this I conclude that it is not forbidden either. Therefore, libquicktime always writes a ctts atom if the stream has B-frames.

On the decoding side (in gmerlin-avdecoder), the quicktime demuxer checks if a dirac stream has a ctts atom. If yes, it is used and everything is fine. If not (i.e. if the file wasn't written by libquicktime), a parser is fired up and an index must be built if one wants sample accuracy. The good news is that the parser is pretty simple and the same thing is needed anyway for decoding dirac in MPEG transport streams.

Other news

gmerlin-avdecoder also supports Dirac in Ogg and MPEG-2 transport streams.

4 comments:

Velmont said...

Yay! I've been eagerly waiting for a excuse to look further into Dirac. I've big hopes for it.

Although the recent progress from Theora has taken most of my focus ;-)

(That is, end-user focus :-) )

burkhard said...

Libschrödinger is currently focussed on speed rather than encoding quality.
I made some encoding tests (PSNR vs. file size) and they confirm that. Let's wait until more encoding features from the dirac research codec are ported to schrödinger. Maybe todays 1.0.6 release already brings improvements.

Tommy Thorn said...

Very interesting. Your comment is about the Quicktime mapping, but Dirac is mapped to quite a few containers, including MPEG4 and Ogg. I assume the issue doesn't exist there?

Qualitywise, Schrödinger still lags significantly behind dirac-research when encoding, though about 10X faster. I'd hope people aren't making conclusions about the Dirac potential based on Schrödinger.

What is the official/best mailing list for discussing Dirac?

burkhard said...

The mappings for mp4 and quicktime are identical.

The ogg mapping guarantees correct timestamps. But it also demands, that the end-of-sequence code is in an own packet. This makes frame-accurate seeking much more difficult than for theora.

The mapping for MPEG-2 transport streams is similar, but it doesn't hurt much because for the other MPEG codecs it's worse.

There is the schroedinger-devel list, it's low traffic and focused on libschroedinger. Don't know about others.

I really hope that the libschroedinger gets more quality improvements. IMO Dirac is the most promising free video codec. But without a good and fast implementation it's pretty worthless.