Saturday, May 1, 2010

Processing compressed streams with gmerlin

As I already mentioned, a main goal of this development cycle is to read compressed streams on the input side and write compressed streams on the encoding side. It's a bit of work, but it's definitely worth it because it offers enormous possibilities:
  • Lossless transmultiplexing from one container to another
  • Adding/removing streams of a file without recompressing the other streams.
  • Lossless concatenation of compressed files
  • Changing metadata of files (i.e. mp3/vorbis tagging)
  • Quicktime has some codecs, which correspond to image formats (png, jpeg, tiff, tga). Supporting compressed frames can convert single images to quicktime movies and back
  • In some cases broken files can be fixed as well
General approach
To limit the possibilities of creating broken files, we are a bit strict about the
codecs we support for compressed I/O. This means, that with the new feature you cannot automatically transfer all compressed streams. For compressed I/O the following conditions
must be met:
  • The precise codec must be known to gavl. While for decoding it never matters if we have MPEG-1 or MPEG-2 video (libmpeg2 decodes both), for compressed I/O it must be known.
  • For some codecs, we need other parameters like the bitrate or if the stream contains B-frames or field pictures.
  • Each audio packet must consist of an independently decompressable frame and we must know, how many uncompressed samples are contained.
  • For each video packet, we must know the pts, how long the frame will be displayed and if it's a keyframe.
Compression support in gavl
For transferring compressed packets, we need 2 data structures:
  • An info structure, which describes the compression format (i.e. the codec). The actual codec is an enum (similar to ffmpegs CodecID), but other parameters can be required as well (see above).
  • A structure for a data packet.
Both of these are in gavl in a new header file gavl/compression.h. Gavl itself never messes around with the contents of compressed packets, if just provides some housekeeping functions for packets and compression definitions. The definitions were moved here, because it's the only common dependency of gmerlin and gmerlin-avdecoder and I didn't want to define that twice.

gmerlin-avdecoder
There are 2 new functions for getting the compression format of A/V streams:
int bgav_get_audio_compression_info(bgav_t * bgav, int stream,
gavl_compression_info_t * info)

int bgav_get_video_compression_info(bgav_t * bgav, int stream,
gavl_compression_info_t * info)
They can be called after the track was selected with bgav_select_track(). If the demuxer doesn't meet the above goals for a stream it's tried with a parser. If there is no parser for this stream, compressed output fails and the functions return 0.

If you decided to read compressed packets from a stream, pass BGAV_STREAM_READRAW to bgav_set_audio_stream() or bgav_set_video_stream(). Then you can read compressed packets with:
int bgav_read_audio_packet(bgav_t * bgav, int stream, gavl_packet_t * p);

int bgav_read_video_packet(bgav_t * bgav, int stream, gavl_packet_t * p);
There is a small commandline tool bgavdemux, which writes the compressed packets to raw files, but only if the compression supports a raw format. This is e.g. not the case for vorbis or theora.

libgmerlin
In the gmerlin library, the new feature shows up mainly in the plugin API. The input plugin (bg_input_plugin_t) got 4 new functions, which have the identical meaning as their counterparts in gmerlin-avdecoder:
int (*get_audio_compression_info)(void * priv, int stream,
gavl_compression_info_t * info);

int (*get_video_compression_info)(void * priv, int stream,
gavl_compression_info_t * info);

int (*read_audio_packet)(void * priv, int stream, gavl_packet_t * p);

int (*read_video_packet)(void * priv, int stream, gavl_packet_t * p);

On the encoding side, there are 6 new functions, which are used for querying if compressed writing is possible, adding compressed A/V tracks and writing compressed A/V packets:
int (*writes_compressed_audio)(void * priv,
const gavl_audio_format_t * format,
const gavl_compression_info_t * info);

int (*writes_compressed_video)(void * priv,
const gavl_video_format_t * format,
const gavl_compression_info_t * info);

int (*add_audio_stream_compressed)(void * priv, const char * language,
const gavl_audio_format_t * format,
const gavl_compression_info_t * info);

int (*add_video_stream_compressed)(void * priv,
const gavl_video_format_t * format,
const gavl_compression_info_t * info);

int (*write_audio_packet)(void * data, gavl_packet_t * packet, int stream);

int (*write_video_packet)(void * data, gavl_packet_t * packet, int stream);
gmerlin-transcoder
In the gmerlin transcoder you have a configuration for each A/V stream:

The options for the stream can be "transcode", "copy (if possible)" or "forget". Copying of a stream is possible if the following conditions are met:
  • The source can deliver compressed packets
  • The encoder can write compressed packets of that format
  • No subtitles are blended onto video images
All filters are however completely ignored. You can configure any filters you want, but when you choose to copy the stream, none of them will be applied.

If a stream cannot be copied, it will be transcoded.

libquicktime
Another major project was support in libquicktime. It's a bit nasty because libquicktime codecs do tasks, which should actually be done by the (de)multiplexer. In practice this means that compressed streams have to be enabled for each codec and container separately. The public API is in compression.h. It was modeled after the functions in libgmerlin, but the definition of the compression (lqt_compression_info_t) is slightly different because inside libquicktime we can't use gavl.

I made a small tool lqtremux. It can either be called with a single file as an argument, in which case all A/V streams are exported to separate quicktime files. If you pass more than one file on the commandline, the last file is considered the output file and all tracks of all other files are multiplexed into the output file. Note that lqtremux is a pretty dumb application, which was written mainly as a demonstration and testbed for the new functionality. In particular you cannot copy some tracks while transcoding others. For more sophisticated tasks use gmerlin-transcoder or write your own tool.

Status and TODO
Most major codecs and containers work, although not all of them are heavily tested. Therefore I cannot guarantee, that files written that way will be compatible with all other decoders. Future work will be testing, fixing and supporting more codecs in more containers. Of course any help (like bugreports or compatibility testing on windows or OSX) is highly appreciated.

With this feature my A/V pipelines are ready for a 1.x version now.