6/18/2011

Streaming JPEG images over RTP using live555 streaming media

Live555 streaming media  is a software library that provides RTP/RTSP streaming features mainly. It supports many popular video and audio codecs. At this moment, JPEG is still the most popular lossy compression image format. It says that Live555 supports JPEG as well but not directly, see herehere and here. Live555 implements JPEG's RTP payload however it doesn't come with any JPEG parser. According to common  Live555 streaming media programming flow, a user that gonna to stream JPEG images through  it should(here):
(i) "JPEGVideoRTPSink", which will be fed by
(ii) a *subclass* of "JPEGVideoSource".


If you takes a peek to "JPEGVideoSource". You can find "JPEGVideoSource" is an abstract base class. You must implement your own subclass base on "JPEGVideoSource" to provide your own implementations of "type()", "qFactor()", "width()", "height()" and even "qFactor()". Brief speaking, a parser for JPEG should be implemented at first to make it work. Taking a look at RFC2435:

3.1.  JPEG header
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Type-specific |              Fragment Offset                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      Type     |       Q       |     Width     |     Height    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

3.1.7.  Restart Marker header
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       Restart Interval        |F|L|       Restart Count       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

3.1.8.  Quantization Table header
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |      MBZ      |   Precision   |             Length            |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Quantization Table Data                    |
   |                              ...                              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Those field at JPEG/RTP payload header are mapped by "JPEGVideoSource". After study RFC2435 and other implementation from open source community, I make a JPEG image parser that able to work with "JPEGVideoSource" and its subclasses. This parser should be used to parse JPEG image files.

JpegFrameParser.hh
/*
    Copyright (C) 2011, W.L. Chuang <ponponli2000 at gmail.com>

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.
*/

#ifndef _JPEG_FRAME_PARSER_HH_INCLUDED
#define _JPEG_FRAME_PARSER_HH_INCLUDED


class JpegFrameParser
{
public:
    JpegFrameParser();
    virtual ~JpegFrameParser();

    unsigned char width()     { return _width; }
    unsigned char height()    { return _height; }
    unsigned char type()      { return _type; }
    unsigned char precision() { return _precision; }
    unsigned char qFactor()   { return _qFactor; }

    unsigned short restartInterval() { return _restartInterval; }

    unsigned char const* quantizationTables(unsigned short& length)
    {
        length = _qTablesLength;
        return _qTables;
    }

    int parse(unsigned char* data, unsigned int size);

    unsigned char const* scandata(unsigned int& length)
    {
        length = _scandataLength;

        return _scandata;
    }

private:
    unsigned int scanJpegMarker(const unsigned char* data,
                                unsigned int size,
                                unsigned int* offset);
    int readSOF(const unsigned char* data,
                unsigned int size, unsigned int* offset);
    unsigned int readDQT(const unsigned char* data,
                         unsigned int size, unsigned int offset);
    int readDRI(const unsigned char* data,
                unsigned int size, unsigned int* offset);

private:
    unsigned char _width;
    unsigned char _height;
    unsigned char _type;
    unsigned char _precision;
    unsigned char _qFactor;

    unsigned char* _qTables;
    unsigned short _qTablesLength;

    unsigned short _restartInterval;

    unsigned char* _scandata;
    unsigned int   _scandataLength;
};


#endif /* _JPEG_FRAME_PARSER_HH_INCLUDED */

JpegFrameParser.cpp
/*
    Copyright (C) 2011, W.L. Chuang <ponponli2000 at gmail.com>

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.
*/

#include <string.h>

#include "JpegFrameParser.hh"

#ifndef NDEBUG
    #include <stdio.h>
    #define LOGGY(format, ...) fprintf (stderr, format, ##__VA_ARGS__)
#endif /* NDEBUG */


enum
{
    START_MARKER = 0xFF,
    SOI_MARKER   = 0xD8,
    JFIF_MARKER  = 0xE0,
    CMT_MARKER   = 0xFE,
    DQT_MARKER   = 0xDB,
    SOF_MARKER   = 0xC0,
    DHT_MARKER   = 0xC4,
    SOS_MARKER   = 0xDA,
    EOI_MARKER   = 0xD9,
    DRI_MARKER   = 0xDD
};

typedef struct
{
    unsigned char id;
    unsigned char samp;
    unsigned char qt;
} CompInfo;


JpegFrameParser::JpegFrameParser() :
    _width(0), _height(0), _type(0),
    _precision(0), _qFactor(255),
    _qTables(NULL), _qTablesLength(0),
    _restartInterval(0),
    _scandata(NULL), _scandataLength(0)
{
    _qTables = new unsigned char[128 * 2];
    memset(_qTables, 8, 128 * 2);
}

JpegFrameParser::~JpegFrameParser()
{
    if (_qTables != NULL) delete[] _qTables;
}

unsigned int JpegFrameParser::scanJpegMarker(const unsigned char* data,
                                             unsigned int size,
                                             unsigned int* offset)
{
    while ((data[(*offset)++] != START_MARKER) && ((*offset) < size));

    if ((*offset) >= size) {
        return EOI_MARKER;
    } else {
        unsigned int marker;

        marker = data[*offset];
        (*offset)++;

        return marker;
    }
}

static unsigned int _jpegHeaderSize(const unsigned char* data, unsigned int offset)
{
    return data[offset] << 8 | data[offset + 1];
}

int JpegFrameParser::readSOF(const unsigned char* data, unsigned int size,
                             unsigned int* offset)
{
    int i, j;
    CompInfo elem;
    CompInfo info[3] = { {0,}, };
    unsigned int sof_size, off;
    unsigned int width, height, infolen;

    off = *offset;

    /* we need at least 17 bytes for the SOF */
    if (off + 17 > size) goto wrong_size;

    sof_size = _jpegHeaderSize(data, off);
    if (sof_size < 17) goto wrong_length;

    *offset += sof_size;

    /* skip size */
    off += 2;

    /* precision should be 8 */
    if (data[off++] != 8) goto bad_precision;

    /* read dimensions */
    height = data[off] << 8 | data[off + 1];
    width = data[off + 2] << 8 | data[off + 3];
    off += 4;

    if (height == 0 || height > 2040) goto invalid_dimension;
    if (width == 0 || width > 2040) goto invalid_dimension;

    _width = width / 8;
    _height = height / 8;

    /* we only support 3 components */
    if (data[off++] != 3) goto bad_components;

    infolen = 0;
    for (i = 0; i < 3; i++) {
        elem.id = data[off++];
        elem.samp = data[off++];
        elem.qt = data[off++];

        /* insertion sort from the last element to the first */
        for (j = infolen; j > 1; j--) {
            if (info[j - 1].id < elem.id) break;
            info[j] = info[j - 1];
        }
        info[j] = elem;
        infolen++;
    }

    /* see that the components are supported */
    if (info[0].samp == 0x21) {
        _type = 0;
    } else if (info[0].samp == 0x22) {
        _type = 1;
    } else {
        goto invalid_comp;
    }

    if (!(info[1].samp == 0x11)) goto invalid_comp;
    if (!(info[2].samp == 0x11)) goto invalid_comp;
    if (info[1].qt != info[2].qt) goto invalid_comp;

    return 0;

    /* ERRORS */
wrong_size:
    LOGGY("Wrong SOF size\n");
    return -1;

wrong_length:
    LOGGY("Wrong SOF length\n");
    return -1;

bad_precision:
    LOGGY("Bad precision\n");
    return -1;

invalid_dimension:
    LOGGY("Invalid dimension\n");
    return -1;

bad_components:
    LOGGY("Bad component\n");
    return -1;

invalid_comp:
    LOGGY("Invalid component\n");
    return -1;
}

unsigned int JpegFrameParser::readDQT(const unsigned char* data,
                                      unsigned int size,
                                      unsigned int offset)
{
    unsigned int quant_size, tab_size;
    unsigned char prec;
    unsigned char id;

    if (offset + 2 > size) goto too_small;

    quant_size = _jpegHeaderSize(data, offset);
    if (quant_size < 2) goto small_quant_size;

    /* clamp to available data */
    if (offset + quant_size > size) {
        quant_size = size - offset;
    }

    offset += 2;
    quant_size -= 2;

    while (quant_size > 0) {
        /* not enough to read the id */
        if (offset + 1 > size) break;

        id = data[offset] & 0x0f;
        if (id == 15) goto invalid_id;

        prec = (data[offset] & 0xf0) >> 4;
        if (prec) {
            tab_size = 128;
            _qTablesLength = 128 * 2;
        } else {
            tab_size = 64;
            _qTablesLength = 64 * 2;
        }

        /* there is not enough for the table */
        if (quant_size < tab_size + 1) goto no_table;

        //LOGGY("Copy quantization table: %u\n", id);
        memcpy(&_qTables[id * tab_size], &data[offset + 1], tab_size);

        tab_size += 1;
        quant_size -= tab_size;
        offset += tab_size;
    }

done:
    return offset + quant_size;

    /* ERRORS */
too_small:
    LOGGY("DQT is too small\n");
    return size;

small_quant_size:
    LOGGY("Quantization table is too small\n");
    return size;

invalid_id:
    LOGGY("Invalid table ID\n");
    goto done;

no_table:
    LOGGY("table doesn't exist\n");
    goto done;
}

int JpegFrameParser::readDRI(const unsigned char* data,
                             unsigned int size, unsigned int* offset)
{
    unsigned int dri_size, off;

    off = *offset;

    /* we need at least 4 bytes for the DRI */
    if (off + 4 > size) goto wrong_size;

    dri_size = _jpegHeaderSize(data, off);
    if (dri_size < 4) goto wrong_length;

    *offset += dri_size;
    off += 2;

    _restartInterval = (data[off] << 8) | data[off + 1];

    return 0;

wrong_size:
    return -1;

wrong_length:
    *offset += dri_size;
    return -1;
}

int JpegFrameParser::parse(unsigned char* data, unsigned int size)
{
    _width  = 0;
    _height = 0;
    _type = 0;
    _precision = 0;
    //_qFactor = 0;
    _restartInterval = 0,

    _scandata = NULL;
    _scandataLength = 0;

    unsigned int offset = 0;
    unsigned int dqtFound = 0;
    unsigned int sosFound = 0;
    unsigned int sofFound = 0;
    unsigned int driFound = 0;
    unsigned int jpeg_header_size = 0;

    while ((sosFound == 0) && (offset < size)) {
        switch (scanJpegMarker(data, size, &offset)) {
        case JFIF_MARKER:
        case CMT_MARKER:
        case DHT_MARKER:
            offset += _jpegHeaderSize(data, offset);
            break;
        case SOF_MARKER:
            if (readSOF(data, size, &offset) != 0) {
                goto invalid_format;
            }
            sofFound = 1;
            break;
        case DQT_MARKER:
            offset = readDQT(data, size, offset);
            dqtFound = 1;
            break;
        case SOS_MARKER:
            sosFound = 1;
            jpeg_header_size = offset + _jpegHeaderSize(data, offset);
            break;
        case EOI_MARKER:
            /* EOI reached before SOS!? */
            LOGGY("EOI reached before SOS!?\n");
            break;
        case SOI_MARKER:
            //LOGGY("SOI found\n");
            break;
        case DRI_MARKER:
            LOGGY("DRI found\n");
            if (readDRI(data, size, &offset) == 0) {
                driFound = 1;
            }
            break;
        default:
            break;
        }
    }
    if ((dqtFound == 0) || (sofFound == 0)) {
        goto unsupported_jpeg;
    }

    if (_width == 0 || _height == 0) {
        goto no_dimension;
    }

    _scandata = data + jpeg_header_size;
    _scandataLength = size - jpeg_header_size;

    if (driFound == 1) {
        _type += 64;
    }

    return 0;

    /* ERRORS */
unsupported_jpeg:
    return -1;

no_dimension:
    return -1;

invalid_format:
    return -1;
}

Demo:

21 comments:

  1. what about code of
    "*subclass* of "JPEGVideoSource""

    ReplyDelete
  2. The original liveMedia doesn't provide a "real" JPEG source. Instead of that, it/live555 declares an abstract C++ class named "JPEGVideoSource". Anyone who likes to stream JPEG frames using liveMedia should implement a JPEG frame source from "JPEGVideoSource". A JPEG frame source would have a JPEG frame parser, just like above one. :)

    ReplyDelete
  3. what about _qFactor, it's allways 255, not working for me.

    ReplyDelete
    Replies
    1. qFactor could be parsed from JPEG frames. In some cases, it won't be 255 always. Please refer to RFC2435 for the meaning of qFactor.

      Delete
    2. I cannot make sense about your situation. This JPEG frame parser is tested with JPEG frames that are produced by either SW and HW JPEG encoders. It should work at most cases. Q field = 255 means dynamically defined quantization tables. Anyway, please refer to section 4.2 of RFC2435.

      Delete
  4. Hi

    Will you please tell me how did you use that parser with live 555 (like parser return some value and this value is assign by which class and which type of input would be given by the above mentioned parser class)?

    Thanks

    ReplyDelete
    Replies
    1. This is a JPEG frame parser. You'd subclass a new JPEG source class from "JPEGVideoSource" which should embed the parser and parse JPEG frames, which are read from somewhere like a file or live source.

      Delete
    2. firstly thank you so much for quick reply :)..

      Actually i have written a subclass of JPEGVideoSource which gets the input of type FramedSource* but now i m confused that how should i use your above mentioned parser class. kindly send me you JPEGVideoSource subclass in which parser class is already implemented (if possible). i will be very much thankful to you ... thanks

      my E-mail ID is:-
      hasnat.hym@gmail.com

      Delete
    3. You'd composite the parser into the subclassed JPEG video source. There is no magic.

      Delete
    4. ok i have done with parser ..

      but now the vlc client creates a session but does not display anything except black screen..

      Delete
    5. Please check RFC2435 for more information.

      Delete
    6. For testing should i use .MJpeg file or .jpg ?

      Delete
    7. It doesn't matter about file name. The point is that a JPEG source, should be able to read and parse JPEG frames periodically.

      Delete
    8. This comment has been removed by the author.

      Delete
    9. Is your demo code available somewhere??

      Delete
    10. I am sorry. The demo code is not ready to be published yet. Only the parser goes open-source so far.

      Delete
    11. Ok no problem and thanks for helping :)

      i am trying ...

      Delete
  5. This comment has been removed by the author.

    ReplyDelete
  6. Hello
    I know that you said that compositing the parser into the subclassed JPEG video source is no magic, but it is for me :). I'm not sure how to write this class and how to use your parser on a single file. Could you give me an example or some explanation?
    Thanks

    ReplyDelete
    Replies
    1. Please refer to following links:
      http://www.live555.com/liveMedia/faq.html#jpeg-streaming
      http://lists.live555.com/pipermail/live-devel/2005-January/001908.html
      http://lists.live555.com/pipermail/live-devel/2003-November/000037.html

      Delete
    2. Hi William chuang .
      I see that your JPEG Streaming Demo application.. Please share your demo T.T
      also, How to use your JPEG Streaming
      help me plz..

      Delete