MJPEG Decoding ( SendStream() ) has a high delay randomly

Hello everyone,
I’ve recently been testing MJPEG hardware decoding on the Duo-S (SG2000), but I ran into an issue: when I send certain frames from an MJPEG file to vdec using SendStream, the function takes nearly 2 seconds to run. I haven’t seen this problem with H.264 hardware decoding, but with MJPEG it sometimes happens.

Details:

  • Development board: Milk-V Duo-S
  • Firmware version: V1

What I’ve tried:

  1. Changed the third parameter of SendStream to -1, 0, and 200(ms) — no effect
  2. Replaced problematic frames with other frames from the MJPEG file that don’t cause delays — this avoids delays at the original positions, but new delays then appear in other places
  3. Split file reading/decoding and frame fetching/YOLO inference into two threads — no effect

Other information:

  1. For MJPEG files, the frames that cause delays are always fixed. For UVC streams, the problematic frames are not fixed
  2. YOLO inference runs at around 20 FPS, while the video file itself is 30 FPS

My code:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include <unistd.h>
#include <time.h>
#include <signal.h>
#include <sys/stat.h>

#include "cvi_tdl.h"
#include "cvi_tdl_media.h"
#include "core/utils/vpss_helper.h"
#include "cvi_vb.h"
#include "cvi_sys.h"
#include "cvi_vdec.h"

// Define common constants for clarity
#define VIDEO_WIDTH 1920
#define VIDEO_HEIGHT 1088
#define STREAM_BUFFER_SIZE (1024 * 1024 * 2) // 2MB buffer for reading stream data

// Global flag to handle program termination via signals.
static volatile bool bExit = false;

// Signal handler for graceful shutdown on SIGINT or SIGTERM.
void handle_sigint(int sig) {
    if (sig == SIGINT || sig == SIGTERM) {
        printf("\nCaught signal, preparing to exit...\n");
        bExit = true;
    }
}

/**
 * @brief Reads a single H.264 frame (NAL unit) from a file stream.
 * This function locates NAL unit start codes (0x000001 or 0x00000001) to delimit frames.
 * @param fp File pointer to the H.264 stream.
 * @param pu8Buf Buffer to store the frame data.
 * @param pu32Len Output pointer for the length of the read frame.
 * @return CVI_SUCCESS on success, CVI_FAILURE on EOF or error.
 */
CVI_S32 h264_read_frame(FILE *fp, CVI_U8 *pu8Buf, CVI_U32 *pu32Len) {
    int read_len;
    int start_code_found = 0;
    int len = 0;
    int zero_count = 0;
    unsigned char *p;

    if (feof(fp)) {
        return CVI_FAILURE;
    }

    read_len = fread(pu8Buf, 1, STREAM_BUFFER_SIZE, fp);
    if (read_len <= 0) {
        return CVI_FAILURE;
    }

    p = pu8Buf;
    while (p < pu8Buf + read_len) {
        if (*p == 0) {
            zero_count++;
        } else if (*p == 1 && zero_count >= 2) {
            if (start_code_found) {
                fseek(fp, (long)(p - zero_count - (pu8Buf + read_len)), SEEK_CUR);
                *pu32Len = len - zero_count;
                return CVI_SUCCESS;
            }
            start_code_found = 1;
            zero_count = 0;
        } else {
            zero_count = 0;
        }
        len++;
        p++;
    }

    *pu32Len = len;
    return CVI_SUCCESS;
}

/**
 * @brief Reads a single MJPEG frame from a file stream.
 * This function robustly finds a complete JPEG frame by locating the
 * Start Of Image (SOI) 0xFFD8 and End Of Image (EOI) 0xFFD9 markers.
 * @param fp File pointer to the MJPEG stream.
 * @param pu8FrameBuf Buffer to store the complete frame data.
 * @param pu32Len Output pointer for the length of the read frame.
 * @param u32BufSize The total size of pu8FrameBuf to prevent overflow.
 * @return CVI_SUCCESS on success, CVI_FAILURE on error or end of stream.
 */
CVI_S32 mjpeg_read_frame(FILE *fp, CVI_U8 *pu8FrameBuf, CVI_U32 *pu32Len, CVI_U32 u32BufSize) {
    int c, prev_c = EOF;
    CVI_U32 len = 0;
    bool soi_found = false;

    // 1. Find Start of Image (SOI) marker: 0xFFD8
    while ((c = fgetc(fp)) != EOF) {
        if (prev_c == 0xFF && c == 0xD8) {
            soi_found = true;
            break;
        }
        prev_c = c;
    }

    if (!soi_found) {
        return CVI_FAILURE; // End of stream or corrupted file
    }

    // 2. We found SOI. The frame data starts with it. Store it in the buffer.
    if (len + 2 > u32BufSize) {
        fprintf(stderr, "Buffer too small for JPEG frame.\n");
        return CVI_FAILURE;
    }
    pu8FrameBuf[len++] = 0xFF;
    pu8FrameBuf[len++] = 0xD8;
    prev_c = 0xD8;

    // 3. Read and store bytes until the End of Image (EOI) marker (0xFFD9) is found.
    while ((c = fgetc(fp)) != EOF) {
        if (len >= u32BufSize) {
            fprintf(stderr, "Error: MJPEG frame is larger than the buffer size (%u bytes).\n", u32BufSize);
            return CVI_FAILURE;
        }
        pu8FrameBuf[len++] = (CVI_U8)c;
        if (prev_c == 0xFF && c == 0xD9) {
            *pu32Len = len;
            return CVI_SUCCESS; // Frame is complete
        }
        prev_c = c;
    }

    // Reached end of file before finding the EOI marker.
    return CVI_FAILURE;
}


/**
 * @brief Initializes YOLOv8 model pre-processing and algorithm parameters.
 * This function is mandatory and sets the specific configurations required for YOLOv8.
 * @param tdl_handle Handle to the TDL (SDK's AI library).
 * @return CVI_SUCCESS on success, otherwise a CVI error code.
 */
CVI_S32 init_yolo_param(const cvitdl_handle_t tdl_handle) {
    printf("Setting up YOLOv8 parameters...\n");

    YoloPreParam preprocess_cfg = CVI_TDL_Get_YOLO_Preparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION);
    for (int i = 0; i < 3; i++) {
        preprocess_cfg.factor[i] = 0.003922; // 1/255.0
        preprocess_cfg.mean[i] = 0.0;
    }
    preprocess_cfg.format = PIXEL_FORMAT_RGB_888_PLANAR;
    preprocess_cfg.rescale_type = RESCALE_CENTER;

    CVI_S32 ret = CVI_TDL_Set_YOLO_Preparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, preprocess_cfg);
    if (ret != CVI_SUCCESS) {
        fprintf(stderr, "Failed to set YOLOv8 preprocess parameters, error: %#x\n", ret);
        return ret;
    }

    YoloAlgParam yolov8_param = CVI_TDL_Get_YOLO_Algparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION);
    yolov8_param.cls = 2; // Example: set expected class number

    ret = CVI_TDL_Set_YOLO_Algparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, yolov8_param);
    if (ret != CVI_SUCCESS) {
        fprintf(stderr, "Failed to set YOLOv8 algorithm parameters, error: %#x\n", ret);
        return ret;
    }

    CVI_TDL_SetModelThreshold(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, 0.5);
    CVI_TDL_SetModelNmsThreshold(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, 0.5);

    printf("YOLOv8 parameters setup successfully.\n");
    return CVI_SUCCESS;
}

int main(int argc, char **argv) {
    if (argc != 3) {
        fprintf(stderr, "Usage: %s <yolo_model_path> <video_file_path (.h264 or .mjpg/.mjpeg)>\n", argv[0]);
        return CVI_FAILURE;
    }
    const char *yolo_model_path = argv[1];
    const char *video_file_path = argv[2];

    // --- Variable Declarations ---
    CVI_S32 s32Ret = CVI_SUCCESS;
    cvitdl_handle_t tdl_handle = NULL;
    VDEC_CHN VdChn = 0;
    VIDEO_FRAME_INFO_S stFrameInfo;
    cvtdl_object_t obj_meta = {0};
    FILE *fpStrm = NULL;
    CVI_U8 *pu8Buf = NULL;
    int frame_count = 0;
    struct timespec start_time, end_time;
    double elapsed_seconds;
    VB_CONFIG_S stVbConf;
    VDEC_CHN_ATTR_S stVdecChnAttr;
    PAYLOAD_TYPE_E enType;
    struct timespec read_start, read_end, decode_start, decode_end, inference_start, inference_end;
    double read_ms = 0, decode_ms = 0, inference_ms = 0;

    // --- Determine Video Type ---
    const char *file_ext = strrchr(video_file_path, '.');
    if (file_ext && (strcmp(file_ext, ".mjpg") == 0 || strcmp(file_ext, ".mjpeg") == 0)) {
        enType = PT_MJPEG;
        printf("Info: Detected MJPEG stream: %s\n", video_file_path);
    } else if (file_ext && strcmp(file_ext, ".h264") == 0) {
        enType = PT_H264;
        printf("Info: Detected H.264 stream: %s\n", video_file_path);
    } else {
        fprintf(stderr, "Error: Unsupported file type. Please use .h264, .mjpg, or .mjpeg\n");
        return CVI_FAILURE;
    }

    // --- Signal Handler ---
    signal(SIGINT, handle_sigint);
    signal(SIGTERM, handle_sigint);

    // --- System & VB Initialization ---
    memset(&stVbConf, 0, sizeof(VB_CONFIG_S));
    stVbConf.u32MaxPoolCnt = 1;
    if (enType == PT_MJPEG) {
        stVbConf.astCommPool[0].u32BlkSize = VDEC_GetPicBufferSize(
            PT_MJPEG, VIDEO_WIDTH, VIDEO_HEIGHT,
            PIXEL_FORMAT_YUV_PLANAR_444, DATA_BITWIDTH_8, COMPRESS_MODE_NONE);
        stVbConf.astCommPool[0].u32BlkCnt = 3;
    } else {
        stVbConf.astCommPool[0].u32BlkSize = VDEC_GetPicBufferSize(
            PT_H264, VIDEO_WIDTH, VIDEO_HEIGHT,
            PIXEL_FORMAT_YUV_PLANAR_420, DATA_BITWIDTH_8, COMPRESS_MODE_NONE);
        stVbConf.astCommPool[0].u32BlkCnt = 10;
    }
    stVbConf.astCommPool[0].enRemapMode = VB_REMAP_MODE_CACHED;

    s32Ret = CVI_VB_SetConfig(&stVbConf);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VB_SetConfig failed with %#x!\n", s32Ret);
        return s32Ret;
    }
    s32Ret = CVI_VB_Init();
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VB_Init failed with %#x!\n", s32Ret);
        return s32Ret;
    }
    s32Ret = CVI_SYS_Init();
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_SYS_Init failed with %#x!\n", s32Ret);
        goto cleanup_vb;
    }

    // --- TDL (AI Model) Initialization ---
    s32Ret = CVI_TDL_CreateHandle(&tdl_handle);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_TDL_CreateHandle failed with %#x!\n", s32Ret);
        goto cleanup_sys;
    }
    s32Ret = init_yolo_param(tdl_handle);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "init_yolo_param failed!\n");
        goto cleanup_tdl;
    }
    s32Ret = CVI_TDL_OpenModel(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, yolo_model_path);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_TDL_OpenModel failed for %s with %#x\n", yolo_model_path, s32Ret);
        goto cleanup_tdl;
    }
    printf("YOLOv8 model opened successfully.\n");

    // --- VDEC Channel Initialization ---
    memset(&stVdecChnAttr, 0, sizeof(VDEC_CHN_ATTR_S));
    stVdecChnAttr.enType = enType;
    stVdecChnAttr.enMode = VIDEO_MODE_FRAME;
    stVdecChnAttr.u32PicWidth = VIDEO_WIDTH;
    stVdecChnAttr.u32PicHeight = VIDEO_HEIGHT;
    // Set buffer counts based on codec type
    stVdecChnAttr.u32FrameBufCnt = (enType == PT_MJPEG) ? 1 : 5; // 1 for MJPEG, 5 for H264 is safe
    stVdecChnAttr.u32StreamBufSize = 0; // Set to 0 to let the driver manage

    s32Ret = CVI_VDEC_CreateChn(VdChn, &stVdecChnAttr);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VDEC_CreateChn failed with %#x\n", s32Ret);
        goto cleanup_tdl;
    }

    // --- CORRECTION: Set detailed parameters using Get/Set ChnParam ---
    VDEC_CHN_PARAM_S stChnParam;
    s32Ret = CVI_VDEC_GetChnParam(VdChn, &stChnParam);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VDEC_GetChnParam failed with %#x!\n", s32Ret);
        goto cleanup_vdec_chn;
    }

    // Now, modify the parameters within the correct structure
    if (enType == PT_MJPEG) {
        stChnParam.enPixelFormat = PIXEL_FORMAT_YUV_PLANAR_444;
        stChnParam.u32DisplayFrameNum = 0;
    } else { // PT_H264
        stChnParam.enPixelFormat = PIXEL_FORMAT_YUV_PLANAR_420;
        stChnParam.u32DisplayFrameNum = 2;
    }

    s32Ret = CVI_VDEC_SetChnParam(VdChn, &stChnParam);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VDEC_SetChnParam failed with %#x!\n", s32Ret);
        goto cleanup_vdec_chn;
    }
    // --- End of Correction ---

    s32Ret = CVI_VDEC_StartRecvStream(VdChn);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VDEC_StartRecvStream failed with %#x\n", s32Ret);
        goto cleanup_vdec_chn;
    }

    // --- Main Processing Loop ---
    fpStrm = fopen(video_file_path, "rb");
    if (!fpStrm) {
        fprintf(stderr, "Cannot open video file: %s\n", video_file_path);
        goto cleanup_vdec_recv;
    }
    pu8Buf = (CVI_U8*)malloc(STREAM_BUFFER_SIZE);
    if (!pu8Buf) {
        fprintf(stderr, "Failed to allocate stream buffer.\n");
        goto cleanup_file;
    }

    printf("Starting decoding and YOLO inference loop...\n");
    clock_gettime(CLOCK_MONOTONIC, &start_time);

    while (!bExit) {
        VDEC_STREAM_S stStream = {0};
        clock_gettime(CLOCK_MONOTONIC, &read_start);
        if (enType == PT_H264) {
            s32Ret = h264_read_frame(fpStrm, pu8Buf, &stStream.u32Len);
        } else {
            s32Ret = mjpeg_read_frame(fpStrm, pu8Buf, &stStream.u32Len, STREAM_BUFFER_SIZE);
        }
        clock_gettime(CLOCK_MONOTONIC, &read_end);
        read_ms = (read_end.tv_sec - read_start.tv_sec) * 1000.0 + (read_end.tv_nsec - read_start.tv_nsec) / 1000000.0;

        if (s32Ret != CVI_SUCCESS || stStream.u32Len == 0) {
            printf("\nEnd of video stream or read error.\n");
            break;
        }

        stStream.pu8Addr = pu8Buf;
        stStream.u64PTS = frame_count;
        stStream.bEndOfStream = CVI_FALSE;
        stStream.bEndOfFrame = CVI_TRUE;

        if (CVI_VDEC_SendStream(VdChn, &stStream, -1) != CVI_SUCCESS) {
            fprintf(stderr, "CVI_VDEC_SendStream failed, retrying...\n");
            usleep(10000);
            continue;
        }

        clock_gettime(CLOCK_MONOTONIC, &decode_start);
        s32Ret = CVI_VDEC_GetFrame(VdChn, &stFrameInfo, -1);
        clock_gettime(CLOCK_MONOTONIC, &decode_end);
        
        if (s32Ret != CVI_SUCCESS) {
            fprintf(stderr, "\nWarning: CVI_VDEC_GetFrame failed with %#x\n", s32Ret);
            usleep(1000);
            continue;
        }
        decode_ms = (decode_end.tv_sec - decode_start.tv_sec) * 1000.0 + (decode_end.tv_nsec - decode_start.tv_nsec) / 1000000.0;

        clock_gettime(CLOCK_MONOTONIC, &inference_start);
        CVI_TDL_YOLOV8_Detection(tdl_handle, &stFrameInfo, &obj_meta);
        clock_gettime(CLOCK_MONOTONIC, &inference_end);
        inference_ms = (inference_end.tv_sec - inference_start.tv_sec) * 1000.0 + (inference_end.tv_nsec - inference_start.tv_nsec) / 1000000.0;

        printf("\rFrame %d: Detected %u objects. | Read: %.2fms, Decode: %.2fms, Inference: %.2fms ",
               frame_count, obj_meta.size, read_ms, decode_ms, inference_ms);
        fflush(stdout);

        CVI_VDEC_ReleaseFrame(VdChn, &stFrameInfo);
        CVI_TDL_Free(&obj_meta);
        frame_count++;
    }

    // --- Performance Calculation ---
    clock_gettime(CLOCK_MONOTONIC, &end_time);
    elapsed_seconds = (end_time.tv_sec - start_time.tv_sec) +
                      (end_time.tv_nsec - start_time.tv_nsec) / 1000000000.0;
    if (elapsed_seconds > 0) {
        double fps = frame_count / elapsed_seconds;
        printf("\n----------------------------------------\n");
        printf("Processing finished.\n");
        printf("Total frames processed: %d\n", frame_count);
        printf("Total time: %.2f seconds\n", elapsed_seconds);
        printf("Actual FPS (Decode + Inference): %.2f\n", fps);
        printf("----------------------------------------\n");
    }

    // --- Cleanup ---
    if(pu8Buf) free(pu8Buf);
cleanup_file:
    if(fpStrm) fclose(fpStrm);
cleanup_vdec_recv:
    CVI_VDEC_StopRecvStream(VdChn);
cleanup_vdec_chn:
    CVI_VDEC_DestroyChn(VdChn);
cleanup_tdl:
    CVI_TDL_DestroyHandle(tdl_handle);
cleanup_sys:
    CVI_SYS_Exit();
cleanup_vb:
    CVI_VB_Exit();

    printf("Cleanup complete. Exiting.\n");
    return s32Ret;
}

Thanks a lot in advance for any advice or suggestions! I really appreciate your help!

1 Like

On the same SoC.

Which output device, MIPI DSI ?

So do you decode on the SoC, with the MJPEG and H.254 decoder ?

The “problamatic” thing here, I think, is memory transfer, between the “userspace” functions.

Some functions are not optimized, ie. the decoder.

The source is “generic” for all IPs from the IP vendor (IP = interlectual property)

Maybe I need some numbers from /proc/interrupts

.. only general help in this area, aka the SDK.

I don’t use this, because of “obvious” reasons

1 Like

Thank you for your helpful answer. I try reading mjpeg frames from both uvc and mjpeg files, but neither of them works. I may research into memory management further.

1 Like

I tried add a VbPool to VDEC module, but still not works.

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include <unistd.h>
#include <time.h>
#include <signal.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <pthread.h> // Include pthread library for threading

#include "cvi_tdl.h"
#include "cvi_tdl_media.h"
#include "cvi_vb.h"
#include "cvi_sys.h"
#include "cvi_vdec.h"
#include "cvi_buffer.h"

#include "uvctest.h" // Contains UVC related V4L2 definitions

// ====================================================================
// 1. UVC Camera Control Section (Unchanged from original)
// ====================================================================

typedef enum pixformat {
    CVI_PIXFMT_YUYV_E,
    CVI_PIXFMT_MJPEG_E,
    CVI_PIXFMT_BUTT
} CVI_PIXFMT_E;

struct v4l2_bufsinfo {
    __u32 length;
    void *start;
};

#define REQ_BUFS_CNT 4 // Use 4 buffers

typedef struct cvi_uvc_host_ctx {
    int fd;
    CVI_PIXFMT_E fmt;
    int width;
    int height;
    struct v4l2_bufsinfo bufs[REQ_BUFS_CNT];
} CVI_UVC_HOST_CTX;

static int uvc_open(const char *dev) {
    int fd = open(dev, O_RDWR);
    if (fd <= 0) {
        fprintf(stderr, "UVC: open error with %s\n", strerror(errno));
    }
    return fd;
}

static void uvc_close(int fd) {
    if (fd > 0) {
        close(fd);
    }
}

static int uvc_set_fmt(int fd, unsigned int pfmt, int w, int h, int frate) {
    struct v4l2_format fmt;
    memset(&fmt, 0x0, sizeof(struct v4l2_format));
    fmt.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    fmt.fmt.pix.pixelformat = pfmt;
    fmt.fmt.pix.height = h;
    fmt.fmt.pix.width = w;
    fmt.fmt.pix.field = V4L2_FIELD_NONE;
    if (ioctl(fd, VIDIOC_S_FMT, &fmt) == -1) {
        fprintf(stderr, "UVC: set fmt failed with %s\n", strerror(errno));
        return -1;
    }

    struct v4l2_streamparm param;
    memset(&param, 0x0, sizeof(struct v4l2_streamparm));
    param.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    param.parm.capture.timeperframe.denominator = frate;
    param.parm.capture.timeperframe.numerator = 1;
    if(ioctl(fd, VIDIOC_S_PARM, &param) == -1) {
        fprintf(stderr, "UVC: set framerate failed with %s\n", strerror(errno));
    }

    printf("UVC: Set format to %c%c%c%c, %dx%d @ %d fps\n",
           pfmt & 0xFF, (pfmt >> 8) & 0xFF, (pfmt >> 16) & 0xFF, (pfmt >> 24) & 0xFF, w, h, frate);
    return 0;
}

static int uvc_req_buffers(int fd, int num, struct v4l2_bufsinfo *bufs) {
    struct v4l2_requestbuffers req;
    req.count = num;
    req.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    req.memory = V4L2_MEMORY_MMAP;
    if (ioctl(fd, VIDIOC_REQBUFS, &req) == -1) {
        fprintf(stderr, "UVC: req buffer failed with %s\n", strerror(errno));
        return -1;
    }
    if (req.count < num) {
        fprintf(stderr, "UVC: Insufficient buffer memory.\n");
        return -1;
    }

    for (int i = 0; i < num; i++) {
        struct v4l2_buffer buf;
        memset(&buf, 0, sizeof(buf));
        buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
        buf.memory = V4L2_MEMORY_MMAP;
        buf.index = i;
        if (ioctl(fd, VIDIOC_QUERYBUF, &buf) == -1) {
            fprintf(stderr, "UVC: query buffer failed with %s\n", strerror(errno));
            return -1;
        }
        bufs[i].length = buf.length;
        bufs[i].start = mmap(NULL, bufs[i].length, PROT_READ | PROT_WRITE, MAP_SHARED, fd, buf.m.offset);
        if (bufs[i].start == MAP_FAILED) {
            fprintf(stderr, "UVC: mmap %d len %d failed with %s\n", i, bufs[i].length, strerror(errno));
            return -1;
        }
    }
    printf("UVC: Requested %d buffers successfully.\n", num);
    return 0;
}

static int uvc_stream_on(int fd, int num) {
    for (int i = 0; i < num; i++) {
        struct v4l2_buffer buf;
        memset(&buf, 0x0, sizeof(struct v4l2_buffer));
        buf.index = i;
        buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
        buf.memory = V4L2_MEMORY_MMAP;
        if (ioctl(fd, VIDIOC_QBUF, &buf) == -1) {
            fprintf(stderr, "UVC: VIDIOC_QBUF failed for index %d: %s\n", i, strerror(errno));
            return -1;
        }
    }
    enum v4l2_buf_type type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    if (ioctl(fd, VIDIOC_STREAMON, &type) == -1) {
        fprintf(stderr, "UVC: VIDIOC_STREAMON failed: %s\n", strerror(errno));
        return -1;
    }
    printf("UVC: Stream ON.\n");
    return 0;
}

static int uvc_get_video_frame(int fd, struct v4l2_buffer *buf) {
    memset(buf, 0x0, sizeof(struct v4l2_buffer));
    buf->type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    buf->memory = V4L2_MEMORY_MMAP;
    if (ioctl(fd, VIDIOC_DQBUF, buf) == -1) {
        if (errno != EAGAIN) {
             fprintf(stderr, "UVC: get video frame (DQBUF) failed with %s\n", strerror(errno));
        }
        return -1;
    }
    return buf->bytesused;
}

static int uvc_release_video_frame(int fd, int index) {
    struct v4l2_buffer buf;
    memset(&buf, 0x0, sizeof(struct v4l2_buffer));
    buf.index = index;
    buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    buf.memory = V4L2_MEMORY_MMAP;
    if (ioctl(fd, VIDIOC_QBUF, &buf) == -1) {
        fprintf(stderr, "UVC: release video frame (QBUF) failed with %s\n", strerror(errno));
        return -1;
    }
    return 0;
}

static int uvc_stream_off(int fd) {
    enum v4l2_buf_type type = V4L2_BUF_TYPE_VIDEO_CAPTURE;
    if (ioctl(fd, VIDIOC_STREAMOFF, &type) == -1) {
        fprintf(stderr, "UVC: VIDIOC_STREAMOFF failed: %s\n", strerror(errno));
        return -1;
    }
    printf("UVC: Stream OFF.\n");
    return 0;
}

void *cvi_uvc_create(const char *dev, CVI_PIXFMT_E pfmt, unsigned int w, unsigned int h, int frate) {
    CVI_UVC_HOST_CTX *ctx = (CVI_UVC_HOST_CTX *)malloc(sizeof(CVI_UVC_HOST_CTX));
    if (ctx == NULL) {
        fprintf(stderr, "UVC: cvi_uvc_create out of mem with %s\n", strerror(errno));
        return NULL;
    }
    memset(ctx, 0x0, sizeof(CVI_UVC_HOST_CTX));
    ctx->fd = -1;

    ctx->fd = uvc_open(dev);
    if (ctx->fd < 0) {
        free(ctx);
        return NULL;
    }

    unsigned int u32Fmt = (pfmt == CVI_PIXFMT_MJPEG_E) ? V4L2_PIX_FMT_MJPEG : V4L2_PIX_FMT_YUYV;
    if (uvc_set_fmt(ctx->fd, u32Fmt, w, h, frate) != 0) {
        uvc_close(ctx->fd);
        free(ctx);
        return NULL;
    }

    if (uvc_req_buffers(ctx->fd, REQ_BUFS_CNT, ctx->bufs) != 0) {
        uvc_close(ctx->fd);
        free(ctx);
        return NULL;
    }
    
    ctx->fmt = pfmt;
    ctx->width = w;
    ctx->height = h;
    return ctx;
}

void cvi_uvc_destroy(void *ctx) {
    CVI_UVC_HOST_CTX *c = (CVI_UVC_HOST_CTX *)ctx;
    if (c) {
        for (int i = 0; i < REQ_BUFS_CNT; i++) {
            if (c->bufs[i].start) {
                munmap(c->bufs[i].start, c->bufs[i].length);
            }
        }
        if (c->fd > 0) uvc_close(c->fd);
        free(c);
        printf("UVC: Context destroyed.\n");
    }
}

// ====================================================================
// 2. YOLOv8 and VDEC Section
// ====================================================================

static volatile bool bExit = false;

// Struct to pass parameters to threads
typedef struct {
    cvitdl_handle_t tdl_handle;
    CVI_UVC_HOST_CTX *uvc_ctx;
    VDEC_CHN VdChn;
    const char *yolo_model_path;
    const char *uvc_dev_path;
    int width;
    int height;
    long long *frame_count; // Pointer to update shared frame count
} ThreadParams;


void handle_sigint(int sig) {
    if (sig == SIGINT || sig == SIGTERM) {
        printf("\nCaught signal, preparing to exit...\n");
        bExit = true;
    }
}

CVI_S32 init_yolo_param(const cvitdl_handle_t tdl_handle) {
    printf("Setting up YOLOv8 parameters...\n");

    YoloPreParam preprocess_cfg = CVI_TDL_Get_YOLO_Preparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION);
    for (int i = 0; i < 3; i++) {
        preprocess_cfg.factor[i] = 0.003922; // 1/255.0
        preprocess_cfg.mean[i] = 0.0;
    }
    preprocess_cfg.format = PIXEL_FORMAT_RGB_888_PLANAR;
    preprocess_cfg.rescale_type = RESCALE_CENTER;

    CVI_S32 ret = CVI_TDL_Set_YOLO_Preparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, preprocess_cfg);
    if (ret != CVI_SUCCESS) {
        fprintf(stderr, "Failed to set YOLOv8 preprocess parameters, error: %#x\n", ret);
        return ret;
    }

    YoloAlgParam yolov8_param = CVI_TDL_Get_YOLO_Algparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION);
    yolov8_param.cls = 2; // Example: set expected class number

    ret = CVI_TDL_Set_YOLO_Algparam(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, yolov8_param);
    if (ret != CVI_SUCCESS) {
        fprintf(stderr, "Failed to set YOLOv8 algorithm parameters, error: %#x\n", ret);
        return ret;
    }

    CVI_TDL_SetModelThreshold(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, 0.5);
    CVI_TDL_SetModelNmsThreshold(tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, 0.5);

    printf("YOLOv8 parameters setup successfully.\n");
    return CVI_SUCCESS;
}

// Thread 1: UVC Capture -> VDEC SendStream
void *uvc_to_encoder_thread(void *arg) {
    ThreadParams *params = (ThreadParams *)arg;
    CVI_S32 s32Ret = CVI_SUCCESS;
    
    // --- 1. UVC Camera Initialization ---
    params->uvc_ctx = (CVI_UVC_HOST_CTX *)cvi_uvc_create(params->uvc_dev_path, CVI_PIXFMT_MJPEG_E, params->width, params->height, 30);
    if (!params->uvc_ctx) {
        fprintf(stderr, "[Capture Thread] Failed to initialize UVC camera.\n");
        bExit = true; // Signal other threads to exit
        return NULL;
    }

    // --- 2. Start UVC Stream ---
    if (uvc_stream_on(params->uvc_ctx->fd, REQ_BUFS_CNT) != 0) {
        fprintf(stderr, "[Capture Thread] Failed to start UVC stream.\n");
        cvi_uvc_destroy(params->uvc_ctx);
        params->uvc_ctx = NULL;
        bExit = true; // Signal other threads to exit
        return NULL;
    }

    struct timespec send_start_time, send_end_time;
    printf("[Capture Thread] Starting UVC capture loop...\n");
    long long uvc_frame_count = 0;
    while (!bExit) {
        struct v4l2_buffer uvc_buf;
        int len = uvc_get_video_frame(params->uvc_ctx->fd, &uvc_buf);
        
        if (len <= 0) {
            usleep(1000); // Wait for a new frame
            continue;
        }

        VDEC_CHN_STATUS_S stStatus;
        CVI_VDEC_QueryStatus(params->VdChn, &stStatus);

        if ((int)stStatus.u32LeftPics >= 3) {
            uvc_release_video_frame(params->uvc_ctx->fd, uvc_buf.index);
            continue;
        }

        VDEC_STREAM_S stStream = {0};
        stStream.pu8Addr = (CVI_U8*)(params->uvc_ctx->bufs[uvc_buf.index].start);
        stStream.u32Len = len;
        stStream.u64PTS = uvc_frame_count;
        stStream.bEndOfStream = CVI_FALSE;
        stStream.bEndOfFrame = CVI_TRUE;
    
        s32Ret = CVI_VDEC_SendStream(params->VdChn, &stStream, 10); // Use a timeout

        // Release UVC buffer immediately after sending for reuse
        uvc_release_video_frame(params->uvc_ctx->fd, uvc_buf.index);

        if (s32Ret != CVI_SUCCESS) {
            // If send stream times out or fails, just continue to next frame
            fprintf(stderr, "\n[Capture Thread] Warning: CVI_VDEC_SendStream failed with %#x", s32Ret);
        } else {
            uvc_frame_count++;
        }
    }

    // --- Cleanup for this thread ---
    uvc_stream_off(params->uvc_ctx->fd);
    printf("[Capture Thread] Exiting.\n");
    return NULL;
}


// Thread 2: VDEC GetFrame -> YOLO Inference
void *decoder_to_yolo_thread(void *arg) {
    ThreadParams *params = (ThreadParams *)arg;
    CVI_S32 s32Ret = CVI_SUCCESS;
    VIDEO_FRAME_INFO_S stFrameInfo;
    cvtdl_object_t obj_meta = {0};

    struct timespec inference_start, inference_end;
    double inference_ms = 0;

    // --- 1. TDL Initialization ---
    s32Ret = CVI_TDL_CreateHandle(&params->tdl_handle);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "[Inference Thread] CVI_TDL_CreateHandle failed with %#x!\n", s32Ret);
        bExit = true;
        return NULL;
    }

    s32Ret = init_yolo_param(params->tdl_handle);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "[Inference Thread] init_yolo_param failed!\n");
        CVI_TDL_DestroyHandle(params->tdl_handle);
        params->tdl_handle = NULL;
        bExit = true;
        return NULL;
    }

    s32Ret = CVI_TDL_OpenModel(params->tdl_handle, CVI_TDL_SUPPORTED_MODEL_YOLOV8_DETECTION, params->yolo_model_path);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "[Inference Thread] CVI_TDL_OpenModel failed for %s with %#x\n", params->yolo_model_path, s32Ret);
        CVI_TDL_DestroyHandle(params->tdl_handle);
        params->tdl_handle = NULL;
        bExit = true;
        return NULL;
    }
    printf("[Inference Thread] YOLOv8 model opened successfully.\n");
    printf("[Inference Thread] Starting YOLO inference loop...\n");

    while (!bExit) {
        // Get decoded YUV frame from VDEC (blocking call with a timeout)
        s32Ret = CVI_VDEC_GetFrame(params->VdChn, &stFrameInfo, 0); // 1000ms timeout
        
        if (s32Ret != CVI_SUCCESS) {
            usleep(1000);
            continue; 
        }

        clock_gettime(CLOCK_MONOTONIC, &inference_start);
        // Run YOLOv8 inference on the decoded frame
        CVI_TDL_YOLOV8_Detection(params->tdl_handle, &stFrameInfo, &obj_meta);
        clock_gettime(CLOCK_MONOTONIC, &inference_end);
        inference_ms = (inference_end.tv_sec - inference_start.tv_sec) * 1000.0 + (inference_end.tv_nsec - inference_start.tv_nsec) / 1000000.0;

        // Print results
        printf("\rFrame %lld: Detected %u objects. | Inference: %.2fms",
              *(params->frame_count), obj_meta.size, inference_ms);
        fflush(stdout);

        // Release the YUV frame back to the VB pool
        CVI_VDEC_ReleaseFrame(params->VdChn, &stFrameInfo);
        // Free memory allocated for object metadata
        CVI_TDL_Free(&obj_meta);
        (*(params->frame_count))++;
    }

    printf("\n[Inference Thread] Exiting.\n");
    return NULL;
}

int main(int argc, char **argv) {
    if (argc != 5) {
        fprintf(stderr, "Usage: %s <yolo_model_path> <uvc_device_path> <width> <height>\n", argv[0]);
        fprintf(stderr, "Example: %s yolov8.cvimodel /dev/video0 1920 1080\n", argv[0]);
        return CVI_FAILURE;
    }
    const char *yolo_model_path = argv[1];
    const char *uvc_dev_path = argv[2];
    int width = atoi(argv[3]);
    int height = atoi(argv[4]);

    // --- Variable declarations ---
    // NOTE: All variables are declared at the top of main to ensure
    // the goto statements do not jump over any initializations.
    CVI_S32 s32Ret = CVI_SUCCESS;
    VB_CONFIG_S stVbConf;
    VDEC_CHN VdChn = 0;
    VDEC_CHN_ATTR_S stVdecChnAttr;
    PAYLOAD_TYPE_E enType = PT_MJPEG;
    VB_POOL VbPool0 = VB_INVALID_POOLID;
    
    pthread_t capture_tid, inference_tid;
    ThreadParams params = {0}; // Initialize all members to NULL or 0
    
    long long frame_count = 0;
    struct timespec start_time, end_time;
    double elapsed_seconds, fps;

    signal(SIGINT, handle_sigint);
    signal(SIGTERM, handle_sigint);

    // --- 1. System and VB Initialization (Done in Main Thread) ---
    memset(&stVbConf, 0, sizeof(VB_CONFIG_S));
    stVbConf.u32MaxPoolCnt = 1;
    stVbConf.astCommPool[0].u32BlkSize = 1024 * 1024 * 2;
    stVbConf.astCommPool[0].u32BlkCnt = 10;
    stVbConf.astCommPool[0].enRemapMode = VB_REMAP_MODE_NONE;
    
    s32Ret = CVI_VB_SetConfig(&stVbConf);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VB_SetConfig failed with %#x!\n", s32Ret);
        return s32Ret;
    }
    s32Ret = CVI_VB_Init();
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VB_Init failed with %#x!\n", s32Ret);
        return s32Ret;
    }
    s32Ret = CVI_SYS_Init();
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_SYS_Init failed with %#x!\n", s32Ret);
        goto cleanup_vb;
    }

    VB_POOL_CONFIG_S stVbPoolConf;
    memset(&stVbPoolConf, 0, sizeof(VB_POOL_CONFIG_S));
    printf("Picture width: %d, height: %d\n", width, height);
    stVbPoolConf.u32BlkSize = 1280 * 720 * 10;
    stVbPoolConf.u32BlkCnt = 10;
    stVbPoolConf.enRemapMode = VB_REMAP_MODE_NONE;
    
    VbPool0 = CVI_VB_CreatePool(&stVbPoolConf);
    if (VbPool0 == VB_INVALID_POOLID)
    {
        fprintf(stderr, "CVI_VB_CreatePool VbPool0 failed %d\n", VbPool0);
        goto cleanup_vb;
    }

    // --- 2. VDEC Initialization (Done in Main Thread) ---
    memset(&stVdecChnAttr, 0, sizeof(VDEC_CHN_ATTR_S));
    stVdecChnAttr.enType = enType;
    stVdecChnAttr.u32StreamBufSize = ALIGN(1280, 128) * ALIGN(720, 64);
    stVdecChnAttr.u32FrameBufCnt = 5;
    stVdecChnAttr.enMode = VIDEO_MODE_FRAME;
    stVdecChnAttr.u32PicWidth = 1920;
    stVdecChnAttr.u32PicHeight = 1080;
    stVdecChnAttr.u32FrameBufSize = VDEC_GetPicBufferSize(
        stVdecChnAttr.enType,
        stVdecChnAttr.u32PicWidth,
        stVdecChnAttr.u32PicHeight,
        /* pixel format - 与你后面设置保持一致,例如 PIXEL_FORMAT_YUV_PLANAR_444 或 420 */
        PIXEL_FORMAT_YUV_PLANAR_422,
        DATA_BITWIDTH_8,
        COMPRESS_MODE_NONE);
    s32Ret = CVI_VDEC_CreateChn(VdChn, &stVdecChnAttr);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VDEC_CreateChn failed with %#x\n", s32Ret);
        goto cleanup_sys;
    }

    // ModParam
    VDEC_MOD_PARAM_S stModParam;
    s32Ret = CVI_VDEC_GetModParam(&stModParam);
    if (s32Ret != CVI_SUCCESS)
    {
        fprintf(stderr, "CVI_VDEC_GetModParam failed %x\n", s32Ret);
        goto cleanup_vdec_chn;
    }

    stModParam.enVdecVBSource = VB_SOURCE_USER;

    s32Ret = CVI_VDEC_SetModParam(&stModParam);
    if (s32Ret != CVI_SUCCESS)
    {
        fprintf(stderr, "CVI_VDEC_SetModParam failed %x\n", s32Ret);
        goto cleanup_vdec_chn;
    }

    VDEC_CHN_PARAM_S stChnParam;
    s32Ret = CVI_VDEC_GetChnParam(VdChn, &stChnParam);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VDEC_GetChnParam failed with %#x!\n", s32Ret);
        goto cleanup_vdec_chn;
    }
    
    stChnParam.enPixelFormat = PIXEL_FORMAT_YUV_PLANAR_422;
    stChnParam.u32DisplayFrameNum = 2;
    stChnParam.stVdecPictureParam.u32Alpha = 0;

    s32Ret = CVI_VDEC_SetChnParam(VdChn, &stChnParam);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VDEC_SetChnParam failed with %#x!\n", s32Ret);
        goto cleanup_vdec_chn;
    }

    VDEC_CHN_POOL_S stPool;
    stPool.hPicVbPool = VbPool0;
    stPool.hTmvVbPool = VB_INVALID_POOLID;

    s32Ret = CVI_VDEC_AttachVbPool(VdChn, &stPool);
    if (s32Ret != CVI_SUCCESS)
    {
        fprintf(stderr, "CVI_VDEC_AttachVbPool failed %x\n", s32Ret);
        goto cleanup_vdec_chn;
    } else {
        printf("VbPool attach ok\n");
    }

    s32Ret = CVI_VDEC_StartRecvStream(VdChn);
    if (s32Ret != CVI_SUCCESS) {
        fprintf(stderr, "CVI_VDEC_StartRecvStream failed with %#x\n", s32Ret);
        goto cleanup_vdec_chn;
    }

    // --- 3. Create and Start Threads ---
    params.VdChn = VdChn;
    params.yolo_model_path = yolo_model_path;
    params.uvc_dev_path = uvc_dev_path;
    params.width = width;
    params.height = height;
    params.frame_count = &frame_count;

    printf("Starting processing threads...\n");
    clock_gettime(CLOCK_MONOTONIC, &start_time);

    pthread_create(&capture_tid, NULL, uvc_to_encoder_thread, &params);
    pthread_create(&inference_tid, NULL, decoder_to_yolo_thread, &params);

    // --- 4. Wait for Threads to Finish ---
    pthread_join(capture_tid, NULL);
    pthread_join(inference_tid, NULL);

    // --- 5. Performance Summary and Resource Cleanup ---
    clock_gettime(CLOCK_MONOTONIC, &end_time);
    elapsed_seconds = (end_time.tv_sec - start_time.tv_sec) +
                      (end_time.tv_nsec - start_time.tv_nsec) / 1000000000.0;
    if (elapsed_seconds > 0) {
        fps = frame_count / elapsed_seconds;
        printf("\n----------------------------------------\n");
        printf("Processing finished.\n");
        printf("Total frames processed: %lld\n", frame_count);
        printf("Total time: %.2f seconds\n", elapsed_seconds);
        printf("Average FPS (Inference Thread): %.2f\n", fps);
        printf("----------------------------------------\n");
    }

    // --- Cleanup is done in reverse order of initialization ---
    CVI_VDEC_StopRecvStream(VdChn);
cleanup_vdec_chn:
    CVI_VDEC_DestroyChn(VdChn);
    // UVC and TDL resources are cleaned up after threads are joined
    if (params.uvc_ctx) cvi_uvc_destroy(params.uvc_ctx);
    if (params.tdl_handle) CVI_TDL_DestroyHandle(params.tdl_handle);
cleanup_sys:
    CVI_SYS_Exit();
cleanup_vb:
    CVI_VB_Exit();
    
    printf("Cleanup complete. Exiting.\n");
    return s32Ret;
}

1 Like

Memory management is hard, also in kernel memory managemend

I assume you are using the “sample” code., which is not optimized for “zero copy”.

Also you are using a camera attached, to MIPI CSI ?

try use mmap() if possible. maybe the “userspace” parts support this (*). read()/write() do sometime a memcpy()

You can use top the see some cpu usage.

Can you do me a favor, I need for my “personell enyoument” to understand different IP (aka hardware blocks for h264 and mjpeg) some output

[root@milkv-duo]~# cat /proc/interrupts

Thanks

1 Like