MulticoreWare

Case Studies

Optimizing & Enhancing the Performance of an Image Processing Algorithm

November 30, 2022

This case study emphasizes our role in creating an optimized pipeline for Chroma Correction Algorithm and future enhancements for one of our clients.

The Client

The Customer is a leading global developer of semiconductor solutions. The client was building the world’s smallest image sensor for smartphone cameras and ISPs and the corresponding software pipeline around it.

The Project

The client had a complex image processing-based pipeline as part of their RGB sensor and camera ISP module. The goal of the project was to optimize the Chroma Correction module of this software pipeline by a factor of ~10x to achieve higher performance (in terms of speedup).

Challenges

  • A very naïve version of the algorithm serving as a base to start with
  • Substantial dependency on third party libraries like OpenCV
  • Data bandwidth related issues had to be managed optimally across modules

Typical Software Optimization Workflow

A typical Software Optimization workflow can be split into the following phases:

Phase 1: This phase would require modifying, compiling & building the application in the target platform ideally with all compiler optimizations disabled. The goal is to determine the correctness of the software.

Phase 2: This phase is called Profiling, to find the areas of code where the application spends most of its run time.

Phase 3: This phase is where actual optimization happens

  • Enabling relevant compiler optimization
  • Cache Friendly Algorithms
  • Optimal usage of available registers & memory transfers
  • Hardware specific optimizations

All the phases and its interdependencies can be pictorially represented as below

Phases of a typical Software Optimization workflow

Solutions Proposed

  • Create control flow graph
  • Hand-optimize modules to replace API calls from OpenCV
  • Design Cache-Aware Algorithm to reduce cache trash
  • Loop Optimizations
    • Code Motion/Loop Invariant
    • Iteration Reordering
    • Loop Unrolling

The MulticoreWare Advantage

MulticoreWare’s gene pool consists of deep-rooted expertise in performance optimization especially for image and video processing pipelines. We possess in-depth experience in creating software solutions and tool development for multi-core and heterogeneous computing environments. This project had the perfect mix of Optimization and Video/Image processing, another area where MulticoreWare is considered as a market leader.

Redefining the Technical Architecture – With our experience in developing bare metal image/video API’s that are out there as open-source SDK’s (x265/rpp/rocAL) it was an easy task for the MulticoreWare team to remove the dependent third-party libraries like OpenCV. Once the external dependency was removed, designing the new control flow was next step.

OUTCOME

Within the estimated project timeline, MulticoreWare team was able to squeeze in ~8x performance speedup for the algorithm

Share Via

Explore More

May 8 2026

Optimizing Android Application Performance for Remote GPU Rendering Platforms

Customer
The customer is a technology company specializing in GPU virtualization middleware that enables discrete processing units to be aggregated into shared resource pools and accessed remotely across conventional network infrastructure.

Read more
Apr 9 2026

Agentic AI for RAN Observability, Explainability and Orchestration

Customer A global telecommunications and network infrastructure company that provides advanced software, hardware, and services for building, managing, and optimizing large-scale telecom and enterprise networks. Its solutions leverage AI, automation, and end-to-end visibility to help operators enhance performance, ensure reliability, and efficiently manage complex, multi-domain network environments. Problem Statement Radio Access Networks (RAN) are the  … Read more

Read more
Apr 3 2026

Embedded Platform Optimization for Advanced Drone Systems: Lidar and Motor Control Integration

Client A leading drone and robotics company developing high-performance UAV platforms for autonomous operations, industrial inspection, and surveying in complex or restricted environments. Problem Statement Simultaneously executing high-throughput LiDAR processing and latency-critical motor control on resource-constrained embedded systems creates a fundamental bottleneck in real-time performance and scalable UAV autonomy. Challenge 1: High-Speed Sensor Integration Integrating  … Read more

Read more

GET IN TOUCH

    Please note: Personal emails like Gmail, Hotmail, etc. are not accepted
    (Max 2000 characters)