Jul 07, 2025

FPGA acceleration: how MEP100 boosts IP video editing

Broadcast & ProAV: Accelerate your ST 2110 workflows with flexible, high-performance architecture

By Andrew Starks, Director of Product Management at Macnica Americas, Inc.

 

As more live production workflows move to IP and adopt standards like ST 2110, system architects face a critical tradeoff: stick with software running on general-purpose CPUs, or add hardware acceleration to handle the growing demands of uncompressed video transport and synchronization. This article explores when pure software platforms start to struggle and how FPGA-accelerated SmartNICs, like Macnica’s MEP100, can offload key bottlenecks without abandoning the flexibility of COTS infrastructure. We’ll look at the challenges of real-time IP media transport, where software-only systems hit their limits, and how the MEP100 bridges that gap by managing ST 2110 flows in hardware while keeping the application layer agile and scalable.

 

Balancing COTS Flexibility with FPGA Power in Live IP Workflows

In 2011, Marc Andreessen famously wrote that “software is eating the world” and today, that’s undeniably true. From fintech to surgical robotics to telecom infrastructure, high-performance applications that once relied on custom hardware are now written in software and deployed on commercial off-the-shelf (COTS) platforms. But not every task plays well with software running on general-purpose CPUs or even GPUs - live broadcast production pushes systems to their limits. A single uncompressed 4K video stream requires nearly 12 gigabits per second of bandwidth. These streams must be synchronized to within 200 nanoseconds - roughly 500,000 times shorter than the blink of an eye, and that’s just timing. The data must also be processed instantly, with ultra-low latency, because operators interact with the content in real time. In these environments, too much latency in a device can be a deal breaker. As production systems increasingly move to ST 2110 and other IP-native workflows, the burden of just moving and managing media in real time can dominate system resources, leaving little room for what actually matters: content processing. That’s why it’s worth stepping back to ask: when does software alone make sense, and where does it start to fall short? By understanding those boundaries, we can see where a well-designed, FPGA-accelerated Smart NIC like the MEP100 adds real value.

 

The Choice: Pure Software or Hardware-Acceleration?

For many small live productions - schools, houses of worship, independent creators, and local event broadcasters - software-only systems can work well. These setups often rely on best-effort protocols like NDI, paired with low-bitrate codecs such as H.265 or SpeedHQ, to manage bandwidth constraints. These technologies are approachable, flexible, and don’t require precise time synchronization. Each source is simply re-timed on arrival, masking the complexity of transport from the user. This can work great. Until it doesn’t. Compressed video streams reduce bandwidth, but they also introduce latency. The receiver can’t decompress the image until it has complete data - sometimes half a frame, sometimes several frames - introducing delays that can stretch into hundreds of milliseconds. Combine that with asynchronous sources, and you get audio and video that are never quite in sync - and never out of sync in the same way. For an engineer, it’s a nightmare. For a production operator trying to follow a fast-moving subject with a PTZ camera, the added latency makes it nearly impossible to “follow the ball.”

 

As these systems mature, their limitations become hard to ignore. Whether it’s inconsistent latency between sources or unreliable real-time control, many productions eventually hit a ceiling. When they do, they start looking at open standards like ST 2110, which offer pixel-accurate synchronization, uncompressed video, and interoperable system design. The ST 2110 brings its own set of challenges. Handling uncompressed video at 3, 6, or even 12 gigabits per second per stream demands significant processing - just to receive, time-align, and forward the media. Packet overhead grows fast: a single frame of 4K video can require more than 14,000 packets. CPU load spikes. Network jitter becomes a real problem. A workflow that was once “software-simple” turns into a real-time systems engineering exercise. That’s where hardware acceleration, and specifically, the MEP100, comes in. By offloading time-sensitive ST 2110 tasks like packet processing, timestamp alignment, and flow management to a dedicated FPGA, the MEP100 lightens the CPU load, reduces system jitter, and ensures reliable, low-latency media transport - all while keeping the flexibility of a COTS platform.

 

High-Performance Architecture Powered by FPGA Acceleration

Let’s take a closer look at just a few of the performance bottlenecks that the MEP100’s FPGA-based architecture is purpose-built to solve: packet processing, traffic shaping, and time synchronization.

 

Packet Processing at Scale

Start with the packet load. A single uncompressed 4K60 video stream over ST 2110 can generate over 14,000 packets per frame. At 60 frames per second, that’s more than 800,000 packets per second, per stream. Multiply that by 8 incoming and 8 outgoing video flows, and the system is now moving and processing over 12 million packets per second, not even counting audio and ancillary data. Now add ST 2022-7 redundancy, which is required in most professional broadcast systems. It duplicates the packet stream across two network paths. The NIC now must receive and evaluate two identical streams, sorting through the duplicates in real time and selecting the best packets to forward. This alone doubles the input traffic - and demands precise, packet-level filtering to avoid unnecessary processing. At this scale, transport overhead often exceeds the workload of the application itself. The MEP100 handles all of this in hardware. Packet parsing, sequencing, redundancy filtering - it’s all done inside the FPGA, freeing the CPU to focus on the actual media workflows that the user cares about.

 

Traffic Shaping and Real-Time Precision

Modern CPUs are fast, but speed alone isn’t enough. Real-time media systems require predictability, not just raw throughput. Nowhere is this more evident than in packet pacing. In an ST 2110 environment, packets must be emitted at precisely controlled intervals to avoid network jitter and switch buffer overflows. This is especially critical when multiple high-bandwidth streams are sharing a link. Without tight pacing, a system can fall out of spec, triggering packet drops and playback glitches downstream. Operating systems aren’t great at precision timing, as threads can be interrupted. Scheduling is non-deterministic. Even with real-time OS tuning, maintaining fine-grained spacing between media packets is a fragile process. The MEP100 doesn’t guess. It uses hardware-based pacing to emit each packet at exactly the right time. Regardless of host load or network traffic, stream timing remains consistent and compliant with ST 2110 requirements - allowing system designers to operate near line rate, maintain low latency, and achieve performance on par with traditional baseband systems.

 

Time Accuracy with Hardware-Accelerated PTP

ST 2110 systems depend on PTP (Precision Time Protocol) to keep all devices synchronized. In software-only systems, the OS typically handles PTP timestamps, which introduces variability. Thread latency, interrupt overhead, and queue depth can all distort time accuracy - especially when dealing with millions of packets per second across multiple streams. By accelerating timestamping in hardware, the MEP100 applies accurate PTP timestamps at the point of ingress and egress, independent of OS scheduling. This results in tighter synchronization, better lip-sync, and lower jitter across audio and video flows.

 

DMA and Kernel Bypass: Designed for Throughput

Of course, offloading packet processing only solves half the problem. That media still needs to get to the application. The MEP100 uses a high-performance DMA engine to move data directly between the SmartNIC and user-space application - completely bypassing the host kernel. Why does this matter? Kernel bypass avoids the overhead of system calls, buffer copies, and interrupt context switches. It reduces latency, increases throughput, and improves determinism - three things that traditional NICs struggle to guarantee under real-time conditions. The result is a level of performance and reliability normally reserved for dedicated hardware appliances but delivered within the flexibility and scalability of a COTS platform.

 

Scaling for What’s Next

Software may be eating the world, but it’s also being asked to do more than ever before. In live production, 4K is quickly becoming the new baseline, pushing systems that once handled 1080p with ease into territory where bandwidth, timing, and CPU load become even more critical. This is exactly where the MEP100 shines. With dual 100GbE interfaces and built-in support for redundant ST 2110 flows, it delivers the throughput and precision needed for high-performance applications like multiviewers and live production switchers - especially those handling multiple uncompressed 4K streams in real time. As expectations grow, systems need to scale without compromising on latency, reliability, or flexibility. The MEP100 enables that evolution by offloading the bottlenecks and giving developers the freedom to focus on what really matters: delivering powerful, responsive tools for live content creation. If you’re building the future of live video, the MEP100 is ready to help you get there.

 

Contact Macnica today.

 

 

 

Related Articles