OBR Architecture Configurator — CAVU Aerospace UK

Mission Requirements

Payload raw data rateTotal sensor output before compression — sets the DDR4 ingest bandwidth requirement23.1 Gbps
Design point25 Gbps
Burst duration5 s (100 ms variant)

Inter-burst gap≥ 60 s
DownstreamLVDS digital interface to IDT transmitter — estimated 8 Gbps via 4–8 LVDS pairsLVDS → IDT (est. 8 Gbps, 4-8 pairs)

Payload OBC duties

Control/monitor payload components, limited online image processing, communicate with spacecraft OBC

Architecture Options

Option ARecommended

Combined Payload OBC + Recorder

3-slot chassis — single combined card approach

Selected

Slot 1OBC-PF-VPX Combined
Slot 2PolarStore-4N (4 TB)
Slot 3Connector / Utility / Power

~2.0 kg

Mass

~15 W

Peak Power

Slots

Option B

Split Payload OBC and Recorder

4-slot chassis — dedicated cards approach

Selected

Slot 1OBC-PF-VPX Recorder
Slot 2PolarStore-4N (4 TB)
Slot 3OBC-PF-VPX Payload OBC
Slot 4Connector / Utility / Power

~2.5 kg

Mass

~16 W

Peak Power

Slots

Standalone OBC-HYPER reference: ~700 g, single-card form factor. Not shown in chassis view — intended for standalone deployments without the high-speed recorder subsystem.

Chassis Layout — Engineering View

Active Drain / Egress Path:

2.4 GB/s

Animated Data Flow Paths

CoaXPress ingest (sensor → DDR4) Storage drain (DDR4 → NVMe) Egress (Storage → PF → 2×10GigE) Image processing (DDR4 ↔ FPGA)

Legend: Teal border = processor card | Amber border = storage card | Grey border = utility/power card. Graphical connectors shown on front panel. VPX backplane provides high-speed inter-card communication. 1 GB/s conservative write speed used for safety margin.

◆ RISC-V CPU — PolarFire SoC MPFS460T

Core Configuration

Application cores4× U54 @ 600 MHz
Monitor core1× E51
DMIPS/MHz1.714
Total DMIPS~4,114
OS supportLinux / bare-metal

Benchmark Performance

CoreMark / MHz / core3.128
CoreMark single-core~1,877
CoreMark 4-core~7,500
CoreMark-PRO multi802.42
CoreMark-PRO single237.68

Practical throughput: Sobel filter on a 1 Megapixel image in ~50–100 ms on 4 cores (NSF-SHREC benchmarks). Linux or bare-metal selectable per deployment requirements.

◆ FPGA Fabric

461,053

Logic Elements

1,473

Math Blocks (18×18)

150–250 MHzMicrochip datasheet Table 56: max 250 MHz fabric clock. Complex designs typically achieve 150–200 MHz per user reports.

Fabric Clock

Can implement custom image processing pipelines: histogram equalization, edge detection, thresholding, ROI extraction. Operates in parallel with RISC-V — fabric handles data movement and pixel processing while CPU handles control logic.

◆ Custom Image Processing Pipelines (FPGA Fabric)

Histogram Equalization

Contrast enhancement. Real-time per frame.

Edge Detection (Sobel/Canny)

Feature extraction. 3×3 kernel convolution.

Thresholding / Binarization

Segmentation. Single-cycle per pixel.

ROI Extraction

Region crop & forward. Zero-copy DMA.

Pipeline Resource Usage & Benchmarks

Pipeline	Fabric LUTs	Fabric DSPs	Fabric Clock	Throughput	CPU Involvement	Notes
Histogram Equalization	~2,000 (~0.4%)	0	200 MHz	~200 Mpix/s	Config only	Fully fabric-accelerated, single-pass
Edge Detection (Sobel 3×3)	~5,000 (~1.1%)	8	200 MHz	~200 Mpix/s	Config only	3×3 convolution kernel in fabric
Thresholding / Binarization	~500 (~0.1%)	0	200 MHz	~200 Mpix/s	Config only	Single-cycle per pixel, trivial resources
ROI Extraction	~1,500 (~0.3%)	0	200 MHz	~200 Mpix/s	ROI coords via RISC-V	Zero-copy DMA crop
All 4 combined	~9,000 (~2%)	8	200 MHz	~200 Mpix/s	Minimal	Pipeline: all stages in series

All processing is fabric-accelerated — CPU handles configuration and control only. RISC-V cores remain free for payload OBC duties, telemetry, and communication. Combined fabric usage is only ~2% of 461K LEs, leaving 75%+ for VectorBlox or other logic. Pixel throughput at 200 MHz fabric clock: 200 Mpixel/s = 200 frames/s for 1 Megapixel images.

How the streaming pipeline works: These pipelines operate as a streaming chain inside the FPGA fabric. Data flows through all stages at wire speed — one pixel per clock cycle. At 200 MHz fabric clock, this delivers ~200 Mpixel/s throughput. The RISC-V CPU is NOT involved in pixel processing — it only sets configuration registers (thresholds, ROI coordinates, kernel coefficients). This keeps the CPU cores free for payload control, telemetry, and spacecraft communication.

Pipeline Data Flow

◆ VectorBlox AI AcceleratorSoft-core neural network overlay from Microchip. Runs quantised INT8 models inside the FPGA fabric. Optional add-on

Config	Peak GOPS	Fabric Usage	MobileNet-v1	TinyYOLO-v3	Power
V250	79 GOPS INT8	~6% LUTs	26 fps	9 fps	452 mW
V500	146 GOPS INT8	~10% LUTs	48 fps	17 fps	825 mW
V1000	279 GOPS INT8	~13% LUTs	68 fps	27 fps	1,300 mW
VBX Custom (75% fabric)	~1,600 GOPS INT8	75% LUTs	~400+ fps	~150+ fps	~5 W

Custom 75% fabric configuration: Extrapolated from V1000 baseline (279 GOPS at ~13% LUTs). Scaling to 75% fabric utilisation: 279 × (75/13) ≈ 1,610 GOPS INT8. Standard V250/V500/V1000 configs also available for lower-power profiles. Power estimate assumes proportional scaling.

◆ Memory Bandwidth

DDR4 peak

12.8 GB/s

Ingest (25 Gbps)CoaXPress v2.0 — up to 12.5 Gbps/lane high-speed camera interface

3.125 GB/s

NVMe write (PolarStore-4N)

~4 GB/s

Remaining (burst)

~5.7 GB/s

LPDDR4 (MSS)

3.2 GB/s

Between bursts, the full 12.8 GB/s DDR4 bandwidth is available for image processing. During a burst, ~5.7 GB/s remains after ingest and storage drain. For selective processing of individual frames (not real-time on the full stream), this is more than sufficient. LPDDR4 4 GB (3.2 GB/s, x32 @ 1600 MT/s) is dedicated to RISC-V, independent of fabric DDR4.

◆ Option A vs B — Processing Budget

Option A — Combined

RISC-V handles payload control + image processing + recorder management. Must time-share.
Fabric handles ingest + storage DMA + processing pipeline.

Option B — Split

Recorder card handles ingest + storage + forwarding exclusively.
OBC card has full RISC-V + fabric available for payload control and image processing. No contention.

Burst & Memory Calculator

Input Parameters

Burst duration 5.00 s

Ingest rate 25.0 Gbps

Inter-burst gap 60 s

Storage write speed 1 GB/s

1 GB/s conservative write speed used for safety margin. PolarStore-4N (2× 2 TB NVMe, 4 TB total) peak: 4 GB/s.

15.63

Burst Volume (GB)

97.7

DDR4 Buffer Fill %

3.91

Time to Drain (s)

Yes

Ready Before Next Burst?

Bursts Cacheable (no drain)

266.7

Avg Write Rate (MB/s)

6.51

Egress via 2×10GigE (s)
@ 2.4 GB/s

13.02

Egress via 1×10GigE (s)
@ 1.2 GB/s

15.63

Egress via LVDS (s)
@ 1.0 GB/s

Burst Timeline Visualization

Burst capture

Storage drain

Available for image processing + LVDS transfer

Idle / next burst

Ping-Pong Double Buffering

DDR4 buffer split into two halves — enables continuous recording without data loss. While one buffer captures incoming data, the other drains to storage. Roles swap seamlessly every burst cycle.

Configuration Summary

Selected Architecture

Option A — Combined Payload OBC + Recorder

Option A

Burst Scenario Results

Processing Capability Highlights

CPU4× U54 @ 600 MHz — ~4,114 DMIPS
CoreMark 4-core~7,500
FPGA fabric461K LEs, 1,473 DSPs @ 150–250 MHz
Pipeline processing~9,000 LUTs (2% fabric) — 200 Mpix/s @ 200 MHz
DDR4 bandwidth12.8 GB/s peak
AI inference (VBX Custom 75%)~1,600 GOPS INT8
Image processingSobel 1 MP in ~50–100 ms (4 cores)
DownstreamLVDS → IDT (est. 8 Gbps, 4-8 pairs)