Skip to content

Add Lossless Data Product Compression Components#5235

Open
kubiak-jpl wants to merge 35 commits into
nasa:develfrom
kubiak-jpl:dev/dpcompressproc
Open

Add Lossless Data Product Compression Components#5235
kubiak-jpl wants to merge 35 commits into
nasa:develfrom
kubiak-jpl:dev/dpcompressproc

Conversation

@kubiak-jpl

Copy link
Copy Markdown
Collaborator
Related Issue(s) #4941
Has Unit Tests (y/n) y
Documentation Included (y/n) y
Generative AI was used in this contribution (y/n) n

Change Description

This PR makes the following changes

  • Adds the DpCompressProc component
    • This component plugs into the DpWriter Proc interface and is responsible for creating compressed data products. It uses a port call to a second compression component for the actual compression
  • Adds the DpZLibCompressor component
    • Uses libz to compress chunks of data
    • DpZLibCompressor requires libz to compile. This is a very common library, even in embedded contexts, but I currently don't have any CMake stuff to guard against it. @LeStarch Any thoughts here?
  • Adds compressed data product pipeline to TestDeployment and DpDemo subtopology
  • Adds command arguments to DpDemo to trigger compression of demo data products via command
  • Updates SerializeFrom to use memmove instead of memcpy
    • DpCompressProc re-serializes data into an existing buffer and can encounter situations where memory overlap occurs between input buffer and the destination serialized buffer. In those situations memcpy would execute with overlapping buffers and trigger undefined behavior
    • This change allows me to use serializeFrom method calls which make the algorithm to re-write the data product cleaner. However, I can also revert this change and modify my code to use memmove directly

Rationale

Implements lossless compressible data products seamlessly into the F Prime Data Product subsystem

Testing/Review Recommendations

The algorithm to modify the data product in place is non-trivial. I believe I have sufficiently thought through the state machine and have unit tests to cover all the cases. But it could use thorough review.

I had this integrated into the Ref deployment before that was moved.

Future Work

  1. Implement a compressor using LZMA for better compression ratios

AI Usage (see policy)

N/A

LeStarch
LeStarch previously approved these changes Jun 9, 2026
Comment thread Svc/DpZLibCompressor/DpZLibCompressor.hpp Fixed
static_cast<FwAssertArgType>(ctx->zlib_alloc_buffer.getSize()));
const FwSizeType free_space = ctx->zlib_alloc_buffer.getSize() - ctx->bump_allocator;

const FwSizeType alloc_size = items * size;
// Component construction and destruction
// ----------------------------------------------------------------------

DpCompressProc ::DpCompressProc(const char* const compName) : DpCompressProcComponentBase(compName) {}

DpCompressProc ::DpCompressProc(const char* const compName) : DpCompressProcComponentBase(compName) {}

DpCompressProc ::~DpCompressProc() {}
Comment thread Svc/DpCompressProc/DpCompressProc.cpp Fixed
Comment thread Svc/DpCompressProc/DpCompressProc.cpp Fixed
Comment thread Svc/DpCompressProc/DpCompressProc.cpp Fixed

}

Svc::CompressionAlgorithm DpZLibCompressor ::compressChunk_handler(FwIndexType portNum,
Comment thread Svc/DpZLibCompressor/DpZLibCompressor.cpp Fixed
Comment thread Svc/DpZLibCompressor/DpZLibCompressor.cpp Fixed
Comment thread Svc/DpZLibCompressor/DpZLibCompressor.cpp Fixed
Comment thread Svc/DpZLibCompressor/DpZLibCompressor.hpp Fixed

@github-advanced-security github-advanced-security AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Coverage report — base devel

Overall (line): 81.10% → 81.22% (+0.12)
Regression threshold: 0.50% (line).

Regressions

(none over threshold)

Modules changed

Module Line Δ Function Δ Branch Δ
Fw/DataStructures 98.21 -0.09 97.14 +0.00 82.84 -0.19

Modules without UTs

CFDP/Checksum/GTest, Drv/ByteStreamDriverModel, Drv/Interfaces, Drv/LinuxGpioDriver, Drv/LinuxI2cDriver, Drv/LinuxSpiDriver, Drv/LinuxUartDriver, Drv/Ports, Drv/Ports/DataTypes, FppTestProject/FppTest/interfaces, FppTestProject/FppTest/topology/async, FppTestProject/FppTest/topology/components/Comp, FppTestProject/FppTest/topology/components/Framework, FppTestProject/FppTest/topology/components/Receiver, FppTestProject/FppTest/topology/components/Sender, FppTestProject/FppTest/topology/guarded, FppTestProject/FppTest/topology/ports, FppTestProject/FppTest/topology/sync, FppTestProject/FppTest/topology/top_ports, FppTestProject/FppTest/topology/types, Fw/Cmd, Fw/Com, Fw/Comp, Fw/FilePacket/GTest, Fw/Fpy, Fw/Interfaces, Fw/Obj, Fw/Port, Fw/Ports/CompletionStatus, Fw/Ports/Ready, Fw/Ports/Signal, Fw/Ports/SuccessCondition, Fw/Prm, Fw/SerializableFile/test/TestSerializable, Fw/Sm, Fw/Test, Fw/Types/GTest, Os/Models, Svc/Cycle, Svc/DpPorts, Svc/Fatal, Svc/FatalHandler, Svc/FileDownlinkPorts, Svc/FprimeProtocol, Svc/Interfaces, Svc/PassiveConsoleTextLogger, Svc/Ping, Svc/PolyIf, Svc/Ports/CommsPorts, Svc/Ports/FilePorts, Svc/Ports/OsTimeEpoch, Svc/Ports/TlmPacketizerPorts, Svc/Ports/VersionPorts, Svc/Sched, Svc/Seq, Svc/Subtopologies/CdhCore, Svc/Subtopologies/ComCcsds, Svc/Subtopologies/ComFprime, Svc/Subtopologies/ComLoggerTee, Svc/Subtopologies/DataProducts, Svc/Subtopologies/FileHandling, Svc/Types/TlmPacketizerTypes, Svc/WatchDog, TestDeploymentsProject/Ref/PingReceiver, TestDeploymentsProject/Ref/RecvBuffApp, TestDeploymentsProject/Ref/SendBuffApp, TestDeploymentsProject/Ref/Top, TestDeploymentsProject/Ref/TypeDemo, cmake/test/data/TestDeployment/TestBuildAutocoder, cmake/test/data/TestDeployment/TestChainedAutocoder, cmake/test/data/TestDeployment/TestHeaderAutocoder, cmake/test/data/TestDeployment/TestTargetAutocoder, cmake/test/data/test-fprime-library/TestLibrary/TestComponent, cmake/test/data/test-fprime-library2/TestLibrary2/TestComponent

@kubiak-jpl

Copy link
Copy Markdown
Collaborator Author

@LeStarch Looks like some of the failures are due to missing zlib.h. Do you want me to find a solution to compiling on platforms without zlib?

@kubiak-jpl

Copy link
Copy Markdown
Collaborator Author

I'm a little confused by this error in the RHEL8 build

/__w/fprime/fprime/Svc/DpCompressProc/DpCompressProc.cpp:358:31: error: conversion from ‘int’ to ‘FwSizeStoreType’ {aka ‘short unsigned int’} may change value [-Werror=conversion]
             uncompressed_size += chunk_size;

Because both uncompress_size and chunk_size are FwSizeStoreType

// Component construction and destruction
// ----------------------------------------------------------------------

DpCompressProc ::DpCompressProc(const char* const compName) : DpCompressProcComponentBase(compName) {}
// Component construction and destruction
// ----------------------------------------------------------------------

DpZLibCompressor ::DpZLibCompressor(const char* const compName) : DpZLibCompressorComponentBase(compName) {}
ctx.comp.log_WARNING_LO_BufferTooBigForZLib(in_buffer.getSize(), ctx.compression_buffer.getSize(),
std::numeric_limits<uInt>::max());
return CompressionAlgorithm::UNCOMPRESSED;
}
// The call to deflateEnd does not flush any additional output data
}

voidpf DpZLibCompressor::zlib_alloc_fn(voidpf opaque, uInt items, uInt size) {
// The call to deflateEnd does not flush any additional output data
}

voidpf DpZLibCompressor::zlib_alloc_fn(voidpf opaque, uInt items, uInt size) {
// Handler implementations for typed input ports
// ----------------------------------------------------------------------

void DpCompressProc::serializeCompressionHeader(Fw::LinearBufferBase& serializer,
FW_ASSERT(ok == Fw::FW_SERIALIZE_OK, ok);
}

void DpCompressProc ::procRequest_handler(FwIndexType portNum, Fw::Buffer& fwBuffer) {
FW_ASSERT(ok == Fw::FW_SERIALIZE_OK, ok);
}

void DpCompressProc ::procRequest_handler(FwIndexType portNum, Fw::Buffer& fwBuffer) {
return alg;
}

CompressionAlgorithm DpZLibCompressor::zlibCompressionHelper(ZLibCtx& ctx, const Fw::Buffer& in_buffer) {
// The call to deflateEnd does not flush any additional output data
}

voidpf DpZLibCompressor::zlib_alloc_fn(voidpf opaque, uInt items, uInt size) {
DpZLibCompressor& comp;

ZLibCtx(DpZLibCompressor& c)
: compression_buffer(), zlib_alloc_buffer(), bump_allocator(0), zlib_stream(), comp(c) {}
Comment thread Svc/DpZLibCompressor/DpZLibCompressor.hpp Fixed
@thomas-bc

Copy link
Copy Markdown
Collaborator

@kubiak-jpl this is a type of warning we're used to seeing on the RHEL8 compilers. FwSizeStoreType is a U16 but the addition in += auto-promotes it to U32, so there's an implicit cast down back into U16. I think uncompressed_size = static_cast<FwSizeStoreType>(uncompressed_size + chunk_size) should fix it

@kubiak-jpl

Copy link
Copy Markdown
Collaborator Author

@kubiak-jpl this is a type of warning we're used to seeing on the RHEL8 compilers. FwSizeStoreType is a U16 but the addition in += auto-promotes it to U32, so there's an implicit cast down back into U16. I think uncompressed_size = static_cast<FwSizeStoreType>(uncompressed_size + chunk_size) should fix it

I didn't realize that could happen. I made your requested change

@thomas-bc

Copy link
Copy Markdown
Collaborator

For the missing zlib.h - hopefully installing zlib/zlib-devel around here (in all 3 jobs) should fix it

dnf install -y git python3.12 python3.12-pip llvm-toolset libasan libubsan java-1.8.0-openjdk


instance dpCompressProc: Svc.DpCompressProc base id DataProductsConfig.BASE_ID + 0x04000

instance dpZLibCompressor: Svc.DpZLibCompressor base id DataProductsConfig.BASE_ID + 0x05000

@thomas-bc thomas-bc Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is exposing that the zlib dependency introduced by the DipZLibCompressor component is not a standard library that's available on most distributions (ubuntu, RHEL8, macos runners all don't have it).

Adding this component to the DataProducts subtopology adds a new system dependency on zlib for systems that want to use this subtopology. I don't think this is desirable as a default option.

Likely the best way forward here is to make the inclusion of this component optional (build option, other subtopology, or something else). That would be awesome, but it's possibly more work than you want to be doing right now.

Given that the change is only adding port connections, a good-and-easy solution would be to add it in the topology of one of our references, but leave it out of the subtopology by default. https://github.com/nasa/fprime-examples/tree/pr-5235 would be a good place.

There may be other options too.

@kubiak-jpl thoughts? TL;DR multiple options, let me know what you prefer

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option would be to conditionally compile the component as a no-op if zlib isn't available. That would prevent breaking these other deployments, but it might add more confusion about whether compression is actually enabled or not.

Maybe there's a middle ground where the DpCompressProc component stays in the subtopology, but we omit the compressor. Then users could add compressors their system supports. Do subtopologies support exporting ports yet?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah conditionally compiling a feature with a no-op component doesn't really respect the intended component-based design.

And yes subtopologies can export ports, for the top-level topology to easily connect to: https://nasa.github.io/fpp/fpp-users-guide.html#Defining-Topologies_Topology-Ports

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this, I can update the DataProduct subtopology to export the DpWriter Proc port and I'll write a subtopology with DpCompressProc and DpZLibCompressor as an example of how to enable compression. Does that work? Would it make sense for the compression subtopology to exist in the fprime repo?

@thomas-bc thomas-bc Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's a great idea! Amazing.

If it were me I would keep the DpProc and DpZLib components in the same (optional) subtopology. That way to enable compression, all one needs to do is

import Subtopologies.DataProducts
+ import Subtopologies.DpCompression

[....]

+ DataProducts.procBufferSendOut[0] -> DpCompression.procRequest

Or is there a use-case for having a dpCompressProc component in the DataProducts subtopology even if it's not connected to anything?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is actually a really nifty pattern for all DpProcessing systems...

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I'll also add some cmake logic to only expose the DpZLibCompressor component if zlib is found. Something with find_library

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants