Skip to content

MINIFICPP-2849 Implement LMDB based content repository#2201

Open
lordgamez wants to merge 3 commits into
apache:mainfrom
lordgamez:MINIFICPP-2849
Open

MINIFICPP-2849 Implement LMDB based content repository#2201
lordgamez wants to merge 3 commits into
apache:mainfrom
lordgamez:MINIFICPP-2849

Conversation

@lordgamez

Copy link
Copy Markdown
Contributor

https://issues.apache.org/jira/browse/MINIFICPP-2849


Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.

In order to streamline the review of the contribution we ask you to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?

  • Does your PR title start with MINIFICPP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file?
  • If applicable, have you updated the NOTICE file?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible.

size_t LmdbStream::write(const uint8_t* value, size_t size) {
if (!write_enable_) { return STREAM_ERROR; }
if (size != 0 && IsNullOrEmpty(value)) { return STREAM_ERROR; }
value_.append(reinterpret_cast<const char*>(value), size);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LMDB does not have an append function when writing a value like RocksDB's Merge function, so instead of rereading the original value, appending to it, then writing back the new value, all the writes are buffered until the stream is closed, that's when the actual write and commit happens. Currently all content repository streams are used either for write-only or read-only use cases, so there should be no use case where reads and writes are mixed. This should be addressed in a separate PR to change the content repository interface to use separate OutputStream and InputStream types for reads and writes to enforce this, which would also result in separate LmdbInputStream and LmdbOutputStream types (same for RocksDB).

@lordgamez lordgamez marked this pull request as ready for review June 22, 2026 09:54
Comment thread cmake/LMDB.cmake Outdated

if (WIN32)
get_directory_property(MINIFI_SAVED_COMPILE_DEFS COMPILE_DEFINITIONS)
remove_definitions(-DWIN32_LEAN_AND_MEAN)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why you did this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LMDB fails to compile on Windows when WIN32_LEAN_AND_MEAN is defined, which is automatically added to the compile definitions in CMakeLists.txt so it is included in all thirdparties used with FetchContent, so it needs to be removed separately for LMDB on Windows.

@szaszm szaszm Jun 22, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to make absolutely sure that we only undefine WIN32_LEAN_AND_MEAN while compiling LMDB, or find another workaround. If you look up what it does, it prevents windows.h from including a bunch of additional headers that usually end up unused, and may end up conflicting with other headers, while also increasing compile times. Alternatively, we could look up the transitively included headers that LMDB relies on, and include them explicitly with target_compile_options(target PRIVATE -include foo.h)

Ideally Microsoft would've made the lightweight header the default, with the option to opt in to additional features, but for historical reasons, the heavyweight header is the default, and you can opt out of the extra features. (bloat)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I checked this according to my understanding this change should only impact the lmdb subdirectory, so it should not impact any other compilation. Anyway I can still check if there is another workaround.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were 2 issues:

  • Defining WIN32_LEAN_AND_MEAN removed winternl.h needed for NTSTATUS usage, that needed to be added explicitly
  • After including winternl.h the function pointer names in mdb.c clashed with the Windows symbols, those needed to be renamed

Added patch in: 52d93b2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants