Skip to content

ImportAll work queue manager to import items in parallel in some situations#970

Open
raymond-rebbeck wants to merge 2 commits into
intersystems:mainfrom
raymond-rebbeck:import-all-no-compile-queue
Open

ImportAll work queue manager to import items in parallel in some situations#970
raymond-rebbeck wants to merge 2 commits into
intersystems:mainfrom
raymond-rebbeck:import-all-no-compile-queue

Conversation

@raymond-rebbeck
Copy link
Copy Markdown
Contributor

Description

Import All is currently all done in a single thread and with a large codebase this becomes a bottleneck as for example it can take 10+ minutes to import to import 6000+ files. We have a CI/CD pipeline where we use ImportAll dozens a times a day and ImportAll is the step that currently takes the most time.

We do not use compileOnImport or decomposeProductions at all and so the scope has purposely been limited to only cover when those options are not being used. In this case it seems to be relatively straightforward to apply a work queue manager so that ImportItem is performed in parallel.

This has yielded significant improvements with importing being 3-4 times faster. When system processes are observed using top multiple irisdb proceses are observed churning away with high CPU usage (as expected), rather than only a single process prior.

It is very likely possible to re-factor ImportRoutines so that more can be done in parallel but the current code structure makes it straightforward to apply this to ImportItem which seems to be where most of the heavy lifting is currently done anyway and so likely the most benefit derived.

To allow a work queue manager to be used with compileOnImport would probably require a separate implementation within the pull event handler itself, which is not planned as part of this work due to not using this setting. To use with decomposeProductions would probably require additional locking to be implemented in ImportItem to allow productions to be created and updated safely, also not planned as part of this due to not using this setting - it is conceivable that such a thing may help with #917 .

Testing

Tested manually with a codebase that consists of 6000+ files (includes types .cls, .hl7, .inc and .lut), importing all from scratch (i.e. fresh empty IRIS container) and subsequent Import All (Force) over the top repeatedly. With and without compileOnImport and decomposeProductions enabled to check that work queue manager is and isn't being used as expected. Compiling all classes and running unit tests without any expected issues after work manager import all.

Checklist

  • [Y] This branch has the latest changes from the main branch rebased or merged.
  • [N/A] Web UI has been built (any changes in git-webui/src have matching changes in git-webui/release)
  • [N/A] CHANGELOG.md entry added if appropriate.
  • [N/A] Documentation has been/will be updated

…o improve performance

Currently only used when not compiling.

When compiling the pull event handler is used which will require a different work queue manager implementation.
…mposed productions are in use

It seems very likely that there would be race conditions within ImportItem without further locking being implemented to ensure that productions are created and updated safely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant