refactor: split levenshtein_distance.js into cohesive single-responsibility modules#776
Open
Wolfvin wants to merge 1 commit into
Open
refactor: split levenshtein_distance.js into cohesive single-responsibility modules#776Wolfvin wants to merge 1 commit into
Wolfvin wants to merge 1 commit into
Conversation
…bility modules DECOMPOSITION: Split levenshtein_distance.js (242 lines, 4 exports) into: - levenshtein_distance.js: Pure Levenshtein distance only (LevenshteinDistance + LevenshteinDistanceSearch) - damerau_levenshtein_distance.js: Damerau-Levenshtein distance only (DamerauLevenshteinDistance + DamerauLevenshteinDistanceSearch) - edit_distance_utils.js: Shared DP utilities (computeEditDistance, findMinCostSubstring) NAMING: Renamed internal functions for clarity: - _getMatchStart() → traceBackMatchStart() - getMinCostSubstring() → findMinCostSubstring() - levenshteinDistance() → computeEditDistance() COHESION: Each file now has a single responsibility: - edit_distance_utils.js: Core DP algorithm (shared by both variants) - levenshtein_distance.js: Levenshtein-specific wrapper functions - damerau_levenshtein_distance.js: Damerau-specific wrapper functions SINGLE RESPONSIBILITY: One file = one distance algorithm variant REDUCE COUPLING: Levenshtein and Damerau variants no longer need to know about each other. They both delegate to the shared DP core in edit_distance_utils.js. All existing tests pass. The index.js re-exports are unchanged, so this is a non-breaking internal refactor. Verified by Regrets fingerprint-based regression testing: - 19 clusters: ALL GREEN - 3 chains: ALL GREEN - KEBENARAN 1 (raw output): IDENTIK - KEBENARAN 2 (fingerprints): IDENTIK
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What was refactored and why
lib/natural/distance/levenshtein_distance.jswas a 242-line file exporting 4 functions (Levenshtein + Damerau-Levenshtein variants) from a single module. This violated several design principles:_getMatchStart()andlevenshteinDistance()were unclearWhat changed
levenshtein_distance.js(242 lines, 4 exports)levenshtein_distance.js(68 lines, 2 exports)damerau_levenshtein_distance.js(70 lines, 2 exports)edit_distance_utils.js(163 lines)index.jsimports from one fileindex.jsimports from two filesNaming improvements
_getMatchStart()→traceBackMatchStart()— describes what it doesgetMinCostSubstring()→findMinCostSubstring()— more precise verblevenshteinDistance()→computeEditDistance()— generic name reflecting shared natureBehavioral verification
This refactor was verified using Regrets (fingerprint-based regression testing) with a 4-verification pattern:
KEBENARAN 1 vs Final Output
All 19 clusters produce identical output after refactoring:
LevenshteinDistance("kitten", "sitting")→ 3 (unchanged)DamerauLevenshteinDistance("az", "za")→ 1 (unchanged)Before/After Fingerprints
Before/After Chain Hashes
Non-breaking
The
distance/index.jsre-exports are completely unchanged. Any code that imports fromnatural/distancewill continue to work identically. The only change is internal file organization.