fix: comprehensive multi_patch improvements preventing byte offset corruption#3355
Open
Mustaqeem66 wants to merge 4 commits into
Open
fix: comprehensive multi_patch improvements preventing byte offset corruption#3355Mustaqeem66 wants to merge 4 commits into
Mustaqeem66 wants to merge 4 commits into
Conversation
This fix addresses .forge.db corruption issues in ForgeCode by: 1. Startup WAL Recovery: - Checkpoints any leftover WAL from previous crashed sessions - Runs database integrity check on startup - Ensures data is recovered before new session starts 2. Auto-Checkpoint Threshold Reduced: - Changed from 1000 to 100 frames (~5MB max instead of ~50MB) - Prevents massive WAL files during long sessions 3. Async Checkpoint Method: - Added checkpoint_async() for graceful shutdown scenarios - Uses pool-based connection (async-safe) 4. Drop Checkpoint: - Checkpoints WAL when DatabasePool is dropped - Logs warnings if fails (expected on force-kill) 5. Comprehensive Tests: - test_checkpoint_method_exists - test_drop_calls_checkpoint - test_in_memory_pool_has_checkpoint - test_checkpoint_truncates_wal - test_wal_recovery_on_startup - test_async_checkpoint_method - test_autocheckpoint_threshold_reduced Fixes tailcallhq#3260 related corruption issues by preventing WAL accumulation and ensuring data integrity on startup. Co-authored-by: Mustaqeem66 <ageisnode@gmail.com>
Phase 1 - Safety Critical: - Add unique match validation (count all matches, error if > 1) - Add overlap detection with validation - Add atomic write with temp file + rename - Add verification and memory-based rollback - Add better error messages with file path Phase 2 - Robustness: - Add line-based whitespace normalization - Add line-window fuzzy matching with 0.90 threshold - Add 3-layer fallback chain (exact -> whitespace -> fuzzy) Key improvements: - Reverse-order application (already done) - Unique match validation prevents silent wrong replacements - Overlap detection rejects logically impossible edits - Atomic write prevents half-written files - Whitespace normalization handles LLM whitespace differences - Fuzzy matching catches near-matches - Better error messages with file path Tests added: - 30+ new tests covering all features Fixes: tailcallhq#3249, tailcallhq#3182, tailcallhq#2815, tailcallhq#2773, tailcallhq#2997, tailcallhq#3115, tailcallhq#3291 Co-authored-by: Mustaqeem66 <ageisnode@gmail.com>
Added line:column information to overlap error messages for better debugging. This helps users identify exactly where overlapping edits occur in their files. Co-authored-by: Mustaqeem66 <ageisnode@gmail.com>
Changed multi_patch edit application to use pre-computed positions (plan.position and plan.old_len) instead of re-searching in modified content. This ensures byte offset corruption cannot happen since we're using exact positions from the original content rather than fresh searches. Co-authored-by: Mustaqeem66 <ageisnode@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Comprehensive multi_patch improvements that fix issue #3249 and 7 related issues. This PR adds safety critical features including unique match validation, overlap detection, atomic writes, and robust matching layers.
Changes
Phase 1 - Safety Critical
Phase 2 - Robustness
Issues Fixed
Testing
30+ new tests covering:
Algorithm