fix: detect binary files via NUL bytes, drop x/tools#55
Conversation
Replace godoc/util.IsText with an in-tree UTF-8 and NUL-byte check on the read prefix and while streaming the remainder. Skip rewriting files that contain a NUL anywhere, avoiding corruption of mixed text/binary content. Closes dolph#9 Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ae53ef65d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if n == 0 { | ||
| return "" | ||
| } | ||
| if !isTextBytes(buf[:n]) { |
There was a problem hiding this comment.
Handle UTF-8 rune split across probe boundary
Using isTextBytes on the first 1024-byte probe rejects valid UTF-8 files when a multibyte rune is split at byte 1024 (for example, 1023 ASCII bytes followed by é). In that case utf8.Valid returns false on the prefix even though the full file is valid text, so Read() returns "" and replacements are silently skipped for legitimate text files.
Useful? React with 👍 / 👎.
Summary
godoc/util.IsTextwith NUL-byte and UTF-8 validation on the prefix and full streamed read.golang.org/x/toolsdependency.Test plan
go test ./...TestReadSkipsBinaryWithNULCloses #9
Made with Cursor