Skip to content
This repository was archived by the owner on Oct 8, 2025. It is now read-only.

parseTar fails on large GitHub repositories tar files#82

Closed
remorses wants to merge 2 commits into
mjackson:mainfrom
remorses:fails-on-large-git-repo
Closed

parseTar fails on large GitHub repositories tar files#82
remorses wants to merge 2 commits into
mjackson:mainfrom
remorses:fails-on-large-git-repo

Conversation

@remorses
Copy link
Copy Markdown

I added a test that fails with Invalid tar header. Maybe the tar is corrupted or needs to be gunzipped?

The returned GitHub tar should be correct, there must be an issue in how tar-parser computes the checksums

@remorses
Copy link
Copy Markdown
Author

remorses commented Jul 24, 2025

Tried using Opus and Claude Code to fix the issue and it did it!

The problem was in the #parseBody() method where the parser wasn't handling the case when a file's content ended exactly at a buffer boundary.

The fix adds a check to detect when #missing becomes 0 after consuming a buffer, which means the file body has been completely read. When this happens, it properly closes the body controller and calculates the padding bytes needed to align to the next 512-byte block boundary.

@mjackson
Copy link
Copy Markdown
Owner

Hi @remorses 👋 Thanks for the PR.

Unfortunately I can't merge it here because I migrated all the code in this repo onto the v3 branch of the remix-run/remix repo earlier this week, and we're doing further development over there now. I am archiving this repo today.

If you're keen to get it merged, would you mind opening another PR over there? I'd really appreciate it!

@mjackson mjackson closed this Jul 24, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants