CLDSRV-908: CopyObject handle checksums#6176
Conversation
Hello leif-scality,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Incorrect fix versionThe
Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:
Please check the |
|
LGTM — clean implementation of checksum propagation and recompute on CopyObject. Stream handling (jsutil.once guards, Azure per-part passthrough, error propagation) and the _shouldRecomputeChecksum decision logic are solid. One minor test style issue flagged inline. |
Codecov Report❌ Patch coverage is
Additional details and impacted files
... and 1 file with indirect coverage changes @@ Coverage Diff @@
## development/9.4 #6176 +/- ##
===================================================
+ Coverage 85.25% 85.26% +0.01%
===================================================
Files 208 208
Lines 13919 14012 +93
===================================================
+ Hits 11867 11948 +81
- Misses 2052 2064 +12
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
5ca489a to
57d9894
Compare
|
LGTM |
57d9894 to
5bd2262
Compare
|
5bd2262 to
8b2b196
Compare
|
LGTM |
Propagate the source FULL_OBJECT checksum when the algorithm matches the request (or no algorithm was requested); otherwise stream source bytes through a ChecksumTransform to compute a fresh digest before writing the destination.
8b2b196 to
720f686
Compare
|
LGTM |
| 'The x-amz-checksum-type header can only be used ' + 'with the x-amz-checksum-algorithm header.', | ||
| ); | ||
|
|
||
| // TODO: Update with 'MD5', 'SHA512', 'XXHASH128', 'XXHASH3', 'XXHASH64' when they are introduced. |
There was a problem hiding this comment.
you can add the ticket number in the comment if you have a ticket for this TODO
| * @returns {boolean} | ||
| */ | ||
| function _shouldRecomputeChecksum(headers, sourceObjMD) { | ||
| const requestedAlgo = headers['x-amz-checksum-algorithm'] && headers['x-amz-checksum-algorithm'].toLowerCase(); |
There was a problem hiding this comment.
I think we can use optional chaining operator in cloudserver.
| const requestedAlgo = headers['x-amz-checksum-algorithm'] && headers['x-amz-checksum-algorithm'].toLowerCase(); | |
| const requestedAlgo = headers['x-amz-checksum-algorithm']?.toLowerCase(); |
Maybe this can be used in other places as well to simplify
| if (err) { | ||
| done(err); |
There was a problem hiding this comment.
When this error happens, it could be any error and the perPart is still piped.
So it could theoretically receive data from azure and continue to stream during a short time window until the final callback calls passthrough.destroy and unpipes the source perPart.
Maybe the perPart should be destroyed here to stop immediately the streaming
| // into the master passthrough and use its 'end' as the completion | ||
| // signal — same pattern arsenal's data.copyObject uses. | ||
| const perPart = new PassThrough(); | ||
| perPart.once('error', done); |
There was a problem hiding this comment.
Should we consider encapsulating the error to include some part details in the error for better troubleshooting if the error is ever logged somewhere else in the streaming path.
| // and masterKeyId stored properly in metadata | ||
| if (sourceIsDestination && storeMetadataParams.locationMatch | ||
| && !isVersionedObj && !needsEncryption) { | ||
| && !isVersionedObj && !needsEncryption && !shouldRecomputeChecksum) { |
There was a problem hiding this comment.
If you don't skip, you'll trigger a data GET + PUT + DELETE (for previous location).
This only for checksum recomputation. Should you rather define another path where if it's only recompute checksum, you allow only the GET to stream into the ChecksumTransform stream and then you discard the end of the data stream, and avoid having to do a data PUT + DELETE
Uh oh!
There was an error while loading. Please reload this page.