-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue with non-ASCII characters in Buffer and Stream attachments #255
Closed
Closed
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes that the buffer is always going to be binary, Is this a safe assumption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just pushed a commit that adds some additional tests around this. The commit also fixes a bug I just found with attaching strings that contain non-ASCII characters.
Best I can tell from http://nodejs.org/api/buffer.html
Buffer
is only designed to handle arrays of octals (bytes); just search for the wordoctet
on that page to see what I mean. Which would mean it is safe to assume that aBuffer
only ever contains binary. What sort of binary it contains (e.g. raw binary or a UTF-8 encoded string) depends on what constructor ifBuffer
you use when creating theBuffer
you pass intoScenario.attach()
.Looking at the
encode64s
function in https://github.com/cucumber/gherkin/blob/master/js/lib/gherkin/formatter/json_formatter.js it assumes that the string you pass as an attachment to theJSONFormatter
only contains 8-bit characters; basically it can only base64 encode binary. Which makes sense as you can only base64 encode binary.Cucumber only seems to store a MIME type (aka media type) with attachments/embeddings. It doesn't store the character encoding. Which means if you attach text, you have no idea from the output JSON what encoding the text is in. cucumber/cucumber-jvm#501 seems to be related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @samccone. Did that answer your question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simondean Reading the docs here I see a note below that means 'binary' encoding will be removed in future versions of Node. Is it safe to use it here?
" 'binary' - A way of encoding raw binary data into strings by using only the first 8 bits of each character. This encoding method is deprecated and should be avoided in favor of Buffer objects where possible. This encoding will be removed in future versions of Node."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you thing about using the
hex
encoding since thebinary
encoding appears to be deprecated?