Fix issue with non-ASCII characters in Buffer and Stream attachments #255

simondean · 2014-10-04T09:43:49Z

This fixes #249. Buffers are not converted to strings using buffer.toString('binary') instead of buffer.toString(). buffer.toString() defaults to utf8 which broken any non-ASCII characters.

This pull request also includes a fix for an issue in features/attachments.feature that meant the node.js version wasn't detected correctly and meant that streams weren't being tested in the feature file (they were still being tested in the specs).

simondean · 2014-10-04T09:44:27Z

The Travis CI build for this pull request is currently failing due to #254.

simondean · 2014-10-04T10:38:06Z

I plan to do a rebase on this pull request once #256 (the fix for the node.js v0.8 npm upgrade issue) is merged.

simondean · 2014-10-04T16:52:24Z

I've rebased the pull request and the Travis CI build is now passing (no more npm upgrade issue)

simondean · 2014-10-20T14:36:26Z

Hi @jbpros. Would it be possible for someone to take a look at this pull request sometime? Unfortunately attachments are pretty useless without this fix as without you can't attach any binary attachments (e.g. screenshots) or anything else containing non-ASCII characters (e.g. UTF-8 encoded text). Thanks

samccone · 2014-10-20T14:51:36Z

lib/cucumber/api/scenario.js

        data.on('end', function() {
-          astTreeWalker.attach(Buffer.concat(buffers).toString(), mimeType);
+          astTreeWalker.attach(Buffer.concat(buffers).toString('binary'), mimeType);


This assumes that the buffer is always going to be binary, Is this a safe assumption?

I've just pushed a commit that adds some additional tests around this. The commit also fixes a bug I just found with attaching strings that contain non-ASCII characters.

Best I can tell from http://nodejs.org/api/buffer.html Buffer is only designed to handle arrays of octals (bytes); just search for the word octet on that page to see what I mean. Which would mean it is safe to assume that a Buffer only ever contains binary. What sort of binary it contains (e.g. raw binary or a UTF-8 encoded string) depends on what constructor if Buffer you use when creating the Buffer you pass into Scenario.attach().

Looking at the encode64s function in https://github.com/cucumber/gherkin/blob/master/js/lib/gherkin/formatter/json_formatter.js it assumes that the string you pass as an attachment to the JSONFormatter only contains 8-bit characters; basically it can only base64 encode binary. Which makes sense as you can only base64 encode binary.

Cucumber only seems to store a MIME type (aka media type) with attachments/embeddings. It doesn't store the character encoding. Which means if you attach text, you have no idea from the output JSON what encoding the text is in. cucumber/cucumber-jvm#501 seems to be related.

Hi @samccone. Did that answer your question?

@simondean Reading the docs here I see a note below that means 'binary' encoding will be removed in future versions of Node. Is it safe to use it here?

" 'binary' - A way of encoding raw binary data into strings by using only the first 8 bits of each character. This encoding method is deprecated and should be avoided in favor of Buffer objects where possible. This encoding will be removed in future versions of Node."

What do you thing about using the hex encoding since the binary encoding appears to be deprecated?

simondean · 2014-12-29T19:41:11Z

@nikulkarni currently the encode64s function in https://github.com/cucumber/gherkin/blob/master/js/lib/gherkin/formatter/json_formatter.js requires the input argument to be in binary format. If the encode64s function was changed to use the Buffer class, it would make it node.js specific and would no longer support running in a browser. There's probably a good case for node.js to keep the binary format as it makes it easier to interop between node.js and other JavaScript platforms.

nikulkarni · 2014-12-29T20:02:16Z

@simondean understood, thanks. Hopefully this PR gets merged soon, it fixes attachment of screenshots unless you pass correctly encoded image.

charlierudolph · 2016-06-24T16:21:22Z

Closing as with 5fb6706 there is no intermediate conversion to string and we just encode it as base64 when adding it to the json output.

lock · 2018-10-25T07:16:23Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Fix issue with non-ASCII characters in Buffer and Stream attachments

53a380d

simondean force-pushed the binary_attachments branch from 40b7321 to 53a380d Compare October 4, 2014 16:47

samccone reviewed Oct 20, 2014
View reviewed changes

Additional tests

73cb7d2

jlin412 mentioned this pull request Dec 17, 2014

Cannot attach screenshot to cucumber report #275

Closed

jbpros added bug labels Dec 22, 2014

jbpros added this to the 0.5 major features milestone Dec 22, 2014

jbpros removed this from the major features milestone Oct 10, 2015

jbpros removed the next-milestone label Oct 10, 2015

charlierudolph closed this Jun 24, 2016

lock bot locked as resolved and limited conversation to collaborators Oct 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issue with non-ASCII characters in Buffer and Stream attachments #255

Fix issue with non-ASCII characters in Buffer and Stream attachments #255

simondean commented Oct 4, 2014

simondean commented Oct 4, 2014

simondean commented Oct 4, 2014

simondean commented Oct 4, 2014

simondean commented Oct 20, 2014

samccone Oct 20, 2014

simondean Oct 20, 2014

simondean Oct 24, 2014

nikulkarni Dec 29, 2014

charlierudolph Oct 26, 2015

simondean commented Dec 29, 2014

nikulkarni commented Dec 29, 2014

charlierudolph commented Jun 24, 2016

lock bot commented Oct 25, 2018

Fix issue with non-ASCII characters in Buffer and Stream attachments #255

Fix issue with non-ASCII characters in Buffer and Stream attachments #255

Conversation

simondean commented Oct 4, 2014

simondean commented Oct 4, 2014

simondean commented Oct 4, 2014

simondean commented Oct 4, 2014

simondean commented Oct 20, 2014

samccone Oct 20, 2014

Choose a reason for hiding this comment

simondean Oct 20, 2014

Choose a reason for hiding this comment

simondean Oct 24, 2014

Choose a reason for hiding this comment

nikulkarni Dec 29, 2014

Choose a reason for hiding this comment

charlierudolph Oct 26, 2015

Choose a reason for hiding this comment

simondean commented Dec 29, 2014

nikulkarni commented Dec 29, 2014

charlierudolph commented Jun 24, 2016

lock bot commented Oct 25, 2018