1
1
# xlsx-stream-reader
2
+
3
+ [ ![ JavaScript Style Guide] ( https://img.shields.io/badge/code_style-standard-brightgreen.svg )] ( https://standardjs.com )
4
+
2
5
======
3
6
4
7
Memory efficinet minimalist streaming XLSX reader that can handle piped
@@ -79,57 +82,56 @@ fs.createReadStream(fileName).pipe(workBookReader);
79
82
80
83
```
81
84
82
-
83
85
Beta Warning
86
+
84
87
-------
85
88
This module is currently in use on a live internal business system for product
86
89
management. That being said this should still be considered beta. More usage
87
90
and input from users will be needed due to the numerous differences/incompatibilities/flukes
88
91
I have already run into with XLSX files.
89
92
90
-
91
93
Limitations
94
+
92
95
-------
93
96
The row reader currently returns stored values for formulas (these are normally available)
94
97
and does not calculate the formula itself. As time permits the row handler will be more capable
95
98
but was enough for currrent purposes (loading values from large worksheets fast)
96
99
97
-
98
-
99
100
Inspiration
101
+
100
102
-----------
101
- Need a simple XLSX file streaming reader to handle large excel sheets but only
102
- one available/compatible was by guyonroche/exceljs. The stream reader module at
103
- the time was unfinished/unusable and rewrite attempts exposed column shifting I
103
+ Need a simple XLSX file streaming reader to handle large excel sheets but only
104
+ one available/compatible was by guyonroche/exceljs. The stream reader module at
105
+ the time was unfinished/unusable and rewrite attempts exposed column shifting I
104
106
could not solve
105
107
106
-
107
108
More Information
109
+
108
110
-----------
109
- Events are emmited as pertinent parts of the workbook and worksheet are receieved
110
- in the stream. Theoretically you could pause the input stream if events are being
111
+ Events are emmited as pertinent parts of the workbook and worksheet are receieved
112
+ in the stream. Theoretically you could pause the input stream if events are being
111
113
receieved too fast but this has not been tested
112
114
113
- Events can potentially (even though I have not seen it) be receieved out of order,
114
- if you receive a worksheet end event while still receieving rows be sure to make sure
115
+ Events can potentially (even though I have not seen it) be receieved out of order,
116
+ if you receive a worksheet end event while still receieving rows be sure to make sure
115
117
your number of rows receieved equals the ` workSheetReader.rowCount `
116
118
117
- Theoretically you could process an excel sheet as it is being uploaded, depending
118
- on the sheet type, but untried (I encountered some XLSX files that have a different
119
- zip format that requires having the entire file to read the archive contents properly),
119
+ Theoretically you could process an excel sheet as it is being uploaded, depending
120
+ on the sheet type, but untried (I encountered some XLSX files that have a different
121
+ zip format that requires having the entire file to read the archive contents properly),
120
122
but still probably better to save temp first and read streasm from there.
121
123
122
- Currently if the zip archive does not have the shared strings at the begining of the
123
- archive then the input stream for each sheet is pied into a temp file until the shared
124
- string are encountered and processed, then re-read the temp worksheets with the shared
125
- strings.
126
-
124
+ Currently if the zip archive does not have the shared strings at the begining of the
125
+ archive then the input stream for each sheet is pied into a temp file until the shared
126
+ string are encountered and processed, then re-read the temp worksheets with the shared
127
+ strings.
127
128
128
129
API Information
130
+
129
131
-----------
130
132
#### new XlsxStreamReader()
131
133
132
- Create a new XlsxStreamReader object (workBookReader). After attaching handlers you
134
+ Create a new XlsxStreamReader object (workBookReader). After attaching handlers you
133
135
can ` pipe() ` your input stream into the reader to begin processing
134
136
135
137
#### Event: 'error'
@@ -150,7 +152,7 @@ are available via array `workBookReader.workBookSharedStrings`.
150
152
151
153
#### Event: 'styles'
152
154
153
- After the workbook styles have been parsed this event is emmited. Styles are available
155
+ After the workbook styles have been parsed this event is emmited. Styles are available
154
156
via array ` workBookReader.workBookStyles `
155
157
156
158
#### Event: 'worksheet'
@@ -161,8 +163,8 @@ Emmitted when a worksheet is reached. The sheet number is availble via
161
163
{Number} ` workSheetReader.id ` . You can either process or skip at this point,
162
164
but you must do one for the processing to the next sheet to continue/finish.
163
165
164
- Once event is recieved you can attach worksheet on handlers (end, row) then you
165
- would ` workSheetReader.process() ` . If you do not want to process a sheet and instead
166
+ Once event is recieved you can attach worksheet on handlers (end, row) then you
167
+ would ` workSheetReader.process() ` . If you do not want to process a sheet and instead
166
168
want to skip entirely, you would ` workSheetReader.skip() ` without attaching any handlers.
167
169
168
170
#### Worksheet Event: 'end'
@@ -178,44 +180,43 @@ Emmitted on every row encountered in the worksheet. for more details on what
178
180
is in the row object attributes, see the [ Row class] [ msdnRows ] on MSDN.
179
181
180
182
For example:
183
+
181
184
* ` row.values ` : sparse array containing all cell values
182
185
* ` row.formulas ` : sparse array containing all cell formulas
183
186
* ` row.attributes.r ` : row index
184
187
* ` row.attributes.ht ` : Row height measured in point size
185
188
* ` row.attributes.customFormat ` : '1' if the row style should be applied.
186
189
* ` row.attributes.hidden ` : '1' if the row is hidden
187
190
188
-
189
191
References
192
+
190
193
-----------
191
194
* [ Working with sheets (Open XML SDK)] [ msdnSheets ]
192
195
* [ Row class] [ msdnRows ]
193
196
* [ ExcelJS] [ ExcelJS ]
194
197
195
-
196
198
Used Modules
199
+
197
200
-----------
198
201
* [ Path] [ modPath ]
199
202
* [ Util] [ modUtil ]
200
203
* [ Stream] [ modStream ]
201
204
* [ Sax] [ modSax ]
202
- * [ Unzip2 ] [ modUnzip2 ]
205
+ * [ unzipper ] [ modUnzipper ]
203
206
* [ Temp] [ modTemp ]
204
207
205
-
206
208
Authors
209
+
207
210
-----------
208
- Written by [ Brian Taber] ( https://github.com/DaSpawn )
211
+ Written by [ Brian Taber] ( https://github.com/DaSpawn ) and [ Kirill Husyatin ] ( https://github.com/kikill95 )
209
212
210
213
[ ![ DaSpawn's Gratipay] [ gratipay-image-daspawn ]] [ gratipay-url-daspawn ]
211
214
212
-
213
215
License
216
+
214
217
-----------
215
218
[ MIT] ( LICENSE )
216
219
217
-
218
-
219
220
[ gratipay-url-daspawn ] : https://gratipay.com/~DaSpawn
220
221
[ gratipay-image-daspawn ] : https://img.shields.io/gratipay/team/daspawn.svg
221
222
[ msdnRows ] : https://msdn.microsoft.com/EN-US/library/office/documentformat.openxml.spreadsheet.row.aspx
@@ -226,5 +227,5 @@ License
226
227
[ modStream ] : https://nodejs.org/api/stream.html
227
228
[ modUtil ] : https://nodejs.org/api/util.html
228
229
[ modSax ] : https://github.com/isaacs/sax-js
229
- [ modUnzip2 ] : https://github.com/glebdmitriew /node-unzip-2
230
+ [ modUnzipper ] : https://github.com/ZJONSSON /node-unzipper
230
231
[ modTemp ] : https://github.com/bruce/node-temp
0 commit comments