Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discover] feat: download CSV from discover page #9530

Merged
merged 13 commits into from
Mar 17, 2025

Conversation

angle943
Copy link
Collaborator

@angle943 angle943 commented Mar 12, 2025

Description

  • adds the ability to export to CSV from the discover page
  • There are two options to download: "Visible" and "Max Available"
  • "Visible" option is always available (whenever there are rows rendered) and will download the current table as CSV
  • "Max Available" option is available whenever we have HITS data (ie for DQL/Lucene) and will make a query call behind the scenes to download Math.min(hits, 10000) with your current configuration

Screenshot

  • When hits are available

Screenshot 2025-03-11 at 8 24 47 PM

  • when hits are not available

Screenshot 2025-03-11 at 8 24 55 PM

  • Video of download. We are downloading three times:
  1. Downloading the visible (500) results with default table (timestamp & _source)
  2. Downloading the max (10,000) with default table
  3. Adding some fields and downloading it
downloadcsv.mp4

Testing the changes

  • I have added both unit tests and cypress test

Changelog

  • feat: add the ability to export to CSV from the discover page

Check List

  • All tests pass
    • yarn test:jest
    • yarn test:jest_integration
  • New functionality includes testing.
  • New functionality has been documented.
  • Update CHANGELOG.md
  • Commits are signed per the DCO using --signoff

Signed-off-by: Justin Kim <jungkm@amazon.com>
@angle943 angle943 changed the title download CSV from discover page [Discover] feat: download CSV from discover page Mar 12, 2025
Signed-off-by: Justin Kim <jungkm@amazon.com>
@opensearch-project opensearch-project deleted a comment from codecov bot Mar 12, 2025
@opensearch-project opensearch-project deleted a comment from codecov bot Mar 12, 2025
Copy link

codecov bot commented Mar 12, 2025

Codecov Report

Attention: Patch coverage is 86.53846% with 14 lines in your changes missing coverage. Please review.

Project coverage is 61.82%. Comparing base (9f0a27e) to head (a052c4b).
Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
...ic/application/view_components/utils/use_search.ts 56.25% 3 Missing and 4 partials ⚠️
...cation/components/download_csv/use_download_csv.ts 92.68% 2 Missing and 1 partial ⚠️
...plication/components/download_csv/download_csv.tsx 83.33% 0 Missing and 2 partials ⚠️
...omponents/download_csv/use_download_csv_toasts.tsx 94.11% 0 Missing and 1 partial ⚠️
...tion/view_components/utils/update_search_source.ts 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9530      +/-   ##
==========================================
+ Coverage   61.78%   61.82%   +0.03%     
==========================================
  Files        3819     3825       +6     
  Lines       91937    92040     +103     
  Branches    14581    14599      +18     
==========================================
+ Hits        56804    56902      +98     
+ Misses      31469    31464       -5     
- Partials     3664     3674      +10     
Flag Coverage Δ
Linux_1 28.94% <7.69%> (-0.04%) ⬇️
Linux_2 56.38% <ø> (ø)
Linux_3 39.42% <86.53%> (+0.09%) ⬆️
Linux_4 28.85% <7.69%> (-0.04%) ⬇️
Windows_1 28.96% <7.69%> (-0.04%) ⬇️
Windows_2 56.33% <ø> (ø)
Windows_3 39.42% <86.53%> (+0.10%) ⬆️
Windows_4 28.85% <7.69%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.


export const saveDataAsCsv = (csvData: string) => {
const blob = new Blob([csvData], { type: 'text/csv;charset=utf-8' });
const fileName = `opensearch_export_${moment().format('YYYY-MM-DD')}`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be pulling some of this from advanced settings?

Screenshot 2025-03-12 at 4 33 22 PM

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are talking about the formatting of the filename? I'm a bit reluctant on that, but open for debate. But here's my rationale:

Because we are saving this to a file, we have to respect the allowed characters that a filename can have. So things like / and such are not possible. So if we want to respect the user configured date format, I feel like we have to consider all these exceptions, and I didn't want to deal with it. Especially since its just a file name and they can rename it as they choose.

But open to suggestions!

if you were talking about something else, let me know!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the quoteValues and separator setting being read from L100-L101

      const csvData = unparse(
        { fields: displayedColumnNames, data: csvRowData },
        {
          quotes: uiSettings.get('csv:quoteValues', true),
          delimiter: uiSettings.get('csv:separator', ','),
          header: true,
        }
      );

@abbyhu2000
Copy link
Member

Nice feature! Just one small comment: I see in the download_csv folders have a lot of files and it seems like some files like <download_csv_button.tsx> and <download_csv_callout.tsx> are fairly simple single component file without any extra logics, imo is it worth it to combine some of them to increase some readability and faster navigation?

@angle943
Copy link
Collaborator Author

angle943 commented Mar 13, 2025

@abbyhu2000 hey! this is a contentious topic, but love to open up the debate.

You are right, DiscoverDownloadCsvButton and DownloadCsvCallout is pretty simple. But I would argue these should be separate. For context to everyone:

  • DiscoverDownloadCsvButton, it is passed as a prop to the EuiPopover's button prop in download_csv.tsx.
  • DownloadCsvCallout is rendered in DownloadcsvPopoverContent component and is hidden behind a conditional {showMaxOption && <DiscoverDownloadCsvCallout />}.

I would argue that these should be separate for the following reason:

  • Within the context of <DownloadCsv />, you whouldn't worry about the implementation details of how the button looks like and what type of props it has. Similarly, within context of DownloadCsvPopoverContent, you shouldn't worry about the UI and text and stuff of the PopoverContent. Just whehther or not it should render it or not. So i think your comment on "readability" is up for debate. Let me follow that up with the next point I have.
  • Both the parent components <DownloadCsv /> and <DownloadCsvPopoverContent /> can blow up and get big over time, so I think it is worth abstracting these things out. We have a lot of components I see with hundreds of lines of code and is in charge of doing so many things. I would argue this decreases readability (and increases chances of bugs).
  • For code navigation, I agree it can get annoying to have to hop through multiple components to get to what you are looking for, but going back to the previous point of having thousands of lines of code components is not ideal either. I also think a general DX(Developer experience) frustration with deeply nested components is solved not by having less nested components, but a better state management strategy that will allow us to not prop drill so much (usually when you are hopping code it is to find the original definition of a certain prop)
  • The unit tests can be more easily scoped, and easier to write in general. For example, the parent's unit test doesn't have to worry about the inner workings of these sub components, and the sub components can have their own unit tests
  • Due to how React renders the DOM tree using a virtual DOM tree, this can potentially be an optimization technique as the whole of the parent components don't need to re-render, if nothing within the child components haven't re-rendered.

Due to the above, I would personally vote that we split it up where we can without over doing it. Of course it can be argued that I might be over doing it here, and it is up for debate, but for the above reasons I think keeping them separate makes more sense

…h variable

Signed-off-by: Justin Kim <jungkm@amazon.com>
Signed-off-by: Justin Kim <jungkm@amazon.com>
Signed-off-by: Justin Kim <jungkm@amazon.com>
Signed-off-by: Justin Kim <jungkm@amazon.com>
Signed-off-by: Justin Kim <jungkm@amazon.com>
Signed-off-by: Justin Kim <jungkm@amazon.com>
Signed-off-by: Justin Kim <jungkm@amazon.com>
@angle943 angle943 merged commit 29f05d7 into opensearch-project:main Mar 17, 2025
71 of 74 checks passed
@angle943 angle943 deleted the download-csv branch March 17, 2025 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants