Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Format][FlightRPC] Flight SQL evolution #41840

Open
2 tasks
lidavidm opened this issue May 27, 2024 · 1 comment
Open
2 tasks

[Format][FlightRPC] Flight SQL evolution #41840

lidavidm opened this issue May 27, 2024 · 1 comment

Comments

@lidavidm
Copy link
Member

lidavidm commented May 27, 2024

Describe the enhancement requested

From apache/arrow-rs#5731 (comment)

Originally Flight RPC was implemented as a framework wrapping gRPC. This was especially expedient for the C++ implementation. By now it's mostly a weight dragging down Flight users, especially Flight SQL.

If we have the chance to evolve Flight SQL and/or Flight RPC, some changes may include:

Component(s)

FlightRPC, Format

@lidavidm
Copy link
Member Author

lidavidm commented May 28, 2024

Other potential things

  • There's a mix of stateful and stateless components, e.g. transactions use explicit handles but sessions are ambient (some of this stems from trying to accommodate JDBC/ODBC more directly)
  • The name needs to be something that doesn't make people think it's a SQL dialect
  • The metadata methods are incomplete (e.g there's no procedures, types, etc)
  • There's no support for multiple result sets (this is also a limitation of using IPC streams in Flight, this has been complained about before too)
  • The underlying IPC stream could use enhancements
    • the small result proposal from Micah, which would embed trivial result sets into the initial response
    • IPC data often isn't aligned
    • IPC data has to deal with the gRPC message size limits
  • Duplication between GetFlightInfo/PollFlightInfo due to evolution over time (though you could argue it makes sense to have both anyways)
  • There's no first-class support for result sets/queries as an actual API resource
  • There's no support for result set pagination, limited support for retries
  • There's no way to extend the protocol with custom actions
  • There's no way to pass extra metadata alongside queries (e.g. a query ID to use)
  • It's hard for proxies to interject info into requests/responses

@felipecrv felipecrv self-assigned this Jun 28, 2024
@felipecrv felipecrv removed their assignment Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants