Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set ssl and sslverifcation to catalog or db credentials #801

Closed
Soufraz opened this issue Jun 23, 2021 · 7 comments
Closed

Set ssl and sslverifcation to catalog or db credentials #801

Soufraz opened this issue Jun 23, 2021 · 7 comments
Labels
Issue: Bug Report 🐞 Bug that needs to be fixed

Comments

@Soufraz
Copy link

Soufraz commented Jun 23, 2021

Description

Connect into a trino/presto db using catalog and credential file.
In datagrip and dbeaver I can connect only with two params:
SSL=True
SSLVerification=NONE
I am trying to find out how to fill this params in catalog definition or credentials file.
When I add to con url the pipeline not even start to run.

Context

Id like to get data using catalogs through trino/presto. I installed sqlalchemy-trino and change the con url to trino://
But to connect in database I need to set ssl params even in datagrip or dbeaver.
Then I am trying to find out how to pass this params in credentials con.
I tried ssl, ssl_context, ssl_verification and a few more but any param that I add I get an error of not recognized params.

kedro.io.core.DataSetError: Failed while loading data from data set SQLQueryDataSet(load_args={}, sql=BIG_QUERY_HERE.
HTTPSConnectionPool(host='presto.address.com', port=8446): Max retries exceeded with url: /v1/statement (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f183810d520>, 'Connection to presto.address.com timed out. (connect timeout=30.0)'))

Your Environment

sqlalchemy==1.4.15
sqlalchemy-trino==0.3.0
psycopg2-binary

  • Kedro version used: 0.17.3
  • Python version used: 3.8.5
  • Operating system and version: Manjaro 21
@Soufraz Soufraz added the Issue: Bug Report 🐞 Bug that needs to be fixed label Jun 23, 2021
@merelcht
Copy link
Member

merelcht commented Jul 5, 2021

Hi @Soufraz, thanks for reaching out!
Could you paste the catalog entry you tried?

@Soufraz
Copy link
Author

Soufraz commented Jul 5, 2021

# credentials
trino_connection:
  con: trino://user:pass@presto.address.com:8446/hive

# catalog
attachments:
  type: pandas.SQLQueryDataSet
  credentials: trino_connection
  sql: 'query here'

Params Ive tried:

load_args:
  ssl:
    fake_flag_to_enable_tls: True
    verification: None

@merelcht
Copy link
Member

merelcht commented Jul 5, 2021

Under the hood we use pandas read_sql_query to pass on the load_args. You can find here which arguments are supported: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_sql_query.html

@Soufraz
Copy link
Author

Soufraz commented Jul 13, 2021

I cant only connect in this way on trino and I didnt found how to pass things like _http_session.verify and auth in kedro catalogs.
Is there a way to create this connection in code and use as a connection in some catalog?

from trino import dbapi, auth
conn = dbapi.connect(
    host=host,
    port=port,
    user=username,
    catalog=dialect,
    schema=database_name,
    http_scheme='https',
    auth=auth.BasicAuthentication(username, password),
)
conn._http_session.verify = False

@Soufraz
Copy link
Author

Soufraz commented Jul 13, 2021

For now I built a custom dataset. Thanks.

@Soufraz Soufraz closed this as completed Jul 13, 2021
@datajoely
Copy link
Contributor

@Soufraz we're always happy to accept PRs if you'd like to contribute a sql.PrestDataSet?

@Soufraz
Copy link
Author

Soufraz commented Jul 13, 2021

I can try. Hahaha. I'll send until the end of this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Bug Report 🐞 Bug that needs to be fixed
Projects
None yet
Development

No branches or pull requests

3 participants