Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClientConnectorCertificateError on GET request to any blob #296

Open
v-hunt opened this issue Oct 19, 2020 · 8 comments
Open

ClientConnectorCertificateError on GET request to any blob #296

v-hunt opened this issue Oct 19, 2020 · 8 comments

Comments

@v-hunt
Copy link

v-hunt commented Oct 19, 2020

What happened:
We are trying to read file(s) from Google storage bucket, but it is not possible

What you expected to happen:
We can run any gcsfs API command

Minimal Complete Verifiable Example:
Please, note that this is a minimal example. For instance, if we run any other command (e.g. the code for opening a file), it will cause the same error.

import gcsfs
fs = gcsfs.GCSFileSystem(project='my-project')
fs.ls('my-bucket')

This code will cause an exception. Error traceback:

Traceback (most recent call last):
  File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 936, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)  # type: ignore  # noqa
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 981, in create_connection
    ssl_handshake_timeout=ssl_handshake_timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 1009, in _create_connection_transport
    await waiter
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/sslproto.py", line 530, in data_received
    ssldata, appdata = self._sslpipe.feed_ssldata(data)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/sslproto.py", line 189, in feed_ssldata
    self._sslobj.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 774, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1076)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/path/to/my-project/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-10-daebfc8e4d60>", line 1, in <module>
    fs.ls('bduk-dev-tmt')
  File "/path/to/my-project/python3.7/site-packages/fsspec/asyn.py", line 121, in wrapper
    return maybe_sync(func, self, *args, **kwargs)
  File "/path/to/my-project/python3.7/site-packages/fsspec/asyn.py", line 100, in maybe_sync
    return sync(loop, func, *args, **kwargs)
  File "/path/to/my-project/python3.7/site-packages/fsspec/asyn.py", line 71, in sync
    raise exc.with_traceback(tb)
  File "/path/to/my-project/python3.7/site-packages/fsspec/asyn.py", line 55, in f
    result[0] = await future
  File "/path/to/my-project/python3.7/site-packages/gcsfs/core.py", line 808, in _ls
    out = await self._list_objects(path)
  File "/path/to/my-project/python3.7/site-packages/gcsfs/core.py", line 598, in _list_objects
    items, prefixes = await self._do_list_objects(path)
  File "/path/to/my-project/python3.7/site-packages/gcsfs/core.py", line 633, in _do_list_objects
    json_out=True,
  File "/path/to/my-project/python3.7/site-packages/gcsfs/core.py", line 494, in _call
    timeout=self.requests_timeout,
  File "/path/to/my-project/python3.7/site-packages/aiohttp/client.py", line 1012, in __aenter__
    self._resp = await self._coro
  File "/path/to/my-project/python3.7/site-packages/aiohttp/client.py", line 483, in _request
    timeout=real_timeout
  File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 523, in connect
    proto = await self._create_connection(req, traces, timeout)
  File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 859, in _create_connection
    req, traces, timeout)
  File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 1004, in _create_direct_connection
    raise last_exc
  File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 986, in _create_direct_connection
    req=req, client_error=client_error)
  File "/path/to/my-project/python3.7/site-packages/aiohttp/connector.py", line 939, in _wrap_create_connection
    req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorCertificateError: Cannot connect to host www.googleapis.com:443 ssl:True [SSLCertVerificationError: (1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1076)')]

Anything else we need to know?:

It looks like this issue can be caused by this. But recompiling Python is not a handy solution. There is should be simplier solution or fix

This issue makes pandas.read_excel and pandas.read_csv command failed, what makes this issue more painful

Environment:

  • Dask version: We don't use Dask, gcsfs version is 0.7.1
  • Python version: Python 3.7.4
  • Operating System: MacOS Catalina 10.15.7
  • Install method (conda, pip, source): pip

pip freeze:

aiohttp==3.6.2
appnope==0.1.0
argon2-cffi==20.1.0
async-generator==1.10
async-timeout==3.0.1
attrs==19.3.0
backcall==0.2.0
bleach==3.2.1
cachetools==4.1.1
certifi==2020.6.20
cffi==1.14.3
chardet==3.0.4
click==7.1.2
decorator==4.4.2
defusedxml==0.6.0
entrypoints==0.3
Flask==1.1.2
fsspec==0.8.4
gcsfs==0.7.1
google-api-core==1.21.0
google-api-python-client==1.10.0
google-auth==1.19.2
google-auth-httplib2==0.0.4
google-auth-oauthlib==0.4.1
google-cloud-core==1.4.3
google-cloud-pubsub==1.7.0
google-cloud-storage==1.31.0
google-cloud-trace==0.23.0
google-crc32c==1.0.0
google-resumable-media==1.1.0
googleapis-common-protos==1.52.0
grpc-google-iam-v1==0.12.3
grpcio==1.30.0
httplib2==0.18.1
idna==2.9
importlib-metadata==2.0.0
ipykernel==5.3.4
ipython==7.18.1
ipython-genutils==0.2.0
itsdangerous==1.1.0
jedi==0.17.2
Jinja2==2.11.2
jsonschema==3.2.0
jupyter-client==6.1.7
jupyter-core==4.6.3
jupyterlab-pygments==0.1.2
MarkupSafe==1.1.1
mistune==0.8.4
multidict==4.7.6
nbclient==0.5.0
nbconvert==6.0.7
nbformat==5.0.8
nest-asyncio==1.4.1
notebook==6.1.4
numpy==1.19.2
oauthlib==3.1.0
opencensus==0.7.9
opencensus-context==0.1.1
packaging==20.4
pandas==1.1.2
pandocfilters==1.4.2
parso==0.7.1
pexpect==4.8.0
pickleshare==0.7.5
prometheus-client==0.8.0
prompt-toolkit==3.0.8
protobuf==3.12.2
ptyprocess==0.6.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.20
Pygments==2.7.1
pyparsing==2.4.7
pyrsistent==0.17.3
python-dateutil==2.8.1
pytz==2020.1
PyYAML==5.3.1
pyzmq==19.0.2
requests==2.24.0
requests-oauthlib==1.3.0
rsa==4.6
Send2Trash==1.5.0
six==1.15.0
terminado==0.9.1
testpath==0.4.4
tornado==6.0.4
traitlets==5.0.4
typing-extensions==3.7.4.3
uritemplate==3.0.1
urllib3==1.25.9
wcwidth==0.2.5
webencodings==0.5.1
Werkzeug==1.0.1
wrapt==1.12.1
xlrd==1.2.0
yarl==1.5.1
zipp==3.3.0
@martindurant
Copy link
Member

Do you succeed with other calls, such as connecting and listing a bucket?
Do google's own python APIs work for you?

To me, an SSL error suggests that you may be behind some complex firewall or proxy. It seems unlikely to me that GCS requires some special weak cypher to be compiled into python - other people are connecting just fine.

@v-hunt
Copy link
Author

v-hunt commented Oct 22, 2020

Hi @martindurant,

| Do you succeed with other calls, such as connecting and listing a bucket?

I used fs.ls() call in this example. I also tried to read the file blob with fs.open() and I got the same error. So I'm pretty sure this is a common error for any HTTP call.

| Do google's own python APIs work for you?

As we can't read spreadsheets within pandas directly due to this issue, we successfully read them manually by Google official google.cloud.storage module with further passing them as BytesIO object to pandas. IOW we can read any GSC file without issues. So it doesn't look like some common gateway issue.

@martindurant
Copy link
Member

Perhaps with a combination of pdb and logging you can figure out exactly what call the google API is making, and then, why the gcsfs via aiohttp is different. This error is coming from pretty deep within python.

Note that I don't see cryptography or pyopenssl (or any ssl) in your installed packages.

Please also check any environment variables or configuration you might have relating to certificate trust stores.

@martindurant
Copy link
Member

(ping)

@v-hunt
Copy link
Author

v-hunt commented Nov 25, 2020

Hi @martindurant
Sorry, have been pretty busy so far.
I'm going to go with a debugger and update you.
For now, just let me share some thoughts:

  • If I don't use async mode, it should not use asyncio on my opinion.
  • All HTPP related libs work without any third-paty SSL and/or encryption libs.

@martindurant
Copy link
Member

If I don't use async mode, it should not use asyncio

To have the distinction would mean writing two separate implementation with double the code. Even if you don't use asyncio directly, you might still appreciate the concurrent bulk operations it provides you.

@lorabit110
Copy link

aio-libs/aiohttp#5375 (comment) solved the problem for me.

@martindurant
Copy link
Member

@lorabit110 , do you know which version that is released in?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants