-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace CoreNLP with spaCy #3
Comments
Are you talking about this: https://stanfordnlp.github.io/CoreNLP/corenlp-server.html ? |
Yes. PyCobalt currently uses CoreNLP as the NLP tool for POS-tagging and NER. Running it is clunky. It is cumbersome to start the CoreNLP server even if using docker. With spaCy all code is directly in Python. This benchmark shows the superiority in speed. NER is slightly worse. |
Great Move, I think Spacy is will be much better than CoreNLP. I am eagerly waiting for this update. Please let me know if you need any help. |
I'm sorry for raising expectations about the implementation and the
timeline. This issue was meant to be a reminder for me, if I have time in
the future. This won't be resolved this or next month.
We would be happy to accept a pull request, though.
swathimithran <notifications@github.com> schrieb am Do., 27. Juli 2017,
11:05:
Great Move, I think Spacy is will be much better than CoreNLP. I am
eagerly waiting for this update. Please let me know if you need any help.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#3 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAjeiURYoUvqWdg8XIQ74DuUoFoHtQfTks5sSFLsgaJpZM4OjtYO>
.
--
Universität Passau
Bernhard Bermeitinger, M.Sc.
Wissenschaftlicher Mitarbeiter
Fakultät für Informatik und Mathematik
Lehrstuhl für Informatik mit Schwerpunkt Digital Libraries and Web
Information Systems
Innstraße 43, ITZ/IH 112
94032 Passau
+49-(0)851/509-3394
bernhard.bermeitinger@uni-passau.de
http://www.fim.uni-passau.de/digital-libraries/
|
Starting the CoreNLP server is not nice for anyone, it is big, relatively slow and the usage is a bit clunky.
Other options are either spaCy or nltk.
First experiments show that
nltk
's Named Entity Recognition is not very accurate and the sentence splitter is worse than CoreNLP.The next choice is
spaCy
which shows nice results from simple experiments. Before we implement, we have to check the following:The text was updated successfully, but these errors were encountered: