Found an expert, who read the webpage. Found the proposal interesting, but not clear on some points.
Created an account. About the webpage proposed by Quentin to collect the dataset:
will include it in the PPP user interface (e.g. Are you happy of the answer?
).
Found an agreement for modules communications. Several scripts to create modules automatically (PHP and Python).
Updated proposal, taking into account what we said.
Created a questions dataset (6000 questions). Algorithm: will use Stanford libraries,
and then test some heuristics. Other proposition: do not make triple structures,
but rather use typical questions (e.g. How to ... ?
or When was ... ?
).
Functional router. Only one request for the moment (do not ask all modules). Todo: add some log to register the asked questions.
Not any work for the moment. Will begin this week.
Downloaded a dictionnary. Write some code (C++) to use it. Will test his idea. Expect to have first results (simple queries, learn some functions) in two weeks (with one week of computation for learning). Need functional Wikidata and core modules. Need a good algorithm to find the nearest neighbour of some point, among a list of other points (dimension 50, 10k points). Would like sub-linear algorithm. Approximation is acceptable.
Found a lookup table (260k words). Words encoded by vectors of float. Two words with low distance are synonyms. Expect to have soon (one to three weks) a working algorithm. Problem: dataset.
6k lines of code, version 0.1 done. Handles simple queries.