-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting the best language match from predictions #6
Comments
Should this be done by modifying the test() function in main.py? |
Yeah I suppose. Here's the relevant code block in the test() function: for dist in model.test(instances):
print(dir(dist))
print(dist.classes()) You could write a function to normalize the values (e.g. set the one with the highest confidence of a False value to 0, the highest confidence of a True value to 1, and scale everything else accordingly. Then replace the code block above with something like: ranked_list = normalize_probabilities(model.test(instances))
if len(ranked_list) != 0:
top = ranked_list[0]
... |
Reviewing the code, it looks like the model.test() function returns a Distribution object, which contains a dictionary of class to probability. Each Distribution object also has a best_class field, so if I'm not mistaken this issue might be solved by doing for dist in model.test(instances):
print(dir(dist))
top = dist.best_class I can put some normalization code into the Distribution class to make sure the probabilities are normalized. |
Hmm, possibly. I didn't write I could be wrong though. |
Currently (when the code works), it only returns the True/False prediction and its score (as model.Distribution objects). It may be the case that more than one, or none, of the languages are chosen as True. The score of the prediction should be used to rank the list of languages for a span, then use that one for the final prediction.
The text was updated successfully, but these errors were encountered: