- export result in an excel file
- report the token length of the documents we feed
- if the result csv file exist, ask if run from start
- support MetaAI llama and Claude AI
- support Google API
- context + paper + all questions
- check mapreduce and stuff RetriveQA method
- use GPT3 only if possible
- load local index with embedding
- for specific questions with chain of thought method
- markdown preparation code
- w/wo cheatsheet
- w/wo embedding
- extract content about the question
- small text/sentence embedding
- group question by relationship
- simply, 3Q per request?
- categorize the paper
- contain table or not
- choose test set
- choose cheatsheet or not
- choose split question or not
- choose split content or not
- choose report or not
- variation analysis
- overall
- for section split content
- test chuck overlap
- use title to know if the paper is related
- date file structure as an object
- if plasma then active replication
- if naive then active replication
- if drug class contains then PI and INSTI
- if naive and no treated, should have number of individuals
- 4301 and 4302
- 4301 and 4101
if one of them get too much disagree in a paper. May need recheck human results.