-
-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow send list of str for the Prompt on openai demo endpoint /v1/completions #323
Allow send list of str for the Prompt on openai demo endpoint /v1/completions #323
Conversation
Hi @ironpinguin! Thanks for the contribution! However, I believe this is not how OpenAI API behaves. Can you take a look at the example below? import openai
completion = openai.Completion.create(
model="text-davinci-003", prompt=["Say", "this", "is", "a", "test"], echo=True, n=1,
stream=stream)
print(completion) Output:
OpenAI API is treating the strings in the list as separate prompts. As a temp fix, when the |
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you for your contribution!
Hi, just want to be sure that it's on my side and it's not a regression. Trying to use langchain with vllm and I am getting exactly this problem: from langchain.llms import VLLMOpenAI
llm = VLLMOpenAI(
openai_api_key="EMPTY",
openai_api_base="http://localhost:8000/v1",
model_name="TheBloke/Llama-2-70B-chat-AWQ",
model_kwargs={"stop": ["."]},
)
print(llm("Rome is")) with response |
…pletions (vllm-project#323) * allow str or List[str] for prompt * Update vllm/entrypoints/openai/api_server.py Co-authored-by: Zhuohan Li <zhuohan123@gmail.com> --------- Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
Continuation of HabanaAI/vllm-hpu-extension#4 I've also removed is_tpu, as it got mistakenly restored in the rebase. It's not in the upstream.
The langchain implementation sends the prompt as an array of strings to the /v1/completions endpoint.
With this change, it is possible to use a simple string or an array of strings to send the prompt.
If the prompt is an array, then we concatenate all strings to one string. And the following engine will work with both prompt data types.
This is a solution for #186