-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add documentation for reranking #261
Conversation
Signed-off-by: HenryL27 <hmlindeman@yahoo.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start. Here's some feedback.
@@ -0,0 +1,119 @@ | |||
# Reranking | |||
|
|||
Second-stage reranking is a technique to use AI to gain significant search relevancy (we've seen +5-10% recall in the top 5) for your queries. OpenSearch 2.12 introduces the [Rerank processor](https://opensearch.org/docs/latest/search-plugins/search-pipelines/rerank-processor/), contributed by Aryn, to perform this operation. For more information, visit the [OpenSearch reranking documentation](https://opensearch.org/docs/latest/search-plugins/search-relevance/reranking-search-results/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should give percentages on improvements. Let's also remove "significant" - let's just say it can improve results, given there is a wide variance.
@@ -0,0 +1,119 @@ | |||
# Reranking | |||
|
|||
Second-stage reranking is a technique to use AI to gain significant search relevancy (we've seen +5-10% recall in the top 5) for your queries. OpenSearch 2.12 introduces the [Rerank processor](https://opensearch.org/docs/latest/search-plugins/search-pipelines/rerank-processor/), contributed by Aryn, to perform this operation. For more information, visit the [OpenSearch reranking documentation](https://opensearch.org/docs/latest/search-plugins/search-relevance/reranking-search-results/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we say "Sycamore uses OpenSearch's Rerank processor [link], contributed by Aryn...
|
||
## Usage | ||
|
||
In order to use the rerank processor, you first need to register and deploy a [Text Similarity Model](https://opensearch.org/docs/latest/ml-commons-plugin/custom-local-models/#cross-encoder-models). If you're using the quickstart containers, one such model comes deployed already. You can get its id (let's call this `reranker_id`) with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They aren't called Quickstart containers. It's just Sycamore, so instead say "Sycamore includes NAME OF MODEL, and you will need to get it's unique ID to use for reranking. To get its ID...
} | ||
``` | ||
|
||
Now, create a pipeline with the rerank processor in it: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"create a new search pipeline (link to the docs where we talk about this)..."
@@ -0,0 +1,119 @@ | |||
# Reranking | |||
|
|||
Second-stage reranking is a technique to use AI to gain significant search relevancy (we've seen +5-10% recall in the top 5) for your queries. OpenSearch 2.12 introduces the [Rerank processor](https://opensearch.org/docs/latest/search-plugins/search-pipelines/rerank-processor/), contributed by Aryn, to perform this operation. For more information, visit the [OpenSearch reranking documentation](https://opensearch.org/docs/latest/search-plugins/search-relevance/reranking-search-results/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more information, visit the Opensearch reranking documentation and Search Pipelines documentation NEED LINK.
} | ||
``` | ||
|
||
You can compose processors in pipelines. For example, to create a search pipeline that performs hybrid search, reranking, and RAG: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
search pipelines. For example, to create a pipeline...
} | ||
``` | ||
|
||
The blocks of processors are ordered. We recommend reranking before doing RAG since it's important for RAG that the LLM gets the best results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recommend adding reranking before the RAG processor...
} | ||
``` | ||
> Note: reranking can have sizeable latency - the processing required grows linearly with the number of search results. We recommend using this feature only in a high-resource environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reranking can add latency to your query -
> Note: reranking can have sizeable latency - the processing required grows linearly with the number of search results. We recommend using this feature only in a high-resource environment. | ||
## Theoretical motivation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put this at the top, in the introduction.
Signed-off-by: HenryL27 <hmlindeman@yahoo.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the heading in my comment. Otherwise LGTM
|
||
Second-stage reranking is a technique to use AI to improve search relevancy for your queries. Sycamore uses OpenSearch's [Rerank processor](https://opensearch.org/docs/latest/search-plugins/search-pipelines/rerank-processor/), contributed by Aryn, to perform this operation. For more information, visit the [OpenSearch reranking documentation](https://opensearch.org/docs/latest/search-plugins/search-relevance/reranking-search-results/) and [search pipelines documentation](https://opensearch.org/docs/latest/search-plugins/search-pipelines/index/). | ||
|
||
## Theoretical motivation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this header
Signed-off-by: HenryL27 <hmlindeman@yahoo.com>
* add documentation for reranking Signed-off-by: HenryL27 <hmlindeman@yahoo.com> * incorporate PR edits Signed-off-by: HenryL27 <hmlindeman@yahoo.com> * remove theoretical motivation header Signed-off-by: HenryL27 <hmlindeman@yahoo.com> --------- Signed-off-by: HenryL27 <hmlindeman@yahoo.com>
No description provided.