Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add documentation for reranking #261

Merged
merged 3 commits into from
Mar 1, 2024
Merged

add documentation for reranking #261

merged 3 commits into from
Mar 1, 2024

Conversation

HenryL27
Copy link
Collaborator

No description provided.

Signed-off-by: HenryL27 <hmlindeman@yahoo.com>
@HenryL27 HenryL27 requested a review from jonfritz February 26, 2024 19:31
Copy link
Contributor

@jonfritz jonfritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start. Here's some feedback.

@@ -0,0 +1,119 @@
# Reranking

Second-stage reranking is a technique to use AI to gain significant search relevancy (we've seen +5-10% recall in the top 5) for your queries. OpenSearch 2.12 introduces the [Rerank processor](https://opensearch.org/docs/latest/search-plugins/search-pipelines/rerank-processor/), contributed by Aryn, to perform this operation. For more information, visit the [OpenSearch reranking documentation](https://opensearch.org/docs/latest/search-plugins/search-relevance/reranking-search-results/).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should give percentages on improvements. Let's also remove "significant" - let's just say it can improve results, given there is a wide variance.

@@ -0,0 +1,119 @@
# Reranking

Second-stage reranking is a technique to use AI to gain significant search relevancy (we've seen +5-10% recall in the top 5) for your queries. OpenSearch 2.12 introduces the [Rerank processor](https://opensearch.org/docs/latest/search-plugins/search-pipelines/rerank-processor/), contributed by Aryn, to perform this operation. For more information, visit the [OpenSearch reranking documentation](https://opensearch.org/docs/latest/search-plugins/search-relevance/reranking-search-results/).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we say "Sycamore uses OpenSearch's Rerank processor [link], contributed by Aryn...


## Usage

In order to use the rerank processor, you first need to register and deploy a [Text Similarity Model](https://opensearch.org/docs/latest/ml-commons-plugin/custom-local-models/#cross-encoder-models). If you're using the quickstart containers, one such model comes deployed already. You can get its id (let's call this `reranker_id`) with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They aren't called Quickstart containers. It's just Sycamore, so instead say "Sycamore includes NAME OF MODEL, and you will need to get it's unique ID to use for reranking. To get its ID...

}
```

Now, create a pipeline with the rerank processor in it:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"create a new search pipeline (link to the docs where we talk about this)..."

@@ -0,0 +1,119 @@
# Reranking

Second-stage reranking is a technique to use AI to gain significant search relevancy (we've seen +5-10% recall in the top 5) for your queries. OpenSearch 2.12 introduces the [Rerank processor](https://opensearch.org/docs/latest/search-plugins/search-pipelines/rerank-processor/), contributed by Aryn, to perform this operation. For more information, visit the [OpenSearch reranking documentation](https://opensearch.org/docs/latest/search-plugins/search-relevance/reranking-search-results/).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For more information, visit the Opensearch reranking documentation and Search Pipelines documentation NEED LINK.

}
```

You can compose processors in pipelines. For example, to create a search pipeline that performs hybrid search, reranking, and RAG:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search pipelines. For example, to create a pipeline...

}
```

The blocks of processors are ordered. We recommend reranking before doing RAG since it's important for RAG that the LLM gets the best results.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend adding reranking before the RAG processor...

}
```
> Note: reranking can have sizeable latency - the processing required grows linearly with the number of search results. We recommend using this feature only in a high-resource environment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reranking can add latency to your query -

> Note: reranking can have sizeable latency - the processing required grows linearly with the number of search results. We recommend using this feature only in a high-resource environment.
## Theoretical motivation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this at the top, in the introduction.

Signed-off-by: HenryL27 <hmlindeman@yahoo.com>
@jonfritz jonfritz self-requested a review February 28, 2024 19:50
Copy link
Contributor

@jonfritz jonfritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the heading in my comment. Otherwise LGTM


Second-stage reranking is a technique to use AI to improve search relevancy for your queries. Sycamore uses OpenSearch's [Rerank processor](https://opensearch.org/docs/latest/search-plugins/search-pipelines/rerank-processor/), contributed by Aryn, to perform this operation. For more information, visit the [OpenSearch reranking documentation](https://opensearch.org/docs/latest/search-plugins/search-relevance/reranking-search-results/) and [search pipelines documentation](https://opensearch.org/docs/latest/search-plugins/search-pipelines/index/).

## Theoretical motivation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this header

Signed-off-by: HenryL27 <hmlindeman@yahoo.com>
@HenryL27 HenryL27 merged commit c67b16d into main Mar 1, 2024
2 checks passed
bohou-aryn pushed a commit that referenced this pull request Mar 13, 2024
* add documentation for reranking

Signed-off-by: HenryL27 <hmlindeman@yahoo.com>

* incorporate PR edits

Signed-off-by: HenryL27 <hmlindeman@yahoo.com>

* remove theoretical motivation header

Signed-off-by: HenryL27 <hmlindeman@yahoo.com>

---------

Signed-off-by: HenryL27 <hmlindeman@yahoo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants