Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Add new sets of blueprint for embedding model with no pre and post processing functions #3619

Open
mingshl opened this issue Mar 4, 2025 · 1 comment
Assignees
Labels
enhancement New feature or request untriaged

Comments

@mingshl
Copy link
Collaborator

mingshl commented Mar 4, 2025

Is your feature request related to a problem?
we have a set of blueprint for embedding model here https://opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/ and also available in github,
Earlier, we have the these blueprint for embedding model which has pre and post processing function to standardize the model input and output format, so that it can be use for neural search query.
But now that we have ml inference processor that allow users to map the model input and output format, then user can register connector and model without pre and post processing function, We want to provide a new set of blueprint for embedding model with no pre and post processing function.

What solution would you like?
write new sets of blueprint with no processing functions for embedding models in ml-commons repo.

update documentation website for the following outline:

Model Blueprints for Vector Search

OpenSearch provides two approaches for implementing embedding models, depending on your needs:

Standard Blueprints (Using ML Inference Processor)

Recommended for new implementations

These blueprints use the ML inference processor to handle input/output mapping, offering:

  • Simpler implementation
  • Direct model registration
  • Flexible input/output mapping through the processor
  • [List of available models with links]

Legacy Blueprints (With Pre/Post Processing)

For existing implementations or specific customization needs

These blueprints include pre- and post-processing functions, suitable when you need:

  • Custom preprocessing logic
  • Specific output formatting requirements
  • Compatibility with existing implementations
  • [List of available models with links]

Choosing the Right Blueprint

  1. For new implementations: Use the standard blueprints with the ML inference processor
  2. For existing systems: Continue using legacy blueprints or consider migrating
  3. For custom processing: Choose legacy blueprints if you need specific preprocessing logic
@mingshl mingshl added enhancement New feature or request untriaged labels Mar 4, 2025
@jngz-es jngz-es moved this to In Progress in ml-commons projects Mar 11, 2025
@jngz-es
Copy link
Collaborator

jngz-es commented Mar 11, 2025

@mingshl , is it a feature needed code changes or just documentation?

@mingshl mingshl changed the title [FEATURE] Add new sets of blueprint for embedding model with no pre and post processing functions [Doc] Add new sets of blueprint for embedding model with no pre and post processing functions Mar 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request untriaged
Projects
Status: In Progress
Development

No branches or pull requests

2 participants