Skip to content

Commit 2daf852

Browse files
feat: Adding Milvus demo to examples (#4910)
* checking in progress but this Pr still is not ready yet Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * feat: Adding new method to FeatureStore to allow more flexible retrieval of features from vector similarity search Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * Adding requested_features back into online_store Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * feat: Adding RAG demo displaying Milvus usage for RAG Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * uploading sample data and updated yaml Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updating workflow Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated example Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * removing modified files Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * reverting postgres change Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updating test_workflow Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated and fixed bug Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * fixing a bad merge/rebase Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated linter because of latest nistall Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * reverting feature store change Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * adding logging of local milvus back Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * Updating readme and adding notebook Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * updated readme Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
1 parent aaa915a commit 2daf852

File tree

10 files changed

+1281
-33
lines changed

10 files changed

+1281
-33
lines changed

docs/reference/online-stores/overview.md

+18-18
Original file line numberDiff line numberDiff line change
@@ -34,21 +34,21 @@ Details for each specific online store, such as how to configure it in a `featur
3434

3535
Below is a matrix indicating which online stores support what functionality.
3636

37-
| | Sqlite | Redis | DynamoDB | Snowflake | Datastore | Postgres | Hbase | [[Cassandra](https://cassandra.apache.org/_/index.html) / [Astra DB](https://www.datastax.com/products/datastax-astra?utm_source=feast)] | [IKV](https://inlined.io) |
38-
| :-------------------------------------------------------- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- |
39-
| write feature values to the online store | yes | yes | yes | yes | yes | yes | yes | yes | yes |
40-
| read feature values from the online store | yes | yes | yes | yes | yes | yes | yes | yes | yes |
41-
| update infrastructure (e.g. tables) in the online store | yes | yes | yes | yes | yes | yes | yes | yes | yes |
42-
| teardown infrastructure (e.g. tables) in the online store | yes | yes | yes | yes | yes | yes | yes | yes | yes |
43-
| generate a plan of infrastructure changes | yes | no | no | no | no | no | no | yes | no |
44-
| support for on-demand transforms | yes | yes | yes | yes | yes | yes | yes | yes | yes |
45-
| readable by Python SDK | yes | yes | yes | yes | yes | yes | yes | yes | yes |
46-
| readable by Java | no | yes | no | no | no | no | no | no | no |
47-
| readable by Go | yes | yes | no | no | no | no | no | no | no |
48-
| support for entityless feature views | yes | yes | yes | yes | yes | yes | yes | yes | yes |
49-
| support for concurrent writing to the same key | no | yes | no | no | no | no | no | no | yes |
50-
| support for ttl (time to live) at retrieval | no | yes | no | no | no | no | no | no | no |
51-
| support for deleting expired data | no | yes | no | no | no | no | no | no | no |
52-
| collocated by feature view | yes | no | yes | yes | yes | yes | yes | yes | no |
53-
| collocated by feature service | no | no | no | no | no | no | no | no | no |
54-
| collocated by entity key | no | yes | no | no | no | no | no | no | yes |
37+
| | Sqlite | Redis | DynamoDB | Snowflake | Datastore | Postgres | Hbase | [[Cassandra](https://cassandra.apache.org/_/index.html) / [Astra DB](https://www.datastax.com/products/datastax-astra?utm_source=feast)] | [IKV](https://inlined.io) | Milvus |
38+
| :-------------------------------------------------------- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- | :-- |:-------|
39+
| write feature values to the online store | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes |
40+
| read feature values from the online store | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes |
41+
| update infrastructure (e.g. tables) in the online store | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes |
42+
| teardown infrastructure (e.g. tables) in the online store | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes |
43+
| generate a plan of infrastructure changes | yes | no | no | no | no | no | no | yes | no | no |
44+
| support for on-demand transforms | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes |
45+
| readable by Python SDK | yes | yes | yes | yes | yes | yes | yes | yes | yes | yes |
46+
| readable by Java | no | yes | no | no | no | no | no | no | no | no |
47+
| readable by Go | yes | yes | no | no | no | no | no | no | no | no |
48+
| support for entityless feature views | yes | yes | yes | yes | yes | yes | yes | yes | yes | no |
49+
| support for concurrent writing to the same key | no | yes | no | no | no | no | no | no | yes | no |
50+
| support for ttl (time to live) at retrieval | no | yes | no | no | no | no | no | no | no | no |
51+
| support for deleting expired data | no | yes | no | no | no | no | no | no | no | no |
52+
| collocated by feature view | yes | no | yes | yes | yes | yes | yes | yes | no | no |
53+
| collocated by feature service | no | no | no | no | no | no | no | no | no | no |
54+
| collocated by entity key | no | yes | no | no | no | no | no | no | yes | no |

examples/rag/README.md

+88
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# 🚀 Quickstart: Retrieval-Augmented Generation (RAG) using Feast and Large Language Models (LLMs)
2+
3+
This project demonstrates how to use **Feast** to power a **Retrieval-Augmented Generation (RAG)** application.
4+
The RAG architecture combines retrieval of documents (using vector search) with In-Context-Learning (ICL) through a
5+
**Large Language Model (LLM)** to answer user questions accurately using structured and unstructured data.
6+
7+
## 💡 Why Use Feast for RAG?
8+
9+
- **Online retrieval of features:** Ensure real-time access to precomputed document embeddings and other structured data.
10+
- **Declarative feature definitions:** Define feature views and entities in a Python file and empower Data Scientists to easily ship scalabe RAG applications with all of the existing benefits of Feast.
11+
- **Vector search:** Leverage Feast’s integration with vector databases like **Milvus** to find relevant documents based on a similarity metric (e.g., cosine).
12+
- **Structured and unstructured context:** Retrieve both embeddings and traditional features, injecting richer context into LLM prompts.
13+
- **Versioning and reusability:** Collaborate across teams with discoverable, versioned data pipelines.
14+
15+
---
16+
17+
## 📂 Project Structure
18+
19+
- **`data/`**: Contains the demo data, including Wikipedia summaries of cities with sentence embeddings stored in a Parquet file.
20+
- **`example_repo.py`**: Defines the feature views and entity configurations for Feast.
21+
- **`feature_store.yaml`**: Configures the offline and online stores (using local files and Milvus Lite in this demo).
22+
- **`test_workflow.py`**: Demonstrates key Feast commands to define, retrieve, and push features.
23+
24+
---
25+
26+
## 🛠️ Setup
27+
28+
1. **Install the necessary packages**:
29+
```bash
30+
pip install feast torch transformers openai
31+
```
32+
2. Initialize and inspect the feature store:
33+
34+
```bash
35+
feast apply
36+
```
37+
38+
3. Materialize features into the online store:
39+
40+
```bash
41+
python -c "from datetime import datetime; from feast import FeatureStore; store = FeatureStore(repo_path='.')"
42+
python -c "store.materialize_incremental(datetime.utcnow())"
43+
```
44+
4. Run a query:
45+
46+
- Prepare your question:
47+
`question = "Which city has the largest population in New York?"`
48+
- Embed the question using sentence-transformers/all-MiniLM-L6-v2.
49+
- Retrieve the top K most relevant documents using Milvus vector search.
50+
- Pass the retrieved context to the OpenAI model for conversational output.
51+
52+
## 🛠️ Key Commands for Data Scientists
53+
- Apply feature definitions:
54+
55+
```bash
56+
feast apply
57+
```
58+
59+
- Materialize features to the online store:
60+
```python
61+
store.write_to_online_store(feature_view_name='city_embeddings', df=df)
62+
```
63+
64+
-Inspect retrieved features using Python:
65+
```python
66+
context_data = store.retrieve_online_documents_v2(
67+
features=[
68+
"city_embeddings:vector",
69+
"city_embeddings:item_id",
70+
"city_embeddings:state",
71+
"city_embeddings:sentence_chunks",
72+
"city_embeddings:wiki_summary",
73+
],
74+
query=query,
75+
top_k=3,
76+
distance_metric='COSINE',
77+
).to_df()
78+
display(context_data)
79+
```
80+
81+
📊 Example Output
82+
When querying: Which city has the largest population in New York?
83+
84+
The model provides:
85+
86+
```
87+
The largest city in New York is New York City, often referred to as NYC. It is the most populous city in the United States, with an estimated population of 8,335,897 in 2022.
88+
```

examples/rag/__init__.py

Whitespace-only changes.

examples/rag/feature_repo/__init__.py

Whitespace-only changes.
Binary file not shown.
+42
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
from datetime import timedelta
2+
3+
from feast import (
4+
FeatureView,
5+
Field,
6+
FileSource,
7+
)
8+
from feast.data_format import ParquetFormat
9+
from feast.types import Float32, Array, String, ValueType
10+
from feast import Entity
11+
12+
item = Entity(
13+
name="item_id",
14+
description="Item ID",
15+
value_type=ValueType.INT64,
16+
)
17+
18+
parquet_file_path = "./data/city_wikipedia_summaries_with_embeddings.parquet"
19+
20+
source = FileSource(
21+
file_format=ParquetFormat(),
22+
path=parquet_file_path,
23+
timestamp_field="event_timestamp",
24+
)
25+
26+
city_embeddings_feature_view = FeatureView(
27+
name="city_embeddings",
28+
entities=[item],
29+
schema=[
30+
Field(
31+
name="vector",
32+
dtype=Array(Float32),
33+
vector_index=True,
34+
vector_search_metric="COSINE",
35+
),
36+
Field(name="state", dtype=String),
37+
Field(name="sentence_chunks", dtype=String),
38+
Field(name="wiki_summary", dtype=String),
39+
],
40+
source=source,
41+
ttl=timedelta(hours=2),
42+
)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
project: rag
2+
provider: local
3+
registry: data/registry.db
4+
online_store:
5+
type: milvus
6+
path: data/online_store.db
7+
vector_enabled: true
8+
embedding_dim: 384
9+
index_type: "IVF_FLAT"
10+
11+
12+
offline_store:
13+
type: file
14+
entity_key_serialization_version: 3
15+
# By default, no_auth for authentication and authorization, other possible values kubernetes and oidc. Refer the documentation for more details.
16+
auth:
17+
type: no_auth
+74
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
import pandas as pd
2+
import torch
3+
import torch.nn.functional as F
4+
from feast import FeatureStore
5+
from transformers import AutoTokenizer, AutoModel
6+
from example_repo import city_embeddings_feature_view, item
7+
8+
TOKENIZER = "sentence-transformers/all-MiniLM-L6-v2"
9+
MODEL = "sentence-transformers/all-MiniLM-L6-v2"
10+
11+
12+
def mean_pooling(model_output, attention_mask):
13+
token_embeddings = model_output[
14+
0
15+
] # First element of model_output contains all token embeddings
16+
input_mask_expanded = (
17+
attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
18+
)
19+
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(
20+
input_mask_expanded.sum(1), min=1e-9
21+
)
22+
23+
24+
def run_model(sentences, tokenizer, model):
25+
encoded_input = tokenizer(
26+
sentences, padding=True, truncation=True, return_tensors="pt"
27+
)
28+
# Compute token embeddings
29+
with torch.no_grad():
30+
model_output = model(**encoded_input)
31+
32+
sentence_embeddings = mean_pooling(model_output, encoded_input["attention_mask"])
33+
sentence_embeddings = F.normalize(sentence_embeddings, p=2, dim=1)
34+
return sentence_embeddings
35+
36+
def run_demo():
37+
store = FeatureStore(repo_path=".")
38+
df = pd.read_parquet("./data/city_wikipedia_summaries_with_embeddings.parquet")
39+
embedding_length = len(df['vector'][0])
40+
print(f'embedding length = {embedding_length}')
41+
42+
store.apply([city_embeddings_feature_view, item])
43+
fields = [
44+
f.name for f in city_embeddings_feature_view.features
45+
] + city_embeddings_feature_view.entities + [city_embeddings_feature_view.batch_source.timestamp_field]
46+
print('\ndata=')
47+
print(df[fields].head().T)
48+
store.write_to_online_store("city_embeddings", df[fields][0:3])
49+
50+
51+
question = "the most populous city in the state of New York is New York"
52+
tokenizer = AutoTokenizer.from_pretrained(TOKENIZER)
53+
model = AutoModel.from_pretrained(MODEL)
54+
query_embedding = run_model(question, tokenizer, model)
55+
query = query_embedding.detach().cpu().numpy().tolist()[0]
56+
57+
# Retrieve top k documents
58+
features = store.retrieve_online_documents_v2(
59+
features=[
60+
"city_embeddings:vector",
61+
"city_embeddings:item_id",
62+
"city_embeddings:state",
63+
"city_embeddings:sentence_chunks",
64+
"city_embeddings:wiki_summary",
65+
],
66+
query=query,
67+
top_k=3,
68+
)
69+
print("features =")
70+
print(features.to_df())
71+
store.teardown()
72+
73+
if __name__ == "__main__":
74+
run_demo()

0 commit comments

Comments
 (0)