You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: _episodes/06-lsa.md
+5-3
Original file line number
Diff line number
Diff line change
@@ -279,6 +279,8 @@ We don't know *why* they are getting arranged this way, since we don't know what
279
279
Let's write a helper to get the strongest words for each topic. This will show the terms with the *highest* and *lowest* association with a topic. In LSA, each topic is a spectra of subject matter, from the kinds of terms on the low end to the kinds of terms on the high end. So, inspecting the *contrast* between these high and low terms (and checking that against our domain knowledge) can help us interpret what our model is identifying.
280
280
281
281
```python
282
+
import pandas as pd
283
+
282
284
defshow_topics(topic, n):
283
285
# Get the feature names (terms) from the vectorizer
284
286
terms = vectorizer.get_feature_names_out()
@@ -287,7 +289,7 @@ def show_topics(topic, n):
287
289
weights = svdmodel.components_[topic]
288
290
289
291
# Create a DataFrame with terms and their corresponding weights
0 commit comments