@@ -280,15 +280,29 @@ Let's write a helper to get the strongest words for each topic. This will show t
280
280
281
281
``` python
282
282
def show_topics (topic , n ):
283
+ def show_topics (topic , n ):
284
+ # Get the feature names (terms) from the vectorizer
283
285
terms = vectorizer.get_feature_names_out()
286
+
287
+ # Get the weights of the terms for the specified topic from the SVD model
284
288
weights = svdmodel.components_[topic]
289
+
290
+ # Create a DataFrame with terms and their corresponding weights
285
291
df = pandas.DataFrame({" Term" : terms, " Weight" : weights})
292
+
293
+ # Sort the DataFrame by weights in descending order to get top n terms
286
294
tops = df.sort_values(by = [" Weight" ], ascending = False )[0 :n]
295
+
296
+ # Sort the DataFrame by weights in ascending order to get bottom n terms
287
297
bottoms = df.sort_values(by = [" Weight" ], ascending = False )[- n:]
298
+
299
+ # Concatenate top and bottom terms into a single DataFrame and return
288
300
return pandas.concat([tops, bottoms])
289
301
290
- topic_words_x = show_topics(1 , 5 )
291
- topic_words_y = show_topics(2 , 5 )
302
+ # Get the top 5 and bottom 5 terms for each specified topic
303
+ topic_words_x = show_topics(1 , 5 ) # Topic 1
304
+ topic_words_y = show_topics(2 , 5 ) # Topic 2
305
+
292
306
```
293
307
294
308
You can also use a helper we prepared for learners:
0 commit comments