Home Artificial Intelligence A Data-Driven Exploration of My Reading Journey Introduction The Power of Digital Books Topic Modeling — A Pathway to Uncovering Book Genres Harnessing ChatGPT to Unveil the Essence of Book Topics Conclusion — The Enduring Impact of Reading on Personal Growth

A Data-Driven Exploration of My Reading Journey Introduction The Power of Digital Books Topic Modeling — A Pathway to Uncovering Book Genres Harnessing ChatGPT to Unveil the Essence of Book Topics Conclusion — The Enduring Impact of Reading on Personal Growth

2
A Data-Driven Exploration of My Reading Journey
Introduction
The Power of Digital Books
Topic Modeling — A Pathway to Uncovering Book Genres
Harnessing ChatGPT to Unveil the Essence of Book Topics
Conclusion — The Enduring Impact of Reading on Personal Growth

Quotes from the literary realm, sparking thought and reflection.
Vivid Word Galaxy: a word cloud capturing the essence and themes of countless literary adventures, as seen through the eyes of a digital reader.
nlp = en_core_web_md.load()

# Tags I would like to remove from the text
removal = ['ADV', 'PRON', 'CCONJ', 'PUNCT',
'PART', 'DET', 'ADP', 'SPACE', 'NUM', 'SYM']

tokens = []
for highlight in nlp.pipe(df_highlights['Highlights']):
proj_tok = [token.lemma_.lower(
) for token in highlight if token.pos_ not in removal and not token.is_stop and token.is_alpha and len(token) > 2]
tokens.append(proj_tok)

tokens_concatenated = list(map(lambda x: ' '.join(x), tokens))
tokens_cleaned = list(map(lambda x: get_cleaned_string(x), tokens_concatenated))

dictionary = Dictionary(tokens)

dictionary.filter_extremes(no_below=5, no_above=0.5, keep_n=1000)

corpus = [dictionary.doc2bow(doc) for doc in tokens]

# Optimal model
topics_count = 15
lda_model = LdaMulticore(corpus=corpus, id2word=dictionary, iterations=100, num_topics=topics_count, employees = 4, passes=100)

# Print topics
lda_model.print_topics(-1)

# Visualize topics
lda_display = pyLDAvis.gensim_models.prepare(lda_model, corpus, dictionary, R=10)
pyLDAvis.display(lda_display)

# Save the report
pyLDAvis.save_html(lda_display, f'data/generated_html/index_{topics_count}.html')

Discovering Expected and Unexpected Themes

Exploring Evolving Interests

Uncovering Core Values and Beliefs

Reflecting on Reading Patterns and Personal Growth

2 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here