Know your doctor: Topic modeling and sentiment analysis based approach to review doctor
Kavya Krishna K V (91616009)
Know your doctor: Topic modeling and sentiment analysis based approach to review doctor - MSC MI 2016-2018
Nowadays people tend to search for doctors through business review websites, they
naturally opt for those that have the very best ratings and an outsized variety of
reviews that support those high ratings. Hundreds or perhaps thousands of reviews
will be given to the best-rated ones beneath their profiles, and comparing a high
rated option to every alternative becomes a tedious task. Furthermore, even if there
is only one highly-rated doctor, one may still want to read the reviews to see why
people like this doctor and if the reviewers addressed his or her concerns. This,
again, could be time-consuming. In both cases, some sort of review summarizer
would be helpful.
Web services such as Zocdoc and Yelp have offered their own version of “doctor
reviews” to help users quickly see what other reviewers have said about doctors.
Zocdoc rates doctors based on three categories: “overall rating,” “bedside manner,”
and “wait time”. However, this does not cover any other useful points that users
made in their specific reviews. Yelp automatically highlights representative review
sentences that share common phrases with other sentences, but no explicit rating is
given for the topics mentioned in those sentences.
This project aimed at building a tool would combine the best of both the above
products. Know Your Doctor first detects the topics that have been discussed in the
reviews (e.g. bedside manner). Then, it analyzes whether people were talking
positively or negatively about those topics, and finally assigns appropriate ratings to
the topics. This project aims to address this issue by making a summarizer to
analyze the public data by performing topic modeling using Latent Dirichlet
Allocation(LDA), a standard Natural Language Processing (NLP) technique. LDA
is a tool which will determine topics from a corpus and word2vec based sentiment
analysis which is the computational study of people's opinions, attitudes and
emotions toward a review. Word2vec is a two-layer neural network that embeds the
text corpus to a set of feature vectors of the words in the corpus. The reviews are
taken from Yelp, an online rating website, of doctors across San Francisco. As a
result of this study, a snapshot is created for each doctor which contain most
dominant topics and their overall sentiment from their reviews.
LATENT DIRICHLET ALLOCATION
NATURAL LANGUAGE PROCESSING
WORD2VEC
Know your doctor: Topic modeling and sentiment analysis based approach to review doctor - MSC MI 2016-2018
Nowadays people tend to search for doctors through business review websites, they
naturally opt for those that have the very best ratings and an outsized variety of
reviews that support those high ratings. Hundreds or perhaps thousands of reviews
will be given to the best-rated ones beneath their profiles, and comparing a high
rated option to every alternative becomes a tedious task. Furthermore, even if there
is only one highly-rated doctor, one may still want to read the reviews to see why
people like this doctor and if the reviewers addressed his or her concerns. This,
again, could be time-consuming. In both cases, some sort of review summarizer
would be helpful.
Web services such as Zocdoc and Yelp have offered their own version of “doctor
reviews” to help users quickly see what other reviewers have said about doctors.
Zocdoc rates doctors based on three categories: “overall rating,” “bedside manner,”
and “wait time”. However, this does not cover any other useful points that users
made in their specific reviews. Yelp automatically highlights representative review
sentences that share common phrases with other sentences, but no explicit rating is
given for the topics mentioned in those sentences.
This project aimed at building a tool would combine the best of both the above
products. Know Your Doctor first detects the topics that have been discussed in the
reviews (e.g. bedside manner). Then, it analyzes whether people were talking
positively or negatively about those topics, and finally assigns appropriate ratings to
the topics. This project aims to address this issue by making a summarizer to
analyze the public data by performing topic modeling using Latent Dirichlet
Allocation(LDA), a standard Natural Language Processing (NLP) technique. LDA
is a tool which will determine topics from a corpus and word2vec based sentiment
analysis which is the computational study of people's opinions, attitudes and
emotions toward a review. Word2vec is a two-layer neural network that embeds the
text corpus to a set of feature vectors of the words in the corpus. The reviews are
taken from Yelp, an online rating website, of doctors across San Francisco. As a
result of this study, a snapshot is created for each doctor which contain most
dominant topics and their overall sentiment from their reviews.
LATENT DIRICHLET ALLOCATION
NATURAL LANGUAGE PROCESSING
WORD2VEC