Authorship recognition for short texts
Material type:
TextSubject(s): Dissertation note: Mphill. Computer Science 2014-2015 INT Summary:
Most research on author identification considers large texts. Not many research is done on authoridentification for short texts, while short texts are commonly used since the rise of digital media. Theanonymous nature of internet applications offers possibilities to use the internet for illegitimate purposes.
Authorship recognition is a technique used to identify the author of an unclaimed document, or in case when more than one author claims a document. Authorship recognition has great potential for applications in Computer forensics. The intended goal of this study is to identify author of anonymous text by providing the some text samples of few authors,assuming that anonymous text is written by one of the author of known text samples. A set of documents with know authorship are used for training and aim is to automatically determine the correspounding author of an anonymous text. In recent years,pratical applications for authorship attribution have grown in areas such as intelligence for linking intercepted messages to each other and to know terrorists ,criminal law for identifying writers of threatening mail, civil law for copyright, and computer security for tracking authors of computer virus source code.
In this study we propose an algorithm for authorship recognition.I performedexperiments using basic-9 and writeprints feature set to determine authorship.A support vector machine algorithm (SVM) was used as the classification method. This is a commonly used algorithm for author recognition. In this study SMO is used for classification.Results revealed that writeprints gives the best result compared to basic-9 feature set.
| Item type | Current library | Call number | Status | Date due | Barcode | |
|---|---|---|---|---|---|---|
Project Reports
|
Kerala University of Digital Sciences, Innovation and Technology Knowledge Centre | Not for loan | R-616 |
Mini Project Report, Mphill CS
Mphill. Computer Science 2014-2015 INT Sabu M Thampi
Most research on author identification considers large texts. Not many research is done on authoridentification for short texts, while short texts are commonly used since the rise of digital media. Theanonymous nature of internet applications offers possibilities to use the internet for illegitimate purposes.
Authorship recognition is a technique used to identify the author of an unclaimed document, or in case when more than one author claims a document. Authorship recognition has great potential for applications in Computer forensics. The intended goal of this study is to identify author of anonymous text by providing the some text samples of few authors,assuming that anonymous text is written by one of the author of known text samples. A set of documents with know authorship are used for training and aim is to automatically determine the correspounding author of an anonymous text. In recent years,pratical applications for authorship attribution have grown in areas such as intelligence for linking intercepted messages to each other and to know terrorists ,criminal law for identifying writers of threatening mail, civil law for copyright, and computer security for tracking authors of computer virus source code.
In this study we propose an algorithm for authorship recognition.I performedexperiments using basic-9 and writeprints feature set to determine authorship.A support vector machine algorithm (SVM) was used as the classification method. This is a commonly used algorithm for author recognition. In this study SMO is used for classification.Results revealed that writeprints gives the best result compared to basic-9 feature set.
There are no comments on this title.