WebLog Analyzer Using Big Data Technology

By: Material type: TextTextSubject(s): Dissertation note: Master of Science in Computer Science & Information Security 2013-2015 EXT "Mirox Cyber Security & Technology Pvt Ltd " Summary: In today’s Internet world, log file analysis is becoming a necessary task for analyzing the customer’s behavior in order to improve advertising and sales as well as for datasets like environment, medical, banking system it is important to analyze the log data to get required knowledge from it. Web mining is the process of discovering the knowledge from the web data. Log files are getting generated very fast at the rate of 1-10 Mb/s per machine, a single data center can generate tens of terabytes of log data in a day. These datasets are huge. In order to analyze such large datasets we need parallel processing system and reliable data storage mechanism. Virtual database system is an effective solution for integrating the data but it becomes inefficient for large datasets. The Hadoop framework provides reliable data storage by Hadoop Distributed File System and MapReduce programming model which is a parallel processing system for large datasets. Hadoop distributed file system breaks up input data and sends fractions of the original data to several machines in Hadoop cluster to hold blocks of data. This mechanism helps to process log data in parallel using all the machines in the hadoop cluster and computes result efficiently. The overall objective of this project is to analyze System Log of Internal organization. Log Files contain list of activities that can be a response to any request which is being occurred on the system server or any hosted application. These log files may resides in the same system server. Each individual request is listed on a separate line in a log file, called a log entry. The aim of a log file is to keep track of what is going on with the system server/application. Analyzing these log files can give lots of insights that help understand traffic patterns, user activity, Security breaks, user’s interest etc.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Call number Status Date due Barcode
Project Reports Project Reports Kerala University of Digital Sciences, Innovation and Technology Knowledge Centre Not for loan R-922
Project Reports Project Reports Kerala University of Digital Sciences, Innovation and Technology Knowledge Centre Not for loan R-699

Master of Science in Computer Science & Information Security 2013-2015 EXT Meraj Uddin Rakesh Kumar R G "Mirox Cyber Security & Technology Pvt Ltd "

In today’s Internet world, log file analysis is becoming a necessary task for analyzing the customer’s behavior in order to improve advertising and sales as well as for datasets like environment, medical, banking system it is important to analyze the log data to get required knowledge from it. Web mining is the process of discovering the knowledge from the web data. Log files are getting generated very fast at the rate of 1-10 Mb/s per machine, a single data center can generate tens of terabytes of log data in a day. These datasets are huge. In order to analyze such large datasets we need parallel processing system and reliable data storage mechanism. Virtual database system is an effective solution for integrating the data but it becomes inefficient for large datasets. The Hadoop framework provides reliable data storage by Hadoop Distributed File System and MapReduce programming model which is a parallel processing system for large datasets. Hadoop distributed file system breaks up input data and sends fractions of the original data to several machines in Hadoop cluster to hold blocks of data. This mechanism helps to process log data in parallel using all the machines in the hadoop cluster and computes result efficiently. The overall objective of this project is to analyze System Log of Internal organization. Log Files contain list of activities that can be a response to any request which is being occurred on the system server or any hosted application. These log files may resides in the same system server. Each individual request is listed on a separate line in a log file, called a log entry. The aim of a log file is to keep track of what is going on with the system server/application. Analyzing these log files can give lots of insights that help understand traffic patterns, user activity, Security breaks, user’s interest etc.

There are no comments on this title.

to post a comment.