CAP 6777 Web Mining

Description:

This course teaches students basic techniques to mine the Web and information networks (including social networks and social media). Detailed topics include three aspects: (1) web Crawling, indexing, ranking and search algorithms using content and link analysis; (2) Web clustering, classification, and mining algorithms, and (3) social network analysis and online social media mining. Students will also gain experience through course project on one of the topics covered in class (such as building a search engine or an online twitter sentiment analysis tool).

 

Text Book (Reference):

http://nlp.stanford.edu/IR-book/

http://www.cs.uic.edu/~liub/WebMiningBook.html

 

Location:

Class location and time: FL 401, M: 4:00PM – 6: 50PM. (Distance Learning available)

Goal: The goal of this class is for students to gain hands-on experiences on information search and web mining. Course covers techniques used to collect, analyze, and understand the data from Internet and the web (including social networks). At the end of the class, students should be able to understand the whole process of collecting information from the web, and carrying out system design for search and mining the web. We will use online web documents (such as Twitter data) as the testbed and practice web mining techniques.

Prerequisites: STA 4821 or equivalent

 

Tentative Topics:

  1. Information retrieval
  2. Web Mining Algorithms
  3. Social Network Analysis

Reading List

Grading Policy:

Homework: 40 %

Mid Term Exam: 15%

Course Projects (term project): 20%

Student Presentation (research paper presentation): 10%

Final Exam (Or a research report): 15%

 

Grading Scale:

90 and above A

85-89 A-

76-84 B+

70-75 B

66-69 C+

60-65 C

50-59 D

49 and below F