IS 4700 / CS 5750 - Social Computing
General Information
Professors: | Christo Wilson and Alan Mislove |
Room: | West Village H 108 |
Time: | Mondays and Wednesdays, 2:50-4:30pm |
Office Hours: | TBA |
Teaching Assistant: | Arash Molavi Kakhki |
TA Email: | arash@ccs.neu.edu |
Lab Hours: | TBA |
Class Forum: | On Piazza |
Paper List: | Here |
Syllabus: | Here |
Course Project Description: | Here |
Course Description
Recently, online social networking sites have exploded in popularity. Numerous sites are dedicated to finding and maintaining contacts and to locating and sharing different types of content. Online social networks represent a new kind of information network that differs significantly from existing networks like the Web. For example, in the Web, hyperlinks between content form a graph that is used to organize, navigate, and rank information. The properties of the Web graph have been studied extensively, and have lead to useful algorithms such as PageRank. In contrast, few links exist between content in online social networks and instead, the links exist between content and users, and between users themselves.
The resulting graph is used to connect and to communicate. Unlike previous networks, graphs in online social networks intermingle people and content, allow systems designers to relate the reputation of content to the reputation of users, and vice versa. It opens the door for new types of systems, new ways of solving longstanding problems, and new security attacks and vulnerabilities.
This course provides a detailed look at popular social information systems, including from online social networks (Facebook, MySpace, Orkut), blogging and microblogging platforms (LiveJournal, Blogger, Twitter), social recommendation engines (Digg, Reddit, last.fm), collaborative organization (Wikipedia), and content sharing sites (Flickr, YouTube). Coursework includes studying models (both formal and sociological) of social information systems, and the application of them both in theory and by analyzing real data from social network interactions.
The graduate version of this courses places greater emphasis on the computing infrastructure that underlies the emerging systems. Focuses on building scalable systems for managing and manipulating large amounts of data, on ensuring privacy for the users, on designing and using interfaces for third-party applications, and on leveraging the mobile nature of the access mechanisms that many users use. A course project of the students choosing will be expected.
Logistics
The class will meet twice per week for 90-minute sessions. As this is a discussion-based class, you are required to attend and participate.
Prerequisites
The CS5750 version of this course is intended for Computer Science Master's and Ph.D. students; the IS4700 version is intended for Computer Science or Information Science B.S. students. We expect you to understand the basics of computer systems, and to have experience implementing non-trivial systems-and-networking-type projects. You should also be able to read unix manual pages, and be able to familiarize yourself with unix utilities.
This course will be partially project-centric, and all students will complete in projects in groups of two (or possibly three, if necessary). Thus, to succeed in this course, you must be able to work in a group. We will allow you to form your own groups, and the course staff will serve as a matching service if necessary. As you are free to choose your partner(s), we will not be sympathetic to complaints at the end of the semester about how your group-mates did not do any work.
Finally, to succeed in this course, you should have some experience with programming in unix development tools, as well being willing to learn how to use online APIs (e.g., Facebook, Twitter). It is also highly recommended that you become familiar with using a debugger, as this will greatly aid you in completing the projects. At a high level, you should be motivated, eager to learn, willing to work hard, and make up, on your own, any prerequisite deficiencies you may have.
Class Forum
The class forum is on Piazza. Why Piazza? Because they have a nice web interface, as well as iPhone and Android apps. Piazza is the best place to ask questions about projects, programming, debugging issues, exams, etc. In order to keep things organized, please tag all posts with the appropriate hashtags, e.g. #lecture1, #project3, etc. We will also use Piazza to broadcast announcements to the class. Bottom line: unless you have a private problem, post to Piazza before emailing the professors and TAs.
Schedule, Lecture Slides, and Assigned Readings
# | Date | Topic | Readings | Summary # | Comments |
---|---|---|---|---|---|
1 | Sept. 4 | Intro, Basic Graph Theory | [9] - Ch. 2 | Join Piazza | |
2 | Sept. 9 | Graph Types and Models | [3], [39] | summary02 | |
3 | Sept. 11 | Sociology Background | [9] - Ch. 3-3.2, [11], [22] | summary03 | |
4 | Sept. 16 | Graph Sampling and Representation | [23], [19] | summary04 | |
5 | Sept. 18 | Interactions Over Networks | [40], [15], [20] | summary05 | |
6 | Sept. 23 | Information Dissemination and Influence | [6], [7] | summary06 | |
7 | Sept. 25 | Ranking Nodes | [32], [16], [10] | summary07 | Hw. 1 Due |
8 | Sept. 30 | Graphs Over Time | [43], [41] | summary08 | Instructor Meeting Deadline |
9 | Oct. 2 | Link Prediction | [18], [2] | summary09 | |
10 | Oct. 7 | Community Detection | [27], [4], [1] | summary10 | |
11 | Oct. 9 | Privacy | [17], [24] | summary11 | Project Proposal Due |
Oct. 14 | No Class - Columbus Day | ||||
12 | Oct. 16 | Anonymization and Deanonymization | [26], [34], [8] | summary12 | Hw. 2 Due |
13 | Oct. 21 | Spam | [12], [35] | summary13 | |
14 | Oct. 23 | Sybil Attacks | [37], [36] | summary14 | |
15 | Oct. 28 | Geo-social Networks | [29], [28] | summary15 | |
16 | Oct. 30 | Content Rating Systems | [25], [31] | summary16 | |
17 | Nov. 4 | Network Economics | [5], [33] | summary17 | |
18 | Nov. 6 | Human Mobility | [30], [38] | summary18 | Interim Report Due |
Nov. 11 | No Class - Veteran's Day | ||||
19 | Nov. 13 | New Approaches to Services | [42], [13] | summary19 | Hw. 3 Due |
20 | Nov. 18 | Personalization and Price Discrimination | [14], [21] | summary20 | |
21 | Nov. 20 | No Class - Work on Your Projects! | |||
22 | Nov. 25 | No Class - Work on Your Projects! | |||
Nov. 27 | No Class - Thanksgiving | ||||
23 | Dec. 2 | Student Presentations | Sarah & Vatsa | Xiaofeng, Zhuoli & Qizhen | Amitash & Rushabh |
24 | Dec. 4 | Student Presentations | Nick & Moon | Prachi, James & Ryan | |
TBA | Final Report Due |
Textbook
The recommended (but not required) textbook for the course is:
- Easley, David and Kleinberg, Jon. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. University Press, Cambridge, UK. 2010.
- Watts, Duncan J. Six Degrees: The Science of a Connected Age. W. W. Norton & Company, 2009.
- Shirky, Clay. Here Comes Everybody: The Power of Organizing Without Organizations. Penguin Press, 2008.
Presentations and Summaries
This course will have a focus on reading and discussing research papers. As such, you will be required to make a approximately 20-minute presentation on one or more research papers and lead the class discussion on the paper. We expect high-quality presentations; you should expect to spend 8 to 10 hours preparing your presentation. In addition, you should come to class with a list of questions that will spark an interesting discussion.
You should also read all of the assigned papers before the class to ensure that you arrive in class prepared to take part in the discussion. To ensure that you have read the papers, you are required to submit approximately 500-word summaries of all of the papers (500 words in aggregate) by midnight before the class. This submission must be in ASCII text format (no Word, PDF, etc). In aggregate, these summaries will count for 10% of your grade.
Summaries are due at 11:59:59pm on the day before the lecture. Summaries will not be accepted late, and slip days cannot be used on summaries.
Homeworks
The goal of this course is to teach the both the fundamentals of social information systems, as well as how to write programs which leverage social media. As such, there will be multiple homework programming projects throughout the semester (in addition to a course-long project, described below). You will form groups of two people to do the homeworks.\footnote{If necessary, one group of three will be allowed.} To collaborate effectively, you should both be involved in all of the major design decisions. You should also determine a partitioning of responsibilities so that you can both work effectively in parallel. For example, one might be responsible for generating all the test code while the other is responsible for the main code. You may switch groups between homeworks.
As the graduate version of this course will contain extra requirements, it is strongly recommended (but not required) that teams be formed of either all undergraduates or graduate students. If any of the team members are graduate students, the project will be graded as a graduate student project and will not receive any additional credit. Homeworks are due at 11:59:59pm on the specified date. You do not need to inform the instructor about the use of slip days; they will be automatically deducted.
Assignment | Description | Due Date | Piazza Tag |
---|---|---|---|
Homework 1 | Basic Graph Analysis | September 25 | #hw1 |
Homework 2 | Link Prediction | October 16 | #hw2 |
Homework 3 | Twitter Spam | November 13 | #hw3 |
Course Project
You will also decide on a project of your choosing as a course-long project. The goal of the project is to conduct a miniature version of a "real" research project. You will first pick a topic, and argue in a written research proposal that this is a topic worth exploring, and that you are capable and prepared to do so. You will design and implement a solution to the problem you have chosen, and quantitatively evaluate your solution. You will then write up the results of your project in a draft final report, which, after review, you will turn into a final report. Finally, you will give a 25-minute presentation of the results of your project in class.
The syllabus for the course project is available here.
I emphasize that this is a "research project", and not a "programming project". Although the implementation of your solution is an essential component, it is only one aspect of the project, next to other equally important components, such as the evaluation and the presentation of your results.
You are free to choose any project topic related to this course (with instructor approval). For example, we have collected a large amount of data on online social networks (e.g., a large collection of tweets from Twitter). This suggests an interesting avenue to explore, as the data has not been looked at by researchers before. Finally, if you are doing large-scale data analysis, we will try to get computing resources that are appropriate for the amount of data analysis you aim to do.
The course project is due at 11:59:59pm on December 10, 2013. No slip days may be used on the course project. The course project also has intermediate deadlines that will be announced as the course progresses.
Teamwork
You will form groups of two people to do the homeworks and programming projects (if necessary, one group of three will be allowed). To collaborate effectively, you should both be involved in all of the major design decisions. You may switch groups between programming projects.
Important: You alone are responsible for finding a partner. The class forum located on Piazza is a particularly good resource for this - there will be a thread there that serves exactly this purpose. Right before or right after lecture, as well as during TA lab hours, are also a good time to look for partners.
We often receive complaints that somebody cannot find their partner, or that their partner continues to promise things that are never delivered. To address this concern, the policy is {\bf you flake, you fail}. Simply put, if you disappear, or are generally not pulling your own weight at any time during the semester, you get an F in the course right then. End of story. If you don't completely flake, but are under-responsive, we reserve the right to design an appropriate (but still fair) way of redistributing points. If your partner is flaking on you, do not wait until the end of the semester to let the course staff know; let us known immediately.
Of course, disasters happen that may pull you away from campus. You are responsible for notifying your partner and the course staff if a major time conflict arises in your life. In the real world, you don't just disappear from your job for a week. You tell people you have to go. The same thing applies here. Likewise, if you feel you're going to need to drop this class, then do it between projects, not in the middle of one. Dropping the course in the middle of a project may be allowed by the university, but it's extremely rude to your partner(s). Be polite.
One useful bit of advice: work together with your partner. We don't mandate pair programming, but it really works. You'll be more than twice as effective as you might if you split the work up and did it separately. Also, you'll avoid the sort of rude surprises that often arise when partners have different expectations.
Submitting Summaries, Homeworks, and Projects
We will use Khoury College Linux-based scripts for submitting homeworks and programming projects; instructions for using these will be included with each project. Note that submitting projects via email is not acceptable, and no credit will be given for these submissions, and no extensions will be granted. You can submit your projects multiple times, and we will grade the latest submission. We suggest submitting your project a few times well before the deadline to ensure there are no configuration errors.
Before you can submit any assignments, you must register for the system with your student ID. To do so, ssh into any Khoury College Linux machine and execute
bash$ /course/cs5750f13/bin/register-student ID About to register user 'USER' with student ID 'ID'. Is this correct? [yn]where ID is your student ID, with any leading 0s removed. For example, if your NEU student ID is 003044, you would enter
bash$ /course/cs5750f13/bin/register-student 3044Double-check that the userid and student ID are correct; if so, type y and Enter. If not, type n and Enter. If it was successful, you will see the message
bash$ /course/cs5750f13/bin/register-student ID About to register user 'USER' with student ID 'ID'. Is this correct? [yn] y User 'USER' is now successfully registered for CS5750 with student ID 'ID'. bash$If you see any other message (in particular, a message with "Error" in it), it has not succeeded. Email the instructor with the error message if you are not able to diagnose the problem.
To submit summaries, you will use the CS5750 turnin system described above. In particular, you should run
/course/cs5750f13/bin/turnin summary02 /path/to/fileif you wanted to submit the summary for lecture 02 (see the schedule for the list of assigned lecture numbers and submission keyword). When run, you should see output that looks like
bash$ /course/cs5750f13/bin/turnin summary02 ~/cs5750/summary02.txt Added file summary02.txt (28392 bytes) Successfully submitted summary02 for user amislove (confirmation ZiwKE5). Submitted a total of 1 files (28392 bytes) in 0 directories.The script will print out every file that you are submitting, make sure that it prints out all of the files you wish to submit. Finally, make sure you see the "Successfully submitted" link at the end. You should also receive an email confirmation of your submission. If you receive any other error messages, email the instructor with the error message if you are not able to diagnose the problem.
Exams
There will be no exams.
Grading
The breakdown of the grades in this course is
Course Project: | 40% |
Homeworks: | 20% (5% each) |
Paper Presentation(s): | 20% |
Paper Summaries: | 10% |
Participation: | 10% |
Final grades will be calculated by summing up the points obtained by each student (the points will sum up to some number x out of 100) and then applying the following scale to determine the letter grade: [0-60] F, [60-62] D-, [63-66] D, [67-69] D+, [70-72] C-, [73-76] C, [77-79] C+, [80-82] B-, [83-86] B, [87-89] B+, [90-92] A-, [93-100] A. Final grades will not be curved in any way.
Any requests for grade changes or regrading must be made within 7 days of when the work was returned. To ask for a regrade, attach to your work a page that specifies (a) the problem or problems you want to be regraded, and (b) for each of these problems, why do you think the problem was misgraded.
Requests for Regrading
In this class, we will use the Coaches Challenge to handle requests for regrading. Each student is allotted two (2) challenges each semester. If you want a project or a test to be regraded, you must come to the professors office hours and make a formal challenge specifying (a) the problem or problems you want to be regraded, and (b) for each of these problems, why you think the problem was misgraded. If it turns out that there has been an error in grading, the grade will be corrected, and you get to keep your challenge. However, if the original grade was correct, then you permanently lose your challenge. Once your two challenges are exhausted, you will not be able to request regrades. You may not challenge the use of slip days, or any points lost due to lateness.
Note that, in the case of projects, all group members must have an available challenge in order to contest a grade. If the challenge is successful, then all group members get to keep their challenge. However, if the challenge is unsuccessful, then all group members permanently lose one challenge.
Late Policy
For programming projects, we will use flexible slip days. Each student is given ten (10) slip days for the semester. You may use the slip days on any project or homework during the semester in increments of one day. For example, you can hand in one project ten days late, or one project two days late and two projects four days late. You do not need to ask permission before using slip days; simply turn in your assignment late and the grading scripts will automatically tabulate any slip days you have used.
Slip days will be deducted from each group member's remaining slip days. Keep this stipulation in mind: if one member of a group has zero slip days remaining, then that means the whole group has zero slip days remaining.
After you have used up your slip days, any project handed in late will be marked off using the following formula:
Original_Grade * (1 - ceiling(Seconds_Late / 86400) * 0.2) = Late_Grade
In other words, every day late is 20% off your grade. Being 1 second late is exactly equivalent to being 23 hours and 59 minutes late. Since you will be turning-in your code on the Khoury College machines, their clocks are the benchmark time (so beware clock skew between your desktop and Khoury College if you're thinking about turning-in work seconds before the deadline). My late policy is extremely generous, and therefor we will not be sympathetic to excuses for lateness.
Cheating Policy
It's ok to ask your peers about the concepts, algorithms, or approaches needed to do the assignments. We encourage you to do so; both giving and taking advice will help you to learn. However, what you turn in must be your own, or for projects, your group's own work. Looking at or copying code or homework solutions from other people or the Web is strictly prohibited. In particular, looking at other solutions (e.g., from other groups or students who previously took the course) is a direct violation. Projects must be entirely the work of the students turning them in, i.e. you and your group members. If you have any questions about using a particular resource, ask the course staff or post a question to the class forum.
All students are subject to the Northeastern University's Academic Integrity Policy. Per Khoury College policy, all cases of suspected plagiarism or other academic dishonesty must be referred to the Office of Student Conduct and Conflict Resolution (OSCCR). This may result is deferred suspension, suspension, or expulsion from the university.
Accommodations for Students with Disabilities
If you have a disability-related need for reasonable academic accommodations in this course and have not yet met with a Disability Specialist, please visit www.northeastern.edu/drc and follow the outlined procedure to request services. If the Disability Resource Center has formally approved you for an academic accommodation in this class, please present the instructor with your "Professor Notification Letter" at your earliest convenience, so that we can address your specific needs as early as possible.
Title IX
Title IX makes it clear that violence and harassment based on sex and gender are Civil Rights offenses subject to the same kinds of accountability and the same kinds of support applied to offenses against other protected categories such as race, national origin, etc. If you or someone you know has been harassed or assaulted, you can find the appropriate resources here.