**This is an inactive course webpage**
This course looks at social networks and markets through the eyes of a computer scientist. We will look at questions including, “Why are acquaintances rather than friends more likely to get us job opportunities?” and, “Why do the rich get richer?” We begin by studying graph theory allowing us to study the structure and interactions of social networks at the introductory level. Among other topics, we will study matching problems, epidemics, and the structure of the internet (including web searches). This course examines the intersection of computer science, economics, sociology, and applied mathematics.
Topics and order are subject to changes as the semester evolves.
I Graph Theory & Social Networks
- complex systems and networks
- graph representation, notation and definitions
- degree distribution, paths, distance distribution, diameter, connectedness
- triadic closure, clustering coefficient
- strong and weak ties
- node centrality and importance, [digression: robustness and resilience]
- homophily and social-affiliation networks
- [signed networks]
II Network Models
- small-world phenomenon
- random graph model (Erdös-Rényi graphs)
- Watts-Strogatz model
- rich-get-richer phenomena
- fitness model, Barabási-Albért graphs, and scale-free networks
III Analyzing Networks: Graph Mining
- communities, betweenness
- community detection, Girvan-Newman algorithm
- graph clustering, graph partitioning
- triangles, cliques, and paths
- node similarity and node classification
- graph similarity and graph classification
IV Modeling Interactions: Matching Problems
- bipartite graphs and perfect matchings
- matching markets
- advertising on the web: adwords model
V Modeling Information Networks: Structure of the Web & Link Analysis
- web graph
- hubs and authorities (HITs)
- PageRank
VI Modeling Dynamics: Information Cascades, Link Prediction, Epidemics
We can choose which of those topics to study in more detail (time permitting).
- information cascades, diffusion models: how does information spread?
- link prediction, supervised random walks: how do friendship suggestions work?
- epidemic models: how do diseases spread?
Prerequisites: CSE 247, ESE 326, and programming experience (we will do some case studies using Python)
Course Objective
After taking this course you should be able to
- understand…
- what networked data is
- how we can represent networked data conceptually and programmatically
- what we can do with networked data in the context of different areas of study such as social sciences, information technology, natural science, etc.
- have some hands-on experience with…
- working with networked data
- implementing and applying algorithms for networked data
Background & Prerequisites
Before taking this course you should have previously studied…
- a computer programming language such as Java or Python
- graphs in the context of data structures and algorithms in CS
- basic concepts in probability and statistics
Specifically you should be familiar with…
- those topics covered in CSE247
- traveling salesman problem
- data structures for graph representation
- single-source shortest path algorithm (Dijkstra)
- breath-first-search and depth-first search
- those topics covered in ESE326
- laws of probability
- discrete and continuous probability distributions
- how to write a mathematical proof (covered in CSE240)
Course Materials & Reading
I expect you to be able to read book chapters intended for researchers and higher education as well as scientific papers describing experiments and applications in the context of social network analysis.
Here are some samples:
- Chapter 2 in [NCM] “Networks, Crowds, and Markets: Reasoning about a Highly Connected World”
- expect to be reading a lot in this book
- Chapter 10 in [MMDS] “Mining of Massive Data Sets”
- this is more mathematical, so be sure you could follow the materials presented in this chapter
- we will cover some topics along these lines
- Code described in Chapter 1 in [DSCN] “Data Science & Complex Networks”
- we will implement and use programs to analyze graphs
- make sure you are prepared to code in Python
- this course will not teach you how to program (you should have taken CSE131 already)
- research paper
Lectures and Lecture Notes
Most of the time I will be writing on the board, so I expect you to attend classes and take your own notes. Not all covered material can be found in the posted reading. It is the students responsibility to know what was covered in the lectures. I will only use slides for illustrations mainly fancy drawings I couldn’t do on the board… These slides will be posted online after the lecture. They are by no means a complete summary of what was covered in class.
Summary:
- board: technical content
- slides: illustrations and pretty pictures
The content of this class is based on the following sources:
- [NCM] “Networks, Crowds, and Markets: Reasoning about a Highly Connected World” book and class taught at Cornell by David Easley and Jon Kleinberg [online available]
- [MMDS] “Mining of Massive Data Sets” book [online available] and “Social and Information Network Analysis” class taught at Stanford by Jure Leskovec
- [DSCN] “Data Science & Complex Networks” book and code by Guido Caldarelli and Alessandro Chessa [course book – please buy]
- [SMM] Social Media Mining: An Introduction by Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu [online available]
Instructor: Marion Neumann
Office: Jolley Hall Room 222
Office Hours: WED 11am-12pm (or individual appointment* – avoid drop ins w/o appointment)
*request individual appointments via email and allow for 2-3 days reply and scheduling time
TA Office Hours:
- conceptual: TUE 5:45-7:45pm in Jolley 224 by Benjamin Choi
- git, eclipse: THU 2-3pm in Jolley 431 by Nate Jarvis
Lectures: TUE/THU 10-11:30am in Lab Sciences / 250
Piazza: Please ask any questions related to the course materials and homework problems on Piazza. Other students might have the same questions or are able to provide a quick answer. Any postings of (partial) solutions to problems (written or in form of source or pseudo code) will result in a grade of zero for that particular problem for ALL students. Sign-up using your wustl email address here: piazza.com/wustl/fall2017/cse316a
Lectures
Lectures will be held every Tuesday and Thursday 10-11:30am in Lab Sciences / 250.
Homework assignments
There will be written homework assignments:
- should be worked on in groups of up to (not more than) 2 students
- submit via Gradescope (use course entry code MV88WW)
- ca. 4-5 assignments (each weighted equally)
- contribute 40% towards your total course performance
Homeworks will be assigned concurrently to the lecture sessions covering the respective materials. Due dates will be indicated on the course webpage under homework assignments. It is every student’s responsibility to meet the submission requirements and deadlines. We cannot accept late submissions and submissions that do not follow the submission instructions for no reason (see also Late Policy below).
Each homework assignment will be graded and the total grade achieved for all homework assignments (no drops, no make-ups) will contribute 40% towards your total course performance.
Regrade Requests
Any regrade requests and claims of missing scores will have to made within 2 weeks of the grade announcement. We will not take any regrade requests after this 2 week period for no reason. Grade announcements and grading comments will be provided via Gradescope. All grades will be maintained on Blackboard. Regrade submissions should be exclusively done via Gradescope.
Participation
There will be a participation score based on in-class activities (experiments, quizzes, etc.) and piazza question answering. This participation score will contribute 10% towards your total course performance. Use your wustl email address to sign-up for Piazza.
Midterm and Final Exams
There will be one written midterm exam and one written final exam contributing 25% each towards your total course performance. Dates are
- Midterm: 19 Oct 2017 (in-class)
- Final: 19 Dec 2017 6:00-8:00pm in TBA
Grading Summary
40% homework assignments
10% class participation (including piazza activity)
25% midterm exam
25% final exam
Final course grades will be assigned using the following straight scale:
Letter Grade | Cutoff Percentage |
---|---|
A | 93% |
A- | 90% |
B+ | 87% |
B | 83% |
B- | 80% |
C+ | 77% |
C | 73% |
C- | 70% |
D+ | 67% |
D | 63% |
D- | 60% |
F | < 60% |
Late Policy
Your homework assignments must be turned in on time. There are absolutely no makeup quizzes or assignments for any reason. You get an automatic 3 day extension on every homework.
WARNING: there is absolutely NO extension to this extension for NO reason!
Collaboration Policy
You are encouraged to discuss the course material with other students. Discussing the material, and the general form of solutions to the labs is a key part of the class. Since, for many of the assignments, there is no single “right” answer, talking to other students and to the TAs is a good thing. However, everything that you turn in should be your own work, unless we tell you otherwise. If you talk about assignments with another student, then you need to explicitly tell us on the hand-in. You are not allowed to copy answers or parts of answers from anyone else, or from material you find on the Internet. This will be considered as willful cheating, and will be dealt with according to the official collaboration policy:
Academic Dishonesty
Unless explicitly instructed otherwise, everything that you turn in for this course must be your own work. If you willfully misrepresent someone else’s work as your own, you are guilty of cheating. Cheating, in any form, will not be tolerated in this class.
There is zero tolerance of Academic Dishonesty. I will be actively searching for academic dishonesty on all homework assignments, quizzes, and exams. If you are guilty of cheating on any assignment or exam, you will receive and F in the course and be referred to the School of Engineering Discipline Committee. In severe cases, this can lead to expulsion from the University, as well as possible deportation for international students. If you copy from anyone in the class both parties will be penalized, regardless of which direction the information flowed. This is your only warning.
Please refer to the University Undergraduate Academic Integrity Policy, for more information. If you suspect that you may be entering an ambiguous situation, it is your responsibility to clarify it before the professor or TAs detect it. If in doubt, please ask us.
Mental Health
Mental Health Services professional staff members work with students to resolve personal and interpersonal difficulties, many of which can affect the academic experience. These include conflicts with or worry about friends or family, concerns about eating or drinking patterns, and feelings of anxiety and depression. See: http://shs.wustl.edu/MentalHealth
If you have any problems with the workload of this class, please come and talk to me. The earlier we talk the better.
Accommodations based upon sexual assault
The University is committed to offering reasonable academic accommodations to students who are victims of sexual assault. Students are eligible for accommodation regardless of whether they seek criminal or disciplinary action. Depending on the specific nature of the allegation, such measures may include but are not limited to: implementation of a no-contact order, course/classroom assignment changes, and other academic support services and accommodations. If you need to request such accommodations, please direct your request to Kim Webb (kim_webb@wustl.edu), Director of the Relationship and Sexual Violence Prevention Center. Ms. Webb is a confidential resource; however, requests for accommodations will be shared with the appropriate University administration and faculty. The University will maintain as confidential any accommodations or protective measures provided to an individual student so long as it does not impair the ability to provide such measures.
If a student comes to me to discuss or disclose an instance of sexual assault, sex discrimination, sexual harassment, dating violence, domestic violence or stalking, or if I otherwise observe or become aware of such an allegation, I will keep the information as private as I can, but as a faculty member of Washington University, I am required to immediately report it to my Department Chair or Dean or directly to Ms. Jessica Kennedy, the Universitys Title IX Coordinator. If you would like to speak with the Title IX Coordinator directly, Ms. Kennedy can be reached at (314) 935-3118, jwkennedy@wustl.edu, or by visiting her office in the Womens Building. Additionally, you can report incidents or complaints to Tamara King, Associate Dean for Students and Director of Student Conduct, or by contacting WUPD at (314) 935-5555 or your local law enforcement agency.
You can also speak confidentially and learn more about available resources at the Relationship and Sexual Violence Prevention Center by calling (314) 935-8761 or visiting the 4th floor of Seigle Hall.
Bias Reporting
The University has a process through which students, faculty, staff and community members who have experienced or witnessed incidents of bias, prejudice or discrimination against a student can report their experiences to the Universitys Bias Report and Support System (BRSS) team. See: http://brss.wustl.edu
_______ |
Topic |
Materials |
---|---|---|
29 Aug | Syllabus, Course Overview
INTRO: |
|
I Graph Theory & Social Networks
slides: Illustrations quizzes: 7 Bridges, Centrality, Network Structure & Link Formation |
||
31 Aug | 1 Motivation
2 Basic Concepts
|
|
5 Sept | 3 Measuring Distance
4 The Global Social Network
|
|
7 Sept | 4 The Global Social Network
5 Data Structures for Graphs |
|
12 Sept | 7 Node Importance and Centrality
|
|
14 Sept | 8 Network Structure and Flow of Information
|
|
19 Sept | 9 Network Structure vs. Surrounding Contexts
|
|
21 Sept | 10 Social Affiliation Networks
11 Signed Networks
|
|
26 Sept | Review Part I
Centrality measures on Actor Network Intro to Random Graph Models |
|
II Network Models
slides: Illustrations |
||
28 Sept | 1 Random Graph Model
|
|
3 Oct | 1 Random Graph Model
Digression: Configuration Model |
|
5 Oct | NO lecture!!! | hw2 is due today at 10am! |
10 Oct | 2 Small World Model
|
|
12 Oct | Midterm Review |
|
17 Oct | Fall break (no class) | |
19 Oct | MIDTERM EXAM (in-class) | |
24 Oct | 3 Scale-free Networks
|
|
26 Oct | 4 Preferential Attachment Model
|
|
III Analyzing Social Networks: Graph Mining
slides: Illustrations quiz: Community Detection interesting blog post: Graph-powered Machine Learning at Google |
||
31 Oct
2 Nov |
1 Communities in Networks
2 Betweenness-based Clustering |
|
7 Nov | 3 Modularity Maximization |
|
9 Nov | 4 Spectral Clustering |
|
14 Nov
16 Nov |
5 Overlapping Communities
|
|
21 Nov | 6 Node Similarity
|
|
23 Nov | Thanksgiving (no class) | |
28 Nov | 7 Node Classification
|
|
IV Network Dynamics & Applications of Social Network Analysis
slides: Illustrations quiz: Epidemics |
||
30 Nov | 1 Spreading Processes
|
|
5 Dec | 2 Cascading Behavior
|
|
7 Dec | 3 Applications and Use Cases of SN
4 Beyond the Basics in SN Review |
|
19 Dec | FINAL EXAM (6-8pm in Simon 023) | You can bring one hand-written, double-sided US-letter-sized cheat sheet. |
All homework submissions must be made via Gradescope. Find a tutorial on submitting a PDF to Gradescope HERE.
Use entry code MV88WW to sign up.
GROUP SUBMISSIONS
Find a tutorial on how to add a group member to your submission in the second half of this video.
- 11/28 hw5
- due: TUE 12/07/2017 at 10am
- follow this LINK to set up your team repository for hw5 (Note: we reuse the teams form hw3!)
- CAUTION: always add yourself and your team members as collaborators to the repository – even if you work on your own! See instructions for hw1.
- submit written answers via Gradescope
- submit code via commit to your hw5 repository on GitHub
- 11/14 hw4
- due: TUE 11/28/2017 at 10am
- follow this LINK to set up your team repository for hw4 (Note: we reuse the teams form hw3!)
- CAUTION: always add yourself and your team members as collaborators to the repository – even if you work on your own! See instructions for hw1.
- submit written answers via Gradescope
- submit code via commit to your hw4 repository on GitHub
- 10/26 hw3
- due:
THU 11/09/2017TUE 11/14/2017 at 10am - follow this LINK to set up your team repository for hw3
- CAUTION: always add yourself and your team members as collaborators to the repository – even if you work on your own!
- submit written answers via Gradescope
- submit code via commit to your hw3 repository on GitHub
- due:
- 09/21 hw2
- 09/05 hw1
[GitHub team and repository set up] follow all of the steps below:
- first, set up a GitHub account using your @wustl.edu email address when registering
- now, follow this LINK to set up your team repository for hw1
- IMPORTANT: team set up and naming instructions
- finally, clone into your repository from your computer (both partners can do that on their machines independently). Now, you are ready to develop your solutions and commit them to your repository. Find instructions on how to create a clone of your repository on your local machine and commit to your repository below.
Books
- [NCM] “Networks, Crowds, and Markets: Reasoning about a Highly Connected World” book and class taught at Cornell by David Easley and Jon Kleinberg[online available]
- [MMDS] “Mining of Massive Data Sets” book [online available] and “Social and Information Network Analysis” class taught at Stanford by Jure Leskovec
- [DSCN] “Data Science & Complex Networks”
- book by Guido Caldarelli and Alessandro Chessa [course book]
- code on GitHub
Gradescope
We will use Gradescope for written homework submissions and all homework grading. Find a tutorial on submitting a PDF to Gradescope HERE. To sign up use entry code MV88WW.
Getting Python (and Eclipse) ready
We will use Python 2.7 and we will need Numpy, Scipy, Matplotlib, and NetworkX. All those packages are included in the Anaconda package. You don’t have to, but you may want to use Eclipse. HERE is a tutorial on how to install Eclipse and PyDev.
Setting up Git
First, go to the GitHub page for your repository (your repository should contain the name of your assignment and the name of your team). Now, you can either follow the instructions under create a new repository on the command line or you can import the repository directly from Eclipse (you do not need to use Eclipse for this course but it is highly recommended). Find instructions on how to setup the git repository in Eclipse HERE (these interactions are from CSE132, but they should be very helpful).
Learning git while playing a game!!
DSCN Code
As the code used in the DSCN book is also available as a github repository, you can check it out in the same way as you did for homework 1 using this LINK or cloning the repository from GitHub directly.
Git Help Videos from CSE131
Please, ignore all the cse131 specific parts. Also note that we are using GitHub and not bitbucket.
Using git: loading your repository, making changes, commit/pushing (start watching from minute 1:35)
How to get unstuck if you can’t commit/push:
- First try to pull and then try the commit/push again
- Drastic steps to get yourself unstuck