# Important terms used in Analysis of Social Media

From Cohen Courses

Jump to navigationJump to searchBelow are some key concepts discussed in Social Media Analysis 10-802 in Spring 2010

- adjacency matrix
- An N-by-N matrix A encoding an N-node graph, where a(i,j)=1 if nodes i and j are connected. See the Class Meeting for 10-802 01/28/2010

- associative sorting
- The tendency of "actors" in a social network to form links based on common properties (e.g., friendships between people of common age). The opposite is disassociative sorting. See the Class Meeting for 10-802 01/28/2010

- association network
- A graph with two types of nodes, one corresponding to "actors" (usually people) and one corresponding to organizations (e.g., clubs, neighborhoods, workplaces, etc). See the Class Meeting for 10-802 01/28/2010

- block diagonal matrix
- A matrix that can be decomposed into (sometimes approximately) uniformly dense blocks - equivalently, the adjacency matrix for a stochastic blockmodel graph. See the Class Meeting for 10-802 02/04/2010

- block matrix
- A matrix that can be decomposed into (sometimes approximately) uniformly dense blocks all of which span the diagonal - equivalently, the adjacency matrix for a stochastic blockmodel graph where all links between block i,j for i!=j have the same probability. See the Class Meeting for 10-802 02/04/2010

- Barbosi-Albert random graph
- A random graph model in which nodes are added one by one, and attached to previous nodes stochastically, in a way that prefers links to nodes with higher degree. These graphs have scale-free degree distribution and small diameter. See the Class Meeting for 10-802 01/26/2010

- betweenness of a node or edge in a graph
- Intuitively actors or edges between actors that are involved in many paths in a social network may have different roles than others (e.g., some role involved in communication or coordination). There are a number of proposed measures of betweenness. See the Class Meeting for 10-802 01/28/2010

- centrality of a node in a graph
- Intuitively actors that are central in a social network may have different roles than others (e.g., some sort of leadership role). There are a number of proposed measures of centrality. See the Class Meeting for 10-802 01/28/2010

- degree of a node in a graph
- The number of neighbors of a node. The distribution of node degrees is one high-level statistic of a graph that is frequently analyzed. See the Class Meeting for 10-802 01/26/2010.

- degree matrix for a graph
- An N-by-N matrix encoding the degree of each node in an N-node graph, where d(i,i) is the degree of node i, and d(i,j)=0 for i!=j. See the Class Meeting for 10-802 02/04/2010

- diameter of a graph
- The maximum length over all pairs of nodes u,v of the shortest path between u and v. Diameter, and variants like mean diameter, is a high-level statistic of a graph that is frequently analyzed. See the Class Meeting for 10-802 01/26/2010

- eigenvector of a matrix
- If W is N-by-N matrix, an eigenvector is a vector v such that Wv=cv, where c is a constant (the eigenvalue of v). See the Class Meeting for 10-802 02/04/2010

- Erdos-Renyi random graph
- A random graph model in pairs of nodes are linked independently according to a Bernoulli. These graphs usually have degree distributions that are binomially distributed, and small degree. See the Class Meeting for 10-802 01/26/2010

- giant connected component of a graph
- Many graphs, including most natural graphs and most random graph models, will have one large connected component (rather than say two or three). This is usually called the giant connected component. See the Class Meeting for 10-802 01/26/2010

- group cohesiveness in graphs
- This refers to a high degree of homophily in a portion of a graph, presumably one corresponding to a meaningful subgroup or subcommunity in the graph. See the Class Meeting for 10-802 01/28/2010

- homophily in graphs
- The tendency for two nodes u,v connected to a third node w to be connected to each other. See the Class Meeting for 10-802 01/26/2010

- Laplacian of a graph
- The matrix D-A, where D is the degree matrix of the graph and A the adjacency matrix. This matrix is important in spectral clustering. There are some important variants of this - for instance the normalized Laplacian is the matrix I-W where W is the weighted adjacency matrix. See the Class Meeting for 10-802 02/04/2010

- Lexicon
- A list of words with semantic or other (linguistic?) information about them. E.g. WordNet, the Pitt/OpinionFinder subjectivity lexicon, [dictionary.com dictionary.com], etc.

- private state
- In sentiment analysis, a private state of a person is a state that can't be verified by anyone else. Subjective statements are defined to be statements about private state. See the Class Meeting for 10-802 01/19/2010

- semantic orientation
- In sentiment analysis, a document (or word, or phrase, or ...) has a positive orientation if it is primarily favorable (about some "target") and a negative orientation if it is primarily unfavorable. Semantic orientation is also sometimes called "polarity". See the Class Meeting for 10-802 01/14/2010

- small world effect in graphs
- An informal name given to tendency for natural graphs to have relatively small diameter. See the Class Meeting for 10-802 01/26/2010

- spatial segregation models
- A class of models that analyze the tendency of strongly homogenous spatial neighborhoods to develop from relatively weak local preferences for spatial homogeneity. See the Class Meeting for 10-802 01/26/2010

- stochastic blockmodel
- A random graph model in which there are k "blocks", and links between each pair of blocks i,j are determined by a Bernoulli random variable. See the Class Meeting for 10-802 01/28/2010

- triadic closure
- The tendency of social networks to develop homophily over time. See the Class Meeting for 10-802 01/28/2010

- Watts-Strogatz random graph
- A random graph in which nodes are arranged in a lattice (often a 1-D lattice shaped like a ring) with nearby nodes connected in a regular pattern, and distance nodes connected randomly. See the Class Meeting for 10-802 01/26/2010

- social bookmarking
- Social bookmarking is a method for internet users to share and organize common bookmarks. Websites like delicious.com provide a bookmarking system.

- weighted adjacency matrix
- An N-by-N matrix W encoding an N-node graph, where w(i,j)=1/degree(i) if nodes i and j are connected. See the Class Meeting for 10-802 02/04/2010

- Bipartite graph
- A bipartite graph is a graph where nodes can be divided into two groups V1 and V2 such that no edge connects the vertices in the same group.

- Information Cascade
- Term used to describe the spread of information through a network of people or websites.