Important terms used in Analysis of Social Media

From Cohen Courses

Jump to navigation Jump to search

Below are some key concepts discussed in Social Media Analysis 10-802 in Spring 2010

adjacency matrix: An N-by-N matrix A encoding an N-node graph, where a(i,j)=1 if nodes i and j are connected. See the Class Meeting for 10-802 01/28/2010

associative sorting: The tendency of "actors" in a social network to form links based on common properties (e.g., friendships between people of common age). The opposite is disassociative sorting. See the Class Meeting for 10-802 01/28/2010

association network: A graph with two types of nodes, one corresponding to "actors" (usually people) and one corresponding to organizations (e.g., clubs, neighborhoods, workplaces, etc). See the Class Meeting for 10-802 01/28/2010

block diagonal matrix: A matrix that can be decomposed into (sometimes approximately) uniformly dense blocks - equivalently, the adjacency matrix for a stochastic blockmodel graph. See the Class Meeting for 10-802 02/04/2010

block matrix: A matrix that can be decomposed into (sometimes approximately) uniformly dense blocks all of which span the diagonal - equivalently, the adjacency matrix for a stochastic blockmodel graph where all links between block i,j for i!=j have the same probability. See the Class Meeting for 10-802 02/04/2010

Barbosi-Albert random graph: A random graph model in which nodes are added one by one, and attached to previous nodes stochastically, in a way that prefers links to nodes with higher degree. These graphs have scale-free degree distribution and small diameter. See the Class Meeting for 10-802 01/26/2010

betweenness of a node or edge in a graph: Intuitively actors or edges between actors that are involved in many paths in a social network may have different roles than others (e.g., some role involved in communication or coordination). There are a number of proposed measures of betweenness. See the Class Meeting for 10-802 01/28/2010

centrality of a node in a graph: Intuitively actors that are central in a social network may have different roles than others (e.g., some sort of leadership role). There are a number of proposed measures of centrality. See the Class Meeting for 10-802 01/28/2010

degree of a node in a graph: The number of neighbors of a node. The distribution of node degrees is one high-level statistic of a graph that is frequently analyzed. See the Class Meeting for 10-802 01/26/2010.

degree matrix for a graph: An N-by-N matrix encoding the degree of each node in an N-node graph, where d(i,i) is the degree of node i, and d(i,j)=0 for i!=j. See the Class Meeting for 10-802 02/04/2010

diameter of a graph: The maximum length over all pairs of nodes u,v of the shortest path between u and v. Diameter, and variants like mean diameter, is a high-level statistic of a graph that is frequently analyzed. See the Class Meeting for 10-802 01/26/2010

eigenvector of a matrix: If W is N-by-N matrix, an eigenvector is a vector v such that Wv=cv, where c is a constant (the eigenvalue of v). See the Class Meeting for 10-802 02/04/2010

Erdos-Renyi random graph: A random graph model in pairs of nodes are linked independently according to a Bernoulli. These graphs usually have degree distributions that are binomially distributed, and small degree. See the Class Meeting for 10-802 01/26/2010

giant connected component of a graph: Many graphs, including most natural graphs and most random graph models, will have one large connected component (rather than say two or three). This is usually called the giant connected component. See the Class Meeting for 10-802 01/26/2010

group cohesiveness in graphs: This refers to a high degree of homophily in a portion of a graph, presumably one corresponding to a meaningful subgroup or subcommunity in the graph. See the Class Meeting for 10-802 01/28/2010

homophily in graphs: The tendency for two nodes u,v connected to a third node w to be connected to each other. See the Class Meeting for 10-802 01/26/2010

Laplacian of a graph: The matrix D-A, where D is the degree matrix of the graph and A the adjacency matrix. This matrix is important in spectral clustering. There are some important variants of this - for instance the normalized Laplacian is the matrix I-W where W is the weighted adjacency matrix. See the Class Meeting for 10-802 02/04/2010

Lexicon: A list of words with semantic or other (linguistic?) information about them. E.g. WordNet, the Pitt/OpinionFinder subjectivity lexicon, [dictionary.com dictionary.com], etc.

private state: In sentiment analysis, a private state of a person is a state that can't be verified by anyone else. Subjective statements are defined to be statements about private state. See the Class Meeting for 10-802 01/19/2010

semantic orientation: In sentiment analysis, a document (or word, or phrase, or ...) has a positive orientation if it is primarily favorable (about some "target") and a negative orientation if it is primarily unfavorable. Semantic orientation is also sometimes called "polarity". See the Class Meeting for 10-802 01/14/2010

small world effect in graphs: An informal name given to tendency for natural graphs to have relatively small diameter. See the Class Meeting for 10-802 01/26/2010

spatial segregation models: A class of models that analyze the tendency of strongly homogenous spatial neighborhoods to develop from relatively weak local preferences for spatial homogeneity. See the Class Meeting for 10-802 01/26/2010

stochastic blockmodel: A random graph model in which there are k "blocks", and links between each pair of blocks i,j are determined by a Bernoulli random variable. See the Class Meeting for 10-802 01/28/2010

triadic closure: The tendency of social networks to develop homophily over time. See the Class Meeting for 10-802 01/28/2010

Watts-Strogatz random graph: A random graph in which nodes are arranged in a lattice (often a 1-D lattice shaped like a ring) with nearby nodes connected in a regular pattern, and distance nodes connected randomly. See the Class Meeting for 10-802 01/26/2010

social bookmarking: Social bookmarking is a method for internet users to share and organize common bookmarks. Websites like delicious.com provide a bookmarking system.

weighted adjacency matrix: An N-by-N matrix W encoding an N-node graph, where w(i,j)=1/degree(i) if nodes i and j are connected. See the Class Meeting for 10-802 02/04/2010

Bipartite graph: A bipartite graph is a graph where nodes can be divided into two groups V1 and V2 such that no edge connects the vertices in the same group.

Information Cascade: Term used to describe the spread of information through a network of people or websites.

Retrieved from "http://curtis.ml.cmu.edu/w/courses/index.php?title=Important_terms_used_in_Analysis_of_Social_Media&oldid=21"

Navigation menu