Daly et al Social Lense: Personalization Around User Defined Collections for Filtering Enterprise Message Streams ICWSM 2011

From Cohen Courses
Jump to navigationJump to search

Citation

Elizabeth Daly, Michael Muller, Liang Gou, David R. Millen. Social Lens: Personalization Around User Defined Collections for Filtering Enterprise Message Stream (ICWSM 2011)

Online version

ICWSM 2011

Summary

This paper addresses the problem of information filtering in a message stream-based integrated enterprise social media setting, which has integrated social-networking, social-bookmarking, social file-sharing, blogs, wikis, online communities, and shared project/task management. The goal of the paper was to create "social lenses" that filter the information stream and also provide discovery and serendipity only using examples (people and/or objects such as blogs, communities) provided by a user. The motivation for this was that lenses created in this way could be easily shared between users, whereas more common personalization approaches that uses user metadata cannot be easily shared.

Starting with the initial people and objects that a user specifies, the system identifies the people and object pool to show as follows:

  1. Find related people and objects through authorship, community membership etc.
  2. Rank related people with a score that is a function of frequency of appearance of that person in the relatedness graph, Friend-of-Friend similarity with the initial people (no idea; not cited within paper and Googling fails me), and the cosine similarity of the TF-IDF weighted documents representing the people.
  3. Rank related objects (initial objects + objects of top 50 relevant people from step 2) using a combination of frequency of appearance in relatedness graph, Friend-of-Friend similarity of the person that links the object to the initial people, and content similarities with initial object.
  4. Score the initial people and objects by taking top 10 related objects and related people and treating it as though it were the initial set. (A form of pseudo-relevance feedback.)

Given the above ranking of people and objects, the user is shown an interface where they can specify the "importance" of the initial people seeds and the initial object seeds. For example, increasing the importance of the people seed will result in a stream that only contains activity from people in the initial seed set, while decreasing it will show activity from related people as well. It is not clear what increasing/decreasing the importance of the initial object set does.

Experiments

The authors ran a very small experiment with 10 people from within the company using the enterprise system and asked them to rate the output of the proposed system against two baseline systems, Top (which shows all updates from friends) and Discover (shows all updates throughout entire system) in interestingness and relatedness. They found that their system was better in both metrics in a statistically significant way.

They also examine the information rated by the testers as interesting and find that a large portion of it was from users that were not in the original lens creator's social network. In addition, they note that much of the interesting information came from objects that have very little owner/author overlap with the original lens creator. Thus, they conclude that the fact that the personalization was done around a collection of resources rather than using a user-centric method produced discovery and serendipity that would not have otherwise existed.

Criticism

The small sample size of testers for the evaluation makes the results of the paper less than convincing. Also, the scoring functions for the people/objects are not specified, making this research difficult to reproduce. Finally, this system of filtering will only work in a very specific and information-rich environment, where links between users, documents, and communities are explicit and there are no concerns about identifying the users across different domains of the social media system; there are unfortunately no comments about the applicability of this research to a more loosely integrated systems more common in public social media.