Forum-Based Language Learning Analysis

From Cohen Courses
Revision as of 00:15, 2 February 2011 by Askory (talk | contribs)
Jump to navigationJump to search

Fast Learning of Graph Structure for Anomalous Pattern Detection

Team Members

Adam Skory

Gabriel Parent

Introduction

Online forums have been used to create topic-topic, user-user, and user-topic graphs. These graphs have been used for such tasks as recommendation systems, investigating knowledge propagation, and identifying influence. In this work we plan to use data from a forum dedicating to studying the Spanish language to to identify salient topics among learners of Spanish and to track influence among the users of the forum.

Dataset

For this dataset will be performing a crawl of http://forums.tomisimo.org/

Some statistics about the forum:

  • Threads: 9,046
  • Posts: 100,535
  • Members: 4,863
  • Active Members: 742

The primary areas of the forum are:

  • Vocabulary
  • Translations
  • Grammar
  • Practice & Homework
  • Teaching & Learning
  • Culture
  • Teaching and Learning Techniques
  • Introductions
  • General Chat

The forum is run on the vBulletin system and anonymous postings are not allowed.

Proposed Work

We will construct a network with nodes of types: Thread, Post, User, and Topic. The first three node types are explicit in the forum structure. The Topic nodes are not explicit, and must be extracted from the thread titles, post texts, and network structure. The following table shows potential link types between these nodes.

Thread Post User Topic
Thread Hyperlink Membership Creator, Participant Primary, Secondary
Post Direct Reply, Replay Author Primary, Secondary
User Quotation, Hyperlink Interest
Topic Related

Related Work

References