Difference between revisions of "Forum-Based Language Learning Analysis"

From Cohen Courses
Jump to navigationJump to search
Line 16: Line 16:
  
 
Some statistics about the forum:
 
Some statistics about the forum:
Threads: 9,046
+
*Threads: 9,046
Posts: 100,535
+
*Posts: 100,535
Members: 4,863
+
*Members: 4,863
Active Members: 742
+
*Active Members: 742
  
 
The primary areas of the forum are:
 
The primary areas of the forum are:
Line 35: Line 35:
  
 
== Proposed Work ==  
 
== Proposed Work ==  
 +
 +
We will construct a network with nodes of types: Thread, Post, User, and Topic. The first three node types are explicit in the forum structure. The Topic nodes are not explicit, and must be extracted from the thread titles, post texts, and network structure. The following table shows potential link types between these nodes.
 +
 +
{| border="1" align="center" style="text-align:center;"
 +
|
 +
|Thread
 +
|Post
 +
|User
 +
|Topic
 +
|-
 +
|Thread
 +
|Hyperlink
 +
|Membership
 +
|Creator, Participant
 +
|Primary, Secondary
 +
|-
 +
|Post
 +
|
 +
|Direct Reply, Replay
 +
|Author
 +
|Primary, Secondary
 +
|-
 +
|User
 +
|
 +
|
 +
|Quotation, Hyperlink
 +
|Interest
 +
|-
 +
|Topic
 +
|
 +
|
 +
|
 +
|Related
 +
|}
  
 
== Related Work ==  
 
== Related Work ==  
  
 
== References ==
 
== References ==

Revision as of 23:15, 1 February 2011

Fast Learning of Graph Structure for Anomalous Pattern Detection

Team Members

Adam Skory

Gabriel Parent

Introduction

Online forums have been used to create topic-topic, user-user, and user-topic graphs. These graphs have been used for such tasks as recommendation systems, investigating knowledge propagation, and identifying influence. In this work we plan to use data from a forum dedicating to studying the Spanish language to to identify salient topics among learners of Spanish and to track influence among the users of the forum.

Dataset

For this dataset will be performing a crawl of http://forums.tomisimo.org/

Some statistics about the forum:

  • Threads: 9,046
  • Posts: 100,535
  • Members: 4,863
  • Active Members: 742

The primary areas of the forum are:

  • Vocabulary
  • Translations
  • Grammar
  • Practice & Homework
  • Teaching & Learning
  • Culture
  • Teaching and Learning Techniques
  • Introductions
  • General Chat

The forum is run on the vBulletin system and anonymous postings are not allowed.

Proposed Work

We will construct a network with nodes of types: Thread, Post, User, and Topic. The first three node types are explicit in the forum structure. The Topic nodes are not explicit, and must be extracted from the thread titles, post texts, and network structure. The following table shows potential link types between these nodes.

Thread Post User Topic
Thread Hyperlink Membership Creator, Participant Primary, Secondary
Post Direct Reply, Replay Author Primary, Secondary
User Quotation, Hyperlink Interest
Topic Related

Related Work

References