Difference between revisions of "Analyzing Community driven Question Answering Sites"
Line 15: | Line 15: | ||
== Abstract == | == Abstract == | ||
Question answering communities such as [http://answers.yahoo.com/ Yahoo! Answers] and [http://stackoverflow.com/ StackOverflow] have emerged as popular as well as effective means of information resource on the web. The questions along with the enitre set of corresponding answers is a big resource to explore a lot of question related to question answering. | Question answering communities such as [http://answers.yahoo.com/ Yahoo! Answers] and [http://stackoverflow.com/ StackOverflow] have emerged as popular as well as effective means of information resource on the web. The questions along with the enitre set of corresponding answers is a big resource to explore a lot of question related to question answering. | ||
+ | One interesting analysis is to track the lifetime of questions in such environments. The lifetime of a question can vary from the question being declared as ''closed'' by community, a short-lived question where an expert sufficiently answers a question or a question which generated a lot of interaction among users for a relatively long duration. Analyzing such question and trying to predict their longevity is one of the goals of our project. Other interesting aspect to explore is identifying questions that have not been sufficiently answered and identifying user expertise for improved recommendations and automatic tag prediction. | ||
+ | |||
+ | == Revised Abstract == | ||
+ | Question answering communities such as [http://answers.yahoo.com/ Yahoo! Answers] and [http://stackoverflow.com/ StackOverflow] have emerged as popular as well as effective means of information resource on the web. The questions along with the enitre set of corresponding answers is a big resource to explore a lot of question related to question answering. | ||
+ | Given a sub-topic say Python, coming up with a workflow of concepts depending upon the difficulty of the question as well as the number of people who could answer that question, the flow of concepts can be computed. | ||
+ | Another interesting question deals with the community detection among experts. By zooming in to the interaction taking place between the experts and the interest of them in concepts, an implicit network of experts can be created. | ||
+ | Recommending questions to the experts based on learning the specific skill of the user from the past question/answering behavior. | ||
One interesting analysis is to track the lifetime of questions in such environments. The lifetime of a question can vary from the question being declared as ''closed'' by community, a short-lived question where an expert sufficiently answers a question or a question which generated a lot of interaction among users for a relatively long duration. Analyzing such question and trying to predict their longevity is one of the goals of our project. Other interesting aspect to explore is identifying questions that have not been sufficiently answered and identifying user expertise for improved recommendations and automatic tag prediction. | One interesting analysis is to track the lifetime of questions in such environments. The lifetime of a question can vary from the question being declared as ''closed'' by community, a short-lived question where an expert sufficiently answers a question or a question which generated a lot of interaction among users for a relatively long duration. Analyzing such question and trying to predict their longevity is one of the goals of our project. Other interesting aspect to explore is identifying questions that have not been sufficiently answered and identifying user expertise for improved recommendations and automatic tag prediction. | ||
Revision as of 23:16, 15 October 2012
Contents
Comments
- Interesting idea. But how much mileage would you get from analyzing just the question itself?
- Do you plan to use related questions to analyze a particular question?
- If the related questions are used, will you use their answers?
- How do you plan to evaluate the longevity? Do you plan to predict a range?
--Apappu 11:39, 11 October 2012 (UTC)
Team Members
Abstract
Question answering communities such as Yahoo! Answers and StackOverflow have emerged as popular as well as effective means of information resource on the web. The questions along with the enitre set of corresponding answers is a big resource to explore a lot of question related to question answering. One interesting analysis is to track the lifetime of questions in such environments. The lifetime of a question can vary from the question being declared as closed by community, a short-lived question where an expert sufficiently answers a question or a question which generated a lot of interaction among users for a relatively long duration. Analyzing such question and trying to predict their longevity is one of the goals of our project. Other interesting aspect to explore is identifying questions that have not been sufficiently answered and identifying user expertise for improved recommendations and automatic tag prediction.
Revised Abstract
Question answering communities such as Yahoo! Answers and StackOverflow have emerged as popular as well as effective means of information resource on the web. The questions along with the enitre set of corresponding answers is a big resource to explore a lot of question related to question answering. Given a sub-topic say Python, coming up with a workflow of concepts depending upon the difficulty of the question as well as the number of people who could answer that question, the flow of concepts can be computed. Another interesting question deals with the community detection among experts. By zooming in to the interaction taking place between the experts and the interest of them in concepts, an implicit network of experts can be created. Recommending questions to the experts based on learning the specific skill of the user from the past question/answering behavior. One interesting analysis is to track the lifetime of questions in such environments. The lifetime of a question can vary from the question being declared as closed by community, a short-lived question where an expert sufficiently answers a question or a question which generated a lot of interaction among users for a relatively long duration. Analyzing such question and trying to predict their longevity is one of the goals of our project. Other interesting aspect to explore is identifying questions that have not been sufficiently answered and identifying user expertise for improved recommendations and automatic tag prediction.
Datasets
The Stack Overflow Data that we plan to use is publicly available from StackOverflow under a Creative Commons license. One can download the latest version from here.
Here are some of the statistics about the data:
- Users 440K (198K questioners, 71K answerers)
- Questions 1M (69% with accepted answer)
- Answers 2.8M (26% marked as accepted)
- Votes 7.6M (93% positive)
- Favorites 775K actions on 318K questions
Techniques Used
- We plan to use a wide set of features - incorporating the textual as well as the network attributes.
- To gain initial insights into the data, we'll use standard Topic Models like LDA and SVM for classification.
Challenges
- Relatively unexplored dataset. Most of the work has used Yahoo! Answers data set.
- Complex network dynamics like the reputation system and bounties. Understanding them key to getting good results.
Relevant Literature
- Anderson et al
- Adamic et al studies Yahoo! Answers to explore the interactions of users. Preliminary work on predicting the best answer.
- Jeon at al predicts the quality of answers using non-textual features.