Search results

Create the page "Documents" on this wiki! See also the search results found.

Blei et al, 2002
- N words of documents are shown by <math> w=\{w_1,w_2,...,w_N\}</math> ...ers are estimated using maximum likelihood estimation on a set of training documents. For inference, one approach is to approximate parameter <math> \phi </math

4 KB (616 words) - 16:55, 24 November 2010
Segmented Topic Model
...ew form of topic model which can take into account the inner structures in documents.

733 bytes (112 words) - 15:54, 29 September 2012
Dong et al WWW 2010
...e queries pose a particular problem for search engines because very recent documents may not even be indexed yet, and even if they are indexed, there may be a r #Twitter is likely to contain URLs of uncrawled documents likely to be relevant to recency sensitive queries.

6 KB (944 words) - 10:22, 29 March 2011
Barzilay and Elhadad, 2003
This paper studies the problem of aligning documents at the sentence level when they are on the same topic or are describing the ...tiple components, first clustering paragraphs within-corpus, then aligning documents at the paragraph level (essentially marking candidate sentence-sentence pai

5 KB (807 words) - 08:10, 30 September 2011
Tag Predicting For Stackoverflow
...ce that they are labeled correctly.Use these high-confidence fresh labeled documents as the input and build the feature graph again. This step can be done itera

3 KB (408 words) - 00:25, 16 October 2012
Huang et al, Coling 2010: Enhancing Cross Document Coreference of Web Documents with Context Similarity and Very Large Scale Text Categorization
...Taylor and C. Lee Giles. 2010. Enhancing Cross Document Coreference of Web Documents with Context Similarity and Very Large Scale Text Categorization. In Procee ...essesProblem::Cross Document Coreference (CDC)]] for web-scale coropora of documents, by using document-level categories, sub-document level context and extract

5 KB (658 words) - 15:58, 7 December 2010
Clustering
e.g clustering of similar documents, summarization etc.

1 KB (142 words) - 00:42, 7 February 2011
Github Repo Recommendation:Topic Model meets Code
...n topics from a subset of the documents? If yes, how can we collect sample documents that are representative of the original distribution? ...ccurately model the corpus by modeling it as a collection of collection of documents?

4 KB (592 words) - 10:14, 16 October 2012
Topic Model Approach to Authority Identification
...of [[AddressesProblem::Authority_Identification|identifying authoritative documents]] in a given domain using textual content and report their best performing Authoritative documents are ones which exhibit novel and relevant information relative to a documen

6 KB (961 words) - 08:16, 4 October 2012
Project dong, 10-802 spring 2010
* Diversify search results (return documents written in different perspectives about topics of interest) * Personalize search results (return documents in viewpoint of user)

3 KB (397 words) - 17:01, 1 February 2011
Latent semantic indexing
...s of the <math>m</math> unique terms within a collection of <math>n</math> documents. In a term-document matrix, each term is represented by a row, and each do ...scribes the relative frequency of the term within the entire collection of documents.

5 KB (774 words) - 00:36, 1 December 2010
Comparison: A Latent Variable Model for Geographic Lexical Variation and A probabilistic approach to spatiotemporal theme pattern mining on weblogs
.... Mei et al. aim at finding subtopics in different time and locations from documents that have the same topics. ..., the data set they used are very different. Jacob et al. use twitter type documents, which are very short. Q. Mei use Weblogs, which are relative long.

3 KB (516 words) - 11:12, 6 November 2012
Document representation and query expansion models for blog recommendation
...ral ways: (1) the unit of output (the blog) is composed of a collection of documents (the blog posts) rather than a single document, (2) the query represents an ...tain lot of noise in the form of reader comments, spams unlike traditional documents

9 KB (1,328 words) - 03:49, 6 November 2012
What VS What? Detect Controversial Topics in Online Community
...iven series of Documents d and the number of comments associated with that Documents, note as <math>N(d)</math> ...ment. Specifically given a topic <math>t_{i}</math>, we hope to find those documents that hold a positive sentiment to this topic, define as <math>D_{t_{i}+}</m

4 KB (744 words) - 01:48, 16 October 2012
Das Sarma et. al., Dynamic Relationship and Event Discovery, WSDM 2011
...es are co-bursting if they appear close together in a large number of news documents in the given time period. ...nts in which both entities appear divided by the product of the numbers of documents each entity appears in (i.e. the [[UsesMethod::Pointwise mutual information

11 KB (1,678 words) - 22:58, 2 November 2011
Chang and Blei, AOAS2010
For Network data, such as social networks of friends, citation networks of documents or hyperlinked networks of web pages, people want to point social network m 2. For each pair of documents <math>d</math>,<math>d'</math>:

3 KB (442 words) - 15:40, 31 March 2011
Nallapati Cohen Link PLSA LDA
...ents. Unlike in Link-LDA and Link-PLSA, which only use citations of other documents with respect to topic k in determining the influence of document d', their

3 KB (521 words) - 14:43, 2 October 2012
Mann 2005 Multi-Field Information Extraction and Cross-Document Fusion
...troduces and evaluates methods for fusing the extracted information across documents to return a consensus answer. It could be applied together with cross-docum ...proach to combine the attribute values extracted for one person across the documents. Two alternatives are considered, one is to pick the most probable value, t

3 KB (514 words) - 01:09, 1 December 2010
Compare Yano et al NAACL 2009 Link PLSA LDA
The biggest difference is that this models the text of the cited documents as well. It is worth noting that the same priors <math>\Omega</math> and <m ...f links off of the words expressed in the original document and the linked documents (either comment on a blog post, or linked blog) can help in this task.

5 KB (895 words) - 22:20, 1 December 2012
Chambers and Jurafsky, Unsupervised Learning of Narrative Event Chains, ACL 2008
...The purpose of this paper is to learn such "scripts" from a collection of documents automatically. The experiment is conducted on documents from the [[UsesDataset::Gigaword corpus]]. The temporal classifier is train

8 KB (1,180 words) - 01:38, 29 November 2011

Search results

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools