Search results

From Cohen Courses
Jump to navigationJump to search
  • - N words of documents are shown by <math> w=\{w_1,w_2,...,w_N\}</math> ...ers are estimated using maximum likelihood estimation on a set of training documents. For inference, one approach is to approximate parameter <math> \phi </math
    4 KB (616 words) - 16:55, 24 November 2010
  • ...ew form of topic model which can take into account the inner structures in documents.
    733 bytes (112 words) - 15:54, 29 September 2012
  • ...e queries pose a particular problem for search engines because very recent documents may not even be indexed yet, and even if they are indexed, there may be a r #Twitter is likely to contain URLs of uncrawled documents likely to be relevant to recency sensitive queries.
    6 KB (944 words) - 10:22, 29 March 2011
  • This paper studies the problem of aligning documents at the sentence level when they are on the same topic or are describing the ...tiple components, first clustering paragraphs within-corpus, then aligning documents at the paragraph level (essentially marking candidate sentence-sentence pai
    5 KB (807 words) - 08:10, 30 September 2011
  • ...ce that they are labeled correctly.Use these high-confidence fresh labeled documents as the input and build the feature graph again. This step can be done itera
    3 KB (408 words) - 00:25, 16 October 2012
  • ...Taylor and C. Lee Giles. 2010. Enhancing Cross Document Coreference of Web Documents with Context Similarity and Very Large Scale Text Categorization. In Procee ...essesProblem::Cross Document Coreference (CDC)]] for web-scale coropora of documents, by using document-level categories, sub-document level context and extract
    5 KB (658 words) - 15:58, 7 December 2010
  • e.g clustering of similar documents, summarization etc.
    1 KB (142 words) - 00:42, 7 February 2011
  • ...n topics from a subset of the documents? If yes, how can we collect sample documents that are representative of the original distribution? ...ccurately model the corpus by modeling it as a collection of collection of documents?
    4 KB (592 words) - 10:14, 16 October 2012
  • ...of [[AddressesProblem::Authority_Identification|identifying authoritative documents]] in a given domain using textual content and report their best performing Authoritative documents are ones which exhibit novel and relevant information relative to a documen
    6 KB (961 words) - 08:16, 4 October 2012
  • * Diversify search results (return documents written in different perspectives about topics of interest) * Personalize search results (return documents in viewpoint of user)
    3 KB (397 words) - 17:01, 1 February 2011
  • ...s of the <math>m</math> unique terms within a collection of <math>n</math> documents. In a term-document matrix, each term is represented by a row, and each do ...scribes the relative frequency of the term within the entire collection of documents.
    5 KB (774 words) - 00:36, 1 December 2010
  • .... Mei et al. aim at finding subtopics in different time and locations from documents that have the same topics. ..., the data set they used are very different. Jacob et al. use twitter type documents, which are very short. Q. Mei use Weblogs, which are relative long.
    3 KB (516 words) - 11:12, 6 November 2012
  • ...ral ways: (1) the unit of output (the blog) is composed of a collection of documents (the blog posts) rather than a single document, (2) the query represents an ...tain lot of noise in the form of reader comments, spams unlike traditional documents
    9 KB (1,328 words) - 03:49, 6 November 2012
  • ...iven series of Documents d and the number of comments associated with that Documents, note as <math>N(d)</math> ...ment. Specifically given a topic <math>t_{i}</math>, we hope to find those documents that hold a positive sentiment to this topic, define as <math>D_{t_{i}+}</m
    4 KB (744 words) - 01:48, 16 October 2012
  • ...es are co-bursting if they appear close together in a large number of news documents in the given time period. ...nts in which both entities appear divided by the product of the numbers of documents each entity appears in (i.e. the [[UsesMethod::Pointwise mutual information
    11 KB (1,678 words) - 22:58, 2 November 2011
  • For Network data, such as social networks of friends, citation networks of documents or hyperlinked networks of web pages, people want to point social network m 2. For each pair of documents <math>d</math>,<math>d'</math>:
    3 KB (442 words) - 15:40, 31 March 2011
  • ...ents. Unlike in Link-LDA and Link-PLSA, which only use citations of other documents with respect to topic k in determining the influence of document d', their
    3 KB (521 words) - 14:43, 2 October 2012
  • ...troduces and evaluates methods for fusing the extracted information across documents to return a consensus answer. It could be applied together with cross-docum ...proach to combine the attribute values extracted for one person across the documents. Two alternatives are considered, one is to pick the most probable value, t
    3 KB (514 words) - 01:09, 1 December 2010
  • The biggest difference is that this models the text of the cited documents as well. It is worth noting that the same priors <math>\Omega</math> and <m ...f links off of the words expressed in the original document and the linked documents (either comment on a blog post, or linked blog) can help in this task.
    5 KB (895 words) - 22:20, 1 December 2012
  • ...The purpose of this paper is to learn such "scripts" from a collection of documents automatically. The experiment is conducted on documents from the [[UsesDataset::Gigaword corpus]]. The temporal classifier is train
    8 KB (1,180 words) - 01:38, 29 November 2011

View (previous 20 | next 20) (20 | 50 | 100 | 250 | 500)