Search results

From Cohen Courses
Jump to navigationJump to search
  • .... Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developin
    540 bytes (85 words) - 21:09, 26 September 2012
  • ...the data as well as in the iterative refinement approach employed by both algorithms. ...>), where each observation is a ''d''-dimensional real vector, ''k''-means clustering aims to partition the ''n'' observations into ''k'' sets (''k'' ≤ ''n'')
    1 KB (190 words) - 00:02, 28 March 2011
  • ...in and William W. Cohen (2010)]: Semi-Supervised Classification of Network Data Using Very Few Labels in ASONAM-2010. ...nars/docs/BinderPartha.pdf PP Talukdar, K Crammer (2009):] New regularized algorithms for transductive learning Machine Learning and Knowledge Discovery in Datab
    2 KB (214 words) - 12:20, 14 November 2017
  • ...in and William W. Cohen (2010)]: Semi-Supervised Classification of Network Data Using Very Few Labels in ASONAM-2010. ...nars/docs/BinderPartha.pdf PP Talukdar, K Crammer (2009):] New regularized algorithms for transductive learning Machine Learning and Knowledge Discovery in Datab
    2 KB (231 words) - 10:50, 30 March 2018
  • ...ng| coauthors = X. Wu| date = 2009| first = D.| last = Lin| title = Phrase clustering for discriminative learning| url = http://www.aclweb.org/anthology/P/P09/P0 This paper makes use of phrase [[UsesMethod::clustering]] to improve on the state of the art for the [[AddressesProblem::Named Enti
    4 KB (577 words) - 01:07, 30 September 2011
  • The approach used here is has multiple components, first clustering paragraphs within-corpus, then aligning documents at the paragraph level (e == Algorithms ==
    5 KB (807 words) - 08:10, 30 September 2011
  • ...would be beneficial if we could automatically produce these templates from data. ...ur most often in the data is one way, while we could also use more complex clustering like Chambers and Jurafsky.
    4 KB (707 words) - 22:45, 6 October 2011
  • ...Profile Based Cross-Document Coreference Using Kernelized Fuzzy Relational Clustering. In Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of ...ce function, and finally clustering them using kernelized fuzzy relational clustering.
    5 KB (765 words) - 01:45, 1 December 2010
  • ...g Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms, Collins, EMNLP 2002]. ...= A.| last = Globerson| pages = 305–312| title = Exponentiated gradient algorithms for log-linear structured prediction}}]]. A more recent EG-based approach.
    2 KB (291 words) - 16:39, 22 September 2011
  • ...g Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms, Collins, EMNLP 2002]. ...= A.| last = Globerson| pages = 305–312| title = Exponentiated gradient algorithms for log-linear structured prediction}}]]. A more recent EG-based approach.
    2 KB (279 words) - 10:22, 4 October 2010
  • ...he project will be designed to compare the scalability of variant learning algorithms on datasets. * Streaming learning algorithms
    5 KB (671 words) - 12:55, 14 November 2011
  • ...n Fall 2016 Overview|Overview]]. Grading policies and etc, History of Big Data, Complexity theory and cost of important operations ...5 in Fall 2016 Probability Review|Probability Review]]. Counting for big data and density estimation, streaming Naive Bayes, Rocchio and TFIDF
    7 KB (1,002 words) - 11:54, 11 August 2017
  • ...paper to assign the weight. Finally, an algorithm similar to hierarchical clustering is proposed and evaluated based on three computer-generated or real-word ne ...unity structure detection algorithm, similar to the iterative hierarchical clustering algorithm, can be formalized as follows:
    6 KB (875 words) - 00:11, 19 December 2012
  • ...erimental results demonstrate that our method outperforms the state-of-art algorithms.
    2 KB (276 words) - 22:01, 1 October 2012
  • Title: '''Fine & coarse grain clustering of tweets based on topics''' ...ce, and "Just for Fun". We propose to address problem by gathering twitter data for approximately 30 popular hash tags corresponding to the different topic
    5 KB (862 words) - 20:21, 14 February 2011
  • ...for 10-605 Overview|Overview]]. Grading policies and etc, History of Big Data, Complexity theory and cost of important operations ...ting for 10-605 Probability Review|Probability Review]]. Counting for big data and density estimation, streaming Naive Bayes, Rocchio and TFIDF
    9 KB (1,220 words) - 12:06, 28 November 2017
  • * Tues Sep 8. [[Class meeting for 10-605 Streaming Naive Bayes|Streaming algorithms and Naive Bayes; The stream-and-sort design pattern; Naive Bayes for large ...10-605 Phrases_with_Stream_and_Sort|Implementing Phrase Finding and Large-Data Testing for Naive Bayes with Stream-and-Sort]].
    6 KB (908 words) - 10:07, 11 October 2016
  • * Find a recent paper which includes data and an implementation for a non-trivial neural model in a standard framewor ...tion on CPUs. The architecture you should build should stream through the data, and construct multiple tasks which require a worker to perform a minibatch
    9 KB (1,249 words) - 13:46, 30 April 2018
  • ...els. I have also worked previously on the area of opinion mining and graph clustering.
    3 KB (426 words) - 12:40, 8 September 2011
  • ...wledge propagation, and identifying influence. In this work we plan to use data from a forum dedicating to studying the Spanish language to facilitate lang ...nal unsupervised clustering techniques such as Yang & Meng's (2006) Markov clustering approach.
    6 KB (834 words) - 10:46, 15 February 2011
  • * Clustering for aligning multiple language entity names based on page topic * Large scale data
    4 KB (604 words) - 03:59, 24 October 2011
  • For clustering the hyperlinks, they use [[UsesMethod::PHITS]] which is mathematically iden * K. Bharat and M. R. Henzinger. Improved algorithms for topic distillation in hyperlinked environments.
    4 KB (610 words) - 17:08, 5 November 2012
  • * Tues Jan 20. [[Class meeting for 10-605 Streaming Naive Bayes|Streaming algorithms and Naive Bayes; The stream-and-sort design pattern; Naive Bayes for large * Thus Feb 19. [[Class meeting for 10-605 Randomized|Randomized Algorithms 1]]
    9 KB (1,328 words) - 14:50, 14 October 2015
  • ...vant to the course - e.g., to compare the scalability of variant learning algorithms on datasets. * Geographical names and places - data on places from GeoNames, Wikipedia, and Geo-tagged Flikr images.
    5 KB (716 words) - 11:34, 1 May 2012
  • * Wed Jan 23. [[Class meeting for 10-605 2013 01 23|Streaming algorithms and Naive Bayes; The stream-and-sort design pattern; Naive Bayes for large * Wed Jan 30. [[Class meeting for 10-605 2013 01 30|More on streaming algorithms: Rocchio, and theory of on-line learning]]
    7 KB (1,005 words) - 17:20, 10 January 2014
  • • We store a persistent database of entities using this clustering, whereby each cluster represents a real-world entity. In other words, an en o For how many entities does a given attribute exist in the data?
    4 KB (675 words) - 18:19, 1 February 2011
  • • We store a persistent database of entities using this clustering, whereby each cluster represents a real-world entity. In other words, an en o For how many entities does a given attribute exist in the data?
    5 KB (739 words) - 18:19, 1 February 2011
  • ...n Proceedings of the fourth ACM international conference on Web search and data mining, 2011. [http://web.eecs.umich.edu/~congy/work/wsdm11.pdf] Both papers made use of algorithms from time series models and graph clustering to solve their respective problems.
    5 KB (842 words) - 23:49, 5 November 2012
  • ..._WWW2009]] || [[Preserving the privacy of sensitive relationships in graph data. PinKDD, 2007]] [http://www.springerlink.com/content/n1404m0668452854/] || ...supervised_learning_algorithm_for_link_prediction]] || [[Fast and scalable algorithms for semi-supervised link prediction on static and dynamic graphs]] [http://
    12 KB (1,642 words) - 17:02, 30 November 2012
  • ...hus does not need tweet ranking. As the time goes on, we will acquire more data from him, so we can recommend accordingly. ...ted in we want to use the learnt graph from the previous step and then use algorithms for link prediction to the nodes which are tweets. Given a graph G(V, E), w
    15 KB (2,240 words) - 23:45, 14 February 2011
  • ...particular research area looking at the changes in the currently available data. Algorithms, Sociology, Signal Processing.
    15 KB (2,315 words) - 00:18, 15 February 2011
  • |title=Maximum Likelihood from Incomplete Data via the EM Algorithm |title=Maximum likelihood theory for incomplete data from an exponential family
    39 KB (5,817 words) - 21:17, 26 September 2012
  • ...activity (# of tweets posted on Twitter), while [15] did this by means of clustering email-exchange network graph. In topic modeling to model document network data, [19] proposed relation topic models for document networks, [19] proposed a
    12 KB (1,759 words) - 16:03, 3 February 2011