Talk:Improving Knowledge-Based Weakly Supervised Information Extraction

From Cohen Courses
Jump to navigationJump to search


Freebase is a repository of structured data of almost 22 million entities. It's been used as the knowledge base for Google.


Data Access

Freebase data is available through either API or data dump.


Follow instructions here to download the dumped dataset. Size of the data is around 3.5 Gbytes compressed (35 Gbytes uncompressed).


Freebase data can be accessed through Metaweb Query Language(MQL), and open through http GET. One example is:{%22query%22:{%22type%22:%22/music/artist%22%2C%22name%22:%22The%20Police%22%2C%22album%22:%5B%5D}}

Query and return values are composed in JSON format, which can be easily handled by most programming languages. For example, for the following query:

 "query": {
   "name":"The Police",

The return value comes:

  "code": "/api/status/ok",
  "result": {
    "album": [
      "Outlandos d'Amour",
      "Reggatta de Blanc",
      "Zenyatt\u00e0 Mondatta",
      "Ghost in the Machine",
      "Every Breath You Take: The Singles",
    "name": "The Police",
    "type": "/music/artist"
  "status": "200 OK",
  "transaction_id": "cache;cache03.p01.sjc1:8101;2011-10-12T03:48:44Z;0018"

Where there is a 'null' or '[]', data is returned from freebase.

Freebase provides a MQL query editor for generating queries.


--posted by Wpang 03:55, 12 October 2011 (UTC)

Freebase Wikipedia Extraction (WEX)

Google provide a open source tool Freebase Wikipedia Extraction (WEX) in processing dump of English language Wikipedia page. Please help with understanding the functionality of this tool.

--posted by Wpang 04:02, 12 October 2011 (UTC)