Skip to main content

SoBigData Event

Summer School on Large Scale Text and Social Media Analytics with GATE

The GATE training course will be held from 17-21 June 2019 at the University of Sheffield, UK. Early bird registration is available at a reduced rate before 1 May.

This event will follow a similar format to that of the 2018 course, with one track Monday to Thursday, and two parallel tracks on Friday, all delivered by the GATE development team. For more information about the schedule, course materials, travel, accommodation, local information etc. please see the FIG participants wiki.

The focus will be on mining text and social media content with GATE. Many of the hands on exercises will be focused on analysing news articles, tweets, and other textual content.

The planned schedule is as follows (NOTE: may still be subject to timetabling changes).

Single track from Monday to Thursday (9am - 5pm):

  • Monday: Module 1: Basic Information Extraction with GATE
    • Intro to GATE + Information Extraction (IE)
    • Corpus Annotation and Evaluation
    • Writing Information Extraction Patterns with JAPE
  • Tuesday: Module 2: Using GATE for social media analysis
    • Challenges for analysing social media, GATE for social media
    • Twitter intro + JSON structure
    • Language identification, tokenisation for Twitter
    • POS tagging and Information Extraction for Twitter
  • Wednesday: Module 3: Crowdsourcing, GATE Cloud/MIMIR, and Machine Learning
    • Crowdsourcing annotated social media content with the GATE crowdsourcing plugin
    • GATE Cloud, deploying your own IE pipeline at scale (how to process 5 millions tweets in 30 mins)
    • GATE Mimir - how to index and search semantically annotated social media streams Challenges of opinion mining in social media
    • Training Machine Learning Models for IE in GATE
  • Thursday: Module 4: Advanced IE and Opinion Mining in GATE
    • Advanced Information Extraction
    • Useful GATE components (plugins)
    • Opinion mining components and applications in GATE

On Friday, there is a choice of modules (9am - 5pm):

  • Module 5: GATE for developers
    • Basic GATE Embedded
    • Writing your own plugin
    • GATE in production - multi-threading, web applications, etc.
  • Module 6: GATE Applications
    • Building your own applications
    • Examples of some current GATE applications: social media summarisation, visualisation, Linked Open Data for IE, and more

Please note that these two modules are run in parallel, so you can only attend one of them. You must state on the booking form which module you would like to follow on Friday. Note that you will be expected to have some programming experience and knowledge of Java to follow Module 5 on the Friday. No particular expertise is needed for Module 6.