face to face training

Scraping - getting a computer to capture information from online sources - is one of the most powerful techniques for data-savvy journalists who want to get to the story first, or find exclusives that no one else has spotted.

This three-day workshop in scraping is designed for reporters with no knowledge of scraping or programming and provides essential skills for getting original stories by compiling data across a range of online sources. By the end of the workshop, you will be able to use specialist scraping tools (without programming) and begin to write your own, more advanced, scrapers. You will also be able to communicate with programmers on relevant projects.


Tuesday, 22 January: Scraping basics

10-10.30am           Registrations 

10:30-11:15am      Introduction: What scraping is and how news organisations are using it 

11:30-12.15pm      Pitching story ideas involving scraping 

12:15-1pm             Scraping basics: finding structure in HTML and URLs 

1-2pm                    Lunch 

2-3.45pm               Simple scraping jobs: checking a webpage every day; identifying information using XPath

4-5pm                    Introduction to scraping tools: Outwit Hub 

Wednesday, 24 January: Looking at what's available 

9-10am                   Advanced Outwit Hub: scraping multiple pages 

10-10:15am            What's possible with programming: APIs, regex and loops 

10:30am-12pm       Scraping text that fits a pattern: regex 

12-1pm                    Lunch 

1-3.45pm                 Basic scraping with Python and Morph.io 

4-5pm                      Scraping database search results by following links: loops 

Thursday, 25 January: Advanced techniques

9-10am                    Advanced scraping: spreadsheets 

10-11am                  Advanced scraping: PDFs 

11am-12pm             Scraping lab: problem solving 

12-1pm                    Lunch 

1-4pm                      Scraping lab: problem solving 

4-5pm                      Wrap up, final results


Big organisations (10+ people) - £405
Freelancers and small organisations (9 people and fewer) - £305
Students (correspondence/evening course) - £205 (limited availability)
Students (full time) - £155 (limited availability)

Full time Goldsmiths' students get 20% discount on all CIJ courses. Please contact marina(at)tcij.org for more details. (Limited availability)

Tags: Journalism education Online news Online media Data journalism
Start Date: 23/01/2018
End Date: 25/01/2018