Previous Workshops

If your university or organisation would like to host a workshop, please contact us.

Workshop on Language Corpora in Australia

Over decades of work in Australia, significant collections of language data have been amassed, including of varieties of Australian English, Australian migrant languages, Australian Indigenous languages, sign languages and others. These collections represent a trove of knowledge not only of language in Australia, but also of Australia’s social and cultural history. And yet, not all are well known and many lack published descriptions. The purpose of this workshop is to provide an opportunity to share information about existing language corpora in Australia, with a view to producing a special issue of the Australian Journal of Linguistics that introduces a selection of these corpora, explores how they can contribute to our understanding of language, society, and history in Australia, and considers avenues that such corpora open up for future research.

When: Monday July 3 2023

Length: Full day

Organisers: Catherine Travis, Li Nguyen


Australian Text Analytics Platform: New tools for text analysis

The main means of delivery for ATAP is Jupyter notebooks and this half-day workshop began with a brief introduction to notebooks for participants not already familiar with the technology. The main body of the workshop introduced two tools made available by ATAP, and the session ended with a short summary of other tools being developed in recent work.

When: Monday June 19 2023

Length: 3 hours

Facilitators: Simon Musgrave, Sam Hames, Ben Foley, Chao Sun

More information: This workshop was a satellite event of the 7th Conference of the International Society for the Linguistics of English (ISLE7) - see the conference website for further details.

Exploring Digital Text Collections with Juxtorpus: A Taster Webinar on the Latest ATAP Text Analysis Tool

Join us for a hybrid taster webinar on the latest addition to the suite of ATAP text analysis tools - Juxtorpus. Developed to provide a unified framework for managing and exploring text contents and metadata, Juxtorpus offers a Corpus package that enables flexible building, exploration, and slicing of your corpus while maintaining its shape, and a Jux package that allows for easy comparison and highlighting of differences between any two corpora with tools and visualisations that come off-the-shelf. During the webinar, we’ll also show you how to use other ATAP tools in combination with the Corpus to create a reusable workflow that will boost your analysis capabilities.

This 1.5-hour webinar will come with minimal hands-on opportunities, and we invite anyone interested in learning how to handle and analyse their digital text collections to join us. No programming knowledge or skills are required.

When: Thursday, 25th May 2023

Length: 90 minutes

Facilitator: Chao Sun

Jefferson Transcript Search Tool

The Search Tool project uses programming to explore how to easily search and manipulate transcripts without the need to ‘clean’ the transcript. A browser-based tool has been developed, designed to be used by researchers unfamiliar with programming.

The workshop was presented by Evelyn Ansell and is an outcome of her Career Development placement with Australia’s Academic and Research Network (AARNET). The Jupyter Notebook tool and this workshop have been developed during that placement.

Date: Friday 17 March 2023

Length: 90 minutes

Facilitator: Evelyn Ansell

A hands-on guide to Semantic Tagger for your text data analysis

The Australian Text Analytics Platform (ATAP) project is a project that aims to provide researchers with the tools and training for analysing, processing, and exploring text. As part of this project, we have adapted with permission, a Semantic Tagger, developed by the University Centre for Computer Corpus Research on Language (UCREL) at Lancaster University. This tool uses the Python Multilingual UCREL Semantic Analysis System (PyMUSAS) to tag your text data so that you can extract token level semantic tags from your text. In addition to the USAS tags, this tool can also recognize Multi Word Expressions (MWE), i.e., expressions formed by two or more words that behave like a unit such as ‘South Australia’, and identifies lemmas and Part-of-Speech (POS) tags in the text. For example, in the sentence ‘President Joe Biden attended two meetings today’, the tool will tag each token with its semantic tag like this -> ‘President Joe Biden’: MWE of [Personal names], ‘attended’: [Participating], ‘two’: [Number], ‘meetings’: [Participating] and ‘today’: [Time: Present; simultaneous]. This tool is available in both English and multi-lingual (Chinese, Italian and Spanish) versions and supports saving the results locally for further analysis, enabling you to gain meaningful insights into your research questions.

Date: Wednesday 22 March 2023

Length: 90 minutes

Facilitator: Sony Jufri

Australian Text Analytics Platform tools: Discursis, Juxtorpus, Quotation tool and Semantic tagger

This workshop was part of the USyd Digital Humanities Day 2023.

The workshop demonstrated and taught several recently or soon-to-be-released tools from the ATAP text analytic tool collection. These tools include Discursis for analysing human conversational texts, Juxtorpus for advanced corpus slicing and comparison, Semantic Tagger for semantically tagging every word in your text collections, and Quotation Tool for NLP algorithm-based quotation extraction, analysis, and visualisation.

Date: Tuesday 14 March 2023

Length: 3 hours

Facilitators: Staff of the Sydney Informatics Hub

HASS Research Data Commons and IRC Computational Skills Summer School

The Australian Research Data Commons (ARDC) through the HASS Research Data Commons and Indigenous Research Capability (HASS RDC and IRC Program) offered a Computational Skills Summer School in Sydney, February 7 and 8, 2023.

The Summer School featured skills development workshops to help researchers use the research infrastructure that is being created in the HASS RDC and IRC Program.

The projects from the HASS RDC and IRC Program presented workshops on using the tools and platforms.


Pre-conference workshop (before the 2022 Conference of the Australian Linguistic Society)

The Australian Text Analytics Platform and the Language Data Commons of Australia presented a day of workshop activities to give ALS conference delegates (and anyone else who is interested) the opportunity to learn more about the work of the two projects. The day included:

  • an overview of the projects and the work to date
  • reports on specific sub-projects
  • introductory workshops on the tools and resources being developed
  • a workshop on using Discursis, a tool for tracking topics in interactive use of language
  • the opportunity to influence future work by exploring and providing feedback on the data interface which we are building.

Further details (including full program)

Geolocating Australian Historical Resources

This workshop was part of the Australian Society of Archivists 2022 Conference

Date: October 20 2022

Length: 3 hours

Facilitators: Michael Niemann, Fiannuala Morgan (ANU), Simon Musgrave

Learn how to collect and analyse comments on YouTube videos using the open-source tools Youte and Discursis

Date: September 21 2022

Length: 3 hours

Facilitators: Boyd Nguyen (ADO), Sam Hames (ATAP)

Finding quotes and speakers in text using the ATAP quotation tools

Date: September 8 2022

Length: 1 hour

Facilitators: Sony Jufri

Advance care planning for your research data

Date: September 7 2022

Event: FAVeR Showcasing Approaches to Digital Humanities for Researchers

Length: 1 hour

Facilitators: Sam Hames, Ben Foley


Date: June 21 2022

Event: SICSS-Sydney

Length: 1 hour

Facilitators: Sam Hames, Ben Foley

Computational Thinking in the Humanities

The workshop Computational Thinking in the Humanities was a 3-hour online workshop featuring two plenary talks, lightning presentations, as well as a panel discussion. The workshop was co-organized by the Australian Text Analytics Platform (ATAP), FIN-CLARIAH and its UEF representatives, and the Australian Digital Observatory.

Date: September 1 2022

Length: 3 hours

Further details

Network analysis and Topic Modeling on Twitter data using R

Date: May 18 2022

Event: Joint event ADO and ATAP

Length: 3 hours

Facilitators: Alice Miller, Simon Musgrave

Monotreme Mania! Comparative text analytics on Twitter data

Date: 16 March 2022

Event: Joint event ADO and ATAP

Length: 3 hours

Facilitators: Sam Hames, Simon Musgrave

An introduction to Jupyter notebooks for text analysis: Virtual workshop for absolute beginners

Date: August 24 2022

Event: FAVeR Showcasing Approaches to Digital Humanities for Researchers

Length: 2 hours

Facilitators: Sara King, Simon Musgrave


Date: 27 July 2022

Event: Workshop for Sydney Corpus Lab

Length: 3 hours

Facilitators: Sara King, Simon Musgrave


Date: 24 November 2021

Event: Digital Humanities Australasia 2021 Conference

Length: 3 hours

Facilitators: Sara King, Simon Musgrave