Huizhi Liang

Also published as: HuiZhi Liang


2022

pdf bib
UoR-NCL at SemEval-2022 Task 3: Fine-Tuning the BERT-Based Models for Validating Taxonomic Relations
Thanet Markchom | Huizhi Liang | Jiaoyan Chen
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

In human languages, there are many presuppositional constructions that impose a constrain on the taxonomic relations between two nouns depending on their order. These constructions create a challenge in validating taxonomic relations in real-world contexts. In SemEval2022-Task3 Presupposed Taxonomies: Evaluating Neural Network Semantics (PreTENS), the organizers introduced a task regarding validating the taxonomic relations within a variety of presuppositional constructions. This task is divided into two subtasks: classification and regression. Each subtask contains three datasets in multiple languages, i.e., English, Italian and French. To tackle this task, this work proposes to fine-tune different BERT-based models pre-trained on different languages. According to the experimental results, the fine-tuned BERT-based models are effective compared to the baselines in classification. For regression, the fine-tuned models show promising performance with the possibility of improvement.

pdf bib
UoR-NCL at SemEval-2022 Task 6: Using ensemble loss with BERT for intended sarcasm detection
Emmanuel Osei-Brefo | Huizhi Liang
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Sarcasm has gained notoriety for being difficult to detect by machine learning systems due to its figurative nature. In this paper, Bidirectional Encoder Representations from Transformers (BERT) model has been used with ensemble loss made of cross-entropy loss and negative log-likelihood loss to classify whether a given sentence is in English and Arabic tweets are sarcastic or not. From the results obtained in the experiments, our proposed BERT with ensemble loss achieved superior performance when applied to English and Arabic test datasets. For the validation dataset, our model performed better on the Arabic dataset but failed to outperform the baseline method (made of BERT with only a single loss function) when applied on the English validation set.

2021

pdf bib
UoR at SemEval-2021 Task 4: Using Pre-trained BERT Token Embeddings for Question Answering of Abstract Meaning
Thanet Markchom | Huizhi Liang
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

Most question answering tasks focuses on predicting concrete answers, e.g., named entities. These tasks can be normally achieved by understanding the contexts without additional information required. In Reading Comprehension of Abstract Meaning (ReCAM) task, the abstract answers are introduced. To understand abstract meanings in the context, additional knowledge is essential. In this paper, we propose an approach that leverages the pre-trained BERT Token embeddings as a prior knowledge resource. According to the results, our approach using the pre-trained BERT outperformed the baselines. It shows that the pre-trained BERT token embeddings can be used as additional knowledge for understanding abstract meanings in question answering.

pdf bib
UoR at SemEval-2021 Task 7: Utilizing Pre-trained DistilBERT Model and Multi-scale CNN for Humor Detection
Zehao Liu | Carl Haines | Huizhi Liang
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

Humour detection is an interesting but difficult task in NLP. Because humorous might not be obvious in text, it can be embedded into context, hide behind the literal meaning and require prior knowledge to understand. We explored different shallow and deep methods to create a humour detection classifier for task 7-1a. Models like Logistic Regression, LSTM, MLP, CNN were used, and pre-trained models like DistilBert were introduced to generate accurate vector representation for textual data. We focused on applying multi-scale strategy on modelling, and compared different models. Our best model is the DistilBERT+MultiScale CNN, it used different sizes of CNN kernel to get multiple scales of features, which achieved 93.7% F1-score and 92.1% accuracy on the test set.

pdf bib
UOR at SemEval-2021 Task 12: On Crowd Annotations; Learning with Disagreements to optimise crowd truth
Emmanuel Osei-Brefo | Thanet Markchom | Huizhi Liang
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

Crowdsourcing has been ubiquitously used for annotating enormous collections of data. However, the major obstacles to using crowd-sourced labels are noise and errors from non-expert annotations. In this work, two approaches dealing with the noise and errors in crowd-sourced labels are proposed. The first approach uses Sharpness-Aware Minimization (SAM), an optimization technique robust to noisy labels. The other approach leverages a neural network layer called softmax-Crowdlayer specifically designed to learn from crowd-sourced annotations. According to the results, the proposed approaches can improve the performance of the Wide Residual Network model and Multi-layer Perception model applied on crowd-sourced datasets in the image processing domain. It also has similar and comparable results with the majority voting technique when applied to the sequential data domain whereby the Bidirectional Encoder Representations from Transformers (BERT) is used as the base model in both instances.

2020

pdf bib
UoR at SemEval-2020 Task 4: Pre-trained Sentence Transformer Models for Commonsense Validation and Explanation
Thanet Markchom | Bhuvana Dhruva | Chandresh Pravin | Huizhi Liang
Proceedings of the Fourteenth Workshop on Semantic Evaluation

SemEval Task 4 Commonsense Validation and Explanation Challenge is to validate whether a system can differentiate natural language statements that make sense from those that do not make sense. Two subtasks, A and B, are focused in this work, i.e., detecting against-common-sense statements and selecting explanations of why they are false from the given options. Intuitively, commonsense validation requires additional knowledge beyond the given statements. Therefore, we propose a system utilising pre-trained sentence transformer models based on BERT, RoBERTa and DistillBERT architectures to embed the statements before classification. According to the results, these embeddings can improve the performance of the typical MLP and LSTM classifiers as downstream models of both subtasks compared to regular tokenised statements. These embedded statements are shown to comprise additional information from external resources which help validate common sense in natural language.

pdf bib
UoR at SemEval-2020 Task 8: Gaussian Mixture Modelling (GMM) Based Sampling Approach for Multi-modal Memotion Analysis
Zehao Liu | Emmanuel Osei-Brefo | Siyuan Chen | Huizhi Liang
Proceedings of the Fourteenth Workshop on Semantic Evaluation

Memes are widely used on social media. They usually contain multi-modal information such as images and texts, serving as valuable data sources to analyse opinions and sentiment orientations of online communities. The provided memes data often face an imbalanced data problem, that is, some classes or labelled sentiment categories significantly outnumber other classes. This often results in difficulty in applying machine learning techniques where balanced labelled input data are required. In this paper, a Gaussian Mixture Model sampling method is proposed to tackle the problem of class imbalance for the memes sentiment classification task. To utilise both text and image data, a multi-modal CNN-LSTM model is proposed to jointly learn latent features for positive, negative and neutral category predictions. The experiments show that the re-sampling model can slightly improve the accuracy on the trial data of sub-task A of Task 8. The multi-modal CNN-LSTM model can achieve macro F1 score 0.329 on the test set.

2016

pdf bib
UNIMELB at SemEval-2016 Tasks 4A and 4B: An Ensemble of Neural Networks and a Word2Vec Based Model for Sentiment Classification
Steven Xu | HuiZhi Liang | Timothy Baldwin
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
UniMelb at SemEval-2016 Task 3: Identifying Similar Questions by combining a CNN with String Similarity Measures
Timothy Baldwin | Huizhi Liang | Bahar Salehi | Doris Hoogeveen | Yitong Li | Long Duong
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
RoseMerry: A Baseline Message-level Sentiment Classification System
Huizhi Liang | Richard Fothergill | Timothy Baldwin
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)