Ata Nizamoglu, Lea Dahm, Talia Sari, Vera Schmitt, Salar Mohtaj, Sebastian Möller In Proceedings of the 10th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics
In 2015, The United Nations (UN) provided a blueprint for sustainable development in various domains. This Blueprint described 17 Sustainable Development Goals (SDGs), such as “No Poverty”, “Zero Hunger”, or “Gender Equality”. Subsequently, many companies have started publishing yearly sustainability reports, explaining their efforts with respect to the SDGs. However, the manual assessment of these reports is an infeasible task, and the automatic processing of text documents is necessary to aggregate information about the distribution of SDGs throughout various domains. In this research, we have developed and measured the performance of various natural language processing models from classical to transfer learning-based models to identify the targeted SDG in sustainability reports. Hereby, transformer-based models show the best performance for this task, especially BERT-based models, such as RoBERTa. The results show, that the approach of automatically processing text documents to classify SDGs in various documents is feasible and can be used to aggregate information about which SDGs are covered by which companies and industry domains.