Florida International University
Mustafa Ocal is a Ph.D. candidate at the Knight Foundation School of Computing and Information Sciences, Florida International University, under the supervision of Professor Mark A. Finlayson. He conducts research at the Cognition, Narrative, and Culture Laboratory (Cognac Lab). His research interest lies in Natural Language Processing, with a focus on Temporal Information Extraction and Temporal Reasoning. He holds an M.S. degree in Computer Science from Florida International University in 2021, and a B.S. degree in Computer Engineering from Gazi University, Turkey in 2017.
Narratives are sequence of events and one important way to understand narratives is to interpret events’ order (i.e. timeline). However, timelines are not explicit in texts and cannot be directly read off from texts. Instead, texts reveal partial orderings of events and times. Such information can be used to construct a temporal graph using TimeML temporal representation language.
The first part of my work has focused on timeline extraction from TimeML annotations. Prior approaches have presented machine learning-based systems which have certain limitations such as imperfect scores, ignoring subordinated relations, and being unable to handle all types of temporal relations. I addressed these issues and presented a CSP-based solution that achieved state-of-the-art performance.
TimeML annotations can be generated either manually or by automatic TimeML annotators. However, manual annotations contain human-made errors. In the second part of my work, I built a system to detect and fix errors in the gold standard annotations. I tested the system on the TimeBank corpus and provided corrections for the entire corpus.
In the third part of my work, I developed a novel suite of methods to evaluate the performance of automatic TimeML annotators and measured the information loss during the automatic process. I presented eight metrics and evaluated four state-of-the-art automatic annotation tools.
In the last step, I successfully implemented a duration extraction system. This work resulted in a large dataset that contains hundreds of thousands of possible event durations. Combining this work with the timeline extraction system, I was able to extract the duration of the entire narratives.