Instruction-Tuned LLMs for Event Extraction

Research on enhancing event extraction using instruction-tuned large language models (LLMs), optimizing annotation guidelines, and improving fine-tuning techniques for NLP tasks.

---
Research conducted at George Mason University (GMU) - Natural Language Processing Lab
Research Supervisor: Professor Ziyu Yao
Conference Submission: ACL 2025
---

📜 Research Paper & Resources


🛠 Tech Stack & Tools

  • Machine Learning & NLP: LLaMA-3.1, Hugging Face Transformers, PyTorch, LoRA Fine-Tuning, Unsloth, Quantization
  • Data Processing: JSON, Python (Pandas, Numpy)
  • GPU Resources: HPC, CUDA
  • Evaluation Metrics: Precision, Recall, F1-score

📖 Research Overview

This research explores the enhancement of event extraction (EE) tasks by leveraging instruction-tuned large language models (LLMs). Traditional event extraction models struggle with limited training data, ambiguous event definitions, and scalability. To address these challenges, our approach:

  • Synthesizes annotation guidelines for 500+ event types and 4000+ argument structures
  • Fine-tunes LLaMA-3.1 8B using LoRA and structured regularization techniques
  • Develops an evaluation framework for optimizing inference speed and cost efficiency
  • Reduces hallucinations by transforming text-based evaluations into code-based prompt optimization

This work aims to improve F1-score performance on event extraction tasks and enhance the reliability of LLM-generated event predictions.


📊 Major Contributions

1. Annotation Guideline Optimization

  • Developed a structured framework for creating high-quality event extraction annotation guidelines.
  • Synthesized event schema covering 500+ event types and 4000+ argument structures using GPT-4o.

2. LLM Fine-Tuning & Optimization

  • Implemented Low-Rank Adaptation (LoRA) for fine-tuning LLaMA-3.1 8B on event extraction tasks using unsloth library.
  • Improved model efficiency using structured regularization techniques.

3. Cost-Effective & Low-Latency Inference Pipelines

  • Designed a scalable NLP pipeline using GPU-optimized environments.
  • Optimized inference by reducing hallucination through implicit prompt engineering guided by annotations for each event type.

4. Performance Improvement

  • Achieved a 10% increase in F1-score compared to traditional event extraction baselines.
  • Enhanced the generalization ability of instruction-tuned LLMs across multiple datasets.

Future Work & Applications

Expanding dataset coverage to include multi-domain event extraction tasks
Exploring multimodal event extraction by integrating text, images, and video content


For collaboration or inquiries, feel free to reach out via LinkedIn or Email.