Overview

Developed a transformer-based AI text detection model to distinguish between human and AI-generated academic writing, specifically engineered to address false positive bias in academic settings where students’ legitimate work was being wrongly flagged as AI-generated. Using RoBERTa with LoRA (Low-Rank Adaptation), achieved 99.6% accuracy while requiring only 0.82% trainable parameters compared to full fine-tuning approaches. Implemented a two-stage training methodology: initially trained on ~1,378 essays from Kaggle, then evaluated and fine-tuned on academic abstracts from the RAID dataset including adversarial examples designed to evade detection. Focused on minimizing unfair misclassification of real student work by applying aggressive 10:1 class weighting and optimizing for fairness-focused metrics like human accuracy and balanced accuracy, reducing false positives on human academic writing from 83.2% to 0.7% while maintaining 99.4% AI detection accuracy.


Next Steps

Building on our initial detection framework, I plan to continue this research independently to further explore the extent to which LLM-generated content can be effectively detected and contribute to the broader discussion on ethical AI usage in education.

Enhanced Detection Methods and Benchmarking

  • Conduct thorough analysis of existing detection models and methods, expanding beyond our initial three-model comparison to include state-of-the-art approaches and commercial tools
  • Implement the stylometric features identified in our future work, including perplexity and burstiness analysis, vocabulary richness metrics, and sentence complexity measurements to create more sophisticated hybrid detection models
  • Explore integration of traditional linguistic features with our transformer-based approach to improve robustness and generalizability

Comprehensive RAID Dataset Evaluation

  • Expand evaluation to the complete RAID test dataset beyond our focused subset of academic abstracts, testing across all domains including news articles, creative writing, technical documentation, and social media content
  • Submit results to the RAID leaderboard to benchmark our approach against other state-of-the-art detection methods and establish comparative performance metrics
  • Analyze performance variations across different text domains and LLM source models to understand generalization capabilities

Robustness and Real-World Application

  • Evaluate detection performance against newer and evolving LLM architectures to ensure sustained effectiveness as AI text generation continues to advance
  • Investigate adversarial robustness by testing against AI-generated content specifically designed to evade detection systems
  • Develop practical implementation tools including web interfaces or API endpoints that educators could integrate into existing academic workflows

Ethical AI and Academic Integrity

  • Conduct comprehensive fairness audits to ensure equitable performance across diverse student populations and writing styles, addressing potential biases in detection accuracy
  • Implement explainable AI features to help educators understand detection decisions and support more informed academic integrity assessments
  • Explore confidence scoring mechanisms that provide nuanced predictions rather than binary classifications, better serving real-world educational decision-making