About Me

I architect, train, and deploy multimodal AI systems spanning 2D/3D Generative AI, LLMs, and Agentic AI. As an AI Scientist & Engineer with a Ph.D. in Computer Science from the University of Western Australia, I bridge research and production β€” from fine-tuning the Qwen 3.5 model family on geological domain data to building real-time video analytics pipelines and environmental monitoring platforms.

  • Generative AI & Computer Vision β€” Diffusion models, 3D Gaussian Splatting, object detection, video analytics; 130% throughput gains, 5.6Γ— model compression, 82% GPU memory reduction via custom CUDA kernels
  • LLMs & Agentic AI β€” LLM fine-tuning (LoRA), RAG pipelines, Agentic AI systems, document data extraction from 1000+ geological PDFs
  • Scalable ML β€” Production pipelines serving 50M+ users, synthetic data generation reducing labelling costs by 60–75%
  • Research β€” 10+ peer-reviewed papers in ICRA, IEEE TMM, IEEE Access, Int. J. Remote Sensing. Industry experience at Dolby Laboratories and Novarc Technologies

Selected Research & Projects

Submitted
Kangaroo thumbnail

Text-to-Skeleton Cascades for Controllable Complex Human Motion Video Generation

Ashkan Taghipour, Morteza Ghahremani, Zinuo Li, Hamid Laga, Farid Boussaid, Mohammed Bennamoun

Project Page

ICRA 2026
SVR-GS thumbnail

SVR-GS: Spatially Variant Regularization for Probabilistic Masks in 3D Gaussian Splatting

Ashkan Taghipour, Vahid Naghshin, Benjamin Southwell, Farid Boussaid, Hamid Laga, Mohammed Bennamoun

This work was conducted during my research internship at Dolby.

Project Page Β |Β  Code Β |Β  Short Video

IEEE TMM
BoxIt2BindIt thumbnail

Box It to Bind It: Unified Layout Control and Attribute Binding in T2I Diffusion Model

Ashkan Taghipour, Morteza Ghahremani, Mohammed Bennamoun, Aref Miri Rekavandi, Hamid Laga, Farid Boussaid

Code Β |Β  Short Video

IEEE Access
Faster I2V thumbnail

Faster image2video generation: A closer look at clip image embedding’s impact on spatio-temporal cross-attentions

Ashkan Taghipour, Morteza Ghahremani, Aref Miri Rekavandi, Z Li, Mohammed Bennamoun, Hamid Laga, Farid Boussaid

Short Video

LLM Project
GeoLLM thumbnail

GeoLLM β€” Domain-Specific LLM Fine-Tuning

End-to-end pipeline extracting structured QA datasets from 1000+ geological PDFs using OCR, then fine-tuning Qwen 3.5 models (0.8B–27B) with LoRA for domain-specific reasoning.

Code Β |Β  Live Demo Β |Β  Dataset Β |Β  Models

Data Science
MineWatchAI thumbnail

MineWatchAI β€” Mining Rehabilitation Monitoring

End-to-end application for monitoring vegetation rehabilitation at WA mining sites using Sentinel-2 imagery, vegetation indices (NDVI, SAVI, EVI), and automated compliance reporting.

Live Demo

Fintech
WealthPathAU thumbnail

WealthPathAU β€” Investment Portfolio Simulator

ASX investment simulator with Monte Carlo projections, historical backtesting, and risk-based portfolio allocation serving Australian retail investors.

Live Demo


Experience

AI/ML Engineer at Novarc Technologies

Apr 2024 – Sep 2025 (Part-Time, Remote) β€” Vancouver, Canada

  • Built deep learning pipelines for real-time video analytics, improving throughput by 130% (15 to 35+ FPS)
  • Designed synthetic data generation pipelines, reducing labelling costs by 60–75%
  • Applied object detection and segmentation with 4–8% accuracy gains through edge-case analysis
  • Designed multimodal conditioning frameworks for complex video understanding tasks

Research Intern β€” Advanced Technology Group at Dolby Laboratories

May 2025 – Sep 2025 β€” Sydney, Australia

  • Developed 3D scene reconstruction pipeline using Gaussian Splatting (SVR-GS, accepted ICRA 2026)
  • Achieved 5.6Γ— model compression and 82% GPU memory reduction via custom CUDA kernels
  • Optimised models for real-time inference on consumer hardware

Computer Vision Researcher (Ph.D.) at University of Western Australia

Apr 2023 – Mar 2026 β€” Perth, Australia

  • Published 10+ peer-reviewed papers in ICRA, IEEE TMM, IEEE Access, and Int. J. Remote Sensing
  • Trained multi-billion parameter video models on 10TB+ datasets using distributed multi-GPU computing
  • Built annotation, evaluation, and data quality pipelines for deep learning at scale

Innovation Center Manager β€” ML & Data Science at MTN Group

Apr 2021 – Apr 2023 β€” Tehran, Iran

  • Built predictive analytics models using PySpark and Databricks, serving 50+ million users
  • Designed dashboards and reports for 15+ stakeholders, reducing report generation time by 40%
  • Led end-to-end ML projects from concept through model training and handoff to engineering teams