Hi, I'm Emad.
I build intelligent systems.
I design and build end-to-end ML systems — from large-scale annotation pipelines and entity resolution to LLM-powered summarization and agentic AI products for compliance and due diligence workflows. Seven years of industry experience, primarily in regulated enterprise environments.
Experience
Download Resume PDFSenior Machine Learning Engineer II
Thomson Reuters
Toronto, Canada
Details & Impact
Senior Research Engineer at TR Labs, Thomson Reuters' applied research division, focused on CLEAR — the company's flagship due diligence platform for KYC/AML screening and investigations. Built production ML systems across the full lifecycle — including a commercially launched agentic investigation product, LLM-powered summarization features, and large-scale data infrastructure — in a regulated environment where model accuracy has direct compliance consequences.
NLP & Machine Learning Engineer
INAGO INC.
Toronto, Canada
Details & Impact
Built NLP models for language understanding and automated text generation, including fine-tuning BERT and T5 for domain-specific tasks. Managed the full training lifecycle and led collaborative research projects with university partners.
Education
M.Sc. in Computer Science
York University
Toronto, Canada
GPA: 8.17 / 9
Thesis: Interactive Question Answering Using Frame-based Knowledge Representation
B.Sc. in Computer Engineering
Amirkabir University of Technology
Tehran, Iran
GPA: 17.18 / 20
Technical Skills
Projects
Agentic Investigation System — CLEAR Investigate
2025 - 2026Contributed to building CLEAR Investigate, Thomson Reuters' first agentic AI product, live in production for customers. Designed the agent experimentation architecture using PydanticAI for rapid prototyping of multi-step workflows across entity search, report retrieval, web search, and an internal knowledge graph. Implemented LLM caching (~25% cost reduction), automated LLM-as-judge evaluation pipelines, and SME annotation workflows for ground-truth benchmarking.
CLEAR Business AI — GenAI Report Summarization
2024Built the AI-powered summary panel for CLEAR Business entity reports, live in production processing hundreds of reports daily. Designed a selective XML parsing engine and multi-prompt LLM architecture with separate calls for business overview, risk analysis, and social media discovery — with Bing Search integration and a verification LLM call to filter false positives. Implemented citation linking to source locations and ran SME annotation and evaluation rounds before launch.
Entity Resolution Data Infrastructure — CLEAR KYC/AML Platform
2022 - 2024Designed and built the data engineering infrastructure for an entity resolution system operating across ~800M entities and billions of documents at 700 idents/second. Architected a PII-isolated AWS environment and a unified versioned schema reconciling two incompatible annotation sources spanning hundreds of thousands of labeled records. Built Spark and Python pipelines for multi-format merging, conflict signal surfacing, and model experimentation, later automated into SageMaker Pipelines.
Semantic Search Improvement — Checkpoint Tax Research
2021Shipped a semantic search improvement for Checkpoint, Thomson Reuters' tax research platform, improving access to conceptual documents by 95%. Built a query intent classifier using sentence embeddings to dynamically promote higher-level content for general queries, delivered to production within a ~100ms latency budget.
Automated Question Generation from Documents
2020Fine-tuned T5 transformer for automated question generation, cutting manual data curation effort by 40%. Experimented with model input representations and evaluated using BLEURT as part of a university collaborative research project.
Domain-Specific Language Understanding Engine
2019Trained domain-specific Word2Vec embeddings and LSTM-based NLU models with interpretability testing to improve language understanding accuracy and model transparency.
Conversational Question Answering
2018Built a domain-specific question answering dialogue system using syntactic and semantic document analysis and ontology generation, as part of a collaborative industry research project.
Publications
Question-worthy Sentence Selection for Question Generation
Interactive Question Answering Using Frame-based Knowledge Representation
Time Aware Topic-based Recommender System
A Study on Prediction of User's Tendency Toward Purchases in Websites based on Behavior Models