§ 00 — LOADING STACK
◆
LangGraphLangGraph
Azure OpenAIAzure OpenAI
QdrantQdrant
Arize PhoenixArize Phoenix
LLangfuse
MMastra
Next.jsNext.js
SSupabase
LANGGRAPH ◆ AZURE ◆ QDRANT ◆ ARIZE ◆ LANGFUSE ◆ MASTRA ◆ NEXT.JS ◆ SUPABASE ◆ LANGGRAPH ◆ AZURE ◆ QDRANT ◆ ARIZE ◆ LANGFUSE ◆ MASTRA ◆ NEXT.JS ◆ SUPABASE
KKomal Vardhan.
HomeWorkAboutWritingResourcesContact
HomeWorkWritingResourcesAboutContact
Build like an engineer. Teach like a friend.

© 2026 Komal Vardhan Lolugu

Sitemap
  • Home
  • Work
  • About
  • Writing
  • Contact
  • Resources
Elsewhere
  • LinkedIn · 3.5K
  • Medium · Writing
  • Instagram
  • GitHub
  • Topmate
Newsletter

A field note every other Sunday. No hype, no AI spam. Unsubscribe anytime.

Designed & built by Komal. Made in India.
← All work
2023 · MLInternalHexaware

Employee Referral System

AI-driven tool that matches candidates to open roles based on parsed résumés and skill graphs. Replaced a manual screening workflow for a 10k-employee org.

5d→2hTime-to-first-feedback cut
10kEmployee org — scale of deployment
0Manual first-pass reviews after launch
↑Referral quality improved via feedback loop
§ 01

The Problem

The referral pipeline at Hexaware was entirely manual — HR reviewed hundreds of PDFs per week to decide whether a referred candidate fit a JD. The process was slow (avg 5 days to first feedback), inconsistent, and created a poor referrer experience that reduced future referral quality.

§ 02

The Solution

Built an ML-driven matching engine that parses uploaded résumés (PDF/DOCX), extracts skills and experience vectors, and scores them against structured JDs using cosine similarity + a gradient-boosted ranking model. Recruiters get a ranked shortlist with per-candidate fit explanations. Referrers get a match score immediately on upload. Replaced the manual first-pass entirely, cutting time-to-first-feedback from 5 days to under 2 hours.

§ 02b

How it works

01
Résumé upload

Employee uploads a PDF/DOCX referral. spaCy parses it, extracting skills, experience, and education as structured entities.

02
Feature vectors

Skills and experience are embedded as vectors. JDs are pre-vectorised and stored. Cosine similarity gives a first-pass score.

03
XGBoost ranking

Gradient-boosted model re-ranks candidates using learned recruiter preference signals from historical accept/reject data.

04
Explained shortlist

Recruiter dashboard shows ranked candidates with per-skill fit explanations. Feedback loop feeds back into the model.

§ 03

What I Learnt

  • 01

    Résumé parsing quality is the highest-leverage problem — garbage in means garbage match scores regardless of model sophistication.

  • 02

    Explainability matters as much as accuracy for recruiter adoption: 'this candidate scored 0.82' means nothing; 'missing: distributed systems, has: Python, ML' gets buy-in.

  • 03

    Building an internal feedback loop (recruiter accept/reject signals) let the model improve over time without a separate labelling workflow.

  • 04

    React dashboards for non-technical users need to ruthlessly hide complexity — early versions had too many knobs.

§ 04

Technologies Used

PythonPython

ML pipeline — parsing, feature extraction, ranking model

scikit-learn / XGBoostscikit-learn / XGBoost

Gradient-boosted ranking model

spaCyspaCy

NLP résumé parsing and skill entity extraction

ReactReact

Recruiter dashboard and referral portal

PostgreSQLPostgreSQL

Candidate and match history storage

PythonPython
scikit-learn / XGBoostscikit-learn / XGBoost
spaCyspaCy
ReactReact
PostgreSQLPostgreSQL
← All workWork together ↗
← PreviousAgri GuruNext →LLM Monitoring Dashboard