Kunal Deokar — Data Analyst

About Me

A bit about who I am and what drives my work

I'm a data analyst who sits at the intersection of financial data operations and analytics engineering. My day-to-day involves handling complex client data migrations, building Python tools that automate what used to be manual, and making messy real-world data usable for wealth management platforms. I work closely with Data Solution Consultants (DSCs), clients, and cross-functional teams — translating business requirements into reliable data pipelines and bridging the gap between operations and delivery.

I believe the most valuable skill isn't knowing every function by heart — it's knowing what to build and why, then figuring out the how. I use AI, code assistants, and the internet as tools, the same way engineers use IDEs and documentation.

Outside work: music team head in college cultural group Rangbhumi, occasional interviewer at Addepar, and always up for a conversation about cricket or financial markets.

Years in financial data

500+

Client accounts migrated

5,000+

Batch logs automated

40%

Triage effort reduced at FIS

Experience

Where I've worked and what I've built

Technical Data Analyst

Addepar, Pune

July 2025 – Present

Designed and deployed the HDC Security and Account Agent — an AI-powered Databricks App (Streamlit + Claude API via Databricks Model Serving) running on US and EU production; built a multi-step waterfall matching engine (ISIN, CUSIP, SEDOL, fuzzy-name), an ML-based model-type inference engine (~85–90% auto-suggestion rate), and a natural language chat interface with 8 tool definitions — cutting hours of manual cross-referencing per client onboarding
Built a UBS Data Engineering Framework from scratch for KI Capitals including a Data Quality and Cutover Readiness module and an Optimization Logic Engine that picks transformation paths based on legacy performance signatures — reduced effort for all future UBS migrations by 40–60%
Led end-to-end data migration for Fortis Advisors (366 high-complexity accounts, Black Diamond legacy) in Databricks; refactored core transformation pipeline with modular logic and upstream quality checks — delivered production-ready imports in ~4 weeks vs. 5–8 week team benchmark
Built a config-driven Batch Processing and Precision Mapping Framework in Python (Pandas, Regex, NumPy, OpenPyXL) for JCF Montreal — processed 132 structurally inconsistent Excel tabs with formula-trapped values; zero errors in production sample load across 56 accounts
Handled end-to-end historical data conversion for Clarity Partners SA (69 accounts, EU production): ingested 70 source files, mapped 621 ISINs, reverse-engineered 6 net value calculation patterns, and built a 972-file pricing pipeline delivering 195,329 prices with full duplicate resolution audit trail
Mentored 2–3 incoming Technical Data Analysts through structured onboarding and domain ramp-up; conducted 4–5 candidate interviews for the Lead role
Collaborated daily with Data Solution Consultants (DSCs) and clients to clarify data requirements, resolve mapping ambiguities, and align on delivery timelines across active migration projects

Software Engineer I

FIS (Fidelity Information Services), Pune

June 2022 – June 2025

Automated parsing of 5,000+ batch job execution logs using Python (Regex + Pandas) to detect SLA breaches, classify error codes (RC/ABEND/JCL), and score job instability — increasing triage efficiency by 40%
Built a Streamlit dashboard for job failure pattern analysis and SLA trend monitoring; integrated SQL outputs with Excel reports for root cause tracking used directly by leadership — reduced weekly reporting effort by 60%
Developed internal data quality tools for financial batch processes covering portfolio records, cash flow statements, and position data; delivered automated weekly reports reducing manual operations checks by 60%+
Collaborated daily with Client Service Representatives (CSRs), Business Analysts (BAs), and end clients to gather requirements, resolve data discrepancies, and coordinate production deployments — acting as the technical liaison between operations and business teams
Actively participated in corporate social responsibility (CSR) initiatives throughout tenure, representing the organisation in community engagement programmes

Lead Intern — Technology Platform Development

EvolvingX, Pune

Sep 2020 – Dec 2021

Led a team of 3 frontend developers managing delivery timelines, code quality, and architectural decisions for a product web application
Built a reusable React.js component library; contributed to database schema design and ran user research activities including persona creation and journey mapping

Projects

Personal projects built end-to-end — data, ML, dashboards, and deployed apps

Financial Data Reconciliation Engine

↗ Live ⌥ Code

Reconciliation tool for comparing two financial datasets - configurable join keys, per-field tolerance thresholds (absolute, %, basis points), break severity ranking, and a downloadable 8-sheet Excel report. Three pre-built use cases: positions, prices, transactions.

Python Pandas Streamlit scikit-learn

Financial Data Pipeline & Quality Monitor

↗ Live ⌥ Code

Data pipeline for three financial datasets with configurable quality checks (nulls, range, schema, cross-field, duplicates), transformation with derived metrics, and SQLite storage. Dashboard tracks run history, pass rates per dataset, and lets you explore the processed data.

Python Pandas Streamlit SQLite

Client Data Onboarding Engine

↗ Live ⌥ Code

Data quality tool for financial client onboarding files. Upload any Excel file and get 12 automated checks covering schema, identifiers, formatting, and business rules, with a readiness score and downloadable report.

Python Pandas Streamlit

Hinglish Text Normalization & Sentiment Analysis

↗ Live ⌥ Code

NLP pipeline for Hinglish (code-mixed Hindi-English) text. Handles token-level language detection, transliteration normalisation, TF-IDF features, and Logistic Regression classification across 3,000 social media comments.

Python NLP ML Chart.js

Job Failure Analytics Dashboard

↗ Live ⌥ Code

Python pipeline on 5,000 simulated batch job logs: SLA breach detection, error code classification via Regex, job instability scoring (60/40 weighted formula), and an interactive dashboard. Directly inspired by real monitoring work at FIS.

Python Pandas Chart.js

Twitter Sentiment Analyzer

↗ Live ⌥ Code

End-to-end sentiment classification on 3,000 tweets across 5 topics using TF-IDF + Logistic Regression. 95% test accuracy, 0.997 ROC-AUC, 5-fold CV validation. Full evaluation including confusion matrix and top discriminative features per class.

Python scikit-learn NLP Chart.js

Hi, I'm Kunal Deokar