Hi, I'm Kunal Deokar

Technical Data Analyst at Addepar, Pune — building data pipelines, analytics tools, and AI-assisted workflows for financial data. 4+ years across financial services, collaborating daily with Data Solution Consultants (DSCs), clients, and cross-functional teams to deliver production-ready data solutions.

Kunal Deokar
About Me

A bit about who I am and what drives my work

I'm a data analyst who sits at the intersection of financial data operations and analytics engineering. My day-to-day involves handling complex client data migrations, building Python tools that automate what used to be manual, and making messy real-world data usable for wealth management platforms. I work closely with Data Solution Consultants (DSCs), clients, and cross-functional teams — translating business requirements into reliable data pipelines and bridging the gap between operations and delivery.

I believe the most valuable skill isn't knowing every function by heart — it's knowing what to build and why, then figuring out the how. I use AI, code assistants, and the internet as tools, the same way engineers use IDEs and documentation.

Outside work: music team head in college cultural group Rangbhumi, occasional interviewer at Addepar, and always up for a conversation about cricket or financial markets.

4+
Years in financial data
500+
Client accounts migrated
5,000+
Batch logs automated
40%
Triage effort reduced at FIS
Skills

Tools and technologies I work with regularly

Languages
Python SQL Git HTML/CSS
Libraries & Frameworks
Pandas NumPy scikit-learn Matplotlib Streamlit Regex
Platforms & Tools
Databricks Unity Catalog VS Code Jupyter GitHub Jira
Databases
MySQL PostgreSQL Delta Lake
AI & ML
LLM Integration NLP TF-IDF Logistic Regression Model Serving
Domain
Financial Data Migration ETL Pipelines Data Quality Batch Processing SLA Monitoring
Experience

Where I've worked and what I've built

Technical Data Analyst
Addepar, Pune
July 2025 – Present
  • Designed and deployed the HDC Security and Account Agent — an AI-powered Databricks App (Streamlit + Claude API via Databricks Model Serving) running on US and EU production; built a multi-step waterfall matching engine (ISIN, CUSIP, SEDOL, fuzzy-name), an ML-based model-type inference engine (~85–90% auto-suggestion rate), and a natural language chat interface with 8 tool definitions — cutting hours of manual cross-referencing per client onboarding
  • Built a UBS Data Engineering Framework from scratch for KI Capitals including a Data Quality and Cutover Readiness module and an Optimization Logic Engine that picks transformation paths based on legacy performance signatures — reduced effort for all future UBS migrations by 40–60%
  • Led end-to-end data migration for Fortis Advisors (366 high-complexity accounts, Black Diamond legacy) in Databricks; refactored core transformation pipeline with modular logic and upstream quality checks — delivered production-ready imports in ~4 weeks vs. 5–8 week team benchmark
  • Built a config-driven Batch Processing and Precision Mapping Framework in Python (Pandas, Regex, NumPy, OpenPyXL) for JCF Montreal — processed 132 structurally inconsistent Excel tabs with formula-trapped values; zero errors in production sample load across 56 accounts
  • Handled end-to-end historical data conversion for Clarity Partners SA (69 accounts, EU production): ingested 70 source files, mapped 621 ISINs, reverse-engineered 6 net value calculation patterns, and built a 972-file pricing pipeline delivering 195,329 prices with full duplicate resolution audit trail
  • Mentored 2–3 incoming Technical Data Analysts through structured onboarding and domain ramp-up; conducted 4–5 candidate interviews for the Lead role
  • Collaborated daily with Data Solution Consultants (DSCs) and clients to clarify data requirements, resolve mapping ambiguities, and align on delivery timelines across active migration projects
Software Engineer I
FIS (Fidelity Information Services), Pune
June 2022 – June 2025
  • Automated parsing of 5,000+ batch job execution logs using Python (Regex + Pandas) to detect SLA breaches, classify error codes (RC/ABEND/JCL), and score job instability — increasing triage efficiency by 40%
  • Built a Streamlit dashboard for job failure pattern analysis and SLA trend monitoring; integrated SQL outputs with Excel reports for root cause tracking used directly by leadership — reduced weekly reporting effort by 60%
  • Developed internal data quality tools for financial batch processes covering portfolio records, cash flow statements, and position data; delivered automated weekly reports reducing manual operations checks by 60%+
  • Collaborated daily with Client Service Representatives (CSRs), Business Analysts (BAs), and end clients to gather requirements, resolve data discrepancies, and coordinate production deployments — acting as the technical liaison between operations and business teams
  • Actively participated in corporate social responsibility (CSR) initiatives throughout tenure, representing the organisation in community engagement programmes
Lead Intern — Technology Platform Development
EvolvingX, Pune
Sep 2020 – Dec 2021
  • Led a team of 3 frontend developers managing delivery timelines, code quality, and architectural decisions for a product web application
  • Built a reusable React.js component library; contributed to database schema design and ran user research activities including persona creation and journey mapping
Projects

Personal projects built end-to-end — data, ML, dashboards, and deployed apps

Reconciliation tool for comparing two financial datasets - configurable join keys, per-field tolerance thresholds (absolute, %, basis points), break severity ranking, and a downloadable 8-sheet Excel report. Three pre-built use cases: positions, prices, transactions.

Python Pandas Streamlit scikit-learn

Data pipeline for three financial datasets with configurable quality checks (nulls, range, schema, cross-field, duplicates), transformation with derived metrics, and SQLite storage. Dashboard tracks run history, pass rates per dataset, and lets you explore the processed data.

Python Pandas Streamlit SQLite

Data quality tool for financial client onboarding files. Upload any Excel file and get 12 automated checks covering schema, identifiers, formatting, and business rules, with a readiness score and downloadable report.

Python Pandas Streamlit

NLP pipeline for Hinglish (code-mixed Hindi-English) text. Handles token-level language detection, transliteration normalisation, TF-IDF features, and Logistic Regression classification across 3,000 social media comments.

Python NLP ML Chart.js

Python pipeline on 5,000 simulated batch job logs: SLA breach detection, error code classification via Regex, job instability scoring (60/40 weighted formula), and an interactive dashboard. Directly inspired by real monitoring work at FIS.

Python Pandas Chart.js

End-to-end sentiment classification on 3,000 tweets across 5 topics using TF-IDF + Logistic Regression. 95% test accuracy, 0.997 ROC-AUC, 5-fold CV validation. Full evaluation including confusion matrix and top discriminative features per class.

Python scikit-learn NLP Chart.js
Education & Certifications

B.E. in Computer Engineering
Honours in Data Science
SPPU, Pune
2018 – 2022
Python for Data Science — Coursera
Data Science Foundations — IBM
Machine Learning — Stanford / Coursera
SQL for Data Analysis — Mode Analytics
Data Engineering with Databricks — Databricks Academy
Get in Touch

Open to interesting conversations, collaborations, and the right opportunities

I'm always happy to connect with people working on interesting data problems — especially in financial services, analytics, or anything involving Python and messy real-world data.

Based in Pune, Maharashtra. Open to roles in Pune, anywhere in India, or globally — remote or relocation.