Data Scientist - Codincity

Job Overview

We are looking for a Data Scientist with strong experience in Generative AI and Large Language Models (LLMs) to design, build, and deploy intelligent AI-powered applications. The ideal candidate should have hands-on expertise in LLM integration, Retrieval-Augmented Generation (RAG), prompt engineering, and AI model evaluation while ensuring safety, accuracy, and performance optimization.

The role involves working closely with data engineers, ML engineers, and product teams to develop scalable GenAI solutions deployed on cloud platforms such as Azure OpenAI.

Key Responsibilities

Generative AI & LLM Development

Design, build, and optimize LLM-based applications using platforms like OpenAI / Azure OpenAI and locally hosted models.
Develop and refine prompt engineering strategies and system prompts to improve response accuracy and reliability.
Implement Retrieval-Augmented Generation (RAG) pipelines to enable context-aware responses.

RAG & Knowledge Integration

Build scalable retrieval systems using techniques such as document chunking, embeddings, and semantic search.
Implement and manage vector databases such as FAISS or Pinecone for knowledge retrieval.
Integrate structured and unstructured data sources into GenAI pipelines.

Frameworks & AI Development

Develop AI workflows using frameworks like LangChain and LlamaIndex.
Build modular pipelines to support conversational AI, document intelligence, and knowledge assistants.

Model Evaluation & Experimentation

Evaluate LLM outputs using quantitative metrics such as BLEU, BERTScore, and other benchmarking techniques.
Implement human-in-the-loop evaluation frameworks to improve model quality.
Track experiments and model performance using MLflow or similar tools.

AI Safety & Responsible AI

Implement guardrails and safety mechanisms to handle toxicity, bias, and harmful outputs.
Detect and manage PII (Personally Identifiable Information) in AI responses.
Apply techniques for grounding, citation, and hallucination mitigation to ensure reliable AI outputs.

Deployment & Optimization

Deploy scalable GenAI services on Azure, particularly using Azure OpenAI (AOAI).
Optimize latency, cost, and inference performance for production systems.
Monitor system performance and continuously improve reliability.

Required Skills & Technologies

Programming & AI

Python
NLP libraries such as spaCy, Hugging Face Transformers
Data processing and model experimentation

GenAI & LLM Technologies

OpenAI / Azure OpenAI APIs
Prompt engineering and system prompt design
Retrieval-Augmented Generation (RAG)

Frameworks

LangChain
LlamaIndex

Vector Databases

FAISS
Pinecone

Evaluation & Experiment Tracking

BLEU, BERTScore
MLflow or similar experiment tracking tools

Cloud & Deployment

Azure
Azure OpenAI services
Model deployment and monitoring

Preferred Qualifications

Experience building enterprise GenAI applications such as chatbots, document assistants, or knowledge search platforms.
Understanding of MLOps and LLMOps practices.
Experience with fine-tuning or parameter-efficient tuning techniques.
Familiarity with data governance, AI ethics, and responsible AI frameworks.

Apply for Data Scientist

Employment Information

Codincity

India - Bengaluru

India - Coimbatore

India - Chennai

USA

Australia

Newsletter

Corporate Address

Industries

Service

Solutions

Platform

Quick Links