Logo of Huzzle

Data Science Intern - NLP, LLM and GenAI

Applications are closed

  • Internship
    Full-time
    Off-cycle Internship
  • Data
    IT & Cybersecurity
  • New York City

Requirements

  • Working towards a Bachelors or Masters degree in a technical field of study in Engineering, Computer Science, Computer Engineering, Management Information Systems, or IT
  • Anticipated graduation date in Fall 2024 or Spring 2025
  • Strong data manipulation skills required including cleaning and handling data
  • Experience using Python, and Java
  • Willingness to work with unstructured, messy data
  • Excellent communication skills
  • Additional Preferred Qualifications:
  • An aptitude to learn and adapt quickly in a rapidly changing environment
  • Intellectual curiosity and desire to learn new things, techniques, and technologies
  • The ability to build strong working relationships in a collaborative setting
  • Hands-on experience leveraging large sets of structured and unstructured data to develop data-driven tactical and strategic analytics and insights using ML, NLP, computer vision solutions.
  • Demonstrated hands-on experience with Python, Hugging Face, TensorFlow, Keras, PyTorch, Spark or similar statistical tools. Expert in python programming.
  • Hands-on experience developing natural language processing (NLP) models, ideally with transformer architectures.
  • Knowledge of information search and retrieval at scale, using a range of solutions ranging from keyword search to semantic search using embeddings.

Responsibilities

  • ML, Gen AI, NLP, LLM Model Development: Design and develop custom ML, Gen AI, NLP, LLM Models for batch and stream processing-based AI ML pipelines. Model components will include data ingestion, preprocessing, search and retrieval, Retrieval Augmented Generation (RAG), NLP/LLM model development, fine-tuning and prompt engineering and ensure the solution meets all technical and business requirements. Work closely with other members of data science, MlOps, technology teams in the design, development, and implementation of the ML model solutions.
  • ML, NLP, LLM Model Evaluation: Work closely with the other data science team members to develop, validate, and maintain robust evaluation solutions and tools to evaluate model performance, accuracy, consistency, reliability, during development, UAT. Implement model optimizations to improve system efficiency.
  • NLP, LLM, Gen AI Model Deployment: Work closely with the MLOps team for the deployment of machine learning models into production environments, ensuring reliability and scalability.
  • Documentation: Write and Maintain comprehensive documentation of ML modeling processes and procedures for reference and knowledge sharing.
  • Develop Models Based on Standards and Best Practices: Ensure that the models are designed and developed while adhering to specified standards, governance and best practices in ML model development as specified by senior Data Science and MLOps leads.
  • Assist in Problem Solving: Troubleshoot complex issues related to machine learning model development and data pipelines and develop innovative solutions.

Finance
Industry
10,001+
Employees

Mission & Purpose

S&P Global (NYSE: SPGI) provides essential intelligence. We enable governments, businesses and individuals with the right data, expertise and connected technology so that they can make decisions with conviction. From helping our customers assess new investments to guiding them through sustainability and energy transition across supply chains, we unlock new opportunities, solve challenges and accelerate progress for the world. We are widely sought after by many of the world’s leading organizations to provide credit ratings, benchmarks, analytics and workflow solutions in the global capital, commodity and automotive markets. With every one of our offerings, we help the world’s leading organizations plan for tomorrow, today. For more information, visit www.spglobal.com. Our divisions include: – S&P Global Market Intelligence partners with customers to broaden their perspective and operate with confidence by bringing them leading data sources and technologies that embed insight in their daily work. – S&P Global Ratings offers critical insights for credit, risk and sustainable finance solutions that are essential to translating complexity into clarity, so market participants can uncover opportunities. – S&P Global Commodity Insights enables organizations to create long-term, sustainable value with data and insights for a complete view on the global energy and commodities markets. – S&P Global Mobility turns invaluable insights captured from automotive data to help our clients understand today’s market, reach more customers, and shape the future of automotive mobility. – S&P Dow Jones Indices provides iconic and innovative index solutions, bringing transparency to global capital markets. – S&P Global Engineering Solutions solves for tomorrow’s challenges today by transforming workflows and end-user experiences with data, insights and technology.