Logo of Huzzle

Data Engineer (Scala, Spark)

image

Capgemini

1mo ago

  • Job
    Full-time
    Junior Level
  • Data
    IT & Cybersecurity
  • Madrid
    Remote

AI generated summary

  • You need 2+ years in Python/Scala and Spark, cloud experience (Azure/AWS/GCP), knowledge of Databricks, Airflow, and a team player attitude with a desire to learn.
  • You will develop projects from scratch using Apache Spark, design architectures, build ETLs with Python, and create scalable pipelines on cloud platforms while collaborating with an international team.

Requirements

  • Al menos 2 años de experiencia en Python o Scala y Spark procesando grandes volúmenes de datos.
  • Experiencia o conocimientos en Cloud (Azure, AWS o GCP).
  • Experiencia con Databricks, Data Factory, Synapse, Apache Airflow, etc.
  • Team player y ganas de seguir aprendiendo.

Responsibilities

  • Tendrás rol de Data Engineer y…
  • Podrás desarrollar proyectos desde cero con la colaboración del equipo.
  • Desarrollarás proyectos utilizando Apache SPARK tanto para arquitecturas Batch como Real Time.
  • Participarás en el diseño de arquitecturas y toma de decisiones en un entorno constructivo.
  • Desarrollarás ETLs con Python (Scala / Spark)
  • Desarrollarás proyectos en Cloud (Azure / AWS / GCP).
  • Construirás pipelines escalables con diferentes tecnologías.
  • El inglés es bastante importante para nuestros proyectos, en muchos casos trabajamos con clientes internacionales y el día a día es en dicho idioma.

FAQs

What programming languages are preferred for this role?

The preferred programming languages for this role are Python and Scala, particularly with experience in Spark.

What kind of projects will I work on as a Data Engineer?

You will develop projects from scratch, utilizing Apache Spark for both Batch and Real-Time architectures, and participate in designing architectures and decision-making.

Is cloud experience required for this position?

Yes, experience or knowledge in cloud platforms such as Azure, AWS, or GCP is required.

Will I be required to communicate in English?

Yes, English is important for our projects as we often work with international clients and daily communication is primarily in English.

How many years of experience should I have for this role?

You should have at least 2 years of experience in Python or Scala and Spark, processing large volumes of data.

What tools and technologies will I be working with?

You will be working with tools such as Databricks, Data Factory, Synapse, Apache Airflow, among others, for developing ETLs and scalable pipelines.

Is a background in teamwork important for this position?

Yes, being a team player and having a willingness to continue learning are important qualities for this role.

Does Capgemini offer training and development opportunities?

Yes, Capgemini offers a wide range of training opportunities, including access to platforms like Coursera, Udemy, and Capgemini University.

What are some benefits of working at Capgemini?

Benefits include a unique work environment, flexible holiday options, continuous training, wellbeing initiatives, and participation in volunteer and social action activities, among others.

Is there a policy in place for diversity and inclusion?

Yes, Capgemini has a commitment to inclusion and equality of opportunity, implementing a Plan of Equality and a Code of Ethics to ensure non-discrimination based on various personal and social circumstances.

Get the Future You Want

Technology
Industry
10,001+
Employees
1967
Founded Year

Mission & Purpose

Capgemini is a global leader in partnering with companies to transform and manage their business by harnessing the power of technology. The Group is guided everyday by its purpose of unleashing human energy through technology for an inclusive and sustainable future. It is a responsible and diverse organization of 360,000 team members in more than 50 countries. With its strong 55-year heritage and deep industry expertise, Capgemini is trusted by its clients to address the entire breadth of their business needs, from strategy and design to operations, fueled by the fast evolving and innovative world of cloud, data, AI, connectivity, software, digital engineering and platforms. The Group reported in 2022 global revenues of €22 billion.