May 24, 2024


My Anti-Drug Is Computer

Big data infrastructure internship | Adaltas

Big data infrastructure internship | Adaltas

Task description

Huge Info and distributed computing are at the core of Adaltas. We accompagny our partners in the deployment, maintenance, and optimization of some of the biggest clusters in France. Considering that not long ago we also deliver assist for day-day operations.

As a great defender and energetic contributor of open up resource, we are at the forefront of the info platform initiative TDP (TOSIT Facts System).

Through this internship, you will contribute to the growth of TDP, its industrialization, and the integration of new open supply factors and new functionalities. You will be accompanied by the Alliage professional group in cost of TDP editor guidance.

You will also do the job with the Kubernetes ecosystem and the automation of datalab deployments Onyxia, which we want to make accessible to our shoppers as effectively as to learners as component of our instructing modules (devops, big information, etc.).

Your skills will enable to extend the products and services of Alliage’s open up supply help featuring. Supported open resource factors consist of TDP, Onyxia, ScyllaDB, … For people who would like to do some website function in addition to large details, we by now have a really functional intranet (ticket management, time administration, innovative look for, mentions and linked articles, …) but other good capabilities are anticipated.

You will apply GitOps launch chains and create articles or blog posts.

You will operate in a staff with senior advisors as mentor.

Enterprise presentation

Adaltas is a consulting agency led by a workforce of open source industry experts concentrating on info administration. We deploy and operate the storage and computing infrastructures in collaboration with our clients.

Lover with Cloudera and Databricks, we are also open source contributors. We invite you to look through our web page and our numerous technical publications to master far more about the organization.

Competencies necessary and to be acquired

Automating the deployment of the Onyxia datalab needs know-how of Kubernetes and Cloud indigenous. You will have to be relaxed with the Kubernetes ecosystem, the Hadoop ecosystem, and the dispersed computing model. You will learn how the standard parts (HDFS, YARN, object storage, Kerberos, OAuth, etc.) do the job together to meet the takes advantage of of huge data.

A fantastic awareness of utilizing Linux and the command line is demanded.

Through the internship, you will understand:

  • The Kubernetes/Hadoop ecosystem in buy to add to the TDP job
  • Securing clusters with Kerberos and SSL/TLS certificates
  • Substantial availability (HA) of providers
  • The distribution of means and workloads
  • Supervision of providers and hosted applications
  • Fault tolerant Hadoop cluster with recoverability of dropped data on infrastructure failure
  • Infrastructure as Code (IaC) by using DevOps equipment these as Ansible and [Vagrant](/en/tag/hashicorp- vagrant/)
  • Be cozy with the architecture and procedure of a details lakehouse
  • Code collaboration with Git, Gitlab and Github


  • Come to be acquainted with the architecture and configuration strategies of the TDP distribution
  • Deploy and take a look at safe and hugely readily available TDP clusters
  • Add to the TDP know-how base with troubleshooting guides, FAQs and posts
  • Actively contribute suggestions and code to make iterative enhancements to the TDP ecosystem
  • Research and analyze the dissimilarities concerning the key Hadoop distributions
  • Update Adaltas Cloud employing Nikita
  • Add to the improvement of a device to collect consumer logs and metrics on TDP and ScyllaDB
  • Actively lead concepts to build our help option

Further info

  • Location: Boulogne Billancourt, France
  • Languages: French or English
  • Beginning date: March 2023
  • Length: 6 months

A lot of the electronic planet runs on Open Source program and the Huge Facts industry is booming. This internship is an possibility to acquire precious experience in each domains. TDP is now the only actually Open up Source Hadoop distribution. This is a good momentum. As part of the TDP group, you will have the chance to find out a single of the core big information processing types and participate in the growth and the potential roadmap of TDP. We believe that that this is an fascinating prospect and that on completion of the internship, you will be all set for a effective vocation in Huge Facts.

Equipment out there

A laptop computer with the adhering to features:

  • 32GB RAM
  • 1TB SSD
  • 8c/16t CPU

A cluster designed up of:

  • 3x 28c/56t Intel Xeon Scalable Gold 6132
  • 3x 192TB RAM DDR4 ECC 2666MHz
  • 3x 14 SSD 480GB SATA Intel S4500 6Gbps

A Kubernetes cluster and a Hadoop cluster.


  • Income 1200 € / thirty day period
  • Restaurant tickets
  • Transportation pass
  • Participation in 1 worldwide meeting

In the past, the conferences which we attended involve the KubeCon organized by the CNCF foundation, the Open up Supply Summit from the Linux Basis and the Fosdem.

For any ask for for supplemental info and to post your application, be sure to get hold of David Worms: