Find Jobs
Hire Freelancers

CANNOT SCALE BIG DATA PROCESSING

€30-250 EUR

Ditutup
Dibuat sekitar 1 tahun yang lalu

€30-250 EUR

Dibayar ketika dikirim
I built an ETL pipeline to process terabytes of data. To achieve that goal, I setup a Spark Cluster (Scala) and MinIO server for object data storage. I can process and save 200 gigabytes in roughly 30 minutes using 10 virtual machines, for Spark Processing. The issue I have is that I am not able to scale that Processing. Meaning if I double the number of spark virtual machines, this does not affect processing time. I need a Data Architect who has enough expertise to help me identify the bottleneck and fix the issue. ARCHITECTURE SUMMARY. • I use virtual machines set up on-premises using VMWare ESXi 6 • Physical machines (which host VMs) are on a 1 GB network. • There is no over commitment for vCPU nor RAM • Spark VMs. 16VCPU, 64 GB RAM • MinIO (Storage). 16vCPU, 64GB RAM, Configured using RAID0 SOME DETAILS ABOUT DATA PROCESSING The process is straight. • Read data from 2 sources on MinIO, • Make a Union of data of two sources, • Filter out empty values on a column from resulting dataset, • Apply 2 groupby on that column (We save intermediate values after the first groupby) • Union the dataset obtained after the groupby operation with the empty columns values • Save the whole again on MinIO
ID Proyek: 35893478

Tentang proyek

5 proposal
Proyek remot
Aktif 1 tahun yang lalu

Ingin menghasilkan uang?

Keuntungan menawar di Freelancer

Tentukan anggaran dan garis waktu Anda
Dapatkan bayaran atas pekerjaan Anda
Uraikan proposal Anda
Gratis mendaftar dan menawar pekerjaan
5 freelancer menawar dengan rata-rata €334 EUR untuk pekerjaan ini
Avatar Pengguna
Hi there,I am excited to share my expertise and skills in data engineering and Big data, which I have acquired over the past 3 years. I am confident that I can meet your requirements. I would be delighted to work with you and I look forward to hearing more about the project if you are interested. Ps: my services are satisfaction guaranteed Ps2 : je peux communiquer avec vous en français
€140 EUR dalam 5 hari
5,0 (1 ulasan)
0,7
0,7
Avatar Pengguna
Hi there, How are you? I have gone through your project details. I would like to tell you that l have a great bunch of experience in VMware, Spark, Data Engineer, Big Data and Amazon S3. For that I would require from your end to start a chat with me to discuss about CANNOT SCALE BIG DATA PROCESSING. You can check my profile that I have 100% completion rate on my projects, so it would be my pleasure to build long term relationship with you. All my skills are related to this particular project. Hoping to hear from you soon. Cheers. Rashid Amjad.
€250 EUR dalam 8 hari
0,0 (0 ulasan)
0,0
0,0
Avatar Pengguna
Hi Saint Denis, I am a Data Engineer with 7+year of experience. I would like to offer you help to fix this issue. Please let me know if we can connect .
€140 EUR dalam 7 hari
0,0 (0 ulasan)
0,0
0,0
Avatar Pengguna
Hi, I hv ,,10 years of exp in this. I would like to work for you. As i have already did the similar task and supported many projects/person in the same way etc. I would like to hear from your side.  Thank you for
€140 EUR dalam 7 hari
5,0 (1 ulasan)
0,1
0,1
Avatar Pengguna
Hi, I am a data engineer of 5 years experience. I have designed and built large scale spark pipelines for use cases similar to yours. Unfortunately as you might be aware there are no straight forward answer to your problem. The bottleneck could be anywhere. It will require understanding the existing system and all its parts, some experimentation and then we can come up with a strategy to solve the issue.
€1.000 EUR dalam 15 hari
0,0 (0 ulasan)
0,0
0,0

Tentang klien

Bendera FRANCE
SAINT DENIS, France
4,9
5
Memverifikasi Metode pembayaran
Anggota sejak Sep 18, 2016

Verifikasi Klien

Terima kasih! Kami telah mengirim Anda email untuk mengklaim kredit gratis Anda.
Anda sesuatu yang salah saat mengirimkan Anda email. Silakan coba lagi.
Pengguna Terdaftar Total Pekerjaan Terpasang
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Memuat pratinjau
Izin diberikan untuk Geolokasi.
Sesi login Anda telah kedaluwarsa dan Anda sudah keluar. Silakan login kembali.