Find Jobs
Hire Freelancers

Develop Text Classification and/or Clustering Algorithms in Python

$250-750 USD

Selesai
Dibuat hampir 8 tahun yang lalu

$250-750 USD

Dibayar ketika dikirim
We require assistance on the following tasks. Please contact us directly to describe how you would solve them. Russian language skills may be necessary. 1) Task: Develop/employ a text-classification algorithm in Python or R that classifies items as one of several thousand 10-digit product codes using a descriptive text field of roughly 300 characters in UTF-8 (Russian / Cyrillic). Description: We have a database of several million textual descriptions of products that have been entered by humans. Each entry is connected to a 10 digit product code, but the same product code can be used for multiple differing textual entries. We require a text-classification algorithm that probabilistically classifies a document that can then be applied to another dataset (see task 2). This task requires tokenizing, stemming, and removing stop words, and therefore you may need to know Russian or to use available NLTK packages. Similarly, several different algorithms may need to be used to improve precision. Output: Python scripts/algorithm(s) classifying documents into 10-digit product codes that can be used in task 2. 2) Task: Use the classification algorithm in (1) to classify textual entries in a second dataset. Description: Once the clean list has been created, employ a machine learning algorithm to assign the 10 digit codes to a target dataset of over 60 million textual product descriptions in UTF-8 (Russian / Cyrillic). Not all entries will have sufficient information to be classified ('leftovers') and should be marked so. For example, this could be done if no classification has a probability above some threshold. Also, the dataset in (1) only contains examples of a subset of the items in the second dataset, but we will be able to estimate which items these are. Output: Second dataset of 60 million entries are matched to 10 digit product codes. 3) Task: For the 'leftovers' of (2), develop/employ a text clustering algorithm that groups entries in k subclasses Description: We will provide you a higher-level grouping variable for the 'leftovers' and a number k that designates how many we clusters need within each grouping. Your task will be to use a text clustering algorithm to create k amount of 'clusters' within the higher groups for the 'leftovers'. Output: A unique variable designating cluster membership for each item in the 'leftovers' (those without 10 digit product codes from step 2).
ID Proyek: 10479220

Tentang proyek

18 proposal
Proyek remot
Aktif 8 tahun yang lalu

Ingin menghasilkan uang?

Keuntungan menawar di Freelancer

Tentukan anggaran dan garis waktu Anda
Dapatkan bayaran atas pekerjaan Anda
Uraikan proposal Anda
Gratis mendaftar dan menawar pekerjaan
Diberikan kepada:
Avatar Pengguna
Hello! My name is Andrey. I'm physicist from Russia with experience in machine learning field. I know how to implement ML methods in practice. For example, I developed predictive algorithm for sport betting. You can find additional information about this job in my Upwork profile. Also I have experience in text classification. I developed model for classification of wikipedia articales at my work. Also you can find me on kaggle.com. My nick is gradiente. But first of all I have to estimate amont of work and see text samples for classifiacation. Hope to work with you under this project!
$611 USD dalam 20 hari
5,0 (6 ulasan)
5,5
5,5
18 freelancer menawar dengan rata-rata $1.329 USD untuk pekerjaan ini
Avatar Pengguna
HI there. I would love to be part of this project as it seems very interesting. I am a data scientist with experience applying data mining algorithms to large amounts of data for prediction and description. I do not have knowledge of russian language, but I do have experience using already developed packages to pre process data. I would do all tasks in python. Hope to hear back from you soon. Thanks, Daniel
$526 USD dalam 10 hari
4,9 (101 ulasan)
7,8
7,8
Avatar Pengguna
We are a group of Data Scientists based in Bangalore. Our core areas of expertise are big data and machine learning.
$10.000 USD dalam 40 hari
4,9 (9 ulasan)
6,4
6,4
Avatar Pengguna
I am a computer science professional with a PhD degree and excellent skills in Python and a number of other languages. I've done many projects involving Clustering or Classification. I'm also a fluent Russian speaker. Please see reviews on my profile. It would be my pleasure to do your project. Here is another large project in which I had to process a large volume of texts in Russian using Python: https://www.freelancer.com/projects/Python/Data-Extraction-from-Word-documents/
$1.000 USD dalam 10 hari
5,0 (59 ulasan)
6,3
6,3
Avatar Pengguna
I am very interesting in your project. I have experience in this field. If you work with me, you will get success. I am ready to work with you now. Phon.
$736 USD dalam 10 hari
4,9 (25 ulasan)
5,8
5,8
Avatar Pengguna
Dear Client, Greetings from Flowgica technologies, I have experience with these skills. We do have similar experience therefore I am looking forward to discuss and move ahead. please check our freelancer portfolio at https://www.freelancer.com/u/mmadi.html?page=portfolio I am ready to work with you,kindly waiting for your response. Thanks & Regards, Mmadi
$600 USD dalam 10 hari
5,0 (1 ulasan)
4,0
4,0
Avatar Pengguna
My name is Mike and I’m from UK. I work with individual clients and also provide outsourcing services for a number of UK and USA based agencies. Your project description sounds interesting to me and I do have skills & experience that is required to complete this project. I can show you some examples of my work. Please contact me to discuss your project.
$555 USD dalam 10 hari
5,0 (1 ulasan)
3,2
3,2
Avatar Pengguna
i have gone through your requirement we done similar kind of job before looking forward your earliest Reply on this for a project discussion Awaiting for your earliest reply
$555 USD dalam 10 hari
0,0 (0 ulasan)
0,0
0,0
Avatar Pengguna
Hello, I understood the initial scope of this project. Although i want to discuss further this job in order to prepare the final concept for this project. After Complete discussion over the call or in chat, i will prepare following things for you - Technical Project Proposal - Flow chart for this Project - Execution plan (Step by step procedure with explanation how and at what that we are going to execute a particular task)
$773 USD dalam 20 hari
0,0 (0 ulasan)
0,0
0,0
Avatar Pengguna
Currently Im working part time, where Im using R on daily basis. I have practical experience with R programming and also with classification algorithms, text mining, clustering and machine learning. Im also student in the field of Economics and Econometrics in Praque.
$1.666 USD dalam 10 hari
0,0 (0 ulasan)
0,0
0,0
Avatar Pengguna
A proposal has not yet been provided
$1.111 USD dalam 21 hari
0,0 (0 ulasan)
0,0
0,0

Tentang klien

Bendera UNITED STATES
Washington, United States
5,0
6
Memverifikasi Metode pembayaran
Anggota sejak Jan 6, 2016

Verifikasi Klien

Terima kasih! Kami telah mengirim Anda email untuk mengklaim kredit gratis Anda.
Anda sesuatu yang salah saat mengirimkan Anda email. Silakan coba lagi.
Pengguna Terdaftar Total Pekerjaan Terpasang
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Memuat pratinjau
Izin diberikan untuk Geolokasi.
Sesi login Anda telah kedaluwarsa dan Anda sudah keluar. Silakan login kembali.