Intro

Hello, my name is Amir and this's my story. I'm a data scientist interesting in RecSys, Geo science, Computer Vision and NLP. There is a collection of my favorite projects. I hope you will like some of them.

At different moments, I was: mathematician, physicist, software developer, geologist, data scientist, meteorologist, chess player and boardgames community org. Even the greatest road starts with import numpy as np.
Let's start.

Since 2014, I have started working in a scientific domain. My first projects were devoted to Applied Mathematics in Geomechanics when I was at a research institute of an oil company. There I have learned classic mathematical modeling and had first programming experience in Python. There was a problem with the mechanical stability of anisotropic layered porous media. Based on empirical approximation I developed analytical model, which calculate stress distribution in a borehole of different trajectory and predict available pressure window of drilling mud. This safe window is a point before well was broken with some gap. Here is the github project.

At this time, I wanted to improve my coding skills. Therefore, I've applied as a Java backend developer (Spring, Hibernate, PostgresSQL) in a large IT company. Where I have got production programming skills, teamwork, agile, sprint iterations, and improved my git culture. Besides, I tried mobile development for Android. There is my github project at this time, a mobile app with singers' bio. After of coding CRM platform, I've considered return to the research.

Next chapter of my research is Meteorology. I was involved in weather research at St. Peterburg research university at ITMO University, where I've worked as a data science researcher. There were many interesting projects: atmospheric modeling platform, transactional credit scoring model, clients interest prediction by social media posts, urban weather forecast by citizen weather stations, and reanalysis. My meteo projects intro, data assimilation, genetic feature selection, geospatial interpolation, anomaly detection with Timeseries VAE Working on these projects I have invoked to NLP domain and FinTech. There was probabilistic topic modeling for clustering communities, user's posts, and even transactions. Social media analysis projects geo posts crawling, site parsing tutorial, text classification, arxiv papers analysis.

There is Open Data Science community ods.ai. I have been in this community since 2016. ODS includes events, conferences, meetups, pet projects and competitions. With excited people we have applied following open source projects urban data analysis, ODS help bot, ODS video converter Competitions projects: 2-th place at ODS pet project hackathon 2020 "Unnamed: 0" team ODS help bot @ods_help_bot at telegram, 5-th place at SIBUR Challenge 2019 "D$Bears" team, prediction web app for chemical process repo, VK Cup 2020 30-th place P1 text classification model, top 100 AutoML Sberbank 2016 repo

Further, I was a data scientist at Adv company Segmento. The advertisement is a real Big Data domain with many Machine learning tasks. On the one hand, your models have to be more accurate than competitors on the other. they have to be high performing. Therefore, a technological stack consisted of Hadoop, Scala, Java, Spark, Pyspark, Spark ML/SynapseML, Airflow, Kafka. I've been developing models for a Real time bidding (RTB), first price prediction, users socdem params, lookalike. There, I worked as a team lead data scientist, my responsibilities included inverviewing and hiring ML engineers.

This period had a lot of additional events. Hackathon hacksai 1st place developing solution for an outlier detection in 1TB transactional data. During this hackathon Profit team was founded. Also I was on the other side of hachathons I've tried myself as organizator and expert in biomedical hackathon, where was more than 300 participants. Next hackathon 3rd place in geospatial goods retail prediction repo. Another cool hackathon Computer Vision object detection and tracking of tigers and leopards in real images from forest camera trapsrepo.

In 2023, I've been working as a datascientist at VK, I developed RecSys models for VK Clips service vk.com/clips. It's the largest Russian service with short videos. There were a lot of challenging tasks such as cold start model, ranking model for videos and churn prediction model. At VK there was a great culture of inner hackathons. One hackathon project was devoted to a real time video generation by text query repo. There with collegues we took 1st place in inner hackathon Spring Code 2023 with a project about neural network semantic search in videos (CLIP, FAISS). To be continued...

 Links