Machine learning engineer
Ennery, France
Passionate about Data science, after 25 years of experience in finance, IT and management, I just graduated as Machine Learning engineer. Today, I want to put my skills at the service of a new challenge in this field.
Projects
Data analysis
Data exploration and cleaning
In this project, I am going to walk you through the end-to-end data analysis process with Python and the OpenFoodFacts Dataset.
The goals are to:
Modelling
Anticipate building consumption needs
As global warming due to human activities is now recognized as proven by most of us, several major cities around the world are now trying to act to reduce their impact.
This project will focus on the city of Seattle that target to be a carbon-neutral city by 2050. Thanks to meticulous surveys carried out by city officials in 2016, I will first have a close look at the consumption and emissions of non-residential buildings. Then I will test different regression models to predict them. To go further, I will also evaluate the interest of the "ENERGY STAR Score" for the forecast of emissions.
Clustering
Segment the customers of an e-commerce site
"A good level of customer knowledge allows the company to better know those who contribute to its commercial prosperity, in particular information on their profiles, their needs, their centers of interest and their expectations."
During this this project I will use the KMeans Clustering technique to provide actionable customer segments to an e-commerce site. Then I will proceed to the evaluation of the maintenance frequency based on an analysis of the stability of the segments over time.
Natural language processing
Automatically categorize questions
Stack Overflow is a website offering questions and answers on a wide range of computer programming topics. A member wishing to ask a question must use the dedicated form in which he must fill in a title, a question and 1 to 5 tags in order to categorize it. For experienced users this is not a problem, but for new users it would be a good idea to suggest some tags related to the question asked.
The proposed solution consists of setting up a tag suggestion tool. It will be based on the title and content of the question. After a first pre-processing, a machine learning model will propose a series of tags depending on the content of the question asked.
Computer vision
Classify images using deep learning algorithms
Most computer vision algorithms use a convolution neural network, or CNN. Like basic feedforward neural networks, CNNs learn from inputs, adjusting their parameters to make a prediction. However, what makes CNNs special is their ability to extract features from images.
In this project which aims to classify dog images according to the dog's breed, I will first implement my own CNN inspired by the famous InceptionV1 model. Next, I'll demonstrate how transfer learning outperforms this baseline using other popular pre-trained models.
Time series
Long Short-Term Memory Neural Network for Financial Time Series
Nowadays, the emergence of Machine Learning and more recently Deep Learning has brought a new dimension to the theory of time series. Indeed, Deep learning methods offer many promises for time series forecasting, such as the automatic learning of time dependence and automatic management of time structures (trends and seasonality).
In this project, implementing the attached arXiv paper, I will present to you the different avenues explored concerning the development of a prototype using a recurrent neural network LSTM for the prediction of the evolution of stock prices.
Audio classification
BirdCLEF 2022
Birds are all around us, and just by listening to them we can learn a lot about our surroundings. Ecologists use birds to understand food systems and the health of environments - for example, if there are more woodpeckers in a forest, it means that there is a lot of dead wood. On the other hand, because birds communicate and mark their territory with songs and calls, it is more effective to identify them based on audio.
This Kaggle competition proposes to use automatic audio classification to identify bird species by sound. More specifically, it involves developing a model capable of processing continuous audio data and then acoustically recognizing species. To do so, thanks to Melspectrogram transformation, I will first explore transfer learning and computer vision models. Next, I'll implement and test dedicated audio CNNs like VGGish and TRILL.