In 2016, I created my first ecommerce store and had my first interaction with web analysis. By searching how to bring more users to my website, I got to know the ins and outs of Pay-Per-Click (PPC) advertising, and I became a Google Partner specialist on it. So, I founded the Bloco-b that started providing PPC services and web analysis all over Brazil. Since 2019, I have been studying Data Analytics and Data Science, doing projects related to my background experiences. Now I’m seeking to make the web a better place with machine learning creative finds.
Personal Projects
I’m always looking to bring some value to online businesses somehow. For that reason, my projects are related to website data, predictions, or insights. The projects are the following:
Organic Traffic and Conversions Predictions w/ FB Prophet & SEO Advices
I was asked about approaches to predict the following year’s monthly organic traffic and conversions, considering the growth through time and seasonalities, and presenting the results for the following 12 months.
I fitted FacebookProphet model with monthly data and created interactive graphs with Plotly.
Using a native functionality for time series Cross Validation I achieved:
Organic Traffic - MSE for 30 days: 0.88 and for 303 days: 1.62
Conversion - MSE for 30 days: 1.10 and for 303 days: 2.04
I also presented a few simple pieces of advice for the SEO team, using the data provided and basic sorting features using Python to optimize the marketing content efforts looking to conversion rates.
I tested four different algorithms to label search queries based on the user’s online behavior to help online businesses reduce costs in their campaigns.
The four different clusters were: K Means, Spectral Clustering, Agglomerative Clustering, Gaussian Mixture.
Specified the number of clusters using Silhouette Score for each clustering algorithm.
All algorithms showed promising results, but the most important was finding the most meaningful cluster. We choose the Gaussian Mixture for some reasons detailed in section 9 - The Best Choice.
Sentiment Analyse + Deep Learn - Amazon Shopping App Reviews
I Scraped over 3,000 reviews from Google Play Store and App Store using APIs
I Built a pipeline that uses Text Preprocessing to bring the reviews to a form that is predictable and analyzable for the Neural Network.
I also built a sentiment analysis using Keras modules Model , Dense, LSTMand Embedding.
In the end, I built a function that applies the pipeline for text preprocessing, classifies a comment with the model, and tells us if the comment is Positive, Negativeor Neutral.
The result reached by my model in the test data set was - Loss 0.60 - Accuracy 0.74 - Prediction 0.75 - Recall 0.73 - F1-Score 074.
Google Analytics API - Ecommerce - Exploratory Data Analysis
I built an Exploratory Data Analysis to get different insights about each traffic medium of eCommerce.
The data was extracted using the Google Analytics API and split based on the traffic medium of the website. Google Analytics API.
With the analysis, we can recognize patterns and differences between the traffic mediums.
Also, I found a hypothesis to be tested. That Android users, who are New Visitors, are having problems with the page loading speed and mobile usability of the site raising the website’s Bounce Rate.
A Kaggle user recently commented on the project, and the code needs adjustments.
Google Analytics-API - Ecommerce - Binary Classification - Transactions Predictor
I Tested 42 different methods to predicate the transaction, tune the best method and plot the decision tree
I built four functions to split the dfs in train and test. Raw split, without zeros (without bounce rate), outliers, and zeros and outliers.
Also, I built three functions for the Resampling Strategies. Random Under Sample, Random Over Sample, and Tomek Links.
Then, the primary function was built, which tested all those variations into three different classification algorithms: DecisionTreeClassifier, RandomForestClassifier, Extreme Gradient Boosting.
I Used Hyperopt to tune the best model of each traffic medium and plot the feature importance and the best decision tree.
That was my first project. I learned a lot more so far, and now I regret many things, haha.