As a data scientist, you can benefit from data generation since it allows you to experiment with various ways of exploring datasets, algorithms, data visualization techniques or to validate assumptions about the behavior of some method against many different datasets of your choosing.
When you have to test a Proof of concept, a tempting option is just to use real data. One small problem though is that production data is typically hard to obtain, even partially, and it is not getting easier with new European laws about privacy and security.
Sales cubes are used to report on sales transactions, specifically concerning posting sales order invoices and sales order packing slips. Sales cube datasets are self-containing and do not require users to create table profiles.
Various units of measure can be incorporated into a sales cube report to ensure that the quantity is correct, like using a SUM, MEAN or COUNT value.
Would you like to create a business intelligence web application with atoti?
Business intelligence tools are helpful to stay competitive. Organizations of every size and stage use BI tools to analyze, manage and visualize business data. It is extremely easy to create a business intelligence web application with atoti.
First, let us create a session in atoti, this session will give us a web application without any cube or any data loaded into it
And after running the above, you will get a link to go to the web application.
In this article, we will talk about one of the hot topics in Machine Learning Ethics — how to reduce machine learning bias. We shall also discuss the tools and techniques for the same.
Machine learning bias, also sometimes known as bias in artificial intelligence, is a phenomenon that occurs when an algorithm produces results that are systemically prejudiced due to erroneous assumptions in the machine learning process.
Bias could be prejudice in favor or against a person, group, or thing that is considered to be unfair.
It happens quite often that the Data Science team has spent, weeks even months to understand the data, perform state of the art Feature Engineering try the various machine learning and deep learning modeling techniques, performed tuning the hyperparameters, and then built the ULTIMATE model to make predictions.
And when this machine learning model is put into production, the business results are below par than the ones in the development conditions.
We shall be using the subpopulation analysis technique on the model predictions and understand why it is important to see through these subpopulations and how to do such an…
In this article, we shall discuss one of the ubiquitous steps in the machine learning pipeline — Feature Scaling. This article's origin lies in one of the coffee discussions in my office on what all models actually are affected by feature scaling and then what is the best way to do it — to normalize or to standardize or something else?
In this article, in addition to the above, we would also cover a gentle introduction to feature scaling, the various feature scaling techniques, how it might lead to data leakage, when to perform feature scaling, and when NOT to…
We shall be doing a probabilistic time-series forecast of the sales for a salad manufacturer and then use atoti for inventory management and find the optimum number of refrigerators to store the salad products.
In this article, we will explore a method to tackle the issue of optimizing inventory management thanks to quantile based time-series predictions and leveraging atoti — a data visualization platform with an aggregation engine and native multidimensional and what-if analysis support.
Inventory management is one of the most critical components of the manufacturing and supply chain processes. It is also one of the major issues faced by the industries.
When it comes to sales for the manufacturing industries, both the extreme scenarios are unfavorable:
‘Data leakage’ is a ubiquitous term associated with predictive modeling and is a prevalent occurrence in most Kagglers dictionary.
If your model is performing too well, reflect on your methods before popping open the champagne.
Predictive modeling focuses on making predictions on novel data using a model that learns the pattern from the training data.
This is a challenging problem. It’s hard because the model cannot be evaluated on something which is not available.
Hence, the existing training data is leveraged for learning the patterns and, at the same time, testing the capabilities of the model to accurately predict an…
I’m trying to implement a Naive Bayes model following the Bayes’ theorem. The problem I face is that some class labels are missing when applying the theorem leading to the overall probability estimate being zero. How to handle such missing classes when using the Naive Bayes model?
Background Let first recall what is the Naive Bayes Algorithm. As the name suggests, it is based on the Bayes theorem of Probability and Statistics with a naive assumption that the features are independent of each other.
Bayes Algorithm describes the probability of an event, based on prior knowledge of conditions that might…
Master in Data Science and Business Analytics, ESSEC Business School- CentraleSupélec