That's Fresh! Newsletter
Read a selection of our past issues.
- Google's answer to ChatGPTAnd: Generating synthetic data within relational databases. Let's meet at WAICF!February 8, 2023
- Understanding ChatGPT betterAnd: How to deal with imbalanced data. More about our productDecember 14, 2022
- A curated list of failed ML projectsAnd: How to build a data strategy. Clearbox AI and Bearing Point partnership.November 16, 2022
- Our open source library is now on GitHubAnd: Clearbox AI on Cybernews.June 22, 2022
- Discovering DagsterAnd: Quantifying privacy risks. Use case: a synthetic data sandbox to freely share data.June 8, 2022
- Can interaction data be fully anonymized?And: Synthetic Data for privacy preservation: understanding privacy risks. Discover our Enterprise solution.April 6, 2022
- What are GFlow nets?And: Improve models with Synthetic Data. Use case: augment financial time series.March 16, 2022
- The European Commission selected us for Women TechEU pilot project!And: What is Synthetic Data. The new Synthetic Data platform.March 09, 2022
- The EDPS on Synthetic DataAnd: From raw to good quality data. Changelogs: now you can upload unlabeled datasets.February 23, 2022
- 2022 Gartner’s Technology TrendsAnd: How to harness the power of AI in companies. Changelogs: new metrics available for your synthetic dataset.February 09, 2022
FROM THE AI WORLD
I've recently stumbled into this nicely curated repository containing a list of machine learning projects that failed miserably. This list includes different model families and domains and links to relevant news sources. You might have heard about these ML incidents individually while reading the news; however, having a repository collecting and updating them is an excellent idea. I was familiar with the most famous ones, such as the Apple Credit Card or Zillow projects; nevertheless, I was happy to learn about less-known ones.
ML project failures can arise for different reasons, ranging from poor problem definition to problems with the data used to train the model. The data aspect will become more prevalent as large-scale NLP and computer vision models use massive datasets for training. These datasets are becoming increasingly difficult to clean and curate, so it is equally challenging to control what models learn. As they say, garbage in, garbage out!
ML projects epic fails
Despite the potential, ML projects may fail. In this Github library curated by Kenneth Leungty you'll find real world examples from which we can learn a lot.
How to build a data strategy
In one of our latest episodes of "People Also Ask...and we answer!" we interviewed Alberto Danese, Head of Data at Nexi, and discussed about data strategies.
HOT FROM THE PRESS
Clearbox AI and BearingPoint
We recently started a partnership with BearingPoint, leader in the technology consulting, for fraud detection and help fintech institutions. (Lang: 🇮🇹)