That's Fresh! Newsletter
Read a selection of our past issues.
- Google's answer to ChatGPTAnd: Generating synthetic data within relational databases. Let's meet at WAICF!February 8, 2023
- Understanding ChatGPT betterAnd: How to deal with imbalanced data. More about our productDecember 14, 2022
- A curated list of failed ML projectsAnd: How to build a data strategy. Clearbox AI and Bearing Point partnership.November 16, 2022
- Our open source library is now on GitHubAnd: Clearbox AI on Cybernews.June 22, 2022
- Discovering DagsterAnd: Quantifying privacy risks. Use case: a synthetic data sandbox to freely share data.June 8, 2022
- Can interaction data be fully anonymized?And: Synthetic Data for privacy preservation: understanding privacy risks. Discover our Enterprise solution.April 6, 2022
- What are GFlow nets?And: Improve models with Synthetic Data. Use case: augment financial time series.March 16, 2022
- The European Commission selected us for Women TechEU pilot project!And: What is Synthetic Data. The new Synthetic Data platform.March 09, 2022
- The EDPS on Synthetic DataAnd: From raw to good quality data. Changelogs: now you can upload unlabeled datasets.February 23, 2022
- 2022 Gartner’s Technology TrendsAnd: How to harness the power of AI in companies. Changelogs: new metrics available for your synthetic dataset.February 09, 2022
It’s always exciting to see when seasoned researchers who are usually unassuming about their results get enthusiastic about the potential of their work. The researcher in question is none other than the brilliant Prof. Yoshua Bengio. His recent work on GFlowNet, a new type of generative architecture, caught my attention for the aforementioned reason.
According to Prof. Bengio, GFlowNets have great potential to make AI models better at learning causal effects and out-of-distribution generalisation. But how do GFlowNets work? I have to admit I am still exploring the principles behind the idea and its implementation, but I found this tutorial to be a great way to dig into the topic. As the name suggests, the central concept of GLFlowNet is analogous to fluid flow. In this case, the flow represents the set of possible discrete trajectories that a model can use to construct a realistic-looking point. GFlowNet uses a reinforcement learning approach to determine the policy to define such trajectories.
In the academic paper introducing the architecture, the authors present a practical implementation to synthesise realistic molecules. I am sure many more examples of real-life applications of GFlowNets will be showcased in the next few months. Meanwhile, I will make sure to understand the theory behind it fully!
Generative Flow Networks live at the intersection of reinforcement learning, deep generative models and energy-based probabilistic modelling. Check them out!
Discover how we used the Clearbox Synthetic Data Engine to generate high-quality synthetic time series for model training and strategy backtesting.
ML models learn from data. As the amount of good quality data increases, so does the quality of the models. What's the role of Synthetic Data in this process?