Data Science—the Future Is Now!
We still hear a lot of buzzwords in the food sector. Blockchain, machine learning, self driving vehicles in farming, predictive analytics and many more.
What Brought Us Here?
These buzzwords evolved from a discipline that came to life when statisticians, computer scientists and business experts questioned themselves about the future of their guild. In 1962 the statistician John W. Tukey was wondering, “For a long time I thought I was a statistician, interested in inferences from the particular to the general. But as I have watched mathematical statistics evolve, I have had cause to wonder and doubt … I have come to feel that my central interest is in data analysis, which I take to include, among other things: procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data. … How vital and how important… is the rise of the stored-program electronic computer?“ (Tukey, John W. The Future of Data Analysis. Ann. Math. Statist. 33 (1962), no. 1, 1-67.)
Tukey adressed an important challenge in the science and practice that we can still observe today: The combination of professional knowledge, computer science and statistics to solve problems holistically with the accuracy of science.
Status Quo and Beyond
Today we see that classical Business Intelligence (BI) is now standard in many companies. BI is grounded in the development of specific software and the storage of mainly quantitative data. Data is stored and retrieved in a way that is useful for businesses to automate processes and to make low-level decisions. This is often combined with basic descriptive statistics and mathematics like averages, sums, variances and plots like histograms, dot-plots and sometimes even box-plots.
The whole is greater than the sum of its parts—this is especially true in Data Science. New technology could only be developed on the interface of computer science and maths/statistics. Algorithms that can learn from data and improve themselves over time – that was just a bold vision back in the days of Tukeys original article. Furthermore, new software was developed for advanced data analysis and machine learning. That enables not so math and computer savvy people to use cutting edge technology in their data analysis and statistics.
Challenges in the Food (Audit) World and Possible Solutions
Food Fraud Detection
Food fraud had its inglorious breakthrough with the horse meat scandal but had its tragic climax in 2008 with the horrific Chinese milk powder melamine incidents. Here, machine learning models help to monitor data streams to detect anomalies. This may include mass balance data, product price, corruption indexes and other related data. Such models are not available off the shelf, but have to be tailored to your needs.
Transparency and Security
Bitcoin! Yes, I just added another buzzword to this blog entry… These days, (unfortunately) no article is complete without blockchain. Blockchain is a technology that is more or less immune to manipulation of transaction data. Imagine a supply chain where data information about producer and buyer, time stamps and mass-information is stored decentralized and available for everyone in the chain at every time. This will most certainly eliminate the element of fraud—at least until the first commercial quantum computers exist.
What will happen when and where? What sounds a bit like fortune telling, is actual science called predictive analytics. Based on your data gathered so far, predictive models can be developed to help you make better decisions. Risk-based auditing is then not a myth any more but is something that can be handled based on hard facts.
My next blog entry will be titled 5 steps to (audit) heaven!—your roadmap. If you don’t want to miss it, ►►►subscribe here◄◄◄.