- Posted by Daitan CTO
- On November 5, 2019
- Data, Data Centers, Data Science, Machine Learning
Since I joined Daitan last year, I have had the opportunity of learning about Machine Learning (ML) and this has been an exciting time in my career. For years I have been hearing and reading about this topic, and now find myself working in an environment where companies—Daitan clients—are considering the strategic value of implementing AI into their products.
In this post, I will share a very basic view of the technologies that are used to build ML solutions with the goal of helping those who have not had exposure to this domain understand where all the hype comes from.
For my engineering colleagues, keep in mind that this article is meant to very simply present this technically complex field; and, does not tackle detailed technical challenges, the compelling issue of bias, and social concerns about ML/AI replacing jobs.
Programming With Data
My favorite way of thinking about ML technologies is by understanding the concept of “Programming with Data.”
For example, supposed you want to create an application to alert you when a person appears on the video feed of your home surveillance camera. Traditional developers would create a software application that analyzes all the pixels on the video feed looking for a person, which, as you can imagine, would be an extremely complex software application.Traditional developers write software programs to accomplish very specific tasks. ML developers write software programs that can learn how to accomplish very specific tasks..
ML developers, on the other hand, would create a software application that is able to “learn” how to recognize if a person is present—or not—on the video feed. This software is expressed in the form of algorithms— called a model by ML practitioners—that could be “programmed” with thousands of image data examples. Keeping with our basic objective, some examples would contain a person, and some wouldn’t. With these examples, the model would “learn” how to recognize a person on the video feed, thus the “learning” in Machine Learning.
The surveillance example described above has three key components of a ML solution: the data, the model, and training (learning algorithms). Each play an important role in any ML project.
First, the data are the image examples used to train a model. A data scientist/engineer would define what type of image data is needed for the project, collect it (possibly from an internal source or a published library) and, normalize it with the purpose of preparing and organizing the data for easier consumption during training. This is often referred to a data wrangling. Defining the “right training data” for a project and the appropriate volume of training data that produces reliable results are critical and challenging steps for a successful ML project.
The second key component, the model, is the actual “machine” that will be trained with the image data. Frequently a model is comprised of multiple algorithm that transform data into a prediction – is somebody at your door or not.
The third component is training. Without getting technical, these learning algorithms include a definition of how well the model is doing its predictions (technically called a loss function) and an optimization algorithm that will search for the best possible configuration of the model to maximize the accuracy of the predictions (or minimize the loss). The objective is to demonstrate consistent, re-producible results in detecting a person standing at your door—or not.
Applications and Types of Machine Learning
With this very basic understanding of ML, we can start thinking about multiple applications for these types of technologies. Any domain in which there are hundreds of examples (a.k.a. labeled data), and the need to make predictions, lends itself well to a ML application. Although, there are multiple types of ML, the following examples cover a few common ones:
- The home surveillance example is a Classification problem. In Classification problems, a model looks at a data point and assigns one of a pre-defined set of labels to it. In our earlier example there are two possible labels, “human present” and “human not present.”
- If we decided we want our ML surveillance application to also detect dogs, cats, raccoons, and any combination of the four, we would build a Tagging model to expand the model’s ability to detect multiple types of entities present. Tagging models can apply multiple labels to a data point—such as a person with a dog is standing at your door—instead of the single label applied by a Classification model.
- One of the most common examples is a Recommendation System, which all of us have experienced on Netflix, Amazon, and many other web services.
- Predicting monthly sales performance based on the forecast of the sales team, macroeconomic indicators, and microeconomic variables is more complex and incorporates a time-based component to the analysis. This model would need a large training data set based on historical data from multiple sources. This type of problem, in which the model will deliver a numerical prediction, is classified as a Regression problem.
In my previous job I worked with IT infrastructure and data centers. This whole space is full of examples where an ML model can add tremendous value to operators, with the advantage that most facilities have years of log data that can be used to train, test and validate new predictive models. Some examples:
- Predicting when a service will experience an outage, which would require data from all components used to deliver a given service (compute, network, storage, etc.).
- Predicting when pieces of supporting infrastructure (power and cooling systems) will fail. These failures usually lead to huge losses. In this era of big data, it seems highly plausible that most large data center operators are already using ML to predict these conditions
- Detecting zombie servers. Many data centers have problems detecting what servers are being used productively and what servers are just wasting power and space. It is possible that a ML model can be trained to identify when a server is doing nothing, which could lead to their removal from the facility and all associated cost savings.
In this post I have described a small part of the ML universe with a few example applications, and without getting into details. ML technologies are appropriate for domains where large amounts of data are available and there is an interest in making predictions. Additionally, ML models can potentially be used to make predictions that would be difficult or impossible to make with traditional software technologies.