infrastructure Archives - Arda Kırankaya

Most people in the industry is talking about big data, deep learning, machine learning, artificial intelligence, data science and digital twin. We need to keep in mind that all of these studies have a single prerequisite, accurate data collection. We can have all the best technology in the world but if we don’t have the data to send to these systems we are not going to get anything.

Data is just numbers without context so we have to do an extra work to transform that data into information. Doing that requires analysis skills and process knowledge. Transformed data should tell us something like, this machine processed that part using these bits and pieces at this time. This is called timestamping. After you’ve done this initial processing you will be able to use it for data science purposes.

There’s a lot also of data in PLCs, robots, sensors and also in our IT systems. First step is connecting these data and putting them into one place, a data lake. Many people say at first step just send all the data up to the cloud but this is not ideal because it will take time to handle that data and it costs.

It is time to give an example for the necessary interval that we log data. Think about a production line with a machine running on a specific rpm and producing a product with A grade and Johny operating the machine at that shift. We have sensors around like temperature and humidity that we read from our conditioning systems.

Is it ok to log all the data at five second interval? How about two or one? The answer depends on the application. If we are trying to find a sudden spike that happens on split seconds one second interval is not enough. Suppose an operator operates a single machine at a shift, it is not necessary to log who is operating the machine in seconds interval.

We need something on the edge of the factory for both OT and IT systems that help us on initial processing of the data. We can use this edge computer to filter and change data into the information throwing away unnecessary information that we won’t need in analytics.

Edge computing pushes data processing to edge devices, meaning workloads are placed closer to the source of data collection. Data transforms into information where action takes place.
Suppose we prepared the data and pushed to the cloud for analytics. Analytics suggests at some point temperature and pressure levels cause valve crash and leads downtime. Analysis of that relational pressure data taking place at cloud servers, the automatic shutoff instructions may come too late. If we leave that job to the edge, relatively higher processing power and very less latency we significantly save downtime, property and even lives.

While emphasizing on the importance of edge devices we do not place less value on computing centers. Although edge devices provide local computing and storage, there will still be a need to connect them to data centers, whether they are on premises or in the cloud. Thanks to high computing power the cloud or on-premise data centers are better when we develop knowledge. When we talk about execution it’s better to be near where the action is happening.