From raw sensory input to motor output

RL & Engineering-Free Discipline

Richard Sutton, the father of Reinforcement Learning (RL) argued that the main reason of RL is that, with enough sensory capability, a motor output can be built without the need for any intervention from engineering or any other tricky process. This, of course, for the skeptics in academe and industrial administrators won’t be easy to digest due to the fact that their minds believe that everything should be running under a clear set of rules.

Riding Bicycles: Engineering or Practice?

So to defang this dilemma, consider a simple mechanism, say bicycle riding and let’s try to see if there’s any engineering behind it. Now the first claim would be the Gyroscopic effect is keeping the bike upright due to velocity. This is simply not true because if we bring a novice rider, chances are extremely high he will crash the moment he takes off. The reality is that the mind is processing small corrections each second and this is clearly seen when toddlers take off their first bike ride. So, everything is in the mind on a condition there’s enough sensory capability. What remains, will take care of itself.

It just works

To add further, RL is criticized being unscientific, which is true, due to the fact it can’t be understood as definite proofs that explain how these neural networks are processing all the data. However, the idea was to get closer on how the human mind works when learning a specific skill and in RL, things look promising in terms of “what works” or other accomplishments in the current AI scene. So we are aware now and should accept the reality that we can’t engineer a complex system but rather simply describe how it can be driven or operated.

Building an AI Industrial Agent in Minutes

Let’s go back to Heavy Industry and examine how human operators manage to drive machinery vis-a-vis our bicycle example. The main idea is that everything is read on a monitor or sensed through a relay then corrections will come as a 2nd step in order to achieve a specific target. Hence, States, Actions and Targets are all what is needed to configure a probabilistic sensory-motor medium for any process. And the entertaining part about that is there is no need for extensive teams meetings to fulfill such a configuration due to the intuitive aspect of the service we want to prove heavy manufacturers with. Say we have to configure a smart sensory-motor agent for a distillation column, the main variables that come to mind are the flow variables of materials inside the DT, temperature at each level etc… Then all these would be stacked on a vector. Second, what would be the action variables that would make this configured vector inch toward the desired target. Maybe some valves need to be tuned constantly to achieve this step. The Target would be simply the idea measurement vector i.e. the same vector we described but with the best values within. All the previously mentioned steps can be done in minutes.


Speaking about the benefits, there are many. Starting from not needing to go through any engineering pipeline then this would mean automatically not needing any models or transfer function for, let’s say, a DT, a WHR system, a combined-cycle turbine or even a wind-farm. No models in need then this would mean no simulation for physical asset (dangerously similar to administering heart condition beta-blockers for a slight tachycardia). And to conclude, no simulation for physical assets then no interventionism at all from our part into our client business. For the skeptics again, wouldn’t it be more rational to provide value on top of a certain business without the need to interfere in it?

Data Privacy

More on that, with the absence of any engineering, then the meaning of the interacting variables would be irrelevant. In other words, privacy for client data in explicitly provided by the absence of any data label i.e. we wouldn’t know as a service provider that the uploaded data on our servers is for a turbine or a distillation column. is fully working under a pure-numerical framework.


At last, this is what we strive to achieve with our AI decision-making platform: less interventionism in machinery, more privacy, simplicity and user-friendliness in configuration. And to finish with one last maxim, I say we aren’t creating something new but rather bringing back the robustness in operation of machinery before automatic control took over and made us unwillingly, passively controlled by processes around us, not the other way around.