Machine learning uses data and algorithms to imitate the way that humans learn, gradually improving its own accuracy. In recent years, it has become a part of our daily lives, powering image recognition, automatic translation, self-driving, and so on. Moreover, it has found application in many areas of natural science, from cosmology to bioinformatics, where it has helped to analyze enormous amount of data and identify the relationships between the parameters of physical systems, which help govern the laws of nature.
Typically, machine learning requires predefined variables that characterize a system in order to analyze it. Perhaps the best-known example of these is the positions and velocities of particles whose dynamics obey Newton’s laws.
In a new study published in Nature Computational Science, a team of physicists led by Boyuan Chen and Qiang Du of Columbia University’s Creative Machines Lab developed a machine learning algorithm that was not only able to deduce the number of basic variables of various physical systems, but to also identify them and predict the future dynamics of those systems after watching video recordings of their behavior.
The hope is that it could one day be used to automate the discovery of new physical laws that govern the dynamics of complex systems.
Machine learning tries its hand at discovery
All laws of physics are relationships between variables that give a complete description of a corresponding system. Knowing what these variables are is extremely important to begin formulating the rules that govern them. The laws of thermodynamics, for example, were discovered only after temperature, pressure, and energy were formalized. The laws of solid mechanics, electromagnetism, fluid dynamics, atomic physics, and so on, all required their own set of variables to be defined before they could be formulated.
Through trial-and-error and scientific exploration, humanity has gained significant knowledge about our world and universe. However, this process takes time, and automating scientific research could hopefully accelerate the discovery of new laws of nature. This is what the new study is devoted to accomplishing.
The new machine learning algorithm operates through a two-step process. In the first, it deduces the number of independent variables in a system, and then in the second, determines what they are.
The scientists began by giving the program records of physical systems whose variables and dynamics are already known, such as the behavior of a single pendulum, which is determined by just two variables: the angle and angular velocity of its arm. As somewhat more complicated examples, they considered a double pendulum with two arms described by four parameters, and an elastic double pendulum whose arm lengths are not fixed, which increases the number of independent variables to six.
After several hours of analysis, the machine learning program returned its own answers for the systems’ variables: 2.05, 4.71 and 5.34. This is quite close to the correct numbers, and the difference can be attributed to the imperfection of the algorithm.
“We thought this answer was close enough,” said Hod Lipson, director of the Creative Machines Lab. “Especially since all the [machine learning algorithm] had access to was raw video footage, without any knowledge of physics or geometry. But we wanted to know what the variables actually were, not just their number.”
In the next phase of their studies, the researchers tried to have their system find not just the number, but the identity of the variables. Deriving them from the video records was difficult as a result of the mathematical language that the program uses. However, after some investigation by the team, it turned out that some of the variables the algorithm identified loosely corresponded to the angles of the arms, although the rest remained mysterious.
“We tried correlating the other variables with anything and everything we could think of: angular and linear velocities, kinetic and potential energy, and various combinations of known quantities,” explained Chen, now an assistant professor at Duke University. “But nothing seemed to match perfectly.”
Although the scientists were unable to figure out what all the variables predicted by the machine learning system were, they are confident that the program found the right set because it made accurate predictions about the future behavior of the pendulums.
The team decided to apply their program beyond pendulums, to more complex systems for which the number of variables remains unknown. They fed the algorithm the recordings of an “air dancer”, a lava lamp, and flames from a holiday fireplace loop, and the program returned the number of variables equal to 7.57, 7.89, and 24.70, respectively.
In addition to the numbers of variables, the scientists were interested in whether they always remain the same when the program was restarted or not.
“I always wondered, if we ever met an intelligent alien race, would they have discovered the same physics laws as we have, or might they describe the universe in a different way?” said Lipson. “Perhaps some phenomena seem enigmatically complex because we are trying to understand them using the wrong set of variables.”
And indeed, that’s what seems to have happened: the number of variables that the program returned for the more complex systems was always the same, although the variables themselves were different each time. This is not surprising since we know that there are many alternative ways to describe even the simplest of systems. For example, the pendulum can be characterized by the angle of its arm and the angular velocity, or by potential and kinetic energies.
These results have given researchers hope that similar algorithms might help unravel unknown laws of nature in future. The fact that machines see the world differently, hear frequencies humans can’t hear, and experience nature using senses humans don’t have at all, could mean that the laws of nature could be defined by variables that we cannot even imagine.
Reference: Boyuan Chen, et al., Automated discovery of fundamental variables hidden in experimental data, Nature Computational Science (2022). DOI: 10.1038/s43588-022-00281-6