This is a schematic diagram of machine learning for materials discovery. Image: Chiho Kim, Ramprasad Lab, UConn.
This is a schematic diagram of machine learning for materials discovery. Image: Chiho Kim, Ramprasad Lab, UConn.

For most of human history, the discovery of new materials has been rather trial-and-error. But now, researchers from the University of Connecticut (UConn) have systematized the search by developing a machine learning tool that can scan millions of theoretical compounds for qualities that would make better solar cells, fibers and computer chips. The search for new materials may never be the same.

No one knows why an early metallurgist decided to smelt a hunk of tin into some copper, but the resulting bronze alloy was harder and more durable than any material previously known. Most materials discovery over the ensuing 7000 years has been similarly random, guided largely by philosophy and chemical intuition. But with at least 95 stable elements, the number of possible combinations is enormous, and experimentation is an awfully inefficient way to find what you're looking for.

Enter UConn materials scientist Ramamurthy 'Rampi' Ramprasad. Instead of randomly mixing chemicals to see what they do, Ramprasad designs them rationally, using machine learning to figure out which atomic configurations make a polymer a good electrical conductor or insulator.

Polymers can have diverse electronic properties: they can be good insulators or good conductors. What controls these properties is mainly how the atoms in the polymer connect to each other. But until recently, no one had systematically related these properties to atomic configurations.

So Ramprasad and his colleagues decided to do just that. First, they analyzed known polymers using laborious but accurate quantum mechanics-based calculations to figure out which arrangements of atoms confer which properties, and then quantified those atomic-level relationships via a string of numbers that fingerprint each polymer. Once they had those, they could conduct a computer search through any number of theoretical polymers to figure out which ones might have which properties. Then anyone looking for a polymer with a certain property could quickly scan the list and decide which theoretical polymers might be worth trying.

For their project, Ramprasad's group looked at polymers made up of just seven molecular building blocks containing carbon, hydrogen, oxygen, nitrogen and sulfur: CH2, C6H4, CO, O, NH, CS and C4H2S. These building blocks are found in common plastics such as polyethylene, polyesters and polyureas, and could theoretically produce an enormous variety of different polymers. Ramprasad's group decided at first to analyze just 283 simple polymers, each composed of a repeated four-block unit.

They started from basic quantum mechanics, and calculated the three-dimensional atomic and electronic structures of each of those 283 four-block polymers. This is not trivial process, though: calculating the position of every electron and atom in a molecule with more than two atoms takes a powerful computer a significant chunk of time, which is why they only did it for 283 molecules.

Once they had the three-dimensional structures, they could calculate what they really wanted to know: each polymer's properties. They calculated the band gap, which is the amount of energy it takes for an electron in the polymer to break free of its home atom and travel around the material, and the dielectric constant, which is a measure of the effect an electric field can have on the polymer. These properties determine how much electric energy each polymer can store in itself.

Ramprasad's group then used this information to develop a much simpler, shorthand system that a computer could use to look at the building blocks of a polymer and how they connect to each other, and then make educated guesses about its properties.

Computers deal with numbers, so first they had to define each polymer as a string of numbers, a sort of numerical fingerprint. Since there are seven possible building blocks, there are seven possible numbers, each indicating how many of each block type are contained in that polymer. But a simple number string doesn't convey enough information about the polymer's structure, so they added a second string of numbers to denote how many pairs there are of each combination of building blocks, such as NH-O or C6H4-CS. They then added a third string that described how many triplets, like NH-O-CH2, there were. They arranged these strings as a three-dimensional matrix, which is a convenient way to describe such strings of numbers in a computer.

Then they let the computer go to work. Using the library of 283 polymers that had been laboriously calculated using quantum mechanics, the machine compared each polymer's numerical fingerprint to its band gap and dielectric constant, and gradually 'learned' which building block combinations were associated with which properties. It could even map those properties onto a two-dimensional matrix of the polymer building blocks.

Once the machine learned which atomic building block combinations gave rise to which properties, it no longer needed the quantum mechanics calculations of atomic structure. It could accurately evaluate the band gap and dielectric constant for any polymer made of any combination of those seven building blocks, using just the numerical fingerprint of its structure.

Many of the predictions of quantum mechanics and the machine learning tool have already been validated by Ramprasad's UConn collaborators, chemistry professor Greg Sotzing and electrical engineering professor Yang Cao. Sotzing actually made several of the novel polymers, while Cao tested their properties; they came out just as Ramprasad's computations had predicted.

"What's most surprising is the level of accuracy with which we can make predictions of the dielectric constant and band gap of a material using machine learning," says Ramprasad. "These properties are generally computed using quantum mechanical methods such as density functional theory, which are six to eight orders of magnitude slower." The group reported their polymer work in a recent paper in Scientific Reports; another paper on utilizing machine learning in a different manner – to discover laws that govern the dielectric breakdown of insulators – will be published in a forthcoming issue of Chemistry of Materials.

The predicted properties of every polymer Ramprasad's group has evaluated so far is also freely available in their online data vault, Khazana, which also provides their machine learning apps to predict polymer properties on the fly. They are also uploading data and the machine learning tools from their Chemistry of Materials work, and from an additional recent paper on predicting the band gap of perovskites, which are inorganic compounds used in solar cells, lasers and light-emitting diodes.

As a theoretical materials scientist, what Ramprasad wants to know is why materials behave the way they do. What about a polymer makes its dielectric constant just so? Or what makes an insulator withstand enormous electric fields without breaking down? But he also wants this understanding to be put to work designing new useful materials rationally, so he is making the results of his calculations freely available.

This story is adapted from material from the University of Connecticut, with editorial changes made by Materials Today. The views expressed in this article do not necessarily represent those of Elsevier. Link to original source.