A.I predict future discoveries
--Must See--

Bioinformatics Summer Internship 2024 With Hands-On-Training + Project / Dissertation - 30 Days, 3 Months & 6 Months Duration

A. I predict future discoveries & reveals new scientific knowledge hidden in old research papers.

Can machines think? once asked Alan Turing, the famous mathematician, code breaker and computer scientist. Today, some experts have no doubt that Artificial Intelligence will soon be able to develop the kind of general intelligence that humans have. But there are others who argue that machines will never measure up. Although Artificial Intelligence can already outperform humans in certain tasks just like calculators, they can’t be taught human creativity.

Just by using the language in millions of old scientific papers, a machine learning algorithm was successfully able to make completely new scientific discoveries.

In a research study published in Nature on July 3, scientists from the Lawrence Berkeley National Laboratory used an algorithm called Word2Vec. The algorithm sifts through scientific papers for connections humans had missed and helps A. I predict future discoveries. it then spits out predictions for possible thermoelectric materials, that convert heat to energy and are used in many heating and cooling applications.

The researchers created a system that could accurately identify and extract information independently. It used sophisticated techniques based on statistical and geometrical properties of data to identify chemical names, concepts, and

structures. This was based on about 1.5m abstracts of scientific papers on material science.

A machine learning program then classified words in the data based on specific features such as “elements”, “energetics” and “binders”. For example, “heat” was classified as part of “energetics”, and “gas” as “elements”. This helped connect certain compounds with types of magnetism and similarity with other materials among other things, providing insight on how the words were connected with no human intervention required.

This algorithm made the prediction by connecting the compound with words such as “chalcogenide” i.e, the material containing “chalcogen elements” such as sulfur or selenium), “optoelectronic”  which means the electronic devices that source, detect and control light and “photovoltaic applications”. Many thermoelectric materials share such properties, and Artificial Intelligence was quick to show that.

This Word2Vec algorithm didn’t know the definition of thermoelectric, though. It received no training in materials science. Using only these word associations, the algorithm was able to provide candidates for future thermoelectric materials, where some of them may be better than those we currently use.

A. I  can read any paper on material science, so can make connections that no researchers could, said the researcher Anubhav Jain. He further added that sometimes it does what a scientist would do; some times it makes these kinds of cross-discipline associations.

To train the algorithm, the researchers first assessed the language in 3.3 million abstracts related to material science with a vocabulary of about 500,000 words. Researchers fed the abstracts to Word2vec, which used machine learning to analyze relationships between words.

The way that these Word2vec algorithm works are that you train a neural network model to remove each word and then predict what the words next to it will be, Jain said. He added that by training a neural network on a word, we get representations of words that can actually confer knowledge. And by using just the words found in scientific abstracts, the Word2vec algorithm was able to understand the concepts such as the periodic table and even the chemical structures of molecules. The Word2vec algorithm linked words that were found close together and by creating vectors of related words that helped to define concepts. In some cases, some of the words were linked to thermoelectric concepts but had never been written about as thermoelectric in any abstract researchers surveyed. This gap in knowledge is hard to catch with a human eye but it is easy for an algorithm like Word2vec to spot.

After showing A. I predict future discoveries and the algorithm’s capacity to predict future materials, researchers took their work back in time, virtually. Researchers scrapped recent data and tested the Word2vec algorithm on old scientific papers, see if it could predict scientific discoveries before they happened. Once again, the Word2vec algorithm worked.

In one experiment, scientists analyzed only papers published before 2009 and were able to predict one of the best modern-day thermoelectric materials 4 years before it was discovered in the year 2012.

This new application of a machine learning algorithm goes beyond materials science. Because it is not trained on a specific scientific dataset, you could easily apply it to other disciplines, retraining it on the literature of whatever subject you wanted. Vahe Tshitoyan who is the lead author on the study, says other researchers have already reached out, wanting to learn more.

This Word2vec algorithm is unsupervised and it builds its own connections, Tshitoyan added.

 

Source

Ria Roy completed her Post Grad degree at the Visvesvaraya Technological University. She has a great grounding in the skills, including technical, analytical and research skills. She is a motivated life science professional with experience of working in famous research institutes