Teaching decision-making involves objectives, rewards, values, policies

The Yuan requests your support! Our content will now be available free of charge for all registered subscribers, consistent with our mission to make AI a human commons accessible to all. We are therefore requesting donations from our readers so we may continue bringing you insightful reportage of this awesome technology that is sweeping the world. Donate now

By Sergei Kalinin | Nov 28, 2023

Image courtesy of and under license from Shutterstock.com

The doomsayers notwithstanding, decision-making remains the exclusive preserve of humans and the day when AI usurps it is still far off. Prof Sergei Kalinin reflects on a GenAI-assisted year of teaching ML and draws on it for lessons for next year’s curriculum in this blog post.

KNOXVILLE, TENNESSEE - One of the most important aspects of machine learning (ML) in domain areas is the connection between ML methods per se and decision-making - including both short-term ones - and long-term planning. Unlike established areas such as ad placement or robot operations, decision-making in materials science is considerably more complex because the concepts of objective, reward, values, and policies very much depend on where and when one is in the process.

Correspondingly, this is a big part of the Machine Learning for Materials Science course I am teaching, and one of the key bits of homework - the one I have enjoyed checking the most.

After looking at students’ homework - while also simultaneously adjusting my own approach to keep up with these rapidly changing times - I have thus far come up with several observations as follows:

- One is actually able to detect ChatGPT, at least with some probability. While the use of ChatGPT is part of my course’s policy, it was nonetheless very curious for me to note that the majority of my students either did not use ChatGPT or opted to combine ChatGPT answers with their own logic - and this is exactly the outcome I hoped to achieve.

- Many of my students have discovered that even the simple placement of ads allows for several ways of defining a reward - to be more precise, the reward for a platform hosting ads (the click rate) is different than for the organization creating those ads (revenue). This is very impressive - and clearly illustrates that many of these definitions are relative and must be integrated in[to? - ambiguous] hierarchical or multiplayer frameworks.

Play time

Parentheti

The content herein is subject to copyright by The Yuan. All rights reserved. The content of the services is owned or licensed to The Yuan. Such content from The Yuan may be shared and reprinted but must clearly identify The Yuan as its original source. Content from a third-party copyright holder identified in the copyright notice contained in such third party’s content appearing in The Yuan must likewise be clearly labeled as such.

GET STARTED

- or -