|

Reinforcement learning

turkey pigeon, collared pigeon, bird, dove, animal, nature, fauna, meeting, dove, dove, dove, dove, dove

Reinforcement learning is a method that drives learning and memory in primitive species such as birds, humans and other living species to its use in machine learning. It is used to influence the behavior of us humans on social media to its use to train machines.

The essential components were initiated by BF Skinner 20th century’s most eminent psychologist. The key concept he developed was that the behavior that was rewarded would be repeated and followed. His key idea was on how to reinforce behavior by intermittent schedules of reinforcement. His belief was the complex behaviours sequences could be taught by reinforcing rewards. His key hypothesis is that human behavior is controlled by environment and the future of humanity could be saved by systematic control of behavior to specific desirable ends rather than haphazardly.

He developed his method through training of pigeons going so far as to make them so trained that they could be used to guide a missile for the US Military (Pigeon’s in a Pelican: https://www.appstate.edu/~steelekm/classes/psy3214/Documents/Skinner1960.pdf)

“To say that a reinforcement is contingent upon a response may mean nothing more than that it follows the response. It may follow because of some mechanical connection or because of the mediation of another organism; but conditioning takes place presumably because of the temporal relation only, expressed in terms of the order and proximity of response and reinforcement. Whenever we present a state of affairs which is known to be reinforcing at a given drive, we must suppose that conditioning takes place, even though we have paid no attention to the behavior of the organism in making the presentation.”

– B.F. Skinner, “Superstition’ in the Pigeon” (p. 168)

He developed the Operant conditioning chamber, later called the Skinner box, in which the animals were taught certain behavior’s by rewarding or punishing the animal’s actions.

By the time Skinner retired from Harvard, behaviorism declined, and his theories were criticized but this this method of reinforcement learning is used by machine learning methods nowadays to train a machine to recognize specific patterns while ignoring other patterns.

In addition –

The little “like” button or “number of followers” are today’s reward system that is used to reward behavior of a social media poster to post more…that grants more posts and more rewards.

Similar Posts

  • Detection of AI created content

    As AI has progressed generating code, writing, music or other intelligent language based skills through an LLM there is a parallel growth in detection of AI generated content.  Like everything AI, it is a probability game. Trying to estimate the combination of words/tokens and comparing with what is in the model. Using a standard model…

  • Wonderful biological tools

    There are so many tools available these days for biologists. It is such a dream…For example, to show which protein interacts with others, it is possible to just plug-in a protein name and find all the other proteins using “String”.In this example the search was for proteins interacting with Tau (Microtubule associated protein – MTAP)…

  • |

    Obesity facts

    Just some highlights to remember 50% of world population is thought to be obesity and thought to be a master switch for a variety of disorders such as diabetes, cardiovascular disease, liver disease and kidney. 30-40% of Diabetics develop chronic kidney diseases and presence of albuminuria raises risk. 38% of adults have MASH with very…

  • AI Automations

    The AI automations have only increased. There is one interesting one that has been receiving publicity. Check it out: https://knowledgework.ai It takes notes while the person is working and becomes the second brain. Privacy and access may be of concern but capability is available with AI tools.

  • | | |

    Biotech companies

    Small Biotechs: Diagonal Tx: Clustering antibodies that mimic the action of the ligand and bypass the need for the ligand and receptor. This mutation that is created makes standard AI models not useful and so need a new method. This restores new ALK1 signaling in Hereditary Hemorrhagic Telangiectasia. It also treates LoF mutations in ALK1…