|

Reinforcement learning

turkey pigeon, collared pigeon, bird, dove, animal, nature, fauna, meeting, dove, dove, dove, dove, dove

Reinforcement learning is a method that drives learning and memory in primitive species such as birds, humans and other living species to its use in machine learning. It is used to influence the behavior of us humans on social media to its use to train machines.

The essential components were initiated by BF Skinner 20th century’s most eminent psychologist. The key concept he developed was that the behavior that was rewarded would be repeated and followed. His key idea was on how to reinforce behavior by intermittent schedules of reinforcement. His belief was the complex behaviours sequences could be taught by reinforcing rewards. His key hypothesis is that human behavior is controlled by environment and the future of humanity could be saved by systematic control of behavior to specific desirable ends rather than haphazardly.

He developed his method through training of pigeons going so far as to make them so trained that they could be used to guide a missile for the US Military (Pigeon’s in a Pelican: https://www.appstate.edu/~steelekm/classes/psy3214/Documents/Skinner1960.pdf)

“To say that a reinforcement is contingent upon a response may mean nothing more than that it follows the response. It may follow because of some mechanical connection or because of the mediation of another organism; but conditioning takes place presumably because of the temporal relation only, expressed in terms of the order and proximity of response and reinforcement. Whenever we present a state of affairs which is known to be reinforcing at a given drive, we must suppose that conditioning takes place, even though we have paid no attention to the behavior of the organism in making the presentation.”

– B.F. Skinner, “Superstition’ in the Pigeon” (p. 168)

He developed the Operant conditioning chamber, later called the Skinner box, in which the animals were taught certain behavior’s by rewarding or punishing the animal’s actions.

By the time Skinner retired from Harvard, behaviorism declined, and his theories were criticized but this this method of reinforcement learning is used by machine learning methods nowadays to train a machine to recognize specific patterns while ignoring other patterns.

In addition –

The little “like” button or “number of followers” are today’s reward system that is used to reward behavior of a social media poster to post more…that grants more posts and more rewards.

Similar Posts

  • Chromosome transfer

    It is possible to transfer DNA from one cell to another but usually this is done by transfer of small regions of DNA and their transfer. However, larger pieces can be transferred as whole chromosomes that would add significant amount of DNA material. There are several steps involved such as creation of large sequences and…

  • | | |

    Biotech companies

    Small Biotechs: Diagonal Tx: Clustering antibodies that mimic the action of the ligand and bypass the need for the ligand and receptor. This mutation that is created makes standard AI models not useful and so need a new method. This restores new ALK1 signaling in Hereditary Hemorrhagic Telangiectasia. It also treates LoF mutations in ALK1…

  • Explainable AI

    A very traditional problem solving method is the following: given a set of features or variables, can we understand the features to form a conclusion. This could be something like a treatment strategy wherein the strategy is built on a series of data and then ingesting the data helps make a conclusion. However, an equally…

  • Plant communication

    Plants communicate with each other. That much is known. However, what is not known is how plants communicate using symbiosis. Dr Johnson has reported new work in Ecology letters that shows that plants use symbiotic fungi to enable communication. Communication also causes interesting secretion by plants when infected by aphids. These secretions that are volatile…

  • Sparql tool – connect LLM to knowledge graphs

    Talk to knowledge graph – sparql-tool Use LLM.: Hallucinations (sometimes good for creativity), Outdated knowledge, no access to your data ( trained on public knowledge). Solutions: Fine tuning – retrain the model on domain data RAG – Search: look things up in real time. Claude code looks at searched doc. Vector embeddings: semantic similarity search…