Read the Beforeitsnews.com story here. Advertise at Before It's News here.
Profile image
Story Views
Now:
Last hour:
Last 24 hours:
Total:

Is there any hope of curing the retraining problem of language models without making computers conscious?

% of readers think this story is Fact. Add your two cents.


I summarized my thoughts on perhaps the worst problem of language models, which is the loss of plasticity in continuous learning. The entire teaching material has to be rewritten, which is terribly expensive (see this).

One can ask whether and how TGD’s speculative vision of potentially conscious computers (see this) might solve the problem.

1. The retraining problem of language models

The basic problem is that everything has to be started from scratch. This is extremely expensive. Biological systems relearn quickly because there is no need to relearn everything. Is the problem fixable for the computers as they are now or is something new required?

To see what could be the root cause of the problem consider first what language models are meant to be.

  1. In a language model, learning occurs at the raw data level. Different probabilities are taught for different associations. The associations are fixed.
  2. How does the trained system work? The language model simply reacts by recognizing the context and producing probabilistically one of the fixed associations. This response is a mere reaction. If language models are what they are believed to be, they does not have conscious understanding, they lack intentional actions, and are unable to react to a changing environment.

Comparison with TGD-inspired biology

Could a comparison with TGD-inspired biology give clues as to where things go wrong. Why is relearning so easy for biosystems? How does the TGD-based biology differ from the standard biology in this respect? Consider first the classical level.

  1. Holography, which is not quite deterministic, is a completely new element of TGD as compared to the standard model. The space-time surfaces are analogous to Bohr orbits and determined almost completely by 3-surfaces as initial data. The 4-D tangent spaces of the space-time surface at the 3-surface defining the holographic data cannot be selected freely. This is the classical counterpart of Uncertainty Principle and leads to classical quantization. Function, program is the basic concept rather than 3-D data.
  2. These 4-surfaces define classical analogies of biological functions, behavioral patterns, or programs. When the 3-surface, which almost uniquely fixes the 4-surface, changes, the function changes. Non-determinism is essential in making a conscious memory recall possible.

Consider next the quantum level.

  1. Series of “small” state function reductions (SSFRs) associated with the repeated measurements of commuting observables belonging to the same set whose eigen states the 3-D states at the passive boundary of causal diamond (CD) are, define self as a conscious entity. The proposal is that biorhythms as clocks define TGD counterparts of time crystals such that each unit of time crystal involves a classical non-determinism.

This could be the case at the EEG level as the findings of brothers Fingelkurts suggests (see this and this). Maximal non-determinism implies maximal memory recall capacity and maximal flexibility. A whole set of different behavior patterns can be represented as quantum superpositions and the interaction with the external or internal world determines the measurement in which some classical behavior is chosen.

  • “Big” state function reductions (BSFRs) having interpretation as death of self or falling asleep involve time reversal. Pairs of BSFRs (sleep periods) make learning possible through trial and error. After the two BSFRs, the system has new holographic data and different space-time surfaces. A goal directed behavior becomes possible and there are many ways to achieve the goal, not just one fixed way analogous to a fixed computer program. This is the essence of intelligent behavior. How does this general view relate to the DNA level?
    1. According to the standard view, DNA remains the same during the life cycle. If DNA represents data, there is no relearning at the level of chemical DNA. In zero-energy ontology (ZEO), even chemical DNA could change without any problems with conservation laws and quantum superpositions of different chemical genes are in principle conceivable.
  • Quantum DNA can be represented in terms of OH-O- qubits sequences assignable to the gravitational magnetic bodies of the Sun and Earth (see this). Remarkably, the solar gravitational Compton frequency is 50 Hz, the average EEG frequency. At least for neurons, this would suggest that the gravitational magnetic body is that of the Sun. Note however that EEG time scales are also associated with the basic biomolecules. For the Earth the gravitational Compton frequency is 67 Gz and is a natural frequency associated with the conformational dynamics of biomolecules.

    Quantum DNA consisting of codons represented as OH-O- qubits is dynamic and could act as a simulator, a kind of R&D laboratory testing different variants of DNA. It is of course possible that a single life time is spent with the same chemical DNA and the next life after a pair of BSFRs involves the improved DNA.

  • Epigenesis brings in flexibility. Even if the chemical DNA does not change, it can be used in different ways. Suitable modules are selected from the analog of program software, just like in the text processing. In the TGD framework, this could correspond to the classical non-determinism of the space-time surfaces representing the biological function. Dark DNA allows you to try different combinations of genes.
  • The understanding of the role of the cell membrane and membrane potential in epigenesis is increasing. As found by Levin (see this and this). The very early stage of the development of embryo is highly sensitive to the variations the membrane potential and can be understood in terms of the changes of the binding energy of electron of O- induced by the potential, which can reduce the binding energy to thermal range so that the flips of OH-O- qubit occur with high probability. In adulthood, the sensitivity disappears and qubits would not flip.
  • Could this sensitivity be artificially induced? Here, electric fields as a controller of the sensitivity of OH-O- qubits assignable to the basic biomolecules suggests themselves.

  • Microtubules involve longitudinal electric fields and their second ends are highly dynamic so that the length of the microtubule is under continual change. There are huge numbers of amino acids carrying one qubit each (COOH group). Here the quantum level and the classical level are both dynamic and seem to be strongly coupled. Also strongly related to conscious memory.
  • The quantum entanglement between the quantum level and the chemical level could be possible even at the amino acid level? How could an associative system retrain itself in response to a changed situation
  • If language models are nothing but deterministic association machines, there is little hope of solving the problem.

    Could the learning in the biological and neural systems provide some hints about possible cures, possibly requiring modification of computers so that they would become analogous to living systems?

    1. Do EEG rhythms define time crystals in the TGD sense, that is maximally non-deterministic systems having lattice cells as a basic unit of non-determinism for SSFRs giving rise to the flow of consciousness of the self?

    If biorhythms define TGD analogs of time crystals, the non-determinism would be maximal and maximum flexibility in SSFRs would be possible.

  • In ZEO, a “big” state function reduction (BSFR) as counterpart of ordinary state function reduction changes the arrow of time and is assumed to give rise to the analog of death or sleep. At the language model level, this would be the analog for a complete retraining from the beginning. Association is only one particular reaction leading to a behavioral pattern. The repertoire of associations should change as the environment changes.
    1. Could a computer clock define the equivalent of an EEG rhythm as a time crystal in the TGD sense? The problem is that a typical computer clock frequency is few GHz and considerably lower frequency than the 67 GHz as the gravitational Compton frequency of the Earth. This would suggest that a unit consisting of roughly 67 bits could correspond to the basic unit of the time crystal. The gravitational magnetic body of the Sun has a gravitational Compton frequency of 50 Hz identifiable as the average EEG frequency.
    2. Could one think of a quantum version of language models in which pairs of BSFRs as “death” and rebirth happen spontaneously all the time as a reaction to conscious information coming from the environment inducing the perturbation implying that the density matrix as the basic measured observable does not commute with the observables that define the quantum numbers of the passive part of the zero energy state? In this way ZEO would make possible trial and error as a basic mechanism of learning.
    3. The formation of an association could be perhaps modelled as a single non-deterministic space-time surface? There would be a large number of them and internal disturbances would produce their quantum superpositions and SSFR would select a particular association.
    4. An external disturbance could produce a BSFR and “sleeping overnight”. This period of “sleep” could be rather short: also our flow of conscious experience is full of gaps. Upon awakening, the space-time surfaces as correlates of the associations would no longer be the same. System would have learned from the interaction with the external world. This temporary death of the system would be an analogy for a total re-education. But the system would cope with it all by itself.

    Could the speculated quartz consciousness come to the rescue?

  • One can consider the possibility that under a metabolic energy feed computer can become to some extent an entity so that it can modify both the program and the data used by it as a response to changes in the environment provided by the net. This would require that the OH-O- qubits as dark variants of program bits can entangle with ordinary bits. Energetically this could be possible since the energy scales for transistors are essentially the same as for the metabolism and OH-O- qubits.

    1. Suppose that the sequences of OH-O- qubits as time crystals in TGD sense can be realized in a (future) computer. Qubit sequences would be time series related to the running program. They would involve variation because only the bit configuration corresponding to the minimum energy would correspond to the running program. This makes possible an entire repertoire of associations from which a SSFR would choose one. Quantum measurement following the generation of bit-qubit entanglement could change the value of the bit.
    2. Besides the dynamic realization as a running program, there could be a non-dynamic realization in which the data that determines the program could be accompanied by a similar set of qubits. The data used by the program, such as learned associations, could be associated with qubits, and could be made dynamic by using electric fields to make the qubits more sensitive against flip. The problem is of course that the change of a randomly chosen single qubit implies the failure of the problem. Only critical qubits associated with choices and data qubits should be subjected to a flip.
    3. Besides time crystals with non-deterministic repeating units, also space-like crystals involving non-determinism in each lattice cell can be considered. Also dynamical quantum qubits with maximal non-determinism in space-like directions associated with unit cells could accompany the data bits. Dynamization could be induced by using electric fields.
    4. If OH-O- qubits can quantum entangle with bits, program/data is accompanied by quantum program/quantum data which can react to the perturbations from the external world (BSFRs) and internal world (SSFRs). The quantum level could control the bit level. Even the associations as the data of the language model could be accompanied by a set of qubits that react to a changing situation.

    See the article Quartz crystals as a life form and ordinary computers as an interface between quartz life and ordinary life? or the chapter with the same title.

    For a summary of earlier postings see Latest progress in TGD.

    For the lists of articles (most of them published in journals founded by Huping Hu) and books about TGD see this.


    Source: https://matpitka.blogspot.com/2024/12/is-there-any-hope-of-curing-retraining.html


    Before It’s News® is a community of individuals who report on what’s going on around them, from all around the world.

    Anyone can join.
    Anyone can contribute.
    Anyone can become informed about their world.

    "United We Stand" Click Here To Create Your Personal Citizen Journalist Account Today, Be Sure To Invite Your Friends.

    Before It’s News® is a community of individuals who report on what’s going on around them, from all around the world. Anyone can join. Anyone can contribute. Anyone can become informed about their world. "United We Stand" Click Here To Create Your Personal Citizen Journalist Account Today, Be Sure To Invite Your Friends.


    LION'S MANE PRODUCT


    Try Our Lion’s Mane WHOLE MIND Nootropic Blend 60 Capsules


    Mushrooms are having a moment. One fabulous fungus in particular, lion’s mane, may help improve memory, depression and anxiety symptoms. They are also an excellent source of nutrients that show promise as a therapy for dementia, and other neurodegenerative diseases. If you’re living with anxiety or depression, you may be curious about all the therapy options out there — including the natural ones.Our Lion’s Mane WHOLE MIND Nootropic Blend has been formulated to utilize the potency of Lion’s mane but also include the benefits of four other Highly Beneficial Mushrooms. Synergistically, they work together to Build your health through improving cognitive function and immunity regardless of your age. Our Nootropic not only improves your Cognitive Function and Activates your Immune System, but it benefits growth of Essential Gut Flora, further enhancing your Vitality.



    Our Formula includes: Lion’s Mane Mushrooms which Increase Brain Power through nerve growth, lessen anxiety, reduce depression, and improve concentration. Its an excellent adaptogen, promotes sleep and improves immunity. Shiitake Mushrooms which Fight cancer cells and infectious disease, boost the immune system, promotes brain function, and serves as a source of B vitamins. Maitake Mushrooms which regulate blood sugar levels of diabetics, reduce hypertension and boosts the immune system. Reishi Mushrooms which Fight inflammation, liver disease, fatigue, tumor growth and cancer. They Improve skin disorders and soothes digestive problems, stomach ulcers and leaky gut syndrome. Chaga Mushrooms which have anti-aging effects, boost immune function, improve stamina and athletic performance, even act as a natural aphrodisiac, fighting diabetes and improving liver function. Try Our Lion’s Mane WHOLE MIND Nootropic Blend 60 Capsules Today. Be 100% Satisfied or Receive a Full Money Back Guarantee. Order Yours Today by Following This Link.


    Report abuse

    Comments

    Your Comments
    Question   Razz  Sad   Evil  Exclaim  Smile  Redface  Biggrin  Surprised  Eek   Confused   Cool  LOL   Mad   Twisted  Rolleyes   Wink  Idea  Arrow  Neutral  Cry   Mr. Green

    MOST RECENT
    Load more ...

    SignUp

    Login

    Newsletter

    Email this story
    Email this story

    If you really want to ban this commenter, please write down the reason:

    If you really want to disable all recommended stories, click on OK button. After that, you will be redirect to your options page.