Post

Difference between LLMs and traditional computer technology

ALL thing are certain in traditional computer technology, some programer say that there are beauty of certainty in traditional computer technology compared with current LLMs.

For the influence or effects for this world, traditional computer technology can solve the certain problems and LMMs can slove all others, but LLMs sometimes cannot slove correctly.

Compared with SFT, RL-base method like DPO, GRPO want human to use certain logits to improve the quality of LLMs when solving all uncertain problems, that’s the direction of we train LMMs by RL-base method.

What we need to do is to mining certain logits as much as we can. One aspect is all data in the true world, which already had been used in pre-train stage. Another aspect, we can create some logits which are suitable for all data produced by LLMs, like logits in math and coding. that’s why RL-based method works well in these two fields. For math, finding correct answer is hard but to validate if one specific answer is correct is related more easy. it works in programming, what we need is to create enough test case and set a automate test process.

This post is licensed under CC BY 4.0 by the author.