[TOC]

  1. Title: Recent Language Model Technique 2024
  2. Review Date: Thu, Apr 25, 2024
  3. url: https://www.youtube.com/watch?v=kzB23CoZG30
  4. url2: https://www.youtube.com/watch?v=iH-wmtxHunk
  5. url3: https://www.youtube.com/watch?v=o68RRGxAtDo

LLama 3

image-20240425125031837

Llama architecture

Extra: Sampling and Rejection Sampling

ref: https://www.youtube.com/watch?v=9ixzzPQWuAY (Inverse Transform Sampling)

ref: https://www.youtube.com/watch?v=OXDqjdVVePY (Accept-Reject Sampling)

Grouped Query Attention

image-20240425154324824

image-20240425154556029

image-20240425154646875

Result

image-20240425154750368

NorNet: Efficient High Order Spatial Interaction with Recursive Gated Convolution

ref: https://arxiv.org/abs/2207.14284

image-20240425223058722

image-20240425223112089

Check definition of depthwise convolution: https://www.youtube.com/watch?v=vVaRhZXovbw

image-20240425223355380

Capability Forgetting (from GLM-4)

as a post-training step subsequent to SFT, the author also observed unexpected behaviour in the policy after the RLHF stage. The model shows a reduced capability in handling specific scenarios.

Direct Preference Optimization

can we just use Cross entropy instead of PPO?

image-20240426231336360

image-20240426232104891