Hyper-v

Softmax Linear Unit (softmax(x)*x) Neurons – Preliminary Results

Softmax Linear Unit (softmax(x)*x) Neurons – Preliminary Results

#Softmax #Linear #Unit #softmaxxx #Neurons #Preliminary #Results

“Mechanistic Interpretability”

In previous work, we’ve found the MLP layers of transformers very difficult to understand. Switching the activation function to be softmax(x)*x — which we call “softmax linear” units (SoLU) — seems to greatly help.

As an experiment, we’ve been recording a couple videos discussing our…

source

 

To see the full content, share this page by clicking one of the buttons below

Related Articles

Leave a Reply