Reflection 70b Ai Model Update. Is it Broken? What is going on

6 Less than a minute

Reflection 70b Ai Model Update. Is it Broken? What is going on here?

#Reflection #70b #Model #Update #Broken

“Digital Spaceport”

The Reflection 70b Llama 3.1 was going to be the best Llama 3.1 fine-tune we had ever seen, even lauded as rivaling OpenAI and Claude. The current state is not that, however, and I wanted to give a quick update on what I found out and the direction things are going. HOPEFULLY we can see a killer…

source

To see the full content, share this page by clicking one of the buttons below

6 Comments

@horrorislander says:
September 9, 2024 at 12:12 pm
I think the benchmarks are very limited. I get that they want "smart", but the biggest failing of LLMs so far is their utter inability "to ask the important question". Instead of asking, we get one of two things: a false assumption (and, because that's usually triggered by the 'safety' training, a good scolding with it); or a massive outpouring of verbiage trying to cover every possible facet of a topic without ever getting to any useful point. This is awful for two reasons: they don't know who we on the other end are at all; what we already know, believe, or understand, and so an insightful question or two would dramatically improve their answers; and secondly, because we're apparently all supposed to live with these things every day, and so will spend our lives being barked at or being drowned with data. In fact, that's exactly why the description of this "reflection" model seemed so interesting to me; because, it seemed to me, that maybe it had the model ask itself "what if I'm wrong" before giving the answer. If it goes to the next step by asking itself how it can find out, maybe we'll finally get an endurable artificial companion!
@jeffwads says:
September 9, 2024 at 12:12 pm
Already deleted the 2nd upload due to its erratic performance. Some things it answered very well, but others were much worse than the 3.1 70b (which I will never delete).
@nufh says:
September 9, 2024 at 12:12 pm
I understand what you mean in this video. The way we evaluate LLMs now isn’t based on graphs or scores, but more on the experience and feel behind it. Unquantized and quantized models can give different results even if the answer is correct—hard to explain, isn’t it?
@borntobomb says:
September 9, 2024 at 12:12 pm
Didnt "matt" alreadoy post that they had a problem during upload?
Idk the actual reality of the benchmark scores, but if its not the real model then probably just prematurely judgement.
@tirtir1401 says:
September 9, 2024 at 12:12 pm
simple prompt fails:
"Write a script that implements the “tree -L 2” functionality, but in bash without using tree. make a one-line script, without arguments, for the current directory."
ANY other LLM can do this more or less correct, except reflection (70b-q8_0 tested). Reflection code just do something useless.
@nufh says:
September 9, 2024 at 12:12 pm
I wish that my rig can run this model.

Reflection 70b Ai Model Update. Is it Broken? What is going on here?

“Digital Spaceport”

To see the full content, share this page by clicking one of the buttons below

Related Articles

ROUTERBOARD MIKROTIK SFP FIBRA OPTICA #SHORTS

How to install Proxmox on Virtualbox-ASJ

Motorcycle trip to Mae Pha Haen Reservoir

?️ 20,000 Subscriber Q&A LiveStream ?️

6 Comments

Leave a ReplyCancel reply