2 Comments
User's avatar
Rainbow Roxy's avatar

This piece really made me think! It's so true how we focus on the bubble instead of the tech itself. Your take on self-supervised learning hits different, really. Do you think there's any aproach that can evolve past token prediction with the current model design? Brilliant insights, loved this read!

Expand full comment
Taylor G. Lunt's avatar

Thanks!

I think as long as we're training using self-supervised learning, trying to complete the next token, that inherently requires inefficient memorization of a bunch of things humans don't care about. You can do somewhat better with optimizations, of course, but I think you'll never do way better in terms of capabilities.

Expand full comment