VANCOUVER, CANADA - GPT-4 is amazing yet also a failure, something which may be difficult to comprehend at first. To be clear, GPT is legitimately amazing: It can see (though there are not yet many details on that); it does astonishingly well on a whole bunch of standardized tests - the LSAT, GRE, and SAT - and it has already been adopted in various commercial systems, e.g., Khan Academy.
However, it is also a failure, and for the following reasons:
- It does not actually solve any of the core problems of truthfulness and reliability that I laid out in my infamous March 2022 essay Deep Learning is Hitting a Wall. Alignment is still shaky: One still cannot use it reliably to guide robots or scientific discovery, the kinds of things that make it worth being excited about A(G)I in the first place. Outliers remain a problem, too.
- This section about the limitations of GPT-4 reads in some ways like a reprise of that March 2022 paper, and it does not offer authoritative solutions to any of those earlier problems. The creators themselves acknowledge the following in their own words:
Massive scaling has thus far not led to revolution. Although GPT-4 is clearly an improvement over GPT-3 and 3.5, it is not - at least as far as I can tell - qualitatively better, only quantitatively better. As noted above, its limitations remain more or less the same. Its quantitative improvements may or may not have considerable commercial implications in particular domains, but clearly tThe content herein is subject to copyright by The Yuan. All rights reserved. The content of the services is owned or licensed to The Yuan. Such content from The Yuan may be shared and reprinted but must clearly identify The Yuan as its original source. Content from a third-party copyright holder identified in the copyright notice contained in such third party’s content appearing in The Yuan must likewise be clearly labeled as such.