The Promise and Limits of AI Large Language Models

March 30, 2023

We have all had the opportunity to try ChatGPT and its competitors Bing and Bard. Spooked by their progress Elon Musk and a group of folks released a letter imploring OpenAI to halt development of artificial intelligence more powerful than GPT-4. I am more skeptical, but still optimistic on the promise of AI and LLMs.

Why It Matters

From arguments about whether AI is sentient, its dangers, or stealing our jobs, everyone seems worried about what is next. If the hype is correct we are about to enter a golden age of prosperity solving our national labor shortage, and are under threat from endless misinformation. If they are wrong then ChatGPT will evaporate with the disappointment of NFTs and cryptocurrency, the stars of our last hype cycle.

General AI Requires Lots of Computing Power

According to a research paper by Nvidia and Microsoft Research GPT-3 would take 288 years to train on a single V100 Nvidia GPU. It is neat to see the outputs and variety of questions GPT-3 can answer, but it takes a lot of resources and requires connecting to the cloud. At this scale maintaining an up to date AI is expensive.

AI’s Other Limitation is Data

As time has gone on high quality sources of data on the web have been displaced by content farms and paywalled resources. As a software engineer the most useful resources I have tend to be the books published by programmers and not websites or blogs. The New York Times and Wall Street Journal paywall their news, protecting it from being slurped into these datasets. General AI might be able to give me cursory overviews of knowledge, but at best might end up being a summarizing tool for Wikipedia.

How I See These Tools Being Used

  • Training on corporate datasets will make it easier to take the piles of documents and retrieve knowledge from internal file systems.
  • Integrated into my text editor and terminal like GitHub CoPilot, but a mere enhancement to regular autocomplete tools that I can run quickly, locally, and enhance with datasets specific to the programming language and apps I use.
  • A lens where I can look at existing code and see less complex variations of it to understand what I am looking at. Conversely a tool that will take my simple code and refactor it to use the latest performance boosting features.
  • A summarizer for long articles

Want to get posts like this in your email?

This work by Matt Zagaja is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.