Thanh's Islet 🏝️

My Take on LLMs

My career began in Natural Language Processing, and I’ve studied the mathematics and theories behind AI extensively. So when ChatGPT took the world by storm, I approached it with skepticism. I viewed Large Language Models (LLMs) as sophisticated “next word predictors” – black boxes that ingest vast amounts of data and generate plausible word sequences from initial prompts, and felt that The hype surrounding LLMs seemed overblown. I struggled to see how these “predictors” could revolutionize our work and lives. However, curiosity and a touch of boredom eventually led me to explore the world of LLMs.

Dumb Code Generator, Reading Partner, and Code Kick-starter

My first practical application of LLMs came when I faced the task of creating numerous Golang API endpoints with similar structures but minor differences. These attributes made deduplication challenging. To my surprise, I found LLMs incredibly useful for generating that kind of predictable with little twists code. LLMs’ code generation ability saved time and reduced errors in these tedious tasks.

Experimenting with LLM-aided reading was a revelation. The experience surpassed traditional reading in many ways:

The approach of using LLMs as reading partner made reading complex texts more accessible and engaging. For instance, when I needed to skim through lengthy, complex regulation documents, using LLMs to summarize the text and answer follow-up questions proved highly effective.

Taking it a step further, I began using LLMs as first draft writers and project kick-starters. They excelled at initiating writing projects and providing a solid foundation for side projects. I could request initialization commands and receive a “good enough” project structure to build upon.

Limitations

Despite their benefits, LLMs have significant limitations. To understand these, let’s simplify the LLM working model into two components:

The limited context length means LLMs struggle with longer documents or keeping large codebases in mind. While we can work around this by dividing knowledge into smaller pieces or creating summarizations, the limitations of the “long-term memory” are more challenging to overcome.

There’s a wealth of tacit and “esoteric” knowledge that LLMs don’t possess in their “long-term memory.” Creating new, nuanced, and complex code, especially in niche domains where knowledge isn’t widely available, remains a significant challenge for LLMs. Even when writing this “relatively simple” blog post, I found that while LLMs can produce a decent first draft, they struggle to fully capture personal experiences or develop more complex arguments without making the process feel convoluted 2.

Useful LLM Tools and Notes

Final Thoughts

Despite their current limitations, LLMs have become an invaluable tool in my workflow. They’ve transformed how I approach various tasks, from coding to writing and researching. I’m excited about future developments in this field. While I believe LLMs can aid 3 us in various tasks, I don’t foresee them replacing human judgment anytime soon.

Footnotes


  1. While there is actual research on “long-term memory for LLMs”, I’m referring to their “plausible text generation ability”. Scientifically speaking, LLMs do not have memory at all: https://datascience.stackexchange.com/questions/122225/will-llms-accumulate-its-skills-after-each-time-it-is-taught-by-one-in-context-l↩︎

  2. You can view an example of this on Telegraph, as I couldn’t find a way to share my Claude conversation: https://telegra.ph/Prompt-and-Draft-for-My-Take-on-LLMs-08-25↩︎

  3. There is actually a tool named Aider that promises complex feature coding. I experimented with Aider using the OpenAI API, but found it required significant hand-holding to be useful, so I abandoned it after a while. â†©ď¸Ž

#llm #ai #claude #chatgpt #supermaven