Dwarves
Memo
Type ESC to close search bar

Market report September 2024

Key takeaways

LLM advancement: Reasoning capability and contextual retrieval

Reasoning capability: Overhyped?

OpenAI’s recent release of o1-preview and o1-mini models sparked interest due to their improved “reasoning” capabilities. While many see these models as the next evolution of AI, it’s important to note that they introduce trade-offs between cost and performance. The key question is whether these reasoning models are truly effective or whether they’ll end up consuming more tokens without delivering significant benefits.

The models seem to build on the “chain of thought” prompting pattern (which literally can be employed with any model), but early adopters are still debating if they’re worth the hype. While reasoning tokens are hidden from the user, they still take up space in the model’s context window and incur costs.

“People are excited about tokens being 100x cheaper, but now we have models using 100x more tokens.”

Contextual retrieval: A new embedding technique for RAG system

Anthropic’s introduction of contextual retrieval marks a new advancement in how AI models access and utilize background knowledge. Traditional Retrieval-Augmented Generation (RAG) solutions often strip away context when encoding information, leading to failures in retrieving relevant data.

We see contextual retrieval addressing this gap by:

This technique isn’t just limited to Anthropic’s products but can be a general solution to enhance any RAG system’s performance. We expect this to improve the efficiency of AI models in specific contexts, making them more reliable and effective in tasks that require extensive background knowledge.

Generative AI in legacy codebases

One of the most powerful capabilities of generative AI is its ability to summarize large volumes of text and media. In software delivery, this translates to helping developers and business analysts understand complex or legacy codebases more efficiently.

We are seeing tools like GitHub Copilot Workspace, and Amazon Q’s developer agent revolutionize how we interact with code:

Despite these advancements, challenges remain:

We think it’s crucial to focus on specific problem spaces where these agents excel, rather than expecting them to handle every coding challenge. Their true value lies in augmenting developer capabilities in targeted areas.

The rise of small models

While most users rely on commercial giants like ChatGPT, researchers and developers are finding value in running smaller models locally. Tools like Ollama are proving that AIs with 8 to 70 billion parameters can run efficiently on personal devices, including laptops. As hardware continues to improve, local models will become an increasingly viable alternative for tasks that don’t require massive computational power.

The immutability of local models also guarantees consistency, making them indispensable in scientific research, where reproducibility is key. With the recent release of Llama 3.2 supporting vision, we’re seeing even more possibilities for lightweight, portable AI.

Serverless overuse is becoming a problem

Serverless computing may have once seemed like a silver bullet, but it’s now becoming clear that it’s not the answer to every problem. We’re seeing a growing number of cases where serverless is being misapplied, particularly in scenarios that require a persistent server.

It’s fine to start with serverless when you’re small, but as usage scales, moving to a dedicated server can save a lot of time and money. It’s time for developers to get smarter about where and when to use serverless architectures.

OpenAI ditches Next.js for Remix

This month, OpenAI surprised the community by switching the framework behind ChatGPT from Next.js to Remix. While the exact reasons haven’t been disclosed, there are a few theories. Remix’s ability to handle server-side rendering more efficiently and reduce the need for API calls seems to have played a part.

Meanwhile, Next.js has grown increasingly complex, with new features like server actions and app routing that have pushed its build times longer. This complexity may have driven developers to look for leaner alternatives like Remix, particularly for client-heavy applications.

JavaScript in crisis?

We’ve been following the thought-provoking observations from Mathias Schäfer regarding the current state of JavaScript in web development, and many of his points resonate with us:

We believe those takeaways should be taken into consideration by any builder who wants to build for the next web.

Who is hiring?

Our analysis of the latest job posts in Y Combinator’s hiring thread shows that while the number of job listings has remained steady, it’s only a third of what it was in 2022. Some key trends:

The key points indicate that companies will continue to be selective and conservative in their hiring strategies. For more detailed insights, you can view the full dataset here.

References