A case for Narrow - Purpose built LLM’s

The days of using weather as conversation starter seems to have long gone, at least within the tech circles. Generative AI has not only overtaken weather as the conversation starter but is often coming out to be the only conversation topic. Everyone wants to know what everyone else is building using Gen AI, what proof of concept (POC) are being considered etc. Large Language Model (LLM) names and their capabilities are talked with the excitement trailing even the Thanksgiving deals.

Yet, a new report about Gen AI adoption and usage published last week seems to contradict this weather replacing hype. The report indicated a decrease in traffic to ChatGPT and other Gen AI services, for the first time in several months. This traffic report further showed that this decreasing trend was in both new traffic, as well as traffic from existing users of these Gen AI services.

What might be causing this gap between the hype and the actual usage?

When Gen AI opened for public, it was that shiny object that everyone wanted. However, the interest seemed to have waned due to lack of real-world use. Purely based on what I have researched and seen of LLM’s; I strongly believe LLM’s can help further the human potential. However, what I think is lacking are ideas on how to leverage the potential of the LLM’s and a well thought through approach to implement the ideas.

Here are my 5-points to explain this gap.

  1. We are looking at LLM’s as the final finished product, and this might be one of the problems. I think we should look at LLM only as an ingredient, a raw material, which can help us build a finished product. Essentially, we need to move the focus away from the LLM’s themselves to what we can actually build using those LLM’s.
  2. LLM can do more than just spit out a poem or create a story or write a blog. We need to start thinking about real use cases which can make a real difference. This is what I meant when I said LLM’s can help further the human potential.
  3. The problem of Security, Data privacy, Access Management, Model Tenancy needs to be addressed along with the use case build and not as an afterthought. We cannot really build anything useful at scale, without addressing these problems.
  4. Also true are the risks of the models been used to build malicious use cases; as well as prompt manipulation leading to undesired output.
  5. The most critical of all, LLM’s in my opinion are too broad for any particular use case. We need to look at training LLM’s for specific tasks using specific datasets, with the training happening in controlled environment which supports human collaboration.

 

Case for Narrow or focused LLM’s - 

Of the 5 points I have noted above, points 1, 2 and 4 are the focus areas for architects; while point 3 can be addressed by services such as Amazon EC2 instance families, Amazon EC2 Auto Scaling, Amazon IAM and Amazon Bedrock.

I was trying a language-to-language conversion using different LLM’s. I used Cobol program snippets and requested the LLM’s to convert it to a different language. What I noticed was,

  • There are few LLM’s which can convert program code from 1 language to another.
  • Of the ones which do, most of them do fairly well for simple code snippets.
  • For functionality which are like medium complexity, the output in the other language was either incorrect or if correct, was verbose.
  • For complex snippets, I have so far found no LLM which correctly translated.
  • The most critical of all, none of the LLM’s gave the logic in the other programming language using the constructs of that language. All LLM’s I tried did a line-by-line conversion only.

These observations bring me to point 5 noted above. LLM’s today are too broad to be implemented for any specific use case. We need to contextualize LLM’s for specific use cases. Think of this as an ETL tool which is purpose built for specific use case of data extraction, transformation and loading into a database, OR a language such as SQL, which is purpose built to extract/manipulate and make sense of data from a Relational Database. Like ETL tools or SQL language, we need Narrow/focused LLM’s which are purpose built for specific use cases.

For example, if the use case is to convert a code written in one language to another, train an LLM with specific code snippets of both the languages. The point here is, there are tools already out there which can translate one language to another line-by-line. These existing tools most often do not use the capabilities of the other language to build out a better version of the same code. This is where narrow LLM’s can come to use. LLM can be trained with specific code snippets including language constructs and functions to understand the input code and use the constructs of the new language to not only convert the code from one language to another but also rearchitect the code using the constructs of the output language.




How I envision Amazon Bedrock can help?

Out of the box, Amazon Bedrock shines by offering access to a wide range of LLM’s through its partnership, and like other AWS services, Amazon Bedrock excels with its seamless integration with other AWS services.

This feature of Amazon Bedrock can be used to focus on Industry specific scenarios or even Use case specific scenarios.

Amazon Bedrock for Industry Specific Scenarios

Services can call LLM’s which are “Fit for Industry” such as Banking, Telecom etc. This will help customers in those specific Industries to leverage LLM’s fit for their specific domain needs. 




Amazon Bedrock for Use Case Specific Scenarios

Another way I think AWS can enable/build Amazon Bedrock is to offer services which based on the use cases. Services can call LLM’s which are “Fit for Use Case” such as ELT, Financial Accounting, Language to Language conversion etc. “Fit for Use Case” LLM’s can be a subset of “Fit for Industry” or they can be a category by themselves if they apply to more than one Industry. This will help customers requiring those specific use cases to leverage LLM’s fit for their specific use case needs. 



Conclusion:

I envision Narrow LLM’s to be like wheels, which the general population can use to build cars, trucks, and trains. Unless this happens, the general population will focus on designing the wheels and will never get to building the cars, trucks or trains; and LLM’s will continue to be that shiny object which has no real-world use beyond as a subject of elevator talk.


~Narendra V Joshi

Comments