Despite the hype, text-generating AI models like OpenAI’s GPT-4 make many blunders, some detrimental. James Vincent of The Verge labeled one model an “emotionally manipulative liar,” which sums up the present situation.
These models’ companies say they’re deploying filters and human moderators to remedy errors when they’re detected. But, unfortunately, there’s no proper answer. Biases, toxicity, and malicious assaults affect even the finest models.
Today, Nvidia unveiled NeMo Guardrails, an open-source toolkit to make AI-powered apps more “accurate, appropriate, on topic and secure” by improving text-generating models.
Nvidia’s VP of applied research, Jonathan Cohen, says the firm had been working on Guardrails’ core architecture for “many years” but just recognized it was a suitable fit for models like GPT-4 and ChatGPT a year ago.
Cohen informed TechCrunch via email that NeMo Guardrails has been in development since. Enterprise model deployment requires AI model safety tools.
Guardrails “safeguards” AI programs that create text and voice with code, examples, and documentation. Nvidia says the toolkit can construct rules in a few lines of code for most generative language models.
Guardrails can restrict models from going off-subject, replying with false or harmful information, and connecting to “unsafe” external sources. For example, keep a customer service assistant from answering whether queries or a search engine chatbot from linking to unreliable academic papers.
Cohen added Guardrails to let developers set their application’s boundaries. However, they may create too broad or narrow guardrails for their use case.
A universal cure for language model flaws sounds too wonderful to be true, and it is. Nvidia admits that Guardrails won’t capture everything, but firms like Zapier utilize it to make their generative models safer.
According to Cohen, guardrails work well with models “sufficiently good at instruction-following” like ChatGPT and employ the popular LangChain framework for AI-powered apps. But, unfortunately, that eliminates certain open-source choices.
Despite Guardrails’ efficacy, Nvidia isn’t offering it for charity. Nvidia’s corporate AI software package and fully managed cloud service use the NeMo framework, which includes it. As a result, Nvidia would rather companies pay for the hosted Guardrails version than the open-source version.
Guardrails are likely safe, but Nvidia should not imply otherwise.