I think AI agentic machine translation has huge potential for improving over traditional neural machine translation, and am releasing as open-source a demonstration I'd been playing with as a fun weekend project. Using an agentic workflow, this demonstration (i) Prompts an LLM to translate from one language to another, (ii) Reflects on the translation to come up with constructive suggestions, (iii) Uses the suggestions to refine the translation. In our limited testing, this is sometimes competitive with, and sometimes worse than, leading commercial providers. But it gives a highly steerable translation system where by simply changing the prompt, you can specify the tone (formal/informal), regional variation (do you want Spanish as spoken in Spain or as spoken in Latin America?), and ensure consistent translation of terms (by providing a glossary). This is not mature software. But I hope the open-source community can make agentic translation work much better. Given how a simple reflection workflow already gives decent results, I think there's significant headroom to make agentic translation much better. Releasing an early software prototype like this is something new I decided to try to see if it is helpful to the developer community. I'd love any feedback on this. Thanks to Joaquin Dominguez, Nedelina Teneva, PhD and John Santerre PhD for help with this. https://lnkd.in/gjGANH6H
Yep I did do this around a year ago in a prototype too, and am in the process of integrating this concept into the open source tool redaktool.ai I'm working on.Thank you for sharing; I'll checkout the prompts and workflow now. Btw. the prototype I did was based on early versions of OpenAI GPT-4 back then and it helped to even translate methaphers well, and address complex linguistic details. Very promising and sometimes providing much better results than commercial providers would offer. It's also more flexible and instructable, obviously.
Andrew Ng This is along the path to what many in the industry are doing. Two comments: 1. Replace step 2 with a formalized reflection approach such as AutoMQM. 2. BLEU isn't discriminative at the accuracy levels modern MT systems achieve. Use a trained metrics like BLEURT or COMET to see real differences between systems. 20 years ago we called this idea automatic post-editing, where you'd train a second string transducer to rewrite initial MT output. However, those systems were very difficult to train and cascade. Using LLMs in cascades is significantly more effective.
I can't agree more. I think the potential of agentic systems, even with current LLMs and SLMs, are severely underestimated. There are so many potential permutations with self-reflection, RAG, tool use, etc. that have yet to be explored. I'm excited about the potential.
Why do we need anything agentic here? Can't we just put in the initial prompt all the requirements (such as tone, the glossary, etc) and then, once the initial translation was generated, start a separate thread, put there the initial translation and add "This translation was supposed to respect the constraints of tone X and glossary Y. If it doesn't, rewrite the translation by respecting the constraints." This is for the best result. But the research demonstrates that a simple "Are you sure about this result?" asked in the same thread improves the output significantly.
Couldn’t we make frontier models immediately 90% more reliable just by adding two or three agentic reflection/web search rounds after each output? Why don’t current AI providers do that by default? Reducing hallucinations and increasing output quality this way would seem like a no-brainer, no?
remarkable idea of iterative (reflexive) LLM usage, I wonder how much of the prompting is improvable by RL tools (for instance the hard-coded texts in the "multichunk_reflect_on_translation").
What distresses me most is that times when notives took hours tweaking and tinkering experimental apps are long gone. Accessing AI tools as if by unison requires sgeling out bucks in advance making spending tons of time trying this and the other tool out if the question. I wish access to an LLM is declared as open source there shdnt be any requirement to pay ostensibly because servicing the platform costs a lot of money. Such a requirement lengthens the learning curve for some while shortening it for others.
It's great to see insights such as this one. I believe we've only begun to plumb the depths of foundation models and LLMs in general. While they will need some help to get to AGI, my instinct is that these models represent the start of an equivalent to the human connectome that need sensory prompts approximating the human sensory array that builds our connectome. LLM technology is fundamentally backwards given that human intelligence emerges from the connectome we build through life. https://agiish.com/f/are-generative-models-creating-connectomes
Freelance Content Creator In AI, SEO | ChatGPT, Adobe Firefly & Stable Diffusion | Freelance Writer | Web Developer | Graphic Designer
3moSo, there are six steps in the utils.py file... here's a summary of what the prompts do. PROMPT 1: Request an initial translation of the entire source text without explanations. PROMPT 2: Ask for constructive criticism and suggestions to improve the initial translation. PROMPT 3: Request an improved translation based on expert suggestions and criticisms. PROMPT 4: Request a translation of a specific chunk of the source text using the rest for context. PROMPT 5: Ask for constructive criticism and suggestions to improve a specific chunk's translation. PROMPT 6: Request an improved translation of a specific chunk based on expert suggestions.