top of page

Trick the AI ​​with just a pen and paper

Updated: Jun 30, 2023

As artificial intelligence systems go, it is pretty smart: show Clip a picture of an apple and it can recognise that it is looking at a fruit. It can even tell you which one, and sometimes go as far as differentiating between varieties.

But even the cleverest AI can be fooled with the simplest of hacks. If you write out the word “iPod” on a sticky label and paste it over the apple, Clip does something odd: it decides, with near certainty, that it is looking at a mid-00s piece of consumer electronics. In another test, pasting dollar signs over a picture of a dog caused it to be recognised as a piggy bank.

An image of a poodle is labelled ‘poodle’, and an image of a poodle with $$$ pasted over it is labelled ‘piggybank’

Source: The Guardian

OpenAI, the machine learning research organisation that created Clip, calls this weakness a “typographic attack”. “We believe attacks such as those described above are far from simply an academic concern”, the organisation said in a paper published this week. “By exploiting the model’s ability to read text robustly, we find that even photographs of handwritten text can often fool the model. This attack works in the wild… but it requires no more technology than pen and paper”.

Like GPT-3, the last AI system made by the lab to hit the front pages, Clip is more a proof of concept than a commercial product. But both have made huge advances in what was thought possible in their domains: GPT-3 famously wrote a Guardian comment piece last year, while Clip has shown an ability to recognise the real world better than almost all similar approaches.

While the lab’s latest discovery raises the prospect of fooling AI systems with nothing more complex than a T-shirt, OpenAI says the weakness is a reflection of some underlying strengths of its image recognition system. Unlike older AIs, Clip is capable of thinking about objects not just on a visual level, but also in a more “conceptual” way. That means, for instance, that it can understand that a photo of Spider-man, a stylised drawing of the superhero, or even the word “spider” all refer to the same basic thing - but also that it can sometimes fail to recognise the important differences between those categories.

We discover that the highest layers of Clip organise images as a loose semantic collection of ideas”, OpenAI says, “providing a simple explanation for both the model’s versatility and the representation’s compactness”. In other words, just like how human brains are thought to work, the AI thinks about the world in terms of ideas and concepts, rather than purely visual structures.

Source: The Guardian

But that shorthand can also lead to problems, of which “typographic attacks” are just the top level. The “Spider-man neuron” in the neural network can be shown to respond to the collection of ideas relating to Spider-man and spiders, for instance; but other parts of the network group together concepts that may be better separated out.

Source: The Guardian newspaper by editor Alex Hern



5 views0 comments


bottom of page