Overview UI/browser automation tests can be brittle, because tests hook into implementation details of the UI which may not be relevant for actual user interaction. Visual test automation is more robust, because it uses the UI in the same way a user is supposed to do. This post explains a solution for visual browser automation. … Continue reading Visual browser automation with OmniParser
Category: AI
Running large LLMs on small hardware: Gemma 4 12B on a VRAM-constrained Radeon laptop
Google released Gemma 4 12B today. I'm a huge fan of the Gemma model family, they have improved with each iteration and consistently perform on par with larger models. It didn't run at first because it needs more VRAM that my laptop has, but there's a workaround. Here's a short instruction for how to run … Continue reading Running large LLMs on small hardware: Gemma 4 12B on a VRAM-constrained Radeon laptop
OPAW: Real-Time Target Sound Extraction
In this instalment of "One Paper a Week", we're looking at Waveformer, a neural network for extracting specific waveforms from a sound mix in real-time. If you're thinking "Independent Component Analysis", you're not alone: ICA can also extract a desired signal from a mix of signals (similarly to how we are able to understand a … Continue reading OPAW: Real-Time Target Sound Extraction
Restricting VS Code terminal commands to an approved commands list
Motivation If you've ever needed to restrict which commands can be run inside a VS Code integrated terminal - nowadays mainly to prevent agents from wreaking havoc - you can achieve this using a combination of VS Code terminal profiles and PowerShell's PSReadLine module. I'm not sure is/how this works with other terminals, however I've … Continue reading Restricting VS Code terminal commands to an approved commands list
OPAW: Tracking Capabilities for Safer Agents
With AI agents rampaging on half the population's computers, there is increased interest in safe-guarding AI agent workflows. In "Tracking Capabilities for Safer Agents" no one less than Martin Odersky (et al) propose a framework for running AI agents subject to security policies. The answer is - of course - Scala. I'm skipping the problem … Continue reading OPAW: Tracking Capabilities for Safer Agents

