Blog

Visit and subscribe to the blog from here: Semantics & Systems

Where To Trust LLMs in the Program Analysis Pipeline by Khaled Ahmed, PhD

Reflections from my thesis defense on keeping correctness with analysis and using models for interpretation.

Read on Substack

adaptive-testing-tools: a small Python library for Adaptive Random Testing by Khaled Ahmed, PhD

From one-off LLM eval scripts to a reusable ART primitive you can drop into any Python test harness.

Read on Substack

Testing Tool-Calling LLMs with Adaptive Random Inputs by Khaled Ahmed, PhD

Measuring Tool Call Accuracy to catch brittle agent behavior before it ships

Read on Substack

Evaluating LLM prompts using Adaptive Random Testing by Khaled Ahmed

for quickly finding test inputs that reveal "problems" with the prompts

Read on Substack

Adaptive Random Testing Introduction by Khaled Ahmed

A step-by-step guide.

Read on Substack

Mutation-Based Fault Localization Introduction by Khaled Ahmed

A step-by-step guide.

Read on Substack

Why Trust Matters in AI, and Why We Still Don’t Have It by Khaled Ahmed

Read on Substack