All articles
Eligibility

Is AI and machine learning development eligible for the R&DTI?

AI and machine learning work is a rich source of genuine R&D, and a fast-growing target for ATO scrutiny. Here is how the eligibility test actually applies to AI development, with examples of what qualifies and what does not.

George Walch, Founder and R&D Tax Expert, Rand Advisory5 min read

AI and machine learning attract more R&D Tax Incentive claims every year, and more attention from the regulators. Both the ATO and the Department of Industry, Science and Resources (DISR) have signalled that AI-related claims are an area of focus. That combination, high claim volume and rising scrutiny, makes it worth being precise about when AI work is genuinely eligible and when it is not.

The short answer: building with AI is not automatically R&D, and neither is training a model. Eligibility turns on the same test as any other software, whether you were resolving a genuine technical uncertainty through systematic experimentation. What changes with AI is how easy it is to blur that line.

The test has not changed, the scrutiny has

There is no special AI category in Division 355. AI development is assessed against the same four core-activity criteria as all software. If you can name the technical unknown and show the experiment, you have a claim. If you used established techniques to a predictable result, you do not, no matter how sophisticated the tooling.

The eligibility test, applied to AI

For a refresher on the four core-activity criteria, see the software eligibility guide. Applied to AI and machine learning, the questions become concrete:

  • Was the outcome genuinely unknown? Could a competent professional, with access to worldwide knowledge, have predicted whether your approach would work? Fine-tuning a well-documented model on a clean dataset to a result the literature already supports is predictable. Pushing a model into a regime where its behaviour is not documented may not be.
  • Did you run a systematic experiment? Hypothesis, experiment, observation, evaluation, conclusion. In ML this maps naturally onto experiment tracking: a hypothesis about an architecture or training regime, runs that test it, metrics that evaluate it, and a conclusion that changes the next step.
  • Were you generating new knowledge? The dominant purpose has to be resolving the uncertainty, not shipping a feature that happens to use a model.

What usually qualifies

AI/ML activityLikely core R&D?Why
Developing a novel model architecture where performance is genuinely unknownYesOutcome cannot be known in advance; requires experimentation
Devising a new technique to train under a hard constraint (latency, memory, tiny dataset) no known method handlesYesGenuine technical uncertainty
Researching whether a model can achieve a capability not demonstrated in the literature for your conditionsOftenDepends on whether the uncertainty is real and documented as such
Calling a third-party LLM API per its documentationNoRoutine integration, outcome knowable
Fine-tuning a standard model on your data to an expected resultNoPredictable application of known methods
Prompt engineering to improve outputsUsually noIterative tuning, not hypothesis-driven experimentation
Building a RAG pipeline from documented componentsUsually noIntegration of known techniques

The recurring trap with AI is mistaking sophistication for uncertainty. Using a frontier model is not R&D. Building a product around one is not R&D. The R&D, if there is any, lives in the specific places where you genuinely did not know if something would work and had to experiment to find out.

Tuning is not experimentation

Sweeping hyperparameters, trying prompts, or swapping models to see which scores best is benchmarking known options, not resolving a technical uncertainty. To be core R&D, the question has to be one the field could not already answer, and the work has to be a structured experiment, not a search over settings.

Why AI claims draw extra scrutiny

Three reasons. First, volume: AI is where a lot of new spend is going, so it is where the regulators look. Second, the gap between marketing language and technical reality, "AI-powered" is a sales phrase, not an eligibility statement, and reviewers know it. Third, the outcome-unknown test is genuinely harder to satisfy now that powerful models are off-the-shelf, because a competent professional can do far more without experimenting than they could five years ago.

None of this means AI work is hard to claim. It means the claim has to be framed honestly and evidenced well.

Documenting an AI claim well

AI development produces excellent contemporaneous evidence almost for free, if you keep it. Experiment trackers, model cards, evaluation runs, and the dated discussion around failed approaches are exactly the records that prove the outcome was uncertain. The failures matter most: a tracker full of approaches that did not work is strong proof that a competent professional could not have known the result in advance.

The discipline is the same as any R&D claim, capture the experimental story as the work happens. See how to document a defensible claim for the evidence hierarchy and what fails in a review.

The 2028 angle

From 1 July 2028 the program narrows toward core experimental activity and removes supporting activities, with higher rates on core. For AI teams, this rewards a sharp account of the experimental core and penalises claims padded with integration and data-preparation work framed as supporting. Read the 2026-27 Budget changes for the full picture, and start tightening how you describe your AI experiments now.

The R&D Tax Incentive is a self-assessment program. Rand helps you prepare and document a defensible claim, but your company and its directors remain responsible for its accuracy, and you should seek advice specific to your circumstances.

Keep reading