· 

IDEA: STRATA – A systematic Framework for choosing the right Level of Abstraction in Simulations

The Challenge of Abstraction

Ever wondered which level of abstractions applies best to your simulation?

Imagine a brain: Simulating it would take a lot of processing power, more than we have at the moment. So, a natural conclusion would be to simulate a part of a brain, maybe even something smaller, let’s say only one neuron. But what kind of conclusions can you draw from simulating only one neuron, when your question revolves around topics of consciousness or awareness. One Neuron might not be enough to simulate anything of relevance, one brain even might not provide the answer (not saying that I have one on this topic anyway), and anyway we don’t have the capabilities at the moment to do that. So, scientists are facing a difficult choice that has so far been based on a best-guess-scenarios – of course supported by the scientific method.

 

But what would you say if you could use a meta-model for the optimal level of abstraction that is repeatable?

 

Here is STRATA (Scoring Tool for Rational Abstraction Tradeoff Analysis)

 

Score (Ai): Find the Abstraction Levels relevant for your field (biology, sociology, AI etc.)

n: Index of the criterion.

k: there are 9 different criteria to develop

wk: the weighing parameters for your simulation

Scoreik: the decided parameters for your simulation.

 

The uptake of this tool is its domain-independency, because you can use it across all disciplines and provide insights into how you came up with the level of abstraction. It balances complexity, accuracy, explainability and actionability which is much needed in every field.

 

Why Existing Tools fall short

Current models such as Bias-Variance Trade-off, Occam’s Razor, Minimum Description Length (MDL) and Kolmogorov Complexity are frameworks that only help to choose model complexity, not abstraction level directly. They more or less guide model selection, rather than abstraction granularity.

Also Multi-Level or Multi-Scale Models in physics, systems biology, economics or climate science often lack formal criteria to choose the “right” level of abstraction, although they recognize that systems operate on nested, interaction levels.

A qualitative approach has been developed by Carl Carver. He said that, a good explanation maps the relevant mechanisms on the right organizational level. Yet it doesn’t offer a quantitative framework that can be reproduced.

Lastly there are some Machine Learning Heuristics such as the Information Bottleneck Method by Tishby et al. It compresses data such that you preserve maximal relevant information about the output, which brings us already very close to what I want to achieve with STRATA.

 

Introducing STRATA and its nine Criteria

Criterion

Description

Score 1 (Low)

Score 5 (High)

Explanatory Power

Captures mechanisms behind the phenomenon. Understanding often matters more than prediction.

Correlational

Causal/generative

Predictive Accuracy

Reliability of forecasting outcomes. The more accurate the better e.g for chemical processes or predictive maintenance

Unreliable

Generalizable & precise

Information Density

Relevant info per unit complexity. Models should be as simple as possible and yet as rich as necessary.

Redundant

Compressed & rich

Intervenability

Potential for effective manipulation. We want to steer systems actively in the end.

Ineffective

Actionable & causal

Robustness

Model stability across contexts. They reveal structure that persists across time, space and variation.

Narrow use

Broad applicability

Computational Efficiency

Resource cost to simulate. Efficiency often determines whether an abstraction is practically usable.

Intractable

Easily computable

Model Complexity

Structural simplicity. Complex models are harder to validate, maintain and interpret. Lower complexity is mostly preferable.

Unwieldy

Elegant, minimal

Interpretability

Human understanding of model. Knowing how a system works is just as important as what it outputs.

Opaque

Intuitive, explainable

Domain Relevance

Fit with field practices. Even if an abstraction is mathematically elegant, it must resonate with real-world practice and be usable by domain experts.

Unused or irrelevant

Widely accepte

 

Each of these criteria should be scored with a weighing system (especially if you have priorities) e.g from 0 to 1. If you don’t have given weights for your field or simulation you can of course use pareto dominance or optimum.

Also the Bayesian Decision Network (BDN) can work for estimating the most probable optimal level given any uncertainties.

 

Ok, let’s go for an example by simulating the brain

You want to find out on which level you should simulate your biological system for the best outcome and avoid under- or overfitting for your model. So, we can compare these three levels of abstraction for our case:

A1: Molecular or

A2: Cellular or

A3: Functional Network

 

Next, we will assign weights to our criteria to find out what’s really important for our simulation – the sum must of course be 1.:

Criterion

Weight wk

Explanatory power

0.2

Predictive power

0.15

Information density

0.1

Intervenability

0.15

Generalizability

0.1

Computational cost

0.1

Model complexity

0.05

Interpretability

0.1

Domain relevance

0.05

Sum =

1

 

After which we add the values according to our understanding, goal and field-standards.

Criterion

Weight wk

A1 (Molecular)

A2 (Cellular)

A3 (Functional Network)

Explanatory power

         0,2

3

4

5

Predictive power

         0,15

4

4

4

Information density

         0,1

2

4

5

Intervenability

         0,15

2

3

5

Generalizability

         0,1

2

4

4

Computational cost

         0,1

1

3

4

Model complexity

         0,05

2

3

4

Interpretability

         0,1

1

4

5

Domain relevance

         0,05

5

5

5

Af

      Afterwards we just multiply each row with the weight, which in our case results in:

A1 (Molecular)

A2 (Cellular)

A3 (Functional Network)

2,45

3,75

4,6

 

So according to the assumptions we have made, the best way to simulate this biological system e.g. the brain or parts of it would be to use the abstraction Level of Functional Network.

 

As we’ve seen choosing the right abstraction level in any domain is rarely trivial. The STRATA framework (Scoring Tool for Rational Abstraction Tradeoff Analysis) offers a robust guide:

 

  1. Define candidate abstraction levels
  2. Evaluate them across nine criteria such as: explanatory power, predictive accuracy, compression, actionable interventions, generalizability, computational feasibility, model complexity, interpretability, and domain alignment
  3. Score and compare, using Pareto analysis or Bayesian decision tools where appropriate

If we follow the example from neuroscience e.g.: modeling working memory. A molecular-level model (receptor kinetics, intracellular signaling) may score high on precision, but fail on interpretability and efficiency. A network-level model captures the functional loops (e.g. prefrontal cortex circuits), scoring high on intervenability and comprehension and often best aligns with experimental and clinical practices. This example reflects principles seen in Frontiers on functional brain networks and the importance of abstraction levels in model design.

 

The science of abstraction is at a tipping point. As systems grow ever more complex, the absence of systematic methods for choosing abstraction levels becomes a bottleneck in itself. STRATA proposes to fill that gap, offering:

·        Transparency: Clear, weighted scoring across nine foundational dimensions

·        Tractability: Pareto and Bayesian tools for objective comparison

·        Universality: Applicable from molecules to macro-systems, from AI to social science

 

As computational modeling continues to expand across domains, from neuroscience to social systems to artificial intelligence, abstraction is becoming a science in its own right. STRATA offers a repeatable, explainable and cross-domain method for evaluating abstraction levels, making model-building not only smarter, but more intentional. The future of simulation isn’t about maximizing detail; it’s about choosing the right detail and tradeoff.

 

by mario