Expression problem

Data abstraction problem in programming languages


title: "Expression problem" type: doc version: 1 created: 2026-02-28 author: "Wikipedia contributors" status: active scope: public tags: ["programming-language-design", "unsolved-problems-in-computer-science"] description: "Data abstraction problem in programming languages" topic_path: "technology/computing" source: "https://en.wikipedia.org/wiki/Expression_problem" license: "CC BY-SA 4.0" wikipedia_page_id: 0 wikipedia_revision_id: 0

::summary Data abstraction problem in programming languages ::

The expression problem is a problem in programming languages that concerns the extensibility and modularity of statically typed data abstractions. The goal is to define a data abstraction that is extensible both in its representations and its behaviors, where one can add new representations and new behaviors to the data abstraction, without recompiling existing code, and while retaining static type safety (e.g., no casts). The statement of the problem exposes deficiencies in programming paradigms and programming languages. Philip Wadler, one of the co-authors of Haskell, has originated the term.

History

Philip Wadler formulated the challenge and named it "The Expression Problem"{{cite web | title=The Expression Problem | url=http://homepages.inf.ed.ac.uk/wadler/papers/expression/expression.txt He also cited three sources that defined the context for his challenge:

The problem was first observed by John Reynolds in 1975. | chapter= User-defined Types and Procedural Data Structures as complementary approaches to Data Abstraction. | title= New Directions in Algorithmic Languages | year= 1975 | last1= Reynolds | first1= John C. | publisher= IFIP Working Group 2.1 on Algol | pages= 157–168 | url= https://www.cs.tufts.edu/~nr/cs257/archive/john-reynolds/procedural-data-structures.pdf Reynolds discussed two forms of Data Abstraction: User-defined Types, which are now known as Abstract Data Types (ADTs) (not to be confused with Algebraic Data Types), and Procedural Data Structures, which are now understood as a primitive form of Objects with only one method. He argued that they are complementary, in that User-defined Types could be extended with new behaviors, and Procedural Data Structures could be extended with new representations. He also discussed related work going back to 1967. Fifteen years later in 1990, William Cook | last= Cook |first= William | author-link= William Cook (computer scientist) | date= 1990 | editor-last1= Bakker | editor-first1= J.W. de | editor-last2= Roever | editor-first2= W.P. de | editor-last3= Rozenberg | editor-first3= G. | title= Foundations of Object-Oriented Languages (FOOL), REX School/Workshop | location= Noordwijkerhout, The Netherlands | publisher= Springer Berlin Heidelberg |pages=151–178 | chapter= Object-Oriented Programming versus Abstract Data Types |series= Lecture Notes in Computer Science |volume= 489 |doi= 10.1007/BFb0019443 | chapter-url= https://link.springer.com/chapter/10.1007/BFb0019443 | isbn= 978-3-540-46450-1}} applied Reynold's idea in the context of Objects and Abstract Data Types, which had both grown extensively. Cook identified the matrix of representations and behaviors that are implicit in a Data Abstraction, and discussed how ADTs are based on the behavioral axis, while Objects are based on the representation axis. He provides extensive discussion of work on ADTs and Objects that are relevant to the problem. He also reviewed implementations in both styles, discussed extensibility in both directions, and also identified the importance of static typing. Most importantly, he discussed situations in which there was more flexibility than Reynolds considered, including internalization and optimization of methods.

At ECOOP '98, Shriram Krishnamurthi et al. | title=Synthesizing Object-Oriented and Functional Design to Promote Re-Use | url=https://cs.brown.edu/~sk/Publications/Papers/Published/kff-synth-fp-oo/ | title= Modular object-oriented programming with units and mixins | date=1999 | doi=10.1145/291251.289432 | last1=Findler | first1=Robert Bruce | last2=Flatt | first2=Matthew | journal=ACM SIGPLAN Notices | volume=34 | pages=94–104 | doi-access=free }} via a rediscovery of mixins.{{cite thesis |type= PhD | last= Cook | first= William | date= 1989 | title= A Denotational Semantics of Inheritance | publisher= Brown University | url= https://www.cs.utexas.edu/~wcook/papers/thesis/cook89.pdf }} | chapter= Classes and Mixins | doi= 10.1145/268946.268961 | title= Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL '98 | pages= 171–183 | year= 1998 | last1= Flatt | first1= Matthew | last2= Krishnamurthi | first2= Shriram | last3= Felleisen | first3= Matthias | isbn= 978-0897919791 | s2cid= 5815257

Others co-discovered variants of the expression problem around the same time as Rice University's PLT, in particular Thomas Kühne in his dissertation, and Smaragdakis and Batory in a parallel ECOOP 98 article.

Some follow-up work used the expression problem to showcase the power of programming language designs. | chapter=Extensible Algebraic Datatypes with Defaults | first1=Matthias | last1=Zenger | first2=Martin | last2=Odersky | title=Proceedings of the sixth ACM SIGPLAN international conference on Functional programming | pages=241–252 | citeseerx=10.1.1.28.6778 | year=2001 | doi=10.1145/507635.507665 | isbn=1-58113-415-0 }} | chapter=Independently extensible solutions to the expression problem | chapter-url=https://homepages.inf.ed.ac.uk/wadler/fool/program/final/10/10_Paper.pdf | first1=Matthias | last1=Zenger | first2=Martin | last2=Odersky | title=FOOL 2005 | publisher=ACM | citeseerx=10.1.1.107.4449 | year=2005

The expression problem is also a fundamental problem in multi-dimensional Software Product Line design and in particular as an application or special case of FOSD Program Cubes.

Solutions

There are various solutions to the expression problem. Each solution varies in the amount of code a user must write to implement them, and the language features they require.

  • Multiple dispatch
  • Coproducts of functors{{cite journal |author = Wouter Swierstra |title = Data Types à La Carte |journal = Journal of Functional Programming |volume = 18 |number = 4 |year = 2008 |issn = 0956-7968 |pages = 423–436 |publisher = Cambridge University Press |doi = 10.1017/S0956796808006758 |s2cid = 21038598 |doi-access = free
  • JavaGI type classes
  • Tagless-final Finally tagless, partially evaluated: Tagless staged interpreters for simpler typed languages" / Object algebras
  • Polymorphic Variants

Example

Problem description

We can imagine we do not have the source code for the following library, written in C#, which we wish to extend:

::code[lang=c#] interface IEvalExp { int Eval(); }

class Lit : IEvalExp { internal Lit(int n) { N = n; }

internal int N { get; }

public int Eval()
{
    return N;
}

}

class Add : IEvalExp { internal Add(IEvalExp left, IEvalExp right) { Left = left; Right = right; }

internal IEvalExp Left { get; }

internal IEvalExp Right { get; }

public int Eval()
{
    return Left.Eval() + Right.Eval();
}

}

static class ExampleOne { static IEvalExp AddOneAndTwo() => new Add(new Lit(1), new Lit(2)); static int EvaluateTheSumOfOneAndTwo() => AddOneAndTwo().Eval(); } ::

Using this library we can express the arithmetic expression as we did in and can evaluate the expression by calling . Now imagine that we wish to extend this library, adding a new type is easy because we are working with an Object-oriented programming language. For example, we might create the following class:

::code[lang=c#] class Mult : IEvalExp { internal Mult(IEvalExp left, IEvalExp right) { Left = left; Right = right; }

internal IEvalExp Left { get; }

internal IEvalExp Right { get; }

public int Eval()
{
    return Left.Eval() * Right.Eval();
}

} ::

However, if we wish to add a new function over the type (a new method in C# terminology), for example to pretty print an expression, we have to change the interface and then modify all the classes that implement the interface. Another possibility is to create a new interface that extends the interface and then create sub-types for , and classes, but the expression returned in has already been compiled so we will not be able to use the new function over the old type. The problem is reversed in functional programming languages like F# where it is easy to add a function over a given type, but extending or adding types is difficult.

Problem solution using object algebra

Let us redesign the original library with extensibility in mind using the ideas from the paper Extensibility for the Masses.

::code[lang=c#] interface ExpAlgebra { T Lit(int n); T Add(T left, T right); }

class ExpFactory : ExpAlgebra { public IEvalExp Lit(int n) { return new Lit(n); }

public IEvalExp Add(IEvalExp left, IEvalExp right)
{
    return new Add(left, right);
}

}

static class ExampleTwo { public static T AddOneToTwo(ExpAlgebra ae) => ae.Add(ae.Lit(1), ae.Lit(2)); } ::

We use the same implementation as in the first code example but now add a new interface containing the functions over the type as well as a factory for the algebra. Notice that we now generate the expression in using the interface instead of directly from the types. We can now add a function by extending the interface, we will add functionality to print the expression:

::code[lang=c#] interface IPrintExp : IEvalExp { string Print(); }

class PrintableLit : Lit, IPrintExp { internal PrintableLit(int n) : base(n) { N = n; }

internal int N { get; }

public string Print()
{
    return N.ToString();
}

}

class PrintableAdd : Add, IPrintExp { internal PrintableAdd(IPrintExp left, IPrintExp right) : base(left, right) { Left = left; Right = right; }

internal new IPrintExp Left { get; }

internal new IPrintExp Right { get; }

public string Print()
{
    return Left.Print() + " + " + Right.Print();
}

}

class PrintFactory : ExpFactory, ExpAlgebra { public IPrintExp Add(IPrintExp left, IPrintExp right) { return new PrintableAdd(left, right); }

public new IPrintExp Lit(int n)
{
    return new PrintableLit(n);
}

}

static class ExampleThree { internal static int Evaluate() => ExampleTwo.AddOneToTwo(new PrintFactory()).Eval(); internal static string Print() => ExampleTwo.AddOneToTwo(new PrintFactory()).Print(); } ::

Notice that in we are printing an expression that was already compiled in , we did not need to modify any existing code. Notice also that this is still strongly typed, we do not need reflection or casting. If we would replace the with the in the we would get a compilation error since the method does not exist in that context.

References

References

  1. (November 1995). "Type Checking and Modules for Multi-Methods". ACM Transactions on Programming Languages and Systems.
  2. (2000). "Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications".
  3. (July 2011). "JavaGI: The Interaction of Type Classes with Interfaces and Inheritance". ACM Transactions on Programming Languages and Systems.
  4. (2009). "Finally Tagless, Partially Evaluated: Tagless Staged Interpreters for Simpler Typed Languages". J. Funct. Program..
  5. (2012). "Extensibility for the Masses: Practical Extensibility with Object Algebras". Ecoop '12.
  6. (2000). "Workshop on Foundations of Software Engineering. Sasaguri, Japan, November 2000".

::callout[type=info title="Wikipedia Source"] This article was imported from Wikipedia and is available under the Creative Commons Attribution-ShareAlike 4.0 License. Content has been adapted to SurfDoc format. Original contributors can be found on the article history page. ::

programming-language-designunsolved-problems-in-computer-science