Ambiguity in spoken and programming languages

Notes on ambiguity and its effects in spoken and programming languages.

This post touches on:

PS: i’m not a linguist and all my findings are empirical so feel free to point out my mistakes…

Ambiguity

Let’s define ‘ambiguity’:

the quality of being open to more than one interpretation.

Ambiguity: in spoken languages

There are many kinds ambiguity but this post focuses on structural ambiguity(aka syntactic ambiguity):

Here’s an example:

he gave her cat food.

The above sentence could be interpreted in at least 2 ways:

  1. he gave her (cat food)
  2. he gave (her cat) food

(Parenthesis are used to disambiguate the structure, similarly to how programming languages use). English language provides syntactic means to disambiguate so the sentence can become:

  1. he gave her cat-food
  2. he gave, her cat, food

or restructure:

  1. he gave food to her cat
  2. he fed her cat

More on spoken language ambiguity

We’ll never have completely unambiguous language because attempts to fix ambiguity result in more ambiguity:

  1. Why you need to be using Oxford commas
  2. How Oxford comma is creating ambiguity
  3. Court Rules the Oxford Comma Necessary

And likely ambiguity is here to stay: ambiguity is a good thing.

Ambiguity in PL syntax

To warm up, what’s the value of the expression:

1 or 2 and 3

It’ll likely to take a second to disambiguate the expression before evaluation:

  1. is it (1 or 2) and 3, or
  2. 1 or (2 and 3)
  3. and i’ve not specified the language…

Syntax ambiguity worsens readability and understanding since it requires more cognitive effort to decipher the intent.

Disambiguation in PL syntax

Disambiguation is done with special rules encoded in the grammar. Here’s Ruby grammar snippet.

%left  keyword_or keyword_and
...
%right keyword_not
...
%left  tOROP
%left  tANDOP

There are 2 main rules:

So given the rules:

the expression 1 or 2 and 3 is disambiguated as (1 or 2) and 3 which yields 3.

Another example(that looks very similar):

1 || 2 && 3

Given the rules:

the expression is disambiguated as 1 || (2 && 3) which yields 1.

The main issue with disambiguation rules is that they’re implicit and not a part of the syntax being read; which makes the syntax easy misinterpret.

Deliberate ambiguity in PL

Yes, PL designers choose to have ambiguous grammar for sake of “readability”; which, in this case, stands for “reduced syntax” and doesn’t mean non-ambiguous or exact.

Some examples of ambiguity:

There are non-ambiguous syntaxes but, for some reason, they’re not in favour:

Cost of deliberate ambiguity in PL

One of the consequences of the syntax-ambiguity is the infinite ways the same syntax may be structured:

  1. Airbnb ruby style guide
  2. Rubocop-hq ruby style guide
  3. github’s ruby style guide
  4. Google JS style guide

Go completely removes this problem with gofmt.

Ambiguity is expensive: how many people-hours are spent on keeping the consistency? And all the yak-shaving? Tabs vs spaces? OMG…

Ambiguity vs Determinism elsewhere

Summary

So my personal take on language ambiguity:

References

Related Posts
Read More
cdp-proxy: Chrome DevTools proxy and middleware for Go
Comments
read or add one↓