tisdagen den 9:e juni 2009

What's in a forall?

Look at the following module:

module Foo where
type Foo = Int
Is this module syntactically correct? I doubt anyone would say no to that, it's just a standard type synonym declaration, nothing strange. Fully Haskell 98 compliant too.

What about this module then:
module Foo where
type Foo = Bar
Is it syntactically correct? Sure it is, the fact that Bar isn't in scope is completely irrelevant where syntax is concerned. Loading it into GHCi would give a
Not in scope: type constructor or class `Bar'
error, which comes from the name resolver/type inference stage, not from the parsing stage. There's a clear cut line between a syntactic error and a semantic error here.

Let's look at yet another module:
module Foo where
type Foo = forall a . a -> a
Is it syntactically correct? Obviously that depends on your angle - it isn't Haskell 98 due to the explicit forall, but GHC would accept it readily with the proper extension enabled, namely Rank2Types or RankNTypes. But does that mean that this is only syntactically correct when one of those extensions are enabled? Let's see.

With no extension flags given, GHCi refuses the module with the error message
Illegal operator `.' in type `forall a . (a -> a)'
It doesn't recognize the explicit quantification syntax, and instead thinks that . is a type-level operator, which isn't allowed since I don't have TypeOperators on. Just for fun, trying it with TypeOperators enabled makes GHCi complain that the names forall and a are not in scope. I should add that in both these cases, GHCi actually also suggests that I might be looking for something like Rank2Types, since it cleverly recognizes the forall word. At any rate, with no extension given or with only TypeOperators it is clear that we don't get the syntax we're after.

With either Rank2Types or RankNTypes enabled, GHCi obviously correctly recognizes the forall as a quantifier and will accept the program. But those two flags are not the only ones that enable the forall syntax. In fact there are three other flags that do - ExistentialQuantification, LiberalTypeSynonyms and PolymorphicComponents. If we try to load our module into GHCi with either of these enabled, we get a different error message:
Illegal polymorphic or qualified type: 
forall a. a -> a
Apparently GHCi recognized and accepted the syntax of the type declaration, but the type checker balked at it because we haven't enabled sufficiently liberal types. But this is not a syntactic issue, but something else entirely, at least where GHCi is concerned.

I don't fully agree with GHCi here though, because of ExistentialQuantification. For the two other cases - LiberalTypeSynonyms and PolymorphicComponents - I think it's perfectly fine that forall-quantified types are made generally available syntactically, even if the type checker will exclude some programs that try to use them in the wrong ways. Both those extention deal with how and where to allow forall-quantified types, the first as arguments to type constructors and the second as arguments to data constructors. But for ExistentialQuantification, the category of syntactically allowed programs is actually different. What ExistentialQuantification allows us is to write forall quantifiers that precede constructors in data type declarations, e.g:
module Bar where
data Bar = forall a . Bar a
Indeed when I try to run this with no extentions enabled, GHCi complains that forall is not a viable data constructor and suggests that I use ExistentialQuantification. I would definitely consider it a syntactic issue to allow the forall keyword to precede data constructor declarations like this, and hence I don't agree with GHCi.

On a side note, something funny happens when I turn on e.g. PolymorphicComponents which enables forall as a keyword. Then GHCi complains that
Data constructor `Bar' has existential 
type variables, or a context
and suggests I use ExistentialQuantification or GADTs to allow this. Indeed, GADTs are a generalisation of ExistentialQuantification, except that it doesn't enable forall as a keyword. So using any extension that enables the forall keyword, plus GADTs, or just ExistentialQuantification, will make GHCi accept the program.

Going back to types, what about the following module:
module Bar where
import Foo (Foo)
type Bar = (Foo -> Int) -> Int
Syntactically correct? Absolutely, there's nothing strange here at all, fully Haskell 98 compliant. Will GHCi accept it? That depends on Foo. If the type Foo is defined as we did earlier as an explicitly polymorphic type, then Bar is a rank-3 type, which requires the RankNTypes extension. The correctness of Bar doesn't depend on module Bar alone, but on its dependent module Foo. Even if I enable Rank2Types for Bar (and Foo of course, which needs it), I still get an error since this type is actually rank 3. And in fact, if I inline Foo in the definition of Bar to instead write
module Bar where
type Bar = ((forall a . a -> a) -> Int) -> Int
GHCi gives me the exact same error message (except of course it mentions the explicit polymorphic type instead of the equivalent Foo):
Illegal polymorphic or qualified type:
forall a. a -> a
So clearly this last program is also syntactically correct, as long as the forall keyword is enabled, just like the simple one that imported Foo. Neither is type correct without RankNTypes, but both are syntactically correct, assuming explicit quantification is enabled at all.

What I'm trying to get at here with all of this is that it's not easy to draw a line between what constitutes a syntactic error and what is rather a semantic error when it comes to forall-quantified types. There are a number of different stances to take here, and I need to pick one for haskell-src-exts to implement.

The core issue is this: Is there a general syntactic difference between programs accepted with Rank2Types and programs accepted with RankNTypes? It is certainly possible to syntactically detect some programs that would require the full RankNTypes, by inspection of the types used. If a type has a higher rank than allowed, and this can be seen right in the syntax by inspecting the arguments to the function arrows, then it would be possible to rule the program out. But as shown by the import example above, there is no way we can syntactically see the difference between these classes of programs in general, we would need to do actual type analysis (and import chasing) for that. Fundamentally it's simply not a syntactic issue.

The question then becomes, to what extent do we bother to even try to detect the difference for those cases where it is possible? I don't think we should, at all. There may not be a clean cut separation between syntactic and type/semantic issues at all times, but I feel we are best off trying to adhere to what separation we can, and making it as clear cut as possible. The principle of least surprise suggests to me that it would be better to not try to be clever, and instead document clearly what haskell-src-exts considers actual syntax and what is left for other analyses.

The stance I intend to take is thus the following:
  • Allow forall-quantified types in all shapes and forms as long as one of Rank2Types, RankNTypes, LiberalTypeSynonyms and PolymorphicComponents is enabled.

  • Allow forall-quantified data constructor declarations (using normal non-GADT-style syntax) if ExistentialQuantification is enabled. GADTs would not lexically enable the forall keyword, and even when coupled with another extension that does I don't think it makes sense to let it enable this syntax.

  • Don't make any other difference between any of these classes of programs in what is allowed and what isn't.
If you think I am making the wrong choice here, please speak up, I would love to get input on this. And of course, please speak up if you agree with me as well. :-)

4 kommentarer:

  1. I agree with you totally on this - anything else would be way too confusing.

  2. Niklas. I think your decisions are the right ones. But I also think that you pay way too much attention to what GHC does. The extensions in GHC is a hodgepodge of stuff accumulated over the years and does not always have a coherent design. I don't mean that as a critisism agains GHC though, it's just the way things have turned out. But it also means that you should use your own head rather than GHC as the main guide. If you do that, I don't think it will be particularly difficult to decide what should go into the parser and what should be left to the type checker.

  3. Josef, you're right, of course. But some of these extensions *only* appear in GHC, and I have only the GHC behavior and sparse documentation to use to figure out what they do at all. Hence I must go via GHC to build up an intuition before I can start using my own head. :-)

  4. I think I agree. Why does LiberalTypeSynonyms enable the forall keyword though (although I guess I understand why)? I wish there was a plain old ExplicitForall extension that enabled the keyword in types (without extending the type checker -- only like (id :: forall a. a -> a) would be allowed)

    Also don't forget ScopedTypeVariables or whatever it's called nowadays that makes *those* explicit foralls have an effect! (Also, does it, also, enable forall syntax in types?)