2011年9月21日水曜日

A Small Patch for Bizarre but User Controllable Limited Overloading


A Small Patch for Bizarre but User Controllable Limited Overloading

Yes, it is bizarre. Yes, it is limited. Yes, it is nothing new at all. But yes, it is simple and useful.

I have written a small patch to OCaml compiler to provide an SML style limited overloading. Limited means that you cannot derive overloading using overloaded values. For example, in SML (+) is (limitedly) overloaded:
1 + 2;
1.2 + 3.4;
But you cannot define overloaded double using (+):
(* In SML syntax *)
fun double x = x + x; (* It is not overloaded. *)
                      (* Defaulted to int -> int *)
The patch provides this "poorman's overloading" to OCaml, additionaly with controllability: you can choose what are overloaded. And this part is the bizarrest part of this patch.

Let's overload plus operators. First of all, list the overloaded instances:
module Plus = struct

  module Int = struct
    let (+) = (+)
  end

  module Float = struct
    let (+) = (+.)
  end

end
That's all. Simple. What I did here is just list the definitions of (+) for different types (int and float). Since one module cannot export more than one values with the same name, Those (+) are defined in separate modules. I named Int and Float but you can use whatever name you like. The preparation is done.

Now, make those (+) overloaded. It is surprising simple:
open* Plus   (* Strange open with a star *)
Done. Now if you use (+), it is overloaded!:
let _ = assert (1 + 2 = 3); assert (1.2 + 3.4 = 4.6);; (* It works! *)
What open* Plus does? It traverses its signature recursively and opens all the sub-modules. So in this case it is equivalent with:
open Plus
open Plus.Int
open Plus.Float
but in ADDITION, if open* finds more than one definitions with the same name, here (+), they are registered as overloaded. So, open* Plus overloads the two (+)!

The overloading by open* can be repeated and the overloaded instances can be easily accumulated. This provides simple and powerful "open" world overload control:
module Num = struct

  let (+) = Num.(+/)
  module Big_int = struct let (+) = Big_int.add_big_int end
  module Nat = struct let (+) = Nat.add_nat end
  module Ratio = struct let (+) = Ratio.add_ratio end

end

open* Plus (* overload (+) for int and float *)
open* Num  (* overload (+) for additional 4 num types! *)
open* Int32 (* defines (+) for int32, omitted *)
open* Int64 (* defines (+) for int64, trivial *)
open* Natint (* defines (+) for natint *)

(* Now you can use (+) for 9 num types! *)
Or more simply, once you define the following:
module SuperPlus = struct
  include Plus
  let (+) = Num.(+/)
  module Big_int = struct let (+) = Big_int.add_big_int end
  module Nat = struct let (+) = Nat.add_nat end
  module Ratio = struct let (+) = Ratio.add_ratio end
  include Int32
  include Int64
  include Natint
end
You can just say open* SuperPlus to enjoy the overloading in your current module.

It is limited.

The overloading is limited. Any local ambiguity is reported as a type error immediately. For example, let double x = x + x is rejected since it has no enough context type information to resolve the overloading of (+). No defaulting, or fancy real polymorphic overloading.

One overloading must be locally resolvable by itself. The following example has no ambiguity, since (+) is clear for int -> int -> int from its context, then the type of one could be fixed as int.:
module One = struct
  module Int = struct let one = 1 end
  module Float = struct let one = 1.0 end
end

open* Plus
open* One

let _ = assert (one + 2 = 3)
But this overloading resolution propagation is not implemented for simplicity, and this example is rejected due to the false positive ambiguous use of one. You need an explicit type annotation.

The source

The source is available from my μ-tated OCaml ranch, poormans_overloading branch: https://bitbucket.org/camlspotter/mutated_ocaml/src/f4aeda4f648a

2011年9月15日木曜日

Redundant open module warning for OCaml

Redundant open module warning for OCaml

GHC has a warning for never used imports; such imports are just redundant and cause unexpected name space contamination. The warning is useful to keep up your import list minimal as possible.

OCaml's open has the same issue of the name space contamination, and unnecessary opens should be warned, too. And I have added a new warning for it.

You can obtain the latest diff for OCaml 3.12.1 from my repo at https://bitbucket.org/camlspotter/mutated_ocaml . After cloning, get it by:

hg diff -r ocaml-3.12.1-11110 -r redundant_open_warning
After patching, building is as usual:
make core coreboot world opt opt.opt install # Beware what you are doing!
I have found nearly 150 redundant opens in OCaml source code! You should check your OCaml code with it, too!

P.S. The first patch contained some garbages. Now the above command creates a clean patch for OCaml 3.12.1.

2011年5月25日水曜日

Planck: A Small Parser Combinator Library for OCaml

Planck: A Small Parser Combinator Library for OCaml

I have released Planck, a small monadic parser combinators for OCaml. It includes a proof of implementation: OCaml syntax lexer and a parser which create 100% identical AST as the original OCaml's lex+yacc, and available at https://bitbucket.org/camlspotter/planck .

Planck is yet another Parsec clone. I have heard of Parsec as "monad for parsing" last year, and started writing Planck just from the phrase, trying to avoid to know what Parsec really is. Sometimes I need such a kind of puzzle to enjoy solving. :-) Interestingly, the implementation of Planck became quite similar to Parsec in retrospect. Therefore, once after the basic functionality is implemented, I have imported some of Parsec combinator names to Planck, such as <|>.

At first it was aimed to be my April-fool project of this year. Unfortunately, the earthquake hit Japan on 3/11. I, my family and relatives are all ok, and Tokyo is safe. But the event surely slowed down the development.

Planck internal

Planck consists of two layers, one is for streams (lazy lists) whose module names start with `S', and the other is for parser combinators whose module names start with `P'.

The implementation is quite straight forward. Streams define lazy lists with positions and other attributes, and parser combinators provide lazy list manipulation over them. Combinators are monadic and easily composable using bind (>>=). For example, the following is an extract from the rule for hex literals like 0xdead_beef:

(* Using pa_monad's perform, aka Haskell's do *)

let zero = token '0'

let hex_char : char t =
  tokenp (function
    | '0' .. '9' | 'A' .. 'F' | 'a' .. 'f' -> true
    | _ -> false)
     "hex"

let char_or_underscores : 'a t -> string t = fun f ->
  matched (perform
             ignore f;
             ?* (ignore f <|> underscore))

let hex_literal : string t = perform
  matched (perform
    zero;
    ignore (one_of ['x'; 'X']);
    ignore (char_or_underscores hex_char))
For efficiency, Sbuffer and Pbuffer provide (hopefully) fast handling of char streams and their parsing. Smemo module provides memoization over stream elements, which is indispensable for efficient parser with backtracking.

Planck also provides a module Op_prec for operator connectivity and precedences. Therefore, it is ready to define parsers of programming languages with rather complex syntax.

Experience report: Parsing OCaml source code with Planck

As a working example of Planck, I have written a lexer and parser of OCaml syntax in Planck, and managed to produce the identical AST as the original ocamllex+ocamlyacc! Here identical means exactly the same tree, with exactly the same position information, which is impossible for current CamlP4 (It has some bugs around position handling...).

BTW, if you have already a complex lex+yacc syntax definition for you language, just stick to it. LL(n) parser combinator is great if you write a syntax from scratch. Porting lex+yacc to LL(n) is just error prone and waste of time.

Porting lex rules

The lexer part reads a char stream and produces a stream of lex tokens.

Porting the lexer definition lexer.mll to Planck was by hand, since lexer.mll is not very huge (just 302 lines). It was not so complicated task to port its pairs of a regexp and an action into Planck, only what I had to be careful was the fact that Lex takes the longest match. You can see the source code in planck/ocaml/lex.ml .

Porting yacc rules

The parser part mimics yacc behavior: it reads a stream of lex tokens created by the lexer above and returns a parsed AST.

Porting the yacc rules, parser.mly, was hard, since Yacc is a bottom-up LALR(1) and Planck (and so is Parsec) a top-down LL(n).

I was not a researcher of parsing nor have any plan to get another PhD of parsing in future, so I just took a stupid and simple method with some of brute force. If you know something nicer to convert Yacc to LL(n) automagically, please tell me.

Eliminating left recursions

Unlike lexer.mll (302 lines), parser.mly is huge (1669 lines), and hand-porting those rules one by one had to be avoided. Therefore, I firstly wrote a small parser of .mly files in Planck planck/ocaml/ocamlyacc.ml and obtained the rules and actions. Then I have analyzed the dependency graph of those rules to find out the left recursions. The left recursions are trouble of Planck (and other LL parsers) and must be removed: http://en.wikipedia.org/wiki/Left_recursion .

Luckily enough, I have found there are only 2 groups of mutual left recursions in the OCaml syntax rules. All the other left recursions are direct. Eliminating direct left recursion is simple and the result code is still understandable for human being. (It is also possible to eliminate mutual left recursions automatically... but the code is, usually, something incomprehensible.) I have written a small function to eliminate direct left recursions automatically, then hand-converted the 2 mutual left recursive groups. The one was expr and expr_comma_list, and the other was pattern and pattern_comma_list. As the names suggest they are almost the same and were easy to translate. As a result I have got planck/ocaml/plparser.ml, a source file of nearly 5000 lines (ugh).

Here is some extract of plparser.ml, the translated rule of core_type2:

method core_type2 = leftrec "core_type2" self#core_type2_nonleftrec self#core_type2_leftrec

method core_type2_nonleftrec = (
      case "core_type2_nonleftrec_0" (perform

         token QUESTION;
         v_2 <-- get_LIDENT;
         token COLON;
         (* v_4 <-- self#core_type2 ; *)
         v_4 <-- self#simple_core_type_or_tuple ; (* ??? *)
         token MINUSGREATER;
         v_6 <-- self#core_type2 ;

         return (fun () ->  mktyp(Ptyp_arrow("?" ^ v_2 ,
             {ptyp_desc = Ptyp_constr(Ldot (Lident "*predef*", "option"), [v_4]);
              ptyp_loc = v_4.ptyp_loc}, v_6)) ))

  <|> case "core_type2_nonleftrec_2" (perform

         v_1 <-- get_OPTLABEL;
         (* v_2 <-- self#core_type2 ; *) (* eats too much *)
         v_2 <-- self#simple_core_type_or_tuple ; (* ??? *)
         token MINUSGREATER;
         v_4 <-- self#core_type2 ;

         return (fun () ->  mktyp(Ptyp_arrow("?" ^ v_1 ,
             {ptyp_desc = Ptyp_constr(Ldot (Lident "*predef*", "option"), [v_2]);
              ptyp_loc = v_2.ptyp_loc}, v_4)) ))

  <|> case "core_type2_nonleftrec_1" (perform

         v_1 <-- get_LIDENT;
         token COLON;
         (* v_3 <-- self#core_type2 ; *)
         v_3 <-- self#simple_core_type_or_tuple ; (* ??? *)
         token MINUSGREATER;
         v_5 <-- self#core_type2 ;

         return (fun () ->  mktyp(Ptyp_arrow(v_1, v_3, v_5)) ))

  < !> case "core_type2_nonleftrec_3" (perform

         v_1 <-- self#simple_core_type_or_tuple ; (* may start with LIDENT *)

         return (fun () ->  v_1 ))

    )

method core_type2_leftrec v_1 = (
      case "core_type2_leftrec_0" (perform

         token MINUSGREATER;
         v_3 <-- self#core_type2 ;

         return (fun () ->  mktyp(Ptyp_arrow("", v_1, v_3)) ))

    )
It is hard to read at a glance, but the basic structure is simple. First, the rule core_type is a left recursive rule in Yacc. Therefore the rule is disassembled to two sub rules core_type_nonleftrec and core_type_leftrec for left recursion elimination. The function leftrec at core_type2 is responsible to assemble back the result of these two rules together. Rules are defined as methods and form a huge class, therefore you can extend the parsing rules by class inheritance, if you dare...

A sub rule consist of cases, which exactly correspond with the left-recursion-eliminated Yacc cases. These cases are listed by <|> or operators. <|> is the same as Parsec's <|>. < !> is <|> with backtracking: x < !> y == try x <|> y in Parsec.

A case is a list of symbol parsing using monadic binds. Here I use pa_monad Camlp4 syntax sugar (perform and <--). For example, you can see the first case of core_type2_nonleftrec is equivalent to ocamlyacc's case description:

QUESTION LIDENT COLON simple_core_type_or_tuple MINUSGREATER core_type2
The action part is in the monadic return. The code is just copy-and-pasted automatically from the conversion program. Functions used in the action parts are wrapped to provide the correct position information. It was... the hardest part. I had to use some bad mannered side-effects. But once done, the parser starts to create the identical AST as the originals nicely.

Backtracking and memoization

(Maybe Parsec does the same thing for backtracking, but I am not sure.)

Usually in LL parser combinators, backtracking ( in Planck, try in Parsec) must be avoided as possible, since it might perform tons of unnecessary computations many times and lowered down the performance of your parser. Therefore you are advised to use <|>, the or combinator without backtracking: if the left argument of <|> fails consuming any token, <|> immediately fails, without trying its right argument. And to use <|> you have to modify your rules... by hand, normally.

Writing rules without backtracking is not very hard, when you write the syntax from scratch. It is horribly boring, however, if you are porting existing huge set of Yacc rules using parser combinators without backtracks. So I have just gave up the backtrack elimination.

Instead, I have introduced memoization. The result of parsing method calls are recorded to memos attached to each stream element. If the same method is called at the same stream element, the saved result is immediately returned, instead of recalculate it. Of course, memoization enforces your method functions somewhat pure but normally it is true for parsing. The memoization works pretty nice, and I could have the results in "realistic" time. Without memoization... things go often exponential and you never get the results.

Debugging. Try and error

Once after the basic translation is done, testing. You can see lots of my test files at planck/ocaml/test*.ml . Unfortunately, the left recursion elimination is not all of the required conversions from Yacc to Planck (Parsec). The both frameworks have notion of rule precedences and they seem to be incompatible. I had to tweak the ordering of cases a lot by try and error basis. And I also had to move small amount of cases (less than 10, AFA I remember) of one rule to another. It was straightforward if you remember how yacc works. It was also fun to see that the cover rate of my parser gradually went up to 100%.

Finally my parser now can parse all the vanilla (non CamlP4 or other preprocessor flavored) OCaml .ml and .mli source code in OCaml compiler itself (one of the biggest OCaml application publicly available) and LablGtk (one of the biggest OCaml OO-library), and produces the identical AST as ocamllex+ocamlyacc. Of course it is not a proof of the correctness but it should cover pretty good part of OCaml syntax, I believe. If you are lucky to compile Planck (it is hard since it depends lots of things), you can try planck/ocaml/ocamltest.sh and planck/ocaml/ocamltest-lablgtk.sh to see the test results.

Performance of Planck

The time of parsing of all the OCaml compiler source code by Planck is in a realistic time. Great. Nice. Period. What? How fast is it, compared to ocamllex+ocamlyacc? You dare ask me it? Sigh...

OK, it is... x100 slower. In my environment, Planck parses all the OCaml compiler source code in around 100 secs while ocamllex+ocamlyacc parse it just less than 1 sec.

I can say, however, x100 slowness is not a bottle neck. For example, compiling the whole OCaml compiler takes minutes. You might not know that ocamlex and ocamlyacc are not 100% written in OCaml but use some of C for efficiency. Still you say Planck is slow? Hmmm. OK.

I have quickly considered why Planck is so slow. All here are my guesses:

Char streams.
The speed ratios of lexer (planck/ocaml/lex.ml) and parser (planck/ocaml/plparser.ml) are almost the same. x100. I had expected better performance in the lexer since I have written a specialized stream and parser module for chars (Sbuffer and Pbuffer). Probably I have not yet improve them enough.
Backtracking and memoization.
Backtracking rate in lexer is 0.2%, that is 0.2% of monadic bind calls are wasted and their results are thrown away. So the lexer's backtracking is not a big issue, and actually I do not put memoization into the lexer but only in the parser. Parser's backtracking rate is now 57%. You may think it is huge. However, I had almost the same parsing speed when I had tons of backtracks <!> and 90% of the rate. What I have seen was: removing more backtracking by replacing <!> by <|>, the total number of bind calls became larger. So in my case, the golden rule of LL(n) parser combinator: "avoid backtracks as possible", was not very true. It is also possible too many cache queries slow down the parser; for now, all the parsing rule methods query the cache. Maybe some clever cache storage/query strategy improves the performance.
No pattern match, but sequence of if-then-elses.
The parsing rules are huge sets of lists connected by <|> operators, simply speaking, sequences of if-then-elses. It might be great if these sequences of if-then-elses could be simplified to one pattern matching or at least one hash table query...
Too many monadic bind calls.
If you execute ocamltest.sh, you can see how many time the monadic bind is called to parse the entire OCaml compiler source: 1 billion!! (50% for lexer and 50% for parser.) OCaml is very great, since it performs 1 billion of bind calls (and closure creations) in 1 minute! However, the results of subtests indicate that the time of parsing is almost proportional to the number of bind calls.

OCaml is not appropriate for heavy Monadic programming?

Elimination of binds by inlining + partial evaluation

We should simply get better results, if we can reduce the number of bind calls. Let's discuss this issue further.

Planck's monad is pretty simple functional one (pbase.ml):

type 'a t = Str.t -> ('a * Str.t, error) Result.t

let return v = fun st -> Ok (v, st)

let bind t f = fun st ->
  match t st with
  | Ok (r, st') -> f r st'
  | Error s -> Error s
The monad type 'a t is a function, which takes a stream and do something, and if successful it returns a result value of type 'a and the rest of the stream after the job. If failed, it returns an error. Ok and Error are a simple result monad, defined in result.ml (aka Haskell's Either).

Look at the definition of bind: one application of bind, bind t f creates a closure. OCaml is basically a functional language and therefore the cost of the closure construction is tiny, but it is not negligible. If we could eliminate this closure construction somehow, the performance should get better.

Actually, in Planck, and probably other functional monadic libraries too, there are lots of opportunities of this closure construction elimination:

t >>= fun r st -> e[r,st]
Here, e[r,st] is an expression which has free occurrences of r and st in it. This form of the expression is found everywhere in the parsing rules. If we expand the bind (>>=), then it is equal to the following:
fun st' ->
  match t st' with
  | Ok (r, st) -> (fun r st -> e[r,st]) r st
  | Error s -> Error s
The expression of the Ok case is actually eta contractable, and we get:
fun st' ->
  match t st' with
  | Ok (r, st) -> e[r,st]
  | Error s -> Error s
Thus we now have one less closure creation! If this expression is preceded by other bind (t' >>= fun r' -> ...), then we can do the same conversion again and again.

Inlining + partial evaluation by CamlP4

If this inlining and partial evaluation could be done automatically by OCaml compiler, it would be pretty nice. But unfortunately, the current version of the compiler does not seem to perform such kind of optimization eagerly.

If the compiler does not do the job, I could do it by myself, and I did. I wrote a small CamlP4 syntax extension module, planck/pa_bind_inline/pa_bind_inline.ml, which performs the inlining and auto eta contractions described above around (>>=), (<|>) and ():

(* Extract from planck/pa_bind_inline/pa_bind_inline.ml *)

module Make (AstFilters : Sig.AstFilters) = struct
  open AstFilters

  let bind _loc t f =
    match f with
    | <:expr< fun $lid:x$ $lid:st$ -> $e$ >> ->
        (* rewrite [bind t (fun x st -> e)] *)
        <:expr< fun st__ ->
          Profile.incr ();
          match $t$ st__ with
          | Result.Ok ($lid:x$, $lid:st$) -> $e$
          | Result.Error e -> Resut.Error e >>,
        true

    | ... (* rules for <|> and  follow *)

  let _ =
    let simplify = object (self)
      inherit Ast.map as super

      method! expr e =
        match super#expr e with
        | <:expr@_loc< $lid:x$ $t$ $f$ >> when x = ">>=" || x = "bind" ->
            (* use [bind] to rewrite [bind t f] *)
            let e, x = bind _loc t f in
            if x then self#expr e else e

        | ... (* rules for <|> and  follow *)

        | x -> x
    end
    in AstFilters.register_str_item_filter simplify#str_item

end
With this hack, the speed ratio to ocamllex/ocamlyacc went down to x70, which means gaining x1.4 of speed. If you are not satisfied with this, I see still lots of not inlined calls, so you can inline more of them.

Current OCaml is too simple, not ready for Monad booms from Haskell world. But how about in future?

As we have seen, unfortunately I have a feeling that the current version of OCaml compiler (3.12.0) is too simple to write efficient programs with heavy use of functional monads like Planck. Normally compositions of functional monads by bind have lots of opportunities of inlining + partial evaluations(eta contractions), however, OCaml cannot make use of them. OCaml has proven that strongly typed functional programming languages can provide scalable and fast executables even with rather simple compiler implementations without decent optimizations. But now, OCaml's simplicity seems to become its inferiority compared to its cousins like F# and Haskell...

There is a hope, however. Recently OCamlPro http://www.ocamlpro.com/ is founded and these guys are working hard on various compiler enhancements, such as inline-more ( https://github.com/OCamlPro/OCamlPro-OCaml-Branch ). It does not work well for my parser example for the moment, but I am really looking forward to seeing the branch boosts it one day!

2011年1月21日金曜日

Some enhancements for pa_monad's "unit binder"

In OCaml monadic programming, I am completely happy with writing the bind binary operator (>>=). Here is a code piece of my LLVM thing, with a builder monad:

let run clos lty_ret =
B.cast clos lty_generic_clos_ptr >>= fun clos ->
B.const_load clos [0; pos_code_ptr] "code" >>= fun loaded ->
B.cast loaded (L.Type.pointer (L.Type.function_list lty_ret [ L.Type.void_pointer; L.Type.void_pointer ])) >>= fun code_ptr ->
B.const_load clos [0; pos_env_ptr] "env" >>= fun env_ptr ->
get_arg_ptr clos (L.Const.int 0) >>= fun args_ptr ->
B.check_call code_ptr [ env_ptr; args_ptr ] >>= fun () ->
B.call code_ptr [ env_ptr; args_ptr ] "called"
But some people cannot work with this conservative style and want to use do notation in Haskell. For OCaml, we have pa_monad's perform notation (http://www.cas.mcmaster.ca/~carette/pa_monad/):
let run clos lty_ret = perform
clos <-- B.cast clos lty_generic_clos_ptr;
loaded <-- B.const_load clos [0; pos_code_ptr] "code";
code_ptr <-- B.cast loaded (L.Type.pointer (L.Type.function_list lty_ret [ L.Type.void_pointer; L.Type.void_pointer ]));
env_ptr <-- B.const_load clos [0; pos_env_ptr] "env";
args_ptr <-- get_arg_ptr clos (L.Const.int 0);
B.check_call code_ptr [ env_ptr; args_ptr ]; (* It is "unit binder" *)
B.call code_ptr [ env_ptr; args_ptr ] "called"
Nice. But after a while playing with pa_monad, I have found two shortcomings of pa_monad's "unit binder", a bind expression without <-- (I am not sure what it is called, so I call it "unit binder").


OCaml sequence expression is forbidden in perform notation

perform notation has the form perform e; e; e; e; ... by changing the original parsing of OCaml sequence expression under perform keyword. Therefore we cannot write the normal OCaml sequence expression e; e; e; ... in it. Instead, we had to use let () = e; e; e in:

let run clos lty_ret = perform
clos <-- B.cast clos lty_generic_clos_ptr;
let () = prerr_endline "clos done" in
loaded <-- B.const_load clos [0; pos_code_ptr] "code";
let () = prerr_endline "loaded done" in
code_ptr <-- B.cast loaded (L.Type.pointer (L.Type.function_list lty_ret [ L.Type.void_pointer; L.Type.void_pointer ]));
env_ptr <-- B.const_load clos [0; pos_env_ptr] "env";
args_ptr <-- get_arg_ptr clos (L.Const.int 0);
let () = prerr_endline "ptrs done"; prerr_endline "all things are prepared. Now call!" in
B.check_call code_ptr [ env_ptr; args_ptr ]; (* It is "unit binder" *)
B.call code_ptr [ env_ptr; args_ptr ] "called"
Oh BTW, I do not recommend using let _ = e in, since it just throws away the result of e and it is unsafe. Use let () = e in instead, to make sure that e's type is unit. Anyway, this workaround requires lots of key types, so I have introduced \\ escape to have side effect expressions more easily in perform:
let run clos lty_ret = perform
clos <-- B.cast clos lty_generic_clos_ptr;
\ prerr_endline "clos done";
loaded <-- B.const_load clos [0; pos_code_ptr] "code";
\ prerr_endline "loaded done";
code_ptr <-- B.cast loaded (L.Type.pointer (L.Type.function_list lty_ret [ L.Type.void_pointer; L.Type.void_pointer ]));
env_ptr <-- B.const_load clos [0; pos_env_ptr] "env";
args_ptr <-- get_arg_ptr clos (L.Const.int 0);
\ prerr_endline "ptrs done";
\ prerr_endline "all things are prepared. Now call!";
B.check_call code_ptr [ env_ptr; args_ptr ]; (* It is "unit binder" *)
B.call code_ptr [ env_ptr; args_ptr ] "called"
Simple isn't it? With \\ sign, the normal OCaml sequences and unit binders are clearly distinguished.


Unit binders should really bind unit. Otherwise, it should be warned.

I also noticed that unit binders can have monads of any contents, and those contents are simply discarded there without any warning. It is very dangerous. If your expression has a type t monad, where t <> unit, then t should be meaningful and should not be thrown away silently. Here is an example of option monad:

let bind x f = match x with
| Some v -> f v
| None -> None

let return x = Some x

let the_answer = return 42

perform
the_answer; (* 42 is gone! *)
return 666
This is because the unit binder e; is converted to an expression bind e (fun _ -> ...), with a wild-card _. In the above example, bind the_answer (fun _ -> return 666). I have changed this conversion a little so that the warning should be printed if non-unit value is thrown away:
let bind x f = match x with
| Some v -> f v
| None -> None

let return x = Some x

let the_answer = return 42

perform
the_answer; (* Warning S: this expression should have type unit. *)
return 666
Now the expression is converted to bind the_answer (fun __should_be_unit -> __should_be_unit; return 666), and if __should_be_unit does not have unit type then OCaml compiler annoys about it.


pa_monad_custom

The modified version is available at https://bitbucket.org/camlspotter/pa_monad_custom . Help yourself.