I’ve wanted to write OCaml for a long time. In my experience, the best way to learn a new language is simply to use it. Unfortunately, “just use it” is often easier said than done.
When the goal is merely to try out a language, I often end up inventing a project purely as an excuse to use it. In hindsight, this is usually a poor motivation for learning anything.
Fortunately, this time I had a real problem to solve.
I’ve been wanting an accounting system to track corporate bonds. However, the mental model I have in mind doesn’t fit well with the accounting tools I’ve seen so far. I blame myself for having a very specific interface in mind instead of adapting to existing tools.
That said, we now have a concrete problem to solve, and therefore the perfect excuse to finally start writing some OCaml :-)
What are we trying to solve
What I’m trying to design is a simple overdraft accounting system. I invest in corporate bonds where the interest is paid out monthly. The borrowers, however, can choose to skip payments during rough months, in which case the interest accrued is compounded at a penalty rate.
I am also an avid user of Emacs and Org mode. I want to maintain the overdraft system in the following way:
* Alice
#+NAME: principal
| Date | Amount | Rate | Comments |
|------------+-----------+------+----------|
| 2025-03-04 | 35,00,000 | 10 | |
| 2025-06-04 | 35,00,000 | 10 | |
| 2025-09-04 | 32,50,000 | 12 | |
| 2026-01-05 | 5,00,000 | 12 | |
#+NAME: interest
| Date | Amount | Comments |
|------------+----------+----------|
| 2025-03-05 | 29,166 | |
| 2025-04-05 | 29,166 | |
| 2025-05-05 | 29,166 | |
| 2025-06-05 | 58,333 | |
| 2025-07-05 | 18,333 | |
| 2025-12-05 | 5,77,513 | |
| 2026-01-05 | 95,833 | |
* Bob
#+NAME: principal
| Date | Amount | Rate | Comments |
|------------+----------+------+----------|
| 2026-03-02 | 5,00,000 | 10 | |
| 2026-03-04 | -10,000 | 10 | |
#+NAME: interest
| Date | Amount | Comments |
|------------+--------+----------|
| 2026-03-04 | 1,500 | |It’s simple, straightforward, and easy to read. The structure for each user will always be the same.
I want a tool called overdraft-render that will help me visualise this file in more detail.
> overdraft-render loans.org
# Alice
Date │ Bal @ 10% │ Bal @ 12% │ Bal │ Int @ 10% │ Int @ 12% │ Int │ Paid │ Due
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2025-03 │ 35,00,000 │ 0 │ 35,00,000 │ 29,166 │ 0 │ 29,166 │ 29,166 │ 0
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2025-04 │ 35,00,000 │ 0 │ 35,00,000 │ 29,166 │ 0 │ 29,166 │ 29,166 │ 0
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2025-05 │ 35,00,000 │ 0 │ 35,00,000 │ 29,166 │ 0 │ 29,166 │ 29,166 │ 0
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2025-06 │ 70,00,000 │ 0 │ 70,00,000 │ 58,333 │ 0 │ 58,333 │ 58,333 │ 0
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2025-07 │ 70,00,000 │ 0 │ 70,00,000 │ 58,333 │ 0 │ 58,333 │ 18,333 │ 40,000
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2025-08 │ 70,00,000 │ 0 │ 70,00,000 │ 58,333 │ 0 │ 58,333 │ 0 │ 99,133
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2025-09 │ 70,00,000 │ 32,50,000 │ 1,02,50,000 │ 58,333 │ 32,500 │ 90,833 │ 0 │ 1,91,948
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2025-10 │ 70,00,000 │ 32,50,000 │ 1,02,50,000 │ 58,333 │ 32,500 │ 90,833 │ 0 │ 2,86,619
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2025-11 │ 70,00,000 │ 32,50,000 │ 1,02,50,000 │ 58,333 │ 32,500 │ 90,833 │ 0 │ 3,83,184
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2025-12 │ 70,00,000 │ 32,50,000 │ 1,02,50,000 │ 58,333 │ 32,500 │ 90,833 │ 5,77,513 │ -95,833
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2026-01 │ 70,00,000 │ 37,50,000 │ 1,07,50,000 │ 58,333 │ 37,500 │ 95,833 │ 95,833 │ -95,833
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2026-02 │ 70,00,000 │ 37,50,000 │ 1,07,50,000 │ 58,333 │ 37,500 │ 95,833 │ 0 │ 0
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2026-03 │ 70,00,000 │ 37,50,000 │ 1,07,50,000 │ 58,333 │ 37,500 │ 95,833 │ 0 │ 95,833
# Bob
Date │ Bal @ 10% │ Bal │ Int @ 10% │ Int │ Paid │ Due
───────────┼─────────────┼────────────┼─────────────┼─────────┼─────────┼─────────
2026-03 │ 4,90,000 │ 4,90,000 │ 4,083 │ 4,083 │ 1,500 │ 2,583
This projection of the ledger history is very useful to me. There can be many such projections in the future, but for the time being, this will suffice.
This tool need not be capable of parsing the entire Org format, just the structure above. If the structure is any different, the tool should fail.
Implementation
I’m a functional programmer and have been writing Haskell for a long time. I want to see how my ideas translate to OCaml. I want the implementation to have a good balance between readability and performance. When I say performance, I want to compare it to the most efficient C alternative. I’m not concerned with the asymptotic complexity but more so with the allocations and the final code produced after compiler optimisation.
I will be skipping out on a few implementation details here. The code is hosted publicly at @adithyaov/overdraft-render and the commits history is a good indication of how things evolved. I will talk more about how I went about implementing it in OCaml and occasionally compare it with Haskell.
Parsing
I’ve used the applicative parser library called angstrom.
I expect the parsing to be linear without any backtracking. It would not have been very difficult to implement this naively, but I’m not sure if a handwritten parser would be significantly better (CPU + allocations) than applicative parsers.
That said, applicative parsers are highly readable and debuggable, and so I’ve decided to use Angstrom. It may be worth implementing a subset of Angstrom to get a better feel for OCaml.
My parser ended up looking like this:
type principal_payment = {
rate : int;
principal : int;
date: date;
} [@@deriving show, eq]
let get_unique_rates (principal_payments : principal_payment list) =
principal_payments
|> List.map (fun p -> p.rate)
|> List.sort_uniq Int.compare
type interest_payment = {
amount : int;
date: date;
} [@@deriving show, eq]
type borrower = string [@@deriving show, eq]
type borrower_details = {
unique_rates : int list;
principal_payments : principal_payment list;
interest_payments : interest_payment list;
} [@@deriving show, eq]
type database = borrower_details StringMap.t
type database_as_list = (borrower * borrower_details) list [@@deriving show, eq]
module Parser = struct
...
let borrower_details =
(fun bn ppl ipl ->
(bn, {
unique_rates = get_unique_rates ppl;
principal_payments = ppl;
interest_payments = ipl;
}))
<$> account_name <* filler
<*>
table_tag "principal"
*> principal_payment_header
*> divider
*> many principal_payment <* filler
<*> table_tag "interest"
*> interest_payment_header
*> divider
*> many interest_payment
let database =
(fun bd_list ->
bd_list
|> List.to_seq
|> StringMap.of_seq)
<$> many (filler *> borrower_details)
endI’m really curious about the final code after all the compiler optimisations. I’ll have to look into this later. I also want to compare the memory profile to hand-written C.
The OCaml module system is actually quite nice. It’s like Haskell’s backpack but much more mature. This is OCaml’s take on ad-hoc polymorphism and I really like its design.
However deriving common functionality seems less mature in OCaml. It looks like deriving is the job of a pre-processor creating new functions for us to use. This feels closer to a compiler plugin in Haskell than to built-in deriving.
Rendering
For every borrower, the idea is to create a timeline that has the state of the ledger every month.
Date │ Bal @ 10% │ Bal @ 12% │ Bal │ Int @ 10% │ Int @ 12% │ Int │ Paid │ Due
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2025-03 │ 35,00,000 │ 0 │ 35,00,000 │ 29,166 │ 0 │ 29,166 │ 29,166 │ 0
...
───────────┼─────────────┼─────────────┼───────────────┼─────────────┼─────────────┼──────────┼────────────┼────────────
2026-01 │ 70,00,000 │ 37,50,000 │ 1,07,50,000 │ 58,333 │ 37,500 │ 95,833 │ 95,833 │ -95,833
Most problems can be modelled naturally in a streaming abstraction. This is no different. Also, given that I’m one of the authors and maintainers of streamly, I have a natural bias.
The first thing I’ve googled was “Zero cost streaming abstractions in OCaml”. strymonas came up and the ideas were very similar to what I’ve been working with for a long time. Stream fusion is a fascinating topic and reliable stream fusion makes readable abstractions as performant as hand written C. While I’ve wanted to take a jab at exploring this, for the time being, I’ve taken the safer path of using what was in the base libraries: Seq.
The idea is simple. Given we have the following sparse streams,
principal_payment Seq.t
interest_payment Seq.twe should build a complete timeline with all the information required generating the output row,
type timeline_elem = {
date : date;
prin_map : int IntMap.t;
int_map : int IntMap.t;
int_paid : int;
int_due : int;
}
timeline_elem Seq.t Conceptually, the ledger can be produced by scanning a time-ordered stream of events. These events consist of principal changes, interest payments, and monthly ticks. So I ended up first building a complete event stream and then scanning over that event stream.
type event =
| Principal of principal_payment
| Interest of interest_payment
| Tick of date [@@deriving show, eq]
let build_event_stream (details : borrower_details) : event Seq.t = ...
let build_timeline_stream (event_stream : event Seq.t) : timeline_elem Seq.t = ...It was quite straightforward to do this OCaml using Seq. It ended up looking like this:
let build_event_stream (details : borrower_details) : event Seq.t =
let initial_date =
match details.principal_payments, details.interest_payments with
| [], [] -> None
| p :: _, [] -> Some p.date
| [], i :: _ -> Some i.date
| p :: _, i :: _ -> Some (if p.date < i.date then p.date else i.date)
in
let rec loop running_date (ps : principal_payment list) (is : interest_payment list) () =
if running_date > (current_date ()) then
Seq.Nil
else
match ps, is with
| p :: p_tl, _ when p.date = running_date ->
Seq.Cons (Principal p, loop running_date p_tl is)
| _, i :: i_tl when i.date = running_date ->
Seq.Cons (Interest i, loop running_date ps i_tl)
| _, _ ->
Seq.Cons (Tick running_date, loop (next_date running_date) ps is)
in
match initial_date with
| None -> Seq.empty
| Some d -> loop d details.principal_payments details.interest_payments
let build_timeline_stream all_rates event_stream =
let initial_state = {
principal_accured = IntMap.empty;
interest_paid = 0;
interest_due = 0;
} in
let rec loop ev_step state () =
match ev_step () with
| Seq.Nil -> Seq.Nil
| Seq.Cons (event, ev_step_next) ->
match event with
| Principal p ->
let new_state = {
state with
principal_accured =
IntMap.update p.rate (fun mres ->
match mres with
| Some res -> Some (res + p.principal)
| None -> Some p.principal
) state.principal_accured
} in
loop ev_step_next new_state ()
| Interest i ->
let new_state = {
state with
interest_paid = state.interest_paid + i.amount
} in
loop ev_step_next new_state ()
| Tick d ->
let standard_int_due_map =
IntMap.mapi (fun r p -> p * r / 100 / 12) state.principal_accured in
let standard_int_due =
IntMap.fold (fun _ amt acc -> acc + amt) standard_int_due_map 0 in
let interest_offset =
if state.interest_due > 0 then
state.interest_due + (state.interest_due * 24) / 100 / 12
else
state.interest_due in
let total_interest_due =
standard_int_due + interest_offset - state.interest_paid in
let new_state =
{ state with
interest_due = total_interest_due;
interest_paid = 0
} in
let loop_next = loop ev_step_next new_state in
Seq.Cons ({
date = d;
prin_map = state.principal_accured;
int_map = standard_int_due_map;
int_due = total_interest_due;
int_paid = state.interest_paid
}, loop_next) in
loop event_stream initial_stateOnce we have the timeline stream, each element can be mapped to a table row. I’ve used printbox for rendering helpers.
let make_table_row (inp : timeline_elem) : PrintBox.t listStreams are essentially functional loops and using streaming abstractions to model my solution helps build the loop step-by-step in a clean way.
As before, I’m still very curious about how this is optimised after the compilation pipeline. While working with streamly in Haskell, I’d occasionally check the Core to see if the pipeline had fused.
Miscellaneous Thoughts:
I miss the where clause in Haskell.
I’m curious about the naming scheme for types and values. Haskell’s clean separation of capitalised words for the type space and uncapitalised words for the value space is very appealing. I wonder why OCaml chose a different approach.
I also miss the explicit function type signatures. In Haskell, I can provide an explicit type signature on a separate line for any function, which serves as excellent documentation. In OCaml, type signatures must be inline, which feels a bit tedious.
Type errors are not very detailed and I’ve felt that type inference is more localised. For example,
type user = { name : string; age : int }
type street = { name : string }
(* Compilation fails with type error *)
let return_user user : user =
if user.name == "Alice" then user else user
(* Compilation succeeds *)
let return_user (user : user) : user =
if user.name == "Alice" then user else userI would have expected it to infer the type of user as user based on the output type.
Next Steps
I’m quite happy with how this turned out. It was very easy to start working with OCaml. There are a few quality-of-life features I miss compared to Haskell, but there is no significant hurdle.
There are a few things I plan to do:
- Learn more about the optimisation pipeline.
- Check how to see the final optimised output (Core equivalent in OCaml).
- Benchmark and study the performance of
overdraft-render. - Improve performance and memory profile of
overdraft-renderwhile maintaining readability.
Feel free to review the code and suggest improvements to overdraft-render. I’m curious to hear thoughts on whether what I’ve written is idiomatic OCaml.