Multicore OCaml: April 2021

2 years ago/90 comments/discuss.ocaml.org

Happy to answer any questions.

3 years ago by NanoCoaster

I'm sorry if these are dumb questions, I'd be totally fine with some pointers in the right directions if you don't want to waste your time :)

I'm under the impression that the implementation is based on algebraic effects. Does that include extending the language in a way that lets users define their own effects & handlers?

Also, how's the performance? Last time I looked this stuff up (and played around with Eff), which was quite a while ago, I was told that regular usage of effects may impact performance quite noticeably.

3 years ago by sadiq

Not dumb questions at all.

Multicore adds parallelism via Domains (which are essentially heavyweight threads) and concurrency via Effects and fibers. There's a multicore GC that supports both of those.

We plan to upstream things in two parts. First domains-only parallelism and then effects as a follow-up. When the latter lands users will be able to define their own effects and handlers, yes.

Performance is pretty good, you can see our PLDI2021 paper for a proper performance evaluation and loads more details: https://arxiv.org/abs/2104.00250

3 years ago by NanoCoaster

Thank you, interesting stuff. Very much looking forward to user-definable effects! I like Ocaml in principal, but haven't yet strayed beyond learning the very basics some time ago. Seeing a usable implementation of algebraic effects in a somewhat popular language...that I gotta see :)

3 years ago by wuschel

Hello sadiq, I have some more general questions for you:

As understand that currently, the situation is analogous to Python. The GIL allows for process based concurrency, with the well known disadvantage regarding memory consumption. Also, my guess is that OCaml relies library based solutions at the moment?

1) What does the introduction of MultiCore potential mean for Ocaml? Will Ocaml be a much better fit to run a webserver backend? Could you perhaps give a comparison to other programming languages?

2) How will Ocaml stand out with Multicore in the PL world, and what solutions would be it uniquely suited for?

3) What tools are going to be present to deal with bugs introduced by your new runtime e.g. race conditions?

4) If one wants to have a go at Multicore and play around with it, where/how do I start?

Many thanks!

3 years ago by sadiq

Good questions!

1) As I mentioned in https://news.ycombinator.com/item?id=27142502 there is support for parallelism and concurrency.

Giving an example of where these might be useful in a webservice.

The addition of shared-memory parallelism is beneficial where you might have a great deal of shared state that needs to be used to service requests. An in-memory cache is a good example - with a processed-based approach managing read/writes and avoiding significant overhead from marshalling the data is difficult.

Concurrency via effects at a minimum can make writing network-based services much more pleasant (and debuggable!). See the examples in https://arxiv.org/abs/2104.00250 where programs can be written in a direct-style similar to blocking IO but using effects are transformed to use asynchronous interfaces. There's work going on in the project at the moment to build fast cross-platform IO implementations that sit atop of uring/gcd/iocp.

3) This is a good question and one we're still working on. I think one of the lead developers KC has a few good ideas about instrumentation we can do to enable detecting races to global state. It's certainly going to be an issue for people porting large codebases.

4) This is the place to start: https://github.com/ocaml-multicore/multicore-opam#install-mu... .

3 years ago by wuschel

Thank you for the your answer! I will check out the URIs.

3 years ago by hajile

StandardML implementations have had good multicore support for decades now despite having only a tiny fraction of the users and development time. Meanwhile, Ocaml has been promising support for years.

What in Ocaml makes this so much harder to implement?

3 years ago by rwmj

OCaml (or its immediate predecessor[1]) had a multicore implementation, but it was dropped because of its complexity and effect on single-thread performance. The challenge is to add it back in a way that is maintainable and doesn't negatively affect current users.

[1] https://www.researchgate.net/publication/2774662_Concurrent_...

3 years ago by sadiq

As the sibling comment mentions, the hard part is actually retrofitting multicore whilst maintaining compatibility _and_ performance.

Our paper last year covers most of why this is tricky: https://arxiv.org/abs/2004.11663

3 years ago by 4ad

Algebraic effects as described in the OCaml papers about effects are dynamically typed. I recently heard in some talk that this idea was abandoned, and the new algebraic effects are in fact statically typed. Do I remember this correctly? If so, where can I read about these new algebraic effects?

3 years ago by kcsrk

Abandoned is perhaps too strong a term :-). Effect handlers in the language are supported by fibers, lightweight stacklets managed by the runtime. The details of the implementation can be found in the upcoming research paper in PLDI'21 conference [1]. The effect handlers in Multicore OCaml today do not provide effect safety. Programs are not statically guaranteed to handle all the effects they may perform. This is only as bad as exceptions in OCaml and every other mainstream language with exceptions.

We are working on developing an effect system, which will ensure effect safety i.e, the compiler ensures that all the effects performed are caught. You also get a nice inferred type that says what effects a particular function may perform; if it performs none, then it is a pure function! This implementation would still use the current fiber support in the runtime. Leo White, one of the developers of Multicore OCaml had given a talk on this new effect system a few years ago [2]. That's the best place today to learn about the new effect handlers.

The plan is to first add the fiber runtime support to OCaml without the syntax extensions for effect handlers, and then introduce syntax along with the effect system.

[1] https://arxiv.org/abs/2104.00250

[2] https://www.janestreet.com/tech-talks/effective-programming/

3 years ago by dang

Past related threads:

Multicore OCaml: Feb 2021 with new preprint on Effect Handlers - https://news.ycombinator.com/item?id=26424785 - March 2021 (29 comments)

Multicore OCaml: October 2020 - https://news.ycombinator.com/item?id=25034538 - Nov 2020 (9 comments)

Multicore OCaml: September 2020 - https://news.ycombinator.com/item?id=24719124 - Oct 2020 (43 comments)

Parallel Programming in Multicore OCaml - https://news.ycombinator.com/item?id=23740869 - July 2020 (15 comments)

Multicore OCaml: May 2020 update - https://news.ycombinator.com/item?id=23380370 - June 2020 (17 comments)

Multicore OCaml: March 2020 update - https://news.ycombinator.com/item?id=22727975 - March 2020 (37 comments)

Multicore OCaml: Feb 2020 update - https://news.ycombinator.com/item?id=22443428 - Feb 2020 (80 comments)

State of Multicore OCaml [pdf] - https://news.ycombinator.com/item?id=17416797 - June 2018 (103 comments)

OCaml-multicore now at 4.04.2 - https://news.ycombinator.com/item?id=16646181 - March 2018 (4 comments)

A deep dive into Multicore OCaml garbage collector - https://news.ycombinator.com/item?id=14780159 - July 2017 (89 comments)

Lock-free programming for the masses - https://news.ycombinator.com/item?id=11907584 - June 2016 (29 comments)

Lock-free programming for the masses - https://news.ycombinator.com/item?id=11893911 - June 2016 (4 comments)

OCaml 4.03 will, “if all goes well”, support multicore - https://news.ycombinator.com/item?id=9582980 - May 2015 (113 comments)

Multicore OCaml - https://news.ycombinator.com/item?id=8003699 - July 2014 (1 comment)

3 years ago by philzook

It seems like great work.

Something I don't understand is that I hear people mentioning a lack of multicore support as blocker to using a programming language. I don't think this is something I have ever felt. What problem domains require multicore?

3 years ago by xfer

Anytime you need some parallelism like processing chunks of data independently? Using processes to do it is not very ergonomic.

3 years ago by gmfawcett

It depends? It's easy in Ocaml to fork child processes, set up pipes, and pass messages back and forth. There are libraries that paper over the details (or you can roll your own, as I suspect many of us have done over the years).

That approach is not suitable for every workload, but fine for most map/reduce applications.

3 years ago by toolslive

For us, it was IO actually: doing IO concurrently (Lwt) and letting the one core schedule the IO worked quite well for a long time, but then we started targetting NVMe devices... Now just the scheduling of the IO would fill up the core without coming even close saturating the device.

We ended up splitting the device and using multiple processes but that was suboptimal.

3 years ago by simiones

Well, any interactive program benefits immensely from multi-threading - at the very least, you need 1 UI thread and 1 background thread. This can be achieved with asynchronous workflows, but it's siginifcantly more complicated compared to spawning a thread to address a user action.

Similarly, wanting to spin a thread to do some background work or periodically check some property is an extremely common pattern, that has nothing to do with performance, and where spawning a new process is significantly more overhead, both conceptually and in terms of used resources.

3 years ago by blacktriangle

I feel the same way and I can't help but feel like I am missing something. Concurrency is still inherently more complicated than serial processing no matter what language features you add, and it's amazing how far you can get just by scaling across OS processes. I know there are plenty of domains that do want multicore (ML, graphics), but it seems like even the people writing line of business apps think lack of multicore is a deal breaker for them.

3 years ago by toolslive

Often, it's just about finding a stick. They don't want to use OCaml and the absence of multi-core features gives them the excuse they were looking for. Now they can inform management: "OCaml is not an option. we'll stick with <insert PL here>"

3 years ago by sshine

It often is.

But sometimes it is also about finding more reasons to use one of your favorite languages.

I’m doing pseudo-async PHP at work, and having proper support would certainly make these solutions less brittle.

3 years ago by Buttons840

I learned Haskell instead of OCaml in part because of this. Yeah, it probably wasn't that important, it was an emotional decision, and it wasn't "fair" to OCaml. But whatever it was or wasn't, I learned Haskell instead of OCaml.

Also, if I read between the lines correctly, in your scenerio it's management who is interested in OCaml and the developers want to use something else? Wonder if that has ever actually happened?

3 years ago by hderms

Yeah I really don't know because OCaml already has async constructs. It doesn't seem like it should be a dealbreaker for most. That being said, it has a reputation for being extremely performant so it might just be that people feel like it's so close to being usable in a lot of situations it currently isn't so there is more aggregate desire to allow for multicore execution

3 years ago by gwmnxnp_516a

Forking new processes is less efficient than spawning new threads as processes uses more machine resources and more memory. In addition, communication between threads is far more simple than communication between processes. Multicore may not be deal-breaker for server applications, but it matters for applications that require parallel processing.

3 years ago by gmfawcett

Forking a process on a modern Linux is relatively cheap these days, and for many jobs you will just prefork all the workers anyway. Passing messages via thread mailboxes or IPC pipes are about equally complex endeavours.

If threads had never been invented, I suspect that 90% of modern multicore programs would have done just fine on a mixture of multi-processing and asynchronous I/O. (The other 10% would have done poorly, though.)

3 years ago by blacktriangle

Less efficient for the machine, making the program deal with async is less efficient for the programmer. I'm just confused when I see people working on CRUD apps ruling out tools because they lack first-class async support.

3 years ago by toolslive

What I'm still missing is a strategy to move existing ocaml programs that use (let's say) Lwt for concurrency and a bit of C for things that were trivial to parallelize to move to multicore ocaml AND benefit from it.

3 years ago by kcsrk

(One of the Multicore OCaml devs here)

We have prototyped offloading CPU intensive computations in Lwt programs using Multicore OCaml [1]. We're currently working with Lwt maintainers to upstream it.

[1] https://sudha247.github.io/2020/10/01/lwt-multicore/

3 years ago by anuragsoni

> We have prototyped offloading CPU intensive computations in Lwt programs using Multicore OCaml

This is very interesting! I'm wondering if you are aware of any discussions with the async maintainers about what their plans are with the multicore runtime?

3 years ago by toolslive

thx! Another question. Suppose you're starting from scratch, is it worth going the effects based async io route ? https://github.com/kayceesrk/ocaml-aeio seems very interesting...

3 years ago by kcsrk

Yes, exactly. The reason why monadic concurrency libraries such as Lwt and Async is that the OCaml language does not support concurrency natively. If it did, we would have built something similar to the `ocaml-aeio` library.

Btw there is a modern instantiation of `ocaml-aeio` called `eieio` [1] which supports Linux's io-uring. Eventually, this will be extended to support all the modern I/O stacks on different platforms, and also support performing I/O on multiple cores.

[1] https://github.com/ocaml-multicore/eioio

3 years ago by terminalserver

For the lay person, what is the advantage of ocaml over other languages?

Why would I reach for it?

3 years ago by richeyryan

OCaml is statically typed functional programming language. It's a cousin of Haskell. It has some nice things like automatic type inference so you don't have to write very many type annotations. It's not as focused on purity as Haskell so its easier to mutate state where you want to but you get a lot of the niceties of ML programming languages like pattern matching, variants, structural typing in places.

It also has object-oriented features, though they aren't widely used the attitude is something like OO is there if we need it and we're definitely willing to use it in places that require it.

Its pretty fast for a functional language and you could probably get pretty far with it before you'd have to consider using a real low level langauge.

The disadvantages I think are pretty uncontroversial: a smaller community, not as many libraries, a bit of a fractured stdlib and build situation and until now no multicore.

3 years ago by dunefox

Also no unicode support which is a huge con.

3 years ago by rwmj

Please no. OCaml gets this exactly right - on the rare occasion I want to do something Unicode-y I'll use Camomile, and the rest of the time don't get in the way with stupid language decisions (yes I'm looking at you Python 3 and Ruby).

3 years ago by patrec

I'd take ocaml's unicode support over python's, javascript's or java's any day of the week.

3 years ago by themulticaster

> It has some nice things like automatic type inference so you don't have to write very many type annotations.

I'm not quite sure what you're going for here. The Hindley-Milner type system Haskell is based on essentially does not require any type annotations [1]. By convention, every top-level declaration is annotated, but that is only for documentation and clarity.

Or did you mean that in comparing OCaml and Haskell to other (imperative) languages?

[1] There are a few buts that don't have much to do with the argument, but I'll list them here anyway:

1) Sometimes the type you end up with is too ambiguous and you'll need type annotations: E.g. what is the type of the term "2+3"? It is something like "Num a => a" (read: any type that is roughly number-like), but that is not useful if you want to run the program. However, in practice you won't need an annotation in almost all cases as long as some function you work with restricts the type.

2) Some Haskell extensions increase ambiguity in certain cases.

3 years ago by Zababa

> Sometimes the type you end up with is too ambiguous and you'll need type annotations: E.g. what is the type of the term "2+3"?

I don't think your example works in the case of OCaml since the signature of (+) is int -> int -> int. Basic operators not being polymorphic is one of the specificities of OCaml.

3 years ago by systems

OCaml is everything D wanted to be, but failed

OCaml is high performance, Garbage Collected, hybrid system and application programming language

You can use OCaml to write high performance Apps and System tools without worrying too much about performance, or doing crazy manual memory management

I would also say, if you like Go, but think you hit a wall with it, then try OCaml

system programming language: C, C++, Rust

Hybrid system and application languages: OCaml, Go , D

If you need a language in this class (Hybrid system and app)I think currently Go and OCaml are your only options, Go being closer to C, Java familly of languages and OCaml is an ML language, so choose as per your preference

3 years ago by Serow225

F# would probably sit in there somewhere too a little more on the app than system side, although you don't see many uses of the more low-level stuff that's been made available in the later releases of .NET, it's there for use in F#

3 years ago by abhijat

Is Haskell a good fit for your hybrid category, similar to go and ocaml, considering it also compiles to native?

I haven't heard of Haskell being used for the kind of "high level" systems programming that go is used for.

3 years ago by willtim

In my opinion, as a professional haskell developer, it is too hard to reason about the runtime space usage of Haskell programs to ever recommend Haskell as a "systems programming language". Haskell would be great for writing a DSL for generating such code though.

3 years ago by vips7L

Why do you think D isn't an option?

3 years ago by systems

Well, D have a GC, and by design cannot have a GC with good performance, because of its complex mutable object system

OCaml, can achieve good GC performance because its immutable by default

D by design, will never outperform OCaml, at least this is my understanding

That, and I think D's community is too small to fix the language, while it does have several brilliant members and developer, its just too small and underfunded

So, you have two reason why D should not be an option the first technical ( D will never have a good GC ) the second is more of a logistics issue, the community is just not there to support a language as complex and as ambitious as D and deliver on all its claims

3 years ago by pjmlp

That language is C# not OCaml, specially with the improvements done after version 7.

3 years ago by rwmj

I develop in OCaml from time to time, and it's pretty practical. Separate compilation, makes small-ish binaries that most people wouldn't know weren't written in C/C++, fast, garbage collection reduces mental burden, easily call out to C if you need to. We steer clear of the more complex language features like functors because they confuse most programmers.

Here's an example of one very widely used production application: https://github.com/libguestfs/virt-v2v/tree/master/v2v

It fits a similar niche to Golang or C++, but unlike those it's an enjoyable language to program in.

3 years ago by lmm

I consider OCaml the baseline for what a language from the last 20-or-so years should be. It doesn't do much that really pushes boundaries, but it has all the basic things you want and no major blunders (which is a surprisingly rare thing). In particular it has a sensible type system with proper algebraic types (which so many languages manage to get subtly wrong, even today), full pattern matching, and very little in the way of control flow keywords.

3 years ago by usrnm

> no major blunders

No multithreading seems to be a pretty big problem. Hopefully, it will be fixed soon, but still

3 years ago by lmm

Nah. People think they want it (hence the popularity of Rust) but it doesn't actually help you achieve any of the things you actually want to achieve.

Daily Digest

Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.

Home About GitHub Kaggle

AI Blog Deep Learning Apps Security Checklist

Bookmarks Hacker News My Stack