Fork me on GitHub


Vous êtes sur le weblog de JN Avila


Add some pages here, or start a new chapter.

Tag Cloud




Le blog de Raphy Stoller
le blog de Belu
Da BOop
JS Zone
Irresponsable !
Mistress Doom Bazar
Le monde de Cornelius


Latest Comments

Mathieu (Pessimisme écolog…): Prêt à t’investir dans https://bulletintransition alors ?
Stéphan (un petit ajout à …): C’est çà la célébrité…
JN (Mes dernières lec…): Il faudra que tu attendes que j‘écrive un billet. Pour ce qui est de “sustainabilité”, ben c’est so…
Mathieu (Mes dernières lec…): Tiens, “sustainabilité”, c’est quoi le mot en vrai français pour dire ça ? “Durabilité”, “soutenabili…
Mathieu (Mes dernières lec…): Houlà, t’en as trop dit ! Balance !!!
JN (Mes dernières lec…): Et encore, je ne te parle pas de ce que je lis en ce moment…


Powered by PivotX - 2.3.3 
XML: RSS Feed 
XML: Atom Feed 

« En voilà un qui a ins… | Home | Real humans and codin… »

parallel computing with python and ruby for fun and profit

Monday 02 September 2013 at 9:37 pm.

Lately, I had to make a lot of computation and I looked into harnessing the power of the CPU that runs most of our PCs today.


I had read before that Python did not accept multithreading (hence for instance the development of the twisted framework). You can use the concept of thread but in the end, the application is run in a serialized way. This is due to the usage of a global locking mechanism in the virtual machine (the Global Interpreter Lock). So for python, the only way is to spawn new processes and have a way to migrate objects and code around.

I had read on the other hand that ruby has been using native threads from version 1.9. Great! But later, if you take an attentive look at the fine prints, you find out that the global lock is also implemented and that the threads are in fact serialized through the use of mutex. Maybe the development is heading toward full thread support, but at the moment, you are limited just as with Python. So, in the end, the solution involves forking other ruby processes.

But enough talk, let's see some code.


My only experience with parallelizing python is within ipython and numpy. Using numpy usually involves running heavy computing and not taking advantage of today multicore processor is a crime. In my case, I had to run some simulations with sets of parameters in the form of:

parameters = [ 100, 200, 300, 400, 500, 600]
result = map(heavy_compute, parameters)

which is obviously parallelisable. In Ipython, the parallelization is reached with starting a cluster of worker processes from the shell:

$ ipcluster start -n 4

 Then in the code:

from IPython.parallel import Client 
s = Client()[:]
with s.imports:
from mylib import heavy_compute
res = s.map_sync(heavy_compute, parameters)

In this case, we create a client to the cluster, then the heavy function is sent over the cluster and called with each of the parameters in parallel.


For ruby, this is even simpler. I wanted to parallelize the task of checking the markdown syntax of the all the translations of the progit book. This task is indeed highly parallelizable.

require 'parallel'
langs ='??')'??-??')
results = do |lang|
error_code = test_lang(lang, $out)
if error_code
print "processing #{lang} KO\n"
print "processing #{lang} OK\n"
fail "At least one language conversion failed" if results.any?

Here Ruby forks automatically on demand. This introduces a bit of overhead, but you don't have to manage a cluster. I had to pass on the descriptor of stdout to the forked processes so that they would be able to print things to the console. Remark that ruby allows to define the parallel job in a more functional way.


As most of modern computers available have multiple cores, it becomes quite simple to take advantage of this additional computing power in my favourites scripting languages.

Side notes: yes, I know that any shell language or Makefile is already able to spawn processes and parallelize tasks in a very natural way. But you can't do everything with bash and make...

No comments

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • Facebook
  • Google
  • LinkedIn
  • StumbleUpon
  • Tumblr
  • Twitter

(optional field)
(optional field)
Remember personal info?
Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.