Fork me on GitHub

About

Vous êtes sur le weblog de JN Avila

Pages

Add some pages here, or start a new chapter.

Tag Cloud

Archives

Categories

Links

Alpinux
Aroblog
Hayabousa
Le blog de Raphy Stoller
le blog de Belu
stellamatutina
Da BOop
JS Zone
Irresponsable !
Blablog
Mistress Doom Bazar
Le monde de Cornelius
Why-Note

Search

Latest Comments

Nathalia (Fin de vacances): Tiens, tu blogues perso maintenant ?
DBF (Mes dernières lec…): Plus léger et en phase avec l’actu www.decitre.fr/livres/star -wars-l.. Joyeux Noël, JNA.
JN (Real humans and c…): C’est en anglais parce que je voulais en faire une réponse à l’article initial.
Frérot (Real humans and c…): Waowww, You’ve gone deep into the episodes… By the way, like you, I was kind of fond of “Real Human…
Cami (Les joies de l'in…): Bonjour! Si vous êtes intéressés de traduire logiciels pour Internet, pour PC, pour mobiles ou tout a…
Mathieu (Pas sérieux): …. je vois passer les avions avec bandeaux publicitaires au-dessus des plages depuis la fenêtre de mo…

Stuff

Powered by PivotX - 2.3.3 
XML: RSS Feed 
XML: Atom Feed 

« En voilà un qui a ins… | Home | Real humans and codin… »

parallel computing with python and ruby for fun and profit

Monday 02 September 2013 at 9:37 pm.

Lately, I had to make a lot of computation and I looked into harnessing the power of the CPU that runs most of our PCs today.

Background

I had read before that Python did not accept multithreading (hence for instance the development of the twisted framework). You can use the concept of thread but in the end, the application is run in a serialized way. This is due to the usage of a global locking mechanism in the virtual machine (the Global Interpreter Lock). So for python, the only way is to spawn new processes and have a way to migrate objects and code around.

I had read on the other hand that ruby has been using native threads from version 1.9. Great! But later, if you take an attentive look at the fine prints, you find out that the global lock is also implemented and that the threads are in fact serialized through the use of mutex. Maybe the development is heading toward full thread support, but at the moment, you are limited just as with Python. So, in the end, the solution involves forking other ruby processes.

But enough talk, let's see some code.

Python

My only experience with parallelizing python is within ipython and numpy. Using numpy usually involves running heavy computing and not taking advantage of today multicore processor is a crime. In my case, I had to run some simulations with sets of parameters in the form of:

parameters = [ 100, 200, 300, 400, 500, 600]
result = map(heavy_compute, parameters)

which is obviously parallelisable. In Ipython, the parallelization is reached with starting a cluster of worker processes from the shell:

$ ipcluster start -n 4

 Then in the code:

from IPython.parallel import Client 
s = Client()[:]
with s.imports:
from mylib import heavy_compute
res = s.map_sync(heavy_compute, parameters)

In this case, we create a client to the cluster, then the heavy function is sent over the cluster and called with each of the parameters in parallel.

Ruby

For ruby, this is even simpler. I wanted to parallelize the task of checking the markdown syntax of the all the translations of the progit book. This task is indeed highly parallelizable.

require 'parallel'
langs = FileList.new('??')+FileList.new('??-??')
results = Parallel.map(langs) do |lang|
error_code = test_lang(lang, $out)
if error_code
print "processing #{lang} KO\n"
else
print "processing #{lang} OK\n"
end
error_code
end
fail "At least one language conversion failed" if results.any?

Here Ruby forks automatically on demand. This introduces a bit of overhead, but you don't have to manage a cluster. I had to pass on the descriptor of stdout to the forked processes so that they would be able to print things to the console. Remark that ruby allows to define the parallel job in a more functional way.

Conclusion

As most of modern computers available have multiple cores, it becomes quite simple to take advantage of this additional computing power in my favourites scripting languages.

Side notes: yes, I know that any shell language or Makefile is already able to spawn processes and parallelize tasks in a very natural way. But you can't do everything with bash and make...

No comments

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Del.icio.us
  • Digg
  • Facebook
  • Google
  • LinkedIn
  • StumbleUpon
  • Tumblr
  • Twitter




(optional field)
(optional field)
Remember personal info?
Small print: All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.