layout - Algorithm to implement a word cloud like Wordle


Translate

Context

My Questions

  • Is there an algorithm available that does what Wordle does?
  • If no, what are some alternatives that produces similar kinds of output?

Why I'm asking

  • just curious
  • want to learn

All Answers
  • Translate

    I'm the creator of Wordle. Here's how Wordle actually works:

    Count the words, throw away boring words, and sort by the count, descending. Keep the top N words for some N. Assign each word a font size proportional to its count. Generate a Java2D Shape for each word, using the Java2D API.

    Each word "wants" to be somewhere, such as "at some random x position in the vertical center". In decreasing order of frequency, do this for each word:

    place the word where it wants to be
    while it intersects any of the previously placed words
        move it one step along an ever-increasing spiral
    

    That's it. The hard part is in doing the intersection-testing efficiently, for which I use last-hit caching, hierarchical bounding boxes, and a quadtree spatial index (all of which are things you can learn more about with some diligent googling).

    Edit: As Reto Aebersold pointed out, there's now a book chapter, freely available, that covers this same territory: Beautiful Visualization, Chapter 3: Wordle


  • Translate

    I've implemented an algorithm as described by Jonathan Feinberg using python to create a tag cloud. It is far away from the beautiful clouds of wordle.net but it gives you an idea how it could be done.

    You can find the project here.


  • Translate

    I've created a Silverlight component that uses the algorithm Jonathan suggests here. The source code and example projects are all available on my blog:

    http://whydoidoit.com

    Color word cloud

    My cloud lets you color and size words based on different weightings and it supports word selection (from a coordinate) and selected word highlighting. The source is yours to use as you see fit.

    Example Word Cloud


  • Translate

    Here's a really nice javascript one from Jason Davies that uses d3. You can even use webfonts with it.

    Demo: http://www.jasondavies.com/wordcloud/

    Github: https://github.com/jasondavies/d3-cloud


  • Translate

    I'm working on WordCram, a Processing library for making word clouds. It's pretty heavily influenced by Wordle, and is informed by the same PDF aeby linked to above. It handles the collision detection for you, and lets you focus on how you want your words laid out, colored, rotated, etc.


  • Translate

    http://code.google.com/apis/visualization/documentation/gallery.html

    Check out the word cloud visualization. Not as fancy as wordle.net but real easy to add to your site.


  • Translate

    I was looking for a wordle-like visualization which would allow to assign color, initial position and size of a String related to other data, such as the relevance within a text - didn't find anything, but thanks to the information I found here (Especially Jonathan's explanation and aeby's link), I could finally implement 'Cloudio', which comes relatively close to wordle (at least I think so...) and offers the features I was looking for.

    It is implemented with SWT and JFace, and I tried to integrate it into the MVC-model of JFace, such that you can set content- and label-providers to modify the layout of a cloud and add it to other Eclipse-plugins or RCP apps. You can also modify the way the initial position of a string is calculated, such that is not difficult to use it for cluster visualization or else. It is still poorly documented and limited in some ways (and I did the initial upload a few hours ago, so it might still be a bit buggy), but if you're interested, here's the link:

    And here's a link to some created clouds, in case you want a quick impression: https://github.com/sschwieb/Cloudio/wiki/Example-Clouds

    Cheers, Stephan


  • Translate

    Here see my implementation of Wordle like cloud. It uses the same spiral algorithm and the QuadTree data structure.

    http://sourcecodecloud.codeplex.com

    or

    http://www.codeproject.com/Articles/224231/Word-Cloud-Tag-Cloud-Generator-Control-for-NET-Win


  • Cecil Lee
    Translate

    Lion and Lamb is an open-source iOS app that creates word clouds using the most frequent words from a chosen book of the Bible.

    It's based on the algorithm as described by Jonathan Feinberg. Hit testing does utilize a quad tree, but the bounding boxes are based on the glyph's bounding rectangle. I want to break the glyph down into many smaller bounding rects to enable word placement within a glyph's bounding box.

    GitHub: https://github.com/PetahChristian/LionAndLamb

    A word cloud of the Bible book of Revelation


  • Translate

    I have a Tag Cloud generator here, which I call Disorganizer :)

    Sources TagCloudService and the razor markup control and a WinForm for testing purposes that you can put in your blog, profile etc, with a little wrapper around it. It uses C# 4.0 & System.Drawing namespace heavily.

    I created it because with the other cloud generators you cannot click on tags to navigate and cannot create hover animations, to show that they are clickable. Since showing hover animation in HTML is necessary for me (I'm doing this with overlay-ed, absolutely-positioned <a> tags) I haven't developed any-angle word display - they are either vertical or horizontal.

    Warning :The above links may go invalid in a few months, I plan to slowly untie it from the surrounding project into a separate project.

    You can see a working demo on this sample blog post, but it is incomplete, and in an incomplete site. Contact me if anyone wants to contribute, I will get on with separating it out asap.


  • Translate

    Here is yet another end-to-end implementation of wordle in Python 3 largely based on the initial outline by Jonathan Feinberg (QuadTrees, spirals, etc.).

    The code (commented, with detailed ReadMe file) is freely available at this Github repository and this is a sample wordle created with the code.

    Macbeth