Nils Grunwald works at the french startup Linkefluence. Their product is more or less social network analysis and graph processing. They crawl the web and blogs or get other social network data and provide solutions with statistics and insights for their customers. 

In this scenario obviously big data is envolved and the data carries a natural structure of a graph. He sais a system to process the data has the following constrains:

  • The processing should not compromise the rest of the system
  • Low maintenance costs
  • Used for queries and rapid prototyping (so they want a “general” graph processing solution as customer needs changes)
  • Flexible, hard to tell which field or metadata will be used beforehand.

He afterwards introduces their solution Cascalog based on Hadoop and is also inspired by cascading a workflow managment system and datalog a subset of prolog which as a declarative, expressive language is very concise way of writing queries and enable quick prototyping

For me personally it is not a very interesting solution since it is not able to answer queries in realtime which of course is obvious if you consider the technologies it is based on. But I quess for people that have time and just do analysis this solution will properly work pretty well!

What I really liked about his the solution is that after processing the graph you can export the data to Gephi or to Neo4j to have fast query processing. 

Hey then explained alot specific details about the syntax of cascalog:


nils grundwald fosdem

nils grundwald from linkfluence talks about cascalog at fosdem

If you like this post, you might like these related posts:

  1. Claudio Martella talks @ FOSDEM about Apache Giraph: Distributed Graph Processing in the Cloud Claudio Martella introduces Apache Giraph which according to him is a loose implementation...
  2. From Graph (batch) processing towards a distributed graph data base Yesterdays meeting of the reading club was quite nice. We...
  3. Birds of a feather: Graph processing future trends in Graph Devroom Since one of the talks got canceled the organisers of...
  4. Michael Hunger talks about High Availability of Neo4j built on Paxos in the GraphDevroom @ FOSDEM As we know neo4j has a master slave replication with...
  5. Google Pregel vs Signal Collect for distributed Graph Processing – pros and cons One of the reading club assignments was to read the...


Tags: , ,

Leave a Reply



Subscribe to my newsletter

You don't like mail?