Finally metalcon.de the social networking site which Jonas, Jens and me created in 2008 gets a redesign. Thanks to the great opportunities at the Institute for Web Science and Technologies here in Koblenz (why don’t you apply for a PhD position with us?) I will have the chance to code up the new version of metalcon. Kicking off on July 15th I will lead a team of 5 programmers for the duration of 4 months. Not only will the development be open source but during this time I will constantly (hopefully on a daily basis) write in this blog about the design decisions we took in order to achieve a good scaling web service.

Before I share my thoughts on high scaling architectures for web sites I want to give a little history and background on what metalcon is and why this redesign is so necessary:

Metalcon is a social networking site for german fans of metal music. It currently has

  • a user base of 10’000 users.
  • about 500 registered bands
  • highly semantic and interlinked data base (bands, geographical coordinates, friendships, events)
  • 624 MB of text and structured data about the mentioned topics.
  • fairly good visibility in search engines.
  • > 30k lines of code (mostly PHP)
  • a bad scaling architecture (own OR-mapper, own AJAX libraries, big monolithic data base design, bad usage of PHP,…)
  • no unit tests (so code maintenance is almost impossible)
  • no music and audio files
  • no processes for content moderation
  • no processes to fight spam and block users
  • a really bad usability (I could write tons of posts at which points the usability lacks)
  • no clear distinction of features for users to understand

When we built metalcon no one on the team had experience with high scaling web applications and we were about happy to get it running any way. After returning from china and starting my PhD program in 2011 I was about to shut down metalcon. Though we became close friends the core team was already up on new projects and we have been lacking manpower. On the other side everyone kept on telling me that metalcon would be a great place to do research. So in 2011 Jonas and me decided to give it another shot and do an open redevelopment. We set up a wiki to document our features and the software and we created a developer blog which we used to exchange ideas. Also we created some open source project to which we hardly contributed code due to the lacking manpower…

Well at that time we already knew of too many problems so that fixing was not the way to go. At least we did learn a lot. Thinking about high scaling architectures at that time I new that a news feed (which the old version of metalcon already had) was very core for the user experience. Reading many stack exchange discussions I knew that you wouldn’t build such a stream on MySQL. Also playing around with graph databases like neo4j I came to my first research paper building graphity a software which is designed to distribute highly personalized news streams to users. Since our development was not proceeding we never deployed Graphity within metalcon. Also building an autocomplete service for the site should not be a problem anymore.

Roadmap for the redesign

  • Over the next weeks I hope to read as many interesting articles about technologies and high scalability as I can possibly find and I will be more than happy to get your feedback and suggestions here. I will start reading many articles of http://highscalability.com/ This blog is pure gold for serious web developers. 
  • During a nice discussion about scalability with Heinrich we already came up with a potential architecture of metalcon. I will soon introduce this architecture but want to check first about the best practices in the high scalability blog.
  • In parallel I will also collect the features needed for the new metalcon version and hopefully be able to pair them with usefull technologies. I already started a wikipage about features and planned technologies to support them.
  • I will also need to decide the programming language and paradigms for the development. Right now I am playing around with ruby on rails vs GWT. We made some greate experiences with the power of GWT but one major drawback is for sure that the website is more an application than some lightweight website.

So again feel free to give input, share your ideas and experiences with me and with the community. I will be ver greatfull for every recommendation of articles, videos, books and so on.

If you like this post, you might like these related posts:

  1. My ranked list of priorities for Backend Web Programming: Scalability > Maintainable code > performance The redevelopment of metalcon is going on and so far...
  2. Open access and data from my research. Old resources for various topics finally online. Being strong pro on the topic of open access I...
  3. Amazed by neo4j, gwt and my apache tomcat webserver edit: the demo is finally online but on a different...
  4. Michael Hunger talks about High Availability of Neo4j built on Paxos in the GraphDevroom @ FOSDEM As we know neo4j has a master slave replication with...
  5. Video of FOSDEM talk finally online I was visiting FOSDEM 2013 with Heinrich Hartmann and talking...

Sharing:

Tags: , , , , , , , , ,

5 Comments on Metalcon finally gets a redesign – Thinking about high scalability

  1. Ruben says:

    Rene,

    Thanks for all of your assistance! Let me recap and add to our recent conversation, and make a case for Ruby on Rails (and / or other RAD frameworks – I don’t want ignite a Rails vs Django vs Cake debate – any of these frameworks share some fo the same pros and cons).

    I do not claim to be an expert, and welcome any corrections . I wrote this on a plane with no internet access so, keep that in mind.

    Pros

    1. Rails Scalability: Ruby on Rails and other frameworks have come a long ways in terms of performance and scaling.. It’s true that older versions of Rails had issues, which have since been improved or resolved. As far as blocking is concerned, app servers like Passenger Phusion spread out processes over cores to resolve this issue. I list this as a pro, though it could be a disadvantage as well, to point out things have improved for Rails in this area.

    2. Plugins, Gems, APIs etc: Frameworks offer features such as libraries, plug ins, GEMS (Ruby on Rails) that can help cut development time down. Autentication, administration, security, and database GEMS are available in Ruby on Rails so you can focus on what is different about your application. (For example, if you want to autenticate with Facebook, you could write your own code, or you could drop in a GEM that has the code to do it. That lets you focus on the rest of your app for which there are no plugs-ins.)

    3. RailsCasts – 300 episodes that really help you to learn Ruby on Rails. One of the best resources for learning how to do things with ROR.

    4. Proven: Twitter, Hulu, Linked In are examples of sites that started with Rails. True some of them have removed rails from the back end, but ROR gave them a competitive advantage and allowed them to grow in the first place. (Also, the versions of Rails they used had more scalablity issues than the current version does).. Unless your site is going to have tens or hundreds of millions of users, the odds that you will not need to remove rails from the back end. So in a nutshell, ROR is a proven framework for quickly creating top tier apps.

    5. Convention not configuration: Instead of worring so much about configuration, simply write code to get the job done.

    6. Ease of development: Once you understand ROR, it’s very quick to develop functionaly. For example, I’ve created social network follow functionality in Neo4j & ROR in a little under 10 minutes.

    7. Lightweight frameworks: Don’t need the heavy functionality of RAILS? Need a lighter library to host web services? Try Sinatra instead of the full Rails stack.. Create sockets, live chatting, etc this way..

    8. Proven in hybrid systems: (Both an advantage and disadvantage.) To scale out, no matter what you write your system in, the odds are you will be using a hybrid design. Django, Ruby on Rails, PHP with Cake, Java etc., node.js, etc you’ll probly have a hybrid system with in memory database servers, Node.js, message queuing systems, app servers, data sharding, data replication, HTTP caching, load balancers etc. The advantage rails offers is the experience others have with it – they have shared it and provided a road map so you don’t have to guess at how to design a ROR hybrid.. Disadvantage is that the architecutre of a hybrid sytem is much more complex..

    9. Framework: There are common sets of functions to use, a common architecture, etc.. Compare this to Node.JS.. Node.js is not an app framework – you don’t have a robust set of standard libraries to call on for common application functionality. Node.js doesn’t help you develop applications faster, though it will help you develop faster applications and to split out web services with greate ease.

    10. JavaScript and Jquery: Rails now uses Jquery and Javascript as standards. If you want to develop even faster, Coffescript can be used to develop your javascript routines. Take advantage of the many query routines already written (like endless pages / lazy loading of pages) to help speed up your development. Integrate Twitter Bootstrap with ROR.. Even use GWT with ROR as well … The cons of this is you have to learn how to integrate these with Rails, but there are GEMS to handle that so you don’t have to! Once you get past the intial learning curve of using these with Rails, it’s pretty easy..

    11. Learning curve: ROR in many ways is easy to learn. Scaffolding is a great way to start creating an app.. I still use it even though I don’t need to use it anymore to develop rails apps. Describing a table with minimal typing (you don’t specify field lengths etc – but you can if you want to) via scaffolding also creates the model and views for you to help get your app up and running very quickly.. Very quick to develop this way, and easy to learn.

    12. Database migrations: Add or remove fields to your tables with migrations. Very useful in synchronizeing test and production database schemas – ROR will make the necessary changes to production tables without the assistance of a DBA.

    13. DBA is not Required: A database administrator (DBA) is not necessary to develop ROR apps. Well, a DBA could be helpful, but it’s not necessary to have a DBA create tables etc.. Rails will do this for the developer, resulting in quicker development time.

    14. Streaming: Stream data to create live chatting, or send data to mobile devices with long request lifetimes.. While it’s not as robost as doing so with Node.js, nevertheless it offers node.js like functionality and performance. (Available with ROR 4 and with 3.2 with some tweaks.)

    15. JRuby.. Uses a Java VM and can access Java APIs.

    16. Community: The Rails community offers a tremendous ammount of support not often found with other frameworks. Have a problem? It’s easy to get help from the community. Like this response about scalability – the community is very helpful.

    Disadvantages.

    1. ORM: Well, Rails uses Active Record, which has some significant overhead. On the other hand it’s possible to use sql statements and procs so that performance can be optimized. With my current Neo4j project, I’m not using ActiveRecord – only Activemodel, and there are other alternatives to Activerecord. While some say never use ORM due to scalability, I think ORM’s proper place is to quickly develop an app – If a project is successful, there will be time and money to move to something else.. If a project is only moderately succesful, you may find that an ORM is just fine.

    2. Threading / Thread blocking: It’s possible to get around this, and Rails 4 offers tread safe functionality, but still, it’s something that has to be considered in terms of the server architecture and app enviornment (Passenger Phusion etc)

    The typical way to get around this typically involves data sharing, data replication, load balancing, caching, in memory database servers (Redis, Orient, MongoDB etc). Of course, you do want to make sure your code and database design take scalability into account – so use best practices when designing and coding.

    4. Code Samples: Not so much a ROR issue but more of a community issue, some examples of code don’t work in the version of rails you may use. Or the examples are outdated… Some GEMS won’t work with new verions of rails, or are minimally documented. Sometimes you might find that you could end up spending more time trying to get the GEM to work than simply writing the equivalent code yourself. (If I am having problems witha a new GEM, and can’t get it working in 3 hours or less, I just remove the GEM and start to code the functionality myself.)

    5. Performance issues are resolved addressed by adding more servers. The way to scale out sites based on frameworks is to use multiple web servers to handle high loads. Node.js does not require this techniqe in terms of web server, at least not to the same degre, but node.js doesn’t provide the framework for apps and websites either.

    6. Plugins, Gems, APIs etc: Frameworks offer features such as libraries, plug ins, GEMS (Ruby on Rails) that can help cut development time down. Autentication, administration, security, and database GEMS are available in Ruby on Rails so you can focus on what is different about your application.

    7. Lack of a standard IDE.. While you can use Eclipse or Netbeans (or some others), there isn’t a default editor or IDE for ROR. I edit with Textmate.. I could use Eclipse, which I love, but for now I’m happy with using a text editor. But not everyone likes to code this way..

    8. Limited Windows support: I started programming in ROR on Windows, but occassionally I would come across something that had not yet been compiled to Windows.. I’m not interested in wasting time trying to complile something or searching for a workaround for Windows – I want to get my project running and developed quickly.. Things may have improved, but I ended up switching to OSX where I’ve never had this issue.

    9. Framework Scalability: If you expect Rails or just about any other framework to scale right out of the box, it won’t. Not to the levels you will need to have a twitter or facebook. You’ll have to create a hybrid architecture to do this.

    10. Heavy Metal: As with most framework based apps, you add metal (servers) to help scale. Node.js is very fast, and you don’t have to throw so much metal at an app. The tradeoff with Node.js is that it will take longer to develop your apps.

    11. MVC. If one does not like like the MVC pattern, ROR might not be a good choice. I believe it’s possible to use other patterns, but out of the box it’s MVC and the examples and help provided online will primarily consist of MVC…

    12. REST & routing. If you don’t like REST, and don’t want to be confined to it, rails might not be a good choice. The routing is very flexible, and REST can have additional methods besides edit/delete, etc, but out of the box this is how it is and if you come from an environment that doesn’t use REST, you might not like it so much.

    There are many other pros and cons.. I’m partial to ROR.. I’ve done JSP, ASP, ASP.NET, Websphere, and I just find ROR faster to develop with. I’ll pick up Djanogo soon – I’m an equal opportunity coder when it comes to earning money so I *will* develop with any system/framework/etc.

    Finally, there are some things that irritate me, and some things I love.. But all in all I highly recommend it.

    • Hey Ruben,

      thanks again for our phone conversation and now for your amazing summary! I was already more and more convinced to use Ruby on Rails for the Framework since I really see some advantages and the issues that will prevent me from scaling will be implemented as stand alone services and integrated later on.
      The Ruby discussion is not quite through yet but still your comment pushed it in a certain direction because it really shows how the community works!

      best Rene

  2. [...] been running metalcon an online social network for metal fans and metal bands. As written recently I have the the chance to rewrite the entire platform with a team of 6 programmers. This time we want to do it the correct why. Instead of Thinking of [...]

  3. [...] I had in mind that performace is the same as scaling (which is a wrong assumption) and asked about Ruby on rails vs GWT. I am totally convinced that GWT is much more performant than ruby. But I have seen GWT code and it [...]

  4. [...] the redesign phase of metalcon we started playing around with HBase to support the architecture of our like button and especially [...]

Leave a Reply

*

Close

Subscribe to my newsletter

You don't like mail?