metalcon – Data Science, Data Analytics and Machine Learning Consulting in Koblenz Germany https://www.rene-pickhardt.de Extract knowledge from your data and be ahead of your competition Tue, 17 Jul 2018 12:12:43 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.6 My ranked list of priorities for Backend Web Programming: Scalability > Maintainable code > performance https://www.rene-pickhardt.de/my-ranked-list-of-priorities-for-backend-web-programming-scalability-maintainable-code-performance/ https://www.rene-pickhardt.de/my-ranked-list-of-priorities-for-backend-web-programming-scalability-maintainable-code-performance/#comments Sat, 27 Jul 2013 09:21:55 +0000 http://www.rene-pickhardt.de/?p=1721 The redevelopment of metalcon is going on and so far I have been very concerned about performance and webscale. Due to the progress of Martin on his bachlor thesis we did a code review of the code to calculate Generalized Language Models with Kneser Ney Smoothing. Even though his code is a standalone (but very performance sensitive) software I realized that for a Web application writing maintainable code seems to be as important as thinking about scalability.

Scaling vs performance

I am a performance guy. I love algorithms and data structures. When I was 16 years old I already programmed a software that could play chess against you using a high performance programming technique called Bitboards.
But thinking about metalcon I realized that web scale is not so much about performance of single services or parts of the software but rather about the scalability of the entire architecture. After many discussions with colleagues Heinrich Hartmann and me came up with a software architecture from which we believe it will scale for web sites that are supposed to handle several million monthly active users (probably not billions though). After discussing the architecture with my team of developers Patrik wrote a nice blog article about the service oriented data denormalized architecture for scalable web applications (which of course was not invented by Heinrich an me. Patrik found out that it was already described in a WWW publication from 2008).
Anyway this discussion showed me that Scalability is more important than performance! Though I have to say that the stand alone services should also be very performant. if a service can only handle 20 requests per seconds – even if it easily scales horizontally – you will just need too many machines.

Performance vs. Maintainable code

Especially after the code review but also having the currently running metalcon version in mind I came to the conclusion that there is an incredibly high value in maintainable code. The hackers community seems to agree on the fact that maintainability comes over performance (only one of many examples).
At that point I want to recall my initial post on the redesign of metalcon. I had in mind that performace is the same as scaling (which is a wrong assumption) and asked about Ruby on rails vs GWT. I am totally convinced that GWT is much more performant than ruby. But I have seen GWT code and it seems almost impractical to maintain. On the other side from all that I know Ruby on Rails is very easy to maintain but it is less performant. The good thing is it easily scales horizontally so it seems almost like a no brainer to use Ruby on Rails rather than GWT for the front end design and middle layer of metalcon.

Maintainable code vs scalability

Now comes the most interesting fact that I realized. A software architecture scales best if it has a lot of independent services. If services need to interact they should be asynchronous and non blocking. Creating a clear software architecture with clear communication protocols between its parts will do 2 things for you:

  1. It will help you to maintain the code. This will cut down development cost and time. Especially it will be easy to add , remove or exchange functionality from the entire software architecture. The last point is crucial since
  2. Being easily able to exchange parts of the software or single services will help you to scale. Every time you identify the bottleneck you can fix it by exchanging this part of the software to a better performing system.
In order to achieve scalable code one needs to include some middle layer for caching and one needs to abstract certain things. The same stuff is done in order to get maintainable code (often decreasing performance)

Summary

I find this to be very interesting and counter intuitive. One would think that performance is a core element for scalability but I have the strong feeling that writing maintainable code is much more important. So my ranked list of priorities for backend web programming (!) looks like that:

  1. Scalability first: No Maintainable code helps you if the system doesn’t scale and can’t be served to millions of users
  2. Maintainable code: As stated above this should go almost hand in hand with scalability
  3. performance: Of course we can’t have a data base design where queries need seconds or minutes to run. Everthing should happen within a few milliseconds. But if the code can become more maintainable at the cost of another few milliseconds I guess thats a good investment.
]]>
https://www.rene-pickhardt.de/my-ranked-list-of-priorities-for-backend-web-programming-scalability-maintainable-code-performance/feed/ 2
Metalcon finally gets a redesign – Thinking about high scalability https://www.rene-pickhardt.de/metalcon-finally-becomes-a-redesign-thinking-about-high-scalability/ https://www.rene-pickhardt.de/metalcon-finally-becomes-a-redesign-thinking-about-high-scalability/#comments Mon, 17 Jun 2013 15:21:30 +0000 http://www.rene-pickhardt.de/?p=1631 Finally metalcon.de the social networking site which Jonas, Jens and me created in 2008 gets a redesign. Thanks to the great opportunities at the Institute for Web Science and Technologies here in Koblenz (why don’t you apply for a PhD position with us?) I will have the chance to code up the new version of metalcon. Kicking off on July 15th I will lead a team of 5 programmers for the duration of 4 months. Not only will the development be open source but during this time I will constantly (hopefully on a daily basis) write in this blog about the design decisions we took in order to achieve a good scaling web service.
Before I share my thoughts on high scaling architectures for web sites I want to give a little history and background on what metalcon is and why this redesign is so necessary:

Metalcon is a social networking site for german fans of metal music. It currently has

  • a user base of 10’000 users.
  • about 500 registered bands
  • highly semantic and interlinked data base (bands, geographical coordinates, friendships, events)
  • 624 MB of text and structured data about the mentioned topics.
  • fairly good visibility in search engines.
  • > 30k lines of code (mostly PHP)
  • a bad scaling architecture (own OR-mapper, own AJAX libraries, big monolithic data base design, bad usage of PHP,…)
  • no unit tests (so code maintenance is almost impossible)
  • no music and audio files
  • no processes for content moderation
  • no processes to fight spam and block users
  • a really bad usability (I could write tons of posts at which points the usability lacks)
  • no clear distinction of features for users to understand

When we built metalcon no one on the team had experience with high scaling web applications and we were about happy to get it running any way. After returning from china and starting my PhD program in 2011 I was about to shut down metalcon. Though we became close friends the core team was already up on new projects and we have been lacking manpower. On the other side everyone kept on telling me that metalcon would be a great place to do research. So in 2011 Jonas and me decided to give it another shot and do an open redevelopment. We set up a wiki to document our features and the software and we created a developer blog which we used to exchange ideas. Also we created some open source project to which we hardly contributed code due to the lacking manpower…
Well at that time we already knew of too many problems so that fixing was not the way to go. At least we did learn a lot. Thinking about high scaling architectures at that time I new that a news feed (which the old version of metalcon already had) was very core for the user experience. Reading many stack exchange discussions I knew that you wouldn’t build such a stream on MySQL. Also playing around with graph databases like neo4j I came to my first research paper building graphity a software which is designed to distribute highly personalized news streams to users. Since our development was not proceeding we never deployed Graphity within metalcon. Also building an autocomplete service for the site should not be a problem anymore.

Roadmap for the redesign

  • Over the next weeks I hope to read as many interesting articles about technologies and high scalability as I can possibly find and I will be more than happy to get your feedback and suggestions here. I will start reading many articles of http://highscalability.com/ This blog is pure gold for serious web developers. 
  • During a nice discussion about scalability with Heinrich we already came up with a potential architecture of metalcon. I will soon introduce this architecture but want to check first about the best practices in the high scalability blog.
  • In parallel I will also collect the features needed for the new metalcon version and hopefully be able to pair them with usefull technologies. I already started a wikipage about features and planned technologies to support them.
  • I will also need to decide the programming language and paradigms for the development. Right now I am playing around with ruby on rails vs GWT. We made some greate experiences with the power of GWT but one major drawback is for sure that the website is more an application than some lightweight website.

So again feel free to give input, share your ideas and experiences with me and with the community. I will be ver greatfull for every recommendation of articles, videos, books and so on.

]]>
https://www.rene-pickhardt.de/metalcon-finally-becomes-a-redesign-thinking-about-high-scalability/feed/ 10