New Docker Sandboxes To Speed Up Submissions

The speed that we can run submissions in our sandbox has been a challenge for a while. For large complex submissions we'd often have to wait a few minutes for the backlog to clear. It worked really well to grind out the submission in actual machine cycles, but it created real usability and scalability challenges as the number of users on the system grew. We knew that we were going to need to come up with a faster and more user-friendly system. 

The system we've moved to uses Docker, and it's really fast. 

The new Codeeval Sandbox are hosted on the Latest AWS Linux instances with Kernel version 4.1.7.  The good news is that this Kernel version supports the OverlayFS storage system which is the fastest solution for Docker. Using OverlayFS storage driver for docker gives us very big speed advantage. For example: running command docker run --rm --entrypoint=/bin/bash ubuntu:latest --version:

  • On the AWS Linux with OverlayFS storage driver takes 0.393415212631 sec;
  • On the Virtual Box Centos 6.7 with default storage driver takes 2.62845516205 sec; 

To execute code for interpreted languages:

To execute the code for compiled languages:

CodeEval Now Supports R and Visual Basic

Today we're excited to be adding on support for two new languages: R and Visual Basic which means that CodeEval now supports 20 different programming languages.

R and Visual Basic are the two languages that have generated the most requests to be added. They're also ranked #10 and #12 on Tiboe's language popularity index which is important for measuring our own data on the most popular programming languages

R Language in particular has grown tremendously in popularity over the last year in and we've seen an overwhelming demand for the language with attention big data and computational statistics is getting. 

  • Primarily used for statistical computing and data mining
  • Increased popularity in 2014 (Rank 15 to Rank 12 from October 2014 to November 2014)
  • Implementation of S programing language w/ lexical scoping
  • Interpreted language similar to APL and MAT Lab

Visual Basic .NET has been around for a while and holds a strong presence in certain applications and areas.

  • High level programming language created on the .NET framework
  • Market share staying in place for 2014 with incremental growth
  • Object oriented language with a collection and library of objects
  • Generally easier to learn than Visual Basic because for the built-in libraries from .NET
  • Derived from BASIC programming language with a GUI to compensate for the leaning curve. 

If you have other languages you'd like to see, email our team at We'd love to hear from you. 

Making CodeEval load faster.

We've had some tremendous growth in the last number of challenges that developers are submitting and that's necessitated some time devoted to Dev Ops.

We're seeing that we've got some performance issues caused by some of the new features we've launched around profiles, rankings and searching for other developers. Since we've been running those in real time, the extra features and capabilities have impacted the speed of the site and caused the page load to cross the threshold of unacceptability.

To combat that we're making a couple of changes; upgrading some databases and moving the ranking system to non-realtime and calculating rankings recurrently in the background, at least for a while. Perhaps not quite optimal but if the ranking is out of sync by a few minutes it's probably a good trade off until we can get the ranking data sorted out in a way that we can move it back to real time.

Any thoughts on this are welcome.


How to find other developers around you!

With the recent launch of social features, we wanted to show you just how easy it is to connect with other developers on the new CodeEval social platform. 

From your developer profile you'll notice a new "find developers" function on the top right of your menu. From here you'll be able to find developers based on geographic location.

  You can find developers and filter based on country and state. If you're looking for your friends or someone in particular, just search using their name or email address.


You can find developers and filter based on country and state. If you're looking for your friends or someone in particular, just search using their name or email address.

  Hit the follow button if you'd just like to be able to view their public profiles and activity. Or hit connect, which will send them and invitation which allows more visibility and messaging options. 


Hit the follow button if you'd just like to be able to view their public profiles and activity. Or hit connect, which will send them and invitation which allows more visibility and messaging options. 

With public profiles, you'll be able to view ranking, challenges, languages, connections and much more.

With public profiles, you'll be able to view ranking, challenges, languages, connections and much more.

For connected users, you'll be able to message each other, exchange information, or collaborate on challenges together. 

For connected users, you'll be able to message each other, exchange information, or collaborate on challenges together. 

This is just the beginning of our rollout on social features. Expect major improvements and added functionality in the next few months allowing you to connect with the best developers on our platform, code more, and compete! 

-CodeEval Team

We've Launched Social Features!

As our community continued to expand, we realized that we needed to evolve into something more than just a place for developers to work on coding challenges. We needed to build a community to help developers with what they do best - Coding. 

Earlier this year, we launched completely new profiles that were elegantly designed with details on your code such as speed, accuracy, and efficiency. We also allowed you to rank and compare yourself against other developers.

This launch went better than expected and many of you have emailed in requesting social features. Today, we're excited to announce that CodeEval now supports features such as messaging, connecting, teams, and following.

 We've designed these features for the purpose of building a more engaged community where developers can learn from one another and connect with the world's best talent. 

If you have any feedback or have suggesting, please let us know at 


In the meantime, let's connect!

- CodeEval Team

Upcoming Features & Community

We love getting letters for our most active users and it really helps us push our community in the right direction and we build out more features for developers. Recently we received a letter from Chris who has been working his way up the leaderboard and offered some feedback on some of the things he loves about CodeEval as well as some of the things he wanted to see on CodeEval. Since his feedback hit some points we’re working on we asked if we could share it to our community and talk about some of the things we’ll be rolling out.

First of all I would like to say that I really like the idea behind CodeEval. I joined the site a few years ago and solved a bunch of problems before I was was distracted by some other work. I came back to CodeEval recently to solve some problems in Go to help learn the language. CodeEval has been a good learning tool to experiment with solving something in a language when you are first getting started to really get a grasp of the syntax. Seeing my rankings grow while solving the problems has been very satisfying and has made your tool pretty addictive. I have been telling some of my co-workers about it and started to get them interested in it. I would really like to see CodeEval grow to be much bigger than it is. I see potential in the form of Stackoverflow, Github, and Linkedin in how it is important to establish ones online profile. Basically if a company is reseaching a candidate they may look at these tools to see what impact this person has had in programming and how passionate/knowlegable they may be. This may not tell the whole story but it does play a factor. As part of a software company we always filter our candidates with a coding problem before hiring and this seems to be a great service that your site can offer real value for companies and recruiters. Anyways, I know I am preeching to the choir here since you know your own product and its potential better than anyone but I came up with a bunch of ideas from observations I have made about your business model.

Like Stackoverflow, Github, and Linkedin, etc your key value seems to be in your users/programmers that are solving problems. You need to attract many users but also quality users. These are a few ideas I have to grow your user base.

It seems that CodeEval is very focused around users who are seeking a job. This makes sense because that is how you are trying to make money but job seekers do not necessarily make up a large proportion or the most proficient coders in the world so this group may not be helping as much to grow your user base. To be credible you want to attract the best coders in the world regardless of whether they are seeking a job or not. When I google a popular coder that I really like who is making an impact in the community I should be able to see their codeeval profile. This is great publicity for your site because I can see that this famous coder is on your site and that I should think about joining your site, or if I didn’t know about CodeEval before, then now I do. Also, I may want to try and pass some of my favorite coders in the ranking, it is a great competitive environemnt. Ok, so attracting the best coders in the world is not necessarily an easy thing because, what would they have to prove by rising up the ranking? They are already thought of as a one of the best coders without being on CodeEval. I am not sure exactly what CodeEval could offer to the best coders in the world but I think that this is good food for thought on how codeeval can try to attract top coders.

One idea that I do have is to have a team or company ranking. I see you do have teams on the site (although I am not sure how I become part of a team), but it would be nice to see which companies have the best coders. It is great publicity for the company to try and be close to the top so they have their name there which also might make candidates want to apply there. As well, the company would encourage it’s employees to join CodeEval and solve problems to improve the company ranking so that they are close to the top of the list. Also, I think coders should be able to browse company pages to learn a bit more about the companies. Offering badges that the Company can put on their site would also be a great way to validate the company from a hiring perspecitve and would be great publicity and traffic for your site.

In terms of persona’s, I think CodeEval has done a good job at looking at the coder and companies side of things. Another possible persona which might bring in some revenue is to address the recruiters persona. I am not sure how to handle this one exactly because if I was on CodeEval and being bombarded by recruiters that might stop me from using the site. Maybe there is the option that would allow recruiters to contact you that you can enable or disable. Your site may be trying to cut out the recruiter but that hasn’t really been the case with linkedin. Companies may not need paid profiles all year round but a recruiter may gladly pay the monthly price if he is just passing that cost along to the company anyway. This next suggestion may not be the most beneficial for making money but having a recruiting or company ranking and reviews would be a good service for users who are looking for jobs and seems like something that many people would be willing to pay for if it met knowing more about good recruiters and companies.

You have done a great job with the rankings, adding badges for each programming language you code in and your percentile ranking. This is really great to have on your public profile and makes the site more addictive. I can’t think of anything that specific but any work on the gamification side is really important so that once you get a user you can drive them to use your site. Earning rewards and offers is also genius and I look forward to seeing this developed more.

Another thing, which I see you have recognized, is that plagarism would be a killer for the company. I don’t have much to add here because I see you have taken actions on this idea by adding already with the uniqueness, but I am just mentioning it because it is probably one of the most important things for your company and potentially your key competitive advantage over other sites that may compete with you. If users are able to falsely plagarise themselves to the top of the ranking then the site may lose all credibility. So, obviously having a really good plagarism detection algorithm is really important. Out of curiousity I noticed that there are a few repositories that have many solutions to the problems on your site. While you can’t stop people from posting solutions, non-unique problems that could be plagarised should add negative to discourage people from sharing solutions.

Anyways, A lot of this was just commending you on the work you have done and your direction. I think your site is a very smart idea and I look forward to seeing its continued growth. I hope these suggestions and feedback are of some help.

Thanks for your message Chris, it’s super helpful to hear some of the things you want to see and it helps move us forward. As you may have noticed, we’re making a lot of big changes here and you touched on many of the things at the top of our list. We started out a few years ago by building a screening tool for employers and focusing on helping developers find jobs. We quickly realized that building a community was much more valuable and more fun! Our new direction has been to eliminate the focus on jobs and work on allowing developers to build up their credibility with elegant profiles and a ranking system to let them know where they stand in the world. We’re building a community to attract the world’s best developers as well as those who strive to be one day and there’s a whole list of things in our pipeline to facilitate this transitions. 

Firstly, more robust team and company pages are coming! Currently, companies have to create a company page to add you to a team. It’s actually been a huge hit among our existing companies since it allows them to showcase their companies and give developers a better idea of what’s it like to work there as an engineer. Things like tech stack and languages are helpful but having team members is key.

With teams, you can see how proficient their developers are and see how they compare other companies. It’s important to know you’ll be working with smart people and for the company, having actual developers become a recruiting tool for your company just makes sense. Developers can connect with each other and share what’s it’s really like working there etc. Companies can build up their ranking and establish credibility as well as gain valuable insight into what kind of people they are attracting. It’s definitely something we’re pushing companies to build out and we’ve made it free so talk to someone at your company to build one out! 

Another things your touched on is privacy. We’ve gone through great lengths to ensure we’re not just another linkedin where you will get bombarded with recruiter emails. We’ve made it optional to turn on and off public profiles and will be introducing a new messaging system to allow you to communicate with recruiters through or other developers.

Gamification has also been at the top of our list. With the introduction of badges and things like offers. In the last few weeks we rolled out many more offers including credit at Firebase, and Kloudless as well as Giveaways on iPads and Oculus Rifts. Stay tuned for more exciting offers and rewards!

Something we’re doing that’s pretty unique is plagiarism. It’s something that few companies have bothered to resolve but because we want to maintain a sense of credibility in our community, we’ve invested heavily in our new plagiarism detection algorithm. More in this blog post:

Thanks for your feedback Chris! If you ever have feedback on our product or features you want to see, drop us a line anytime at We read every email. 


CodeEval Team

New: CodeEval Code Comparison (Plagiarism Detection)

CodeEval is now even smarter with the launch of our code comparison (plagiarism detection) engine.

One of the challenges with providing relevant information and realistic code rankings for developers on CodeEval is building in a comprehensive system to protect the community against code plagiarism.

The open web makes sharing code easy and we want to make sure that we're providing information and code rankings that are actually relevant, and ensure that the reputation of the platform and our developers is protected. One of the requirements to do this is some kind of system to address cheating or copying code so that when you see someone's code rank, you be confident that they wrote the code themselves. 

After considering a number of algorithms for finding plagiarism in source code, we've decided to build our custom similarity detection engine based on the most current academic research in the area of "Winnowing". Here's one example of the research we took a look at: Winnowing: Local Algorithms for Document Fingerprinting

While we're not going to discuss everything we're doing or exactly how we do it... the gist of it is that every submission of code goes through an analyzer that splits the source code to lexemes (a basic lexical unit of a language, consisting of one word or several words, considered as an abstract unit, and applied to a family of words related by form or meaning.)  We get rid of dependence in the names of variables, classes, etc., then we apply the hashing algorithm and the principle of minimum hash. We choose an imprint that characterizes the source code then we compare the prints with each other, if they're similar - it means the code was duplicated.

A few of the features this supports:

  • You can organize a database to accelerate the check one-against-everyone.
  • Works across all of our 18 supported programming languages.
  • Benefits of the tokenized representation - So we automatically ignore the names of functions and variables (classes, objects, and so on). The tokenization prevents the impact of small changes in the program code to the code duplication checking.
  • Moving a small pieces of code in the source code can slightly affect the result of duplication searching.
  • The algorithm is insensitive to permutations of chunks of code.
  • Since the system compares new code to all existing solutions, finding solutions online and submitting them (even if they're manipulated) are easily identified.

Of course, while displaying this information publicly on profiles this might be a little sensitive for some developers, it's not intended to offend anyone... it's designed to give you some insight into your code, add some credibly to your profile, and to protect the community from those who are cheating or trying to game the rankings.

The comparison engine has been running behind CodeEval for a while now while we thought through how we wanted to present this information to developers. We've tried a number of things and learned a lot. There are benefits that come with simplicity and in looking at the results that were returned in tests it becomes fairly evident who has copied code (even if they've manipulated it) and who has written unique code that has some similarity since it's solving the same challenge (as could be expected). There are some thumbs on the scale in different regards. For example; simpler solutions have a higher probability of returning similar results since the code isn't as long and it's designed to solve a specific challenge so it's likely to have some duplication. Still, while we feel that we have a pretty good idea, we didn't want to be in the position of reporting any false positives. we decided against making any kind of true/false judgement and just deliver the results as a percentage of 'uniqueness' when compared with other submissions, leaving the final judgement with the viewer.

Login to your CodeEval account to see this in action in your account.

How we calculate the percentage...

Whenever any code is submitted for a challenge, we collect the information about duplication in the solutions of a user, we then calculate the ratio of challenges with a duplicate to the total number of solved problems - obtaining the percentage of uniqueness.


  • The percentage of 'not unique' code does not mean that that code was plagiarized, only that it is similar to other submissions (which is to be expected to some extent).
  • We currently don't check the source code for easy level challenges since the code tends to be similar. 
  • We will be continually tweeking this to improve the results so exchanges are to be expected.
  • While we keep all of the submissions for each coding challenge, we only use the results from your last successful submission. So, if you have a result that is less than favorable, you can write original code and improve your uniqueness score.

Feel free to email us for feedback or leave a comment below.

-CodeEval Team


Additional References:

  1. Arvind Arasu, Junghoo Cho, Hector Garcia-Molina, Andreas Paepcke, and Sriram Raghavan. Searching the web. ACM Transactions on Internet Technology (TOIT), 1(1):2–43,  2001.

  2. Brenda S. Baker. On finding duplication and near-duplication in large software systems. In L. Wills, P. Newcomb, and E. Chikofsky, editors, Second Working Conference on Reverse Engineering, pages 86–95, Los Alamitos, California, 1995. IEEE Computer Society Press.

  3. Brenda S. Baker and Udi Manber. Deducing similarities in java sources from byte codes. In Proc. of Usenix Annual Technical Conf., pages 179–190, 1998.

  4. Sergey Brin, James Davis, and Hector Garcıa-Molina. Copy detection mechanisms for digital documents. In Proceedings of the ACM SIGMOD Conference, pages 398–409, 1995.

  5. Andrei Broder. On the resemblance and containment of documents. In SEQS: Sequences ’91, 1998.

  6. Andrei Broder, Steve Glassman, Mark Manasse, and Geoffrey Zweig. Syntactic clustering of the web. In Proceedings of the Sixth International World Wide Web Conference, pages 391–404, April 1997.

  7. Nevin Heintze. Scalable document fingerprinting. In 1996 USENIX Workshop on Electronic Commerce, November 1996.

  8. Richard M. Karp and Michael O. Rabin. Pattern-matching algorithms. IBM Journal of Research and Development, 31(2):249–260, 1987.

  9. Udi Manber. Finding similar files in a large file system. In Proceedings of the USENIX Winter 1994 Technical Conference, pages 1–10, San Francisco, CA, USA, 17–21 1994.

  10. Peter Mork, Beitao Li, Edward Chang, Junghoo Cho, Chen Li, and James Wang. Indexing tamper resistant features for image copy detection, 1999. URL:

  11. Narayanan Shivakumar and Hector Garcıa-Molina. SCAM: A copy detection mechanism for digital documents. In Proceedings of the Second Annual Conference on the Theory and Practice of Digital Libraries, 1995.

  12. Esko Ukkonen. On-line construction of suffix trees. Algorithmica, 14:249–260, 1995.