PSA: Don't open-source everything!

Affenity · January 16, 2020, 11:49am

Intro

Hello, everyone. It’s me, the noob (again), and I’m back now with a PSA about open-sourcing. Recently, I’ve started to see more people say “you should open-source this”, or “I’m going to open-source this”, “it would benefit the community”.

First off, it’s great that you want to open-source something you’ve made, which you can provide to others so they can learn from it, and see how it’s made. You help others by doing this, and sometimes it even helps you! By open-sourcing you allow someone to spot a bug, and correct it for you. This is known as the power of open-source. But, know that you should not open source everything you create.

TL;DR

You should not open-source software / products you make for your end users. By allowing anyone to view the contents of what you made, you let other people find exploits in your code, and take advantage of them. Mostly, not for a good purpose.

Why

Do you know why open-sourcing your products is not completely safe? It’s because it’s open-sourced. Anyone can view the contents, any cybercriminal (“hacker”), or just a curious person. When you open up your product to the public, you allow anyone to view what you made, and how you did it. And the pitfall for this is that people, well, they are not always the good ones.

If you have a vulnerability in your code, for example something that executes arbitrary code (eval is evil), and someone finds it, they will either:

Notify you about it
Exploit the heck out of it, and alerting their friends to do it too

Now, don’t ask me what most people would do, because I imagine you’d figure it out. It makes sense, doesn’t it? And it’s especially important to not publish your code online where you provide a service to someone. You’re putting your end users in explicit danger by doing so.

Have you ever seen a Discord bot, a ranking service, or some ban panels? You most likely have. These services can be awesome. But, imagine for a second, the owner has a critical bug in one of the endpoints / commands. He also has decided to open-source his product. Now, if a “hacker” finds this bug, they can retrieve data from the service, and much more, which can cause severe harm to end-users.

What if you store (hypothetically, of course) credit card information, IP addresses, Roblox account cookies and more? By allowing someone to view the contents of what you have made, you’re asking your end users for a lot of trust. On the other hand, though, you receive more trust from other people who prefer to view what they’re using. Sometimes, security through obscurity is something we have to accept, and we generally do.

There are a few exceptions from this PSA, though. Take Bitwarden. It’s an open-sourced password manager. You might think that this would be completely horrible to open-source, but since it’s so popular to let others know how software works, many volunteer security experts are reviewing code changes, which makes it more secure. But, unless you have a handful of security experts ready to watch over your code, you really shouldn’t open-source what you made, even though it benefits the community.

Conclusion

I’m going to keep this point short. I’ve explained why open-sourcing is not a great idea, especially when you’re providing software to other end-users. It imposes a security risk, and the cons are way worse than the pros. But, I’ve said a lot about why open-sourcing isn’t always a great idea, but that does not mean it’s never not a good idea.

Take the following thread to see why it’s good to open-source what you’ve made (that is not a service/product ;)): The magic of sharing: Why you should open source

Please also share this topic with users who want to open-source something they really shouldn’t.

Anaminus · January 16, 2020, 1:29pm

Now, don’t ask me what most people would do, because I imagine you’d figure it out.

I’ll tell you: the vast majority of users that find a vulnerability will not realize that it’s a vulnerability. The next largest group will not care enough to take any action. The next largest group will try to notify you, but not have enough foresight to use a private channel. Next is the group of benevolent users that actually use the right channels and want to help. Finally, the smallest group is the one user that actually puts effort into crafting an exploit for the vulnerability. Indeed, there are a good number of users that would willingly use this exploit, but not enough to craft it themselves. Moral of the story is: most people are lazy.

ForbiddenJ · January 16, 2020, 1:30pm

So I guess you’re saying that if a product is not popular, don’t open source it for security reasons?

buildthomas · January 16, 2020, 1:40pm

This seems to be a bit one-sided of a post.

You should always separate code from application secrets. When open sourcing a system, you are just open sourcing the code, any secrets should be pulled from a separate file that you’ve either redacted or just not provided (i.e. environment/constants file).
If you are storing user data such as these, and the mere act of showing the code of how you are accessing this data to malicious parties would compromise your system, then your system is fundamentally broken and should not be used by anyone, since it cannot protect their data properly.

You should weigh this point much more carefully. There are outliers, but in general by open-sourcing you are solving more problems before they are abused compared to the closed-source situation, simply because more eyes see more things and there are more benign coders out there than there are malicious ones.

That being said, this only works if what you are open-sourcing is actually worth anything to other people. If it’s some weird niche thing and you are the only real user, then don’t expect to get much support or collaboration from others, and then this section becomes moot.

Blackboxing a system with an accessible API is not going to make it so much harder to find exploits. The most dedicated malicious users will do anything to reverse-engineer your system if they find the worth in it. They don’t even need to understand the code to find these kind of vulnerabilities – probing is enough, i.e. throwing malformed requests at it in any way they can think of until the system starts behaving unexpectedly.

boatbomber · January 16, 2020, 3:59pm

oh no someone is wrong on the internet i gotta go spend almost an hour writing a detailed reply with sources and research to prove my points mom i swear

Disclaimer: None of this is meant as a personal attack against OP, but rather against the points they have tried to make. @1TheNoobestNoob is still cool.

I disagree with this post. Shocking, I know! I’m actually a little annoyed, because many people will not read past the title and will have tons of misinformation and ideas stuck with them.

Your post completely ignores threat modeling, and basic security.

That pretty much invalidates a lot of what you’ve said.

Threat modeling

What is it?

In essence, it is a view of the application and its environment through security glasses. Threat modeling is a process for capturing, organizing, and analyzing all of this information. Threat modeling enables informed decision-making about application security risk.

Thanks, Google, that was a little vague. Threat modeling is a way to plan and optimize security operations. Security teams lay out their goals, identify vulnerabilities and outline defense plans to prevent and remediate cybersecurity threats, at minimal cost.

It’s best demonstrated with an example.

Let’s take a look at my phone.
It’s secured with my fingerprint. In Hollywood, that seen as high tech and secure. In real life, its actually terrible security. Passwords are meant to be secret, right? Well, if your fingerprint is your password, it’s not a secret. You leave the password everywhere you go! Not only that, but it’s fuzzy matching, because otherwise you’d get false negatives all the time, and it would piss you off. In 6th grade my friend and I broke into his phone by faking his fingerprint.

If it’s so weak, why do I use it? Threat modeling. Analyzing my threat model lead me to take the option that speeds my workflow through convenience at the cost of security, because the security is overkill. Slowing myself down and making my phone less streamlined for the sake of security provides me with no added benefit.

What’s my threat model? In other terms, what am I trying to defend against?

My little brother, my mom, and my friends.
Is a fingerprint enough to stop them? Yeah. None of them will go through the effort of cracking it, even if all they needed to do is to watch a 5 minute YouTube tutorial.
What are the potential threats? They put their thumb on it, or try to force my hand on it. No real worries there.

If I were a spy or criminal, my phone would have a long alphanumeric+symbols password on it, because I’d be trying to keep out much stronger attackers.

So, threat modeling is used to decide what is necessary for our level of threat. I hope I explained that well.

Further reading on this topic can be found here:

Threat Modeling Cheat Sheet (Super helpful)
12 Methods of Threat Modeling (Extremely useful)
Threat Modeling Explained (More network specific but still good)

Basic Security

AAAAAAAAAAAAHHHHHHHHH!

If you store secrets or valuable info (someone’s credit card) in your code, or in a plain text file, you lose the right to code. Thanks.

Cryptographic hashing has joined the battle!

This is way too much for me to cover in a reply. Maybe I’ll update this post in the future, but I haven’t even had breakfast or a coffee yet.
For now, here’s some external reading:

If you’re wondering how to implement these hashes on Roblox, I’ve got your back.
I already open sourced a library of hashes! HashLib

This is a great counterpoint to what you said. This is giving everyone the key to my security, right? They know my hash algorithm! My library isn’t obfuscated or hidden! They know exactly how I store my secrets!

Wait. The hashes are standard, and can be found in almost any language you want.
How can they be secure? Surely, you can read it and exploit it!

That’s the entire point of a properly made cryptographic hash. It doesn’t matter if they know that you used a standard SHA256 hash, because it’s a one way function and they cannot get the original input data out of the hash result. It’s no encryption that can be undone with a key. Again, see the further reading on this.

Well, let’s take a look at open source work and decide, based on these concepts, if they should have been kept secret.

I’ll start with a few of my own works.

Yup, all good.

Alright, but some of those may not be fair examples because they aren’t full systems, but just pieces for you to implement into your own.

Here’s a long list of open source projects.
I’ve been scrolling for a while and still have yet to find one that I think should have been kept secret.

The replies above mine actually did threat modeling of the Roblox userbase, even if they might not have realized that’s what it was. @Anaminus did a fine analysis:

@berezaa open sourced Miner’s Haven. Your post implies that such a popular project with such incentives to exploit would totally been cracked by now, but it hasn’t…

Your post gave no examples of an open source going sour, and only gave an example of it going well. Not the best way to make your point, is it?

What you meant to say was “Don’t open source poorly written security features and terribly stored secrets.”

In that case, yeah, I agree. But don’t discourage our community from sharing and growing together under false pretenses.

Edit:

@EtiTheSpirit’s reply made me realize that I only approached this from the cryptography and security side. There are other factors in play on the business side!

I have a syntax highlighter module that can highlight any TextObject you throw at it, and an IDE module that uses this along with a lot of other things (like autocomplete and error detection) to have a really nice in game IDE. I haven’t open sourced them, because they are one of the primary things that give Lua Learning an edge over its competition.

However, I still firmly believe in open sourcing and benefiting the community. What if I told you that I open sourced parts of them, or even other versions! My TextBox Plus module is built off a branched version of the IDE (uses the undo/redo sections of the IDE, stuff like that.) The lexer that runs under the highlighter is also given away for free!

Xan_TheDragon · January 16, 2020, 4:03pm

I can’t say I immediately agree with the primary points of this thread, especially as someone whose most popular creation is an open source raycasting module.

I don’t think the proper ideology is “don’t open source things” so much as it is “be smart and selective about what you open source”. I think the thread would be far better if you educated people about both why and when to open source content. Right now it seems almost akin to what could be crudely and inaccurately considered a “fear-mongering” thread (I know that’s not the point, no worries) in that you really try to make open sourcing look like a dangerous and foreign beast more than anything.

By describing when to open source and when not to, for instance, by covering the points like the following…

Are you an expert at the language?
Do you understand the potential repercussions of an exploit?
And most importantly, is the nature of the code one that can be abused or contains abusable segments (e.g. eval as you mention)?

… You provide a lot more information to users and show more care to the subject.

My own module is a shining example of one of these risks – Considering that a large portion of new FPS/TPS games are employing use of this module, it would be a very critical issue if it were exploitable, yes? In fact, I can already think of an exploit. Here’s the important part – Not all of the pitfalls are a result of the module writer so much as they are a fault of the person using the module. Incorrect use can always create new issues that make the original module look incorrect or faulty. In the case of the exploit I imagined, it affects clientside hit detection in that if the client freezes their game, they can get a really long and straight ray to appear, potentially even past the length limit specified by the developer. I’m comfortable mentioning that because most proper implementations will employ hit detection on the serverside, because people already know it’s dangerous to do that on the client.

There’s other cases of where this is selective, for instance, is the module extremely valuable to your game? There’s all kinds of private libraries in my indev game that many people would likely regard as extremely powerful or even revolutionary, and I’m not going to release them because they are far too valuable to me and my own success.

Affenity · January 16, 2020, 4:18pm

Hi everyone, I’m happy to get your viewpoints on the matter, but I seem to have worded myself poorly, which I can tell from the responses. That’s on me! I should of course word myself in a way that anyone can understand, so nobody gets confused.

I’m just going to state that I open-source stuff, I hope you don’t think of me as one who hates open-sourcing things because I don’t know what it is. I’m in fact a creator of several open-sourced projects myself, so if I were against open-sourcing, I wouldn’t have released them to the public at all!

I’m also going to state that I’m not talking about libraries you can use in your game that is a part of your product / service (product/service: game, website, etc.). Most of you have said you have open-source projects, and that’s fantastic! You’re helping the community out. But what you have listed is not what I was aiming for! HashLib, Raycasting module and all those “modules” are not in the scope of my PSA.

I’m terribly sorry for wording it that way. I meant end-services. For example websites.
I’ve seen multiple posts about services (Discord bots, ranking services, ban panels, etc.), many of which the users / OPs say “we/you should open-source this”. I’m aiming for those kind of things, because I disagree with open-source being the solution to everything. Again, I open-sourcing stuff that is of value to others.

I believe that open-sourcing something of which can expose a security risk on end-users (the end-users of the said services) should be avoided. Yes, open-sourcing your service will let others know how it all works and they can correct some bugs (maybe). But, I feel like it imposes a security risk, due to anyone being able to see how your project works, and how you can exploit it.

I’m not going to say names, or point out anyone here, because that is not my intention, but: Those who have released these kind of services on the forum have had serious security vulnerabilities, some of which are absolutely basic. And when they say they’re going to open-source these projects, it just makes me confident that there are oceans of things to exploit in them.

What I don’t like additionally about this is that they’re handling PII on behalf of users, which are email addresses, ip addresses, etc. They most likely have bugs, or something they didn’t plan for in their code, that can exploit the functionality of said functions. For example getting someone else’s PII. I believe this matters, in addition to the other factors of business secrets and that stuff.

boatbomber · January 16, 2020, 4:21pm

I started my reply with a disclaimer that you’re cool.

I addressed that as well.

Alright, but some of those may not be fair examples because they aren’t full systems, but just pieces for you to implement into your own.
…
@berezaa open sourced Miner’s Haven. Your post implies that such a popular project with such incentives to exploit would totally been cracked by now, but it hasn’t…
…
Your post gave no examples of an open source going sour, and only gave an example of it going well. Not the best way to make your point, is it?

This was my conclusion.

What you meant to say was “Don’t open source poorly written security features and terribly stored secrets.”

Provalities · January 16, 2020, 5:47pm

This isn’t the right conversation to have. The many-eyes theory that open-source improves the security of your project doesn’t seem to play out in practice, because such a tiny sliver of people will actually bother to read any source code, and to validate security of that code, you’ll need to put in considerable effort. We can compare the security vulnerabilities in Linux and Windows 10, with no clear winner.

This isn’t a good argument against open-source. Instead, we need to be focused on improved tooling, such as programming languages, frameworks, testing or formal validation, and good programming practices in the languages we employ. This is what’s going to make the difference. By the logic used in the original post, we should be cautious about making games because by doing so, we could have vulnerabilities in those games.

Security through obscurity is valuable in inherently unsolvable problems, like aimbots in a shooter game. We can have all the validation in the world, but ultimately, we need obscurity (unless we go the Google Stadia route, which I doubt the viability of). But when we can solve the problem, as with general security vulnerabilities in our code, that’s our objective.

Quenty · January 16, 2020, 7:25pm

I heavily disagree. Security through obscurity is a bad idea. The foundation of computer security is not through obscurity, but rather cryptography and other authentication and authorization methods. Security through obscurity is a bad practice and not something you should rely upon.

On a side note, my actual game is about 50% open source, and I don’t think any of the exploits that my game has experienced come from people being able to use the open source code.

JedDevs · January 16, 2020, 10:24pm

Agreed, it has been proven time and time again that open sourcing your code to the general public, with moderation team in place, has a positive impact on: Security, Quality and in this ever changing and evolving word, even helps keep up with competition.

This is why so many fortune 500 companies opensource their systems as the world of open source developers vs their own team is no match.

While we haven’t got as big an opensource userbase as many would like, we still have one and as soon as you make a devforum post about it, you have a large group of people offering opinions, suggestions and assistance.

If it works for Microsoft, google, facebook, walmart…why not for roblox game development? The expertise is there (for now), the people are there, the backbone is there.

“The ninth annual Future of Open Source Survey 2015 says around 78 percent of companies around the world are using open source software for running their vital operations. This is a significant indicator of the large-scale application of open source software (OSS) in the current technological landscape. This write-up talks about various advantages that OSS offers, other than cost benefit, to big companies along with examples of its successful implementation in the likes of Google, Facebook, IBM, Toyota and others. You also here get to know about various job opportunities that one can apply to in this domain.” - opensourceforu. com

OverEngineeredCode · January 20, 2020, 9:10pm

I 100% see where you’re coming from, but I don’t think some of the points you made are entirely seeing the full picture.

Pretty much anything which is comparable to open source has problems and isn’t completely safe. Going closed source isn’t completely safe, as you have to rely on the safety of your source code and keep it safe. Open source just had different risk-reward situations.

Like you said a few times, it’s important to weigh up your options and I think it’s dangerous to shy away from something because it’s not “completely safe”, otherwise you will never get anything new happening.

I completely disagree with this statement. What you do with your code is your problem. End users don’t care, they just want a good experience.

There are so many huge projects that handle extremely sensitive data which are open source. These services don’t put their users in explicit danger.

Your users are in no more danger than if a black-hat security expert tries to break the program without code. If there is a serious bug, somebody will find it with or without code.

That’s not what happens, not intentionally anyway. You’re not meant to expose personal information with this, nor do you have to.

Most tools and platforms provide methods to let you sort this out. For example, repl.it has the Environments System.

And, if you do expose information unintentionally, there are tools to remove it. For example, here’s GitHub’s solution.

To put it bluntly, that’s your problem. As the developer, you shouldn’t ever leave a security hole in your program relying on the fact that nobody can see it,

An attacker will eventually find it - and that’s without code. You can still find exploits without code. If it’s there, it will be exploited.

This point in particular misses a major point of open sourcing. When you open source, you do have a team of security experts - you have whoever is viewing your project looking for vulnerabilities. The idea of open sourcing is that you will have far more eyes to look for problems and a higher chance of them being found and reported to you.

If you leave something as public with a bug in it, it will be found by somebody. If they don’t report it, somebody else will. And, if the first person exploits the bug, you will eventually find out.

Like I said before though, it’s not the community’s job to write your code. You should write good, strong code in the first place and not have to rely on community members. But, as a fallback, the mistakes you do make (and you will make some) will be found easier than if you had to find them yourself.

Sorry for the long reply, I had a large amount to say